Description
What happened:
Dataset.interp
silently drops boolean variables.
What you expected to happen:
If I'm interpolating a group of variables I expect to get all of them back in the correct shape with relevant values in them.
If the variables are boolean or object arrays I don't expect it to do linear interpolation because it doesn't make sense but stepwise interpolation like nearest or zero order interpolation should be fine to expect.
Minimal Complete Verifiable Example:
import numpy as np
a = np.arange(0, 5)
b = np.core.defchararray.add("long_variable_name", a.astype(str))
coords = dict(time=da.array([0, 1]))
data_vars = dict()
for v in b:
data_vars[v] = xr.DataArray(
name=v,
data=np.array([0, 1]).astype(bool),
dims=["time"],
coords=coords,
)
ds1 = xr.Dataset(data_vars)
# Print raw data:
print(ds1)
Out[3]:
<xarray.Dataset>
Dimensions: (time: 2)
Coordinates:
* time (time) int32 0 1
Data variables:
long_variable_name0 (time) bool False True
long_variable_name1 (time) bool False True
long_variable_name2 (time) bool False True
long_variable_name3 (time) bool False True
long_variable_name4 (time) bool False True
# Interpolate:
ds1 = ds1.interp(
time=da.array([0, 0.5, 1, 2]),
assume_sorted=True,
method="nearest",
kwargs=dict(fill_value="extrapolate"),
)
# Print interpolated data:
<xarray.Dataset>
Dimensions: (time: 4)
Coordinates:
* time (time) float64 0.0 0.5 1.0 2.0
Data variables:
*empty*
Anything else we need to know?:
ds.interp_like
use ds.reindex
in these cases which seems like a good choice in ds.interp
as well. But I think that both ds.interp
and ds.interp_like
should fill by default with nearest value instead of np.nan because we're still requesting interpolation.
Environment:
Output of xr.show_versions()
xr.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.8.5 (default, Sep 3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)]
python-bits: 64
OS: Windows
libhdf5: 1.10.4
libnetcdf: None
xarray: 0.16.2
pandas: 1.1.5
numpy: 1.17.5
scipy: 1.4.1
netCDF4: None
pydap: None
h5netcdf: None
h5py: 2.10.0
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2020.12.0
distributed: 2020.12.0
matplotlib: 3.3.2
cartopy: None
seaborn: 0.11.1
numbagg: None
pint: None
setuptools: 51.0.0.post20201207
pip: 20.3.3
conda: 4.9.2
pytest: 6.2.1
IPython: 7.19.0
sphinx: 3.4.0