Open
Description
Hi,
I get a very slow performance of Dataset.isel or DataArray.isel in comparison with the native numpy approach. Do you know where this comes from?
ds = xr.Dataset(
{
"a": ("time", np.arange(55_000_000))
}, coords={
"time": np.arange(55_000_000)
}
)
time_filter = ds.time > 50_000
Select some values with DataArray.isel:
%timeit ds.a.isel(time=time_filter)
2.22 s ± 375 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Use the native numpy approach:
%timeit ds.a.values[time_filter]
163 ms ± 12.1 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Linux
OS-release: 3.16.0-4-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.utf8
LOCALE: en_US.UTF-8
xarray: 0.10.4
pandas: 0.23.0
numpy: 1.14.2
scipy: 1.1.0
netCDF4: 1.4.0
h5netcdf: 0.5.1
h5py: 2.8.0
Nio: None
zarr: None
bottleneck: 1.2.1
cyordereddict: None
dask: 0.17.5
distributed: 1.21.8
matplotlib: 2.2.2
cartopy: 0.16.0
seaborn: 0.8.1
setuptools: 39.1.0
pip: 9.0.3
conda: None
pytest: 3.5.1
IPython: 6.4.0
sphinx: 1.7.4