Closed
Description
Code Sample
import xarray as xr
import numpy as np
def standardize(x):
return (x - x.mean()) / x.std()
ds = xr.Dataset()
ds["variable"] = xr.DataArray(np.random.rand(4,3,5),
{"lat":np.arange(4), "lon":np.arange(3), "time":np.arange(5)},
("lat", "lon", "time"),
)
ds["id"] = xr.DataArray(np.arange(12.0).reshape((4,3)),
{"lat": np.arange(4), "lon":np.arange(3)},
("lat", "lon"),
)
ds["id"].values[0,0] = np.nan
ds.groupby("id").apply(standardize)
Problem description
This results in an IndexError. This is mildly confusing, it took me a little while to figure out the NaN's were to blame. I'm guessing the NaN doesn't get filtered out everywhere.
The traceback:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-2-267ba57bc264> in <module>()
15 ds["id"].values[0,0] = np.nan
16
---> 17 ds.groupby("id").apply(standardize)
C:\Miniconda3\envs\main\lib\site-packages\xarray\core\groupby.py in apply(self, func, **kwargs)
607 kwargs.pop('shortcut', None) # ignore shortcut if set (for now)
608 applied = (func(ds, **kwargs) for ds in self._iter_grouped())
--> 609 return self._combine(applied)
610
611 def _combine(self, applied):
C:\Miniconda3\envs\main\lib\site-packages\xarray\core\groupby.py in _combine(self, applied)
614 coord, dim, positions = self._infer_concat_args(applied_example)
615 combined = concat(applied, dim)
--> 616 combined = _maybe_reorder(combined, dim, positions)
617 if coord is not None:
618 combined[coord.name] = coord
C:\Miniconda3\envs\main\lib\site-packages\xarray\core\groupby.py in _maybe_reorder(xarray_obj, dim, positions)
428
429 def _maybe_reorder(xarray_obj, dim, positions):
--> 430 order = _inverse_permutation_indices(positions)
431
432 if order is None:
C:\Miniconda3\envs\main\lib\site-packages\xarray\core\groupby.py in _inverse_permutation_indices(positions)
109 positions = [np.arange(sl.start, sl.stop, sl.step) for sl in positions]
110
--> 111 indices = nputils.inverse_permutation(np.concatenate(positions))
112 return indices
113
C:\Miniconda3\envs\main\lib\site-packages\xarray\core\nputils.py in inverse_permutation(indices)
52 # use intp instead of int64 because of windows :(
53 inverse_permutation = np.empty(len(indices), dtype=np.intp)
---> 54 inverse_permutation[indices] = np.arange(len(indices), dtype=np.intp)
55 return inverse_permutation
56
IndexError: index 11 is out of bounds for axis 0 with size 11
Expected Output
My assumption was that it would throw out the values that fall within the NaN group, likepandas
:
import pandas as pd
import numpy as np
df = pd.DataFrame()
df["var"] = np.random.rand(10)
df["id"] = np.arange(10)
df["id"].iloc[0:2] = np.nan
df.groupby("id").mean()
Out:
var
id
2.0 0.565366
3.0 0.744443
4.0 0.190983
5.0 0.196922
6.0 0.377282
7.0 0.141419
8.0 0.957526
9.0 0.207360
Output of xr.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 45 Stepping 7, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
xarray: 0.10.8
pandas: 0.23.3
numpy: 1.15.0
scipy: 1.1.0
netCDF4: 1.4.0
h5netcdf: 0.6.1
h5py: 2.8.0
Nio: None
zarr: None
bottleneck: 1.2.1
cyordereddict: None
dask: 0.18.2
distributed: 1.22.0
matplotlib: 2.2.2
cartopy: None
seaborn: None
setuptools: 40.0.0
pip: 18.0
conda: None
pytest: 3.7.1
IPython: 6.4.0
sphinx: 1.7.5