Description
I'm not sure if this is a bug or I'm not using xarray
correctly, but I used to be able to do this without crashing. The new behavior seems to have been introduced some time between 0.16.2 and 0.18.2.
What happened:
Traceback (most recent call last):
File "scripts/repro.py", line 12, in <module>
ds = ds.unstack(['c'])
File "/home/darsh/src/notebooks/build/venv/lib/python3.8/site-packages/xarray/core/dataset.py", line 4024, in unstack
raise ValueError(
ValueError: cannot unstack dimensions that do not have a MultiIndex: ['c']
What you expected to happen:
The code runs without the ValueError
exception.
Minimal Complete Verifiable Example:
from xarray import DataArray, Dataset
a = DataArray([0], dims=['a'])
b = a.stack(b=('a',)).reset_index('b')
c = b.stack({'c': ['b']})
ds = Dataset({'d': DataArray(c.data, dims=['c'])}, coords=c.coords)
print('\nBefore:')
print(ds)
ds = ds.unstack(['c'])
print('\nAfter:')
print(ds)
Anything else we need to know?:
Here's the full output from the example on 0.18.2:
Before:
<xarray.Dataset>
Dimensions: (c: 1)
Coordinates:
* c (c) MultiIndex
- b (c) int64 0
a (c) int64 0
Data variables:
d (c) int64 0
Traceback (most recent call last):
File "scripts/repro.py", line 12, in <module>
ds = ds.unstack(['c'])
File "/home/darsh/src/notebooks/build/venv/lib/python3.8/site-packages/xarray/core/dataset.py", line 4024, in unstack
raise ValueError(
ValueError: cannot unstack dimensions that do not have a MultiIndex: ['c']
What confuses me is that the c
dimension is shown as a MultiIndex
, but it still complains that it doesn't have a MultiIndex
. Directly unstacking ds.d
rather than the dataset itself also fails with the same exception.
Oddly, it seems to work if I assign the coordinates after constructing the dataset:
diff --git a/scripts/repro.py b/scripts/repro.py
index ed2ae7c..d5bd6a3 100644
--- a/scripts/repro.py
+++ b/scripts/repro.py
@@ -5,7 +5,7 @@ a = DataArray([0], dims=['a'])
b = a.stack(b=('a',)).reset_index('b')
c = b.stack({'c': ['b']})
-ds = Dataset({'d': DataArray(c.data, dims=['c'])}, coords=c.coords)
+ds = Dataset({'d': DataArray(c.data, dims=['c'])}).assign_coords(c.coords)
print('\nBefore:')
print(ds)
With that workaround, or by downgrading to 0.16.2, the example doesn't crash:
Before:
<xarray.Dataset>
Dimensions: (c: 1)
Coordinates:
* c (c) MultiIndex
- b (c) int64 0
a (c) int64 0
Data variables:
d (c) int64 0
After:
<xarray.Dataset>
Dimensions: (b: 1)
Coordinates:
a (b) int64 0
* b (b) int64 0
Data variables:
d (b) int64 0
Environment:
Output of xr.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.8.0 (default, Feb 25 2021, 22:10:10)
[GCC 8.4.0]
python-bits: 64
OS: Linux
OS-release: 5.4.0-73-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None
xarray: 0.18.2
pandas: 1.2.4
numpy: 1.20.3
scipy: 1.6.3
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2021.05.0
distributed: None
matplotlib: 3.4.2
cartopy: None
seaborn: None
numbagg: None
pint: 0.17
setuptools: 39.0.1
pip: 21.1.1
conda: None
pytest: 6.2.4
IPython: 7.23.1
sphinx: None
None