Skip to content

Regression: "ValueError: cannot unstack dimensions that do not have a MultiIndex" when unstacking a MultiIndex #5384

Closed
@dranjan

Description

@dranjan

I'm not sure if this is a bug or I'm not using xarray correctly, but I used to be able to do this without crashing. The new behavior seems to have been introduced some time between 0.16.2 and 0.18.2.

What happened:

Traceback (most recent call last):
  File "scripts/repro.py", line 12, in <module>
    ds = ds.unstack(['c'])
  File "/home/darsh/src/notebooks/build/venv/lib/python3.8/site-packages/xarray/core/dataset.py", line 4024, in unstack
    raise ValueError(
ValueError: cannot unstack dimensions that do not have a MultiIndex: ['c']

What you expected to happen:

The code runs without the ValueError exception.

Minimal Complete Verifiable Example:

from xarray import DataArray, Dataset


a = DataArray([0], dims=['a'])
b = a.stack(b=('a',)).reset_index('b')
c = b.stack({'c': ['b']})

ds = Dataset({'d': DataArray(c.data, dims=['c'])}, coords=c.coords)
print('\nBefore:')
print(ds)

ds = ds.unstack(['c'])
print('\nAfter:')
print(ds)

Anything else we need to know?:

Here's the full output from the example on 0.18.2:


Before:
<xarray.Dataset>
Dimensions:  (c: 1)
Coordinates:
  * c        (c) MultiIndex
  - b        (c) int64 0
    a        (c) int64 0
Data variables:
    d        (c) int64 0
Traceback (most recent call last):
  File "scripts/repro.py", line 12, in <module>
    ds = ds.unstack(['c'])
  File "/home/darsh/src/notebooks/build/venv/lib/python3.8/site-packages/xarray/core/dataset.py", line 4024, in unstack
    raise ValueError(
ValueError: cannot unstack dimensions that do not have a MultiIndex: ['c']

What confuses me is that the c dimension is shown as a MultiIndex, but it still complains that it doesn't have a MultiIndex. Directly unstacking ds.d rather than the dataset itself also fails with the same exception.

Oddly, it seems to work if I assign the coordinates after constructing the dataset:

diff --git a/scripts/repro.py b/scripts/repro.py
index ed2ae7c..d5bd6a3 100644
--- a/scripts/repro.py
+++ b/scripts/repro.py
@@ -5,7 +5,7 @@ a = DataArray([0], dims=['a'])
 b = a.stack(b=('a',)).reset_index('b')
 c = b.stack({'c': ['b']})
 
-ds = Dataset({'d': DataArray(c.data, dims=['c'])}, coords=c.coords)
+ds = Dataset({'d': DataArray(c.data, dims=['c'])}).assign_coords(c.coords)
 print('\nBefore:')
 print(ds)
 

With that workaround, or by downgrading to 0.16.2, the example doesn't crash:


Before:
<xarray.Dataset>
Dimensions:  (c: 1)
Coordinates:
  * c        (c) MultiIndex
  - b        (c) int64 0
    a        (c) int64 0
Data variables:
    d        (c) int64 0

After:
<xarray.Dataset>
Dimensions:  (b: 1)
Coordinates:
    a        (b) int64 0
  * b        (b) int64 0
Data variables:
    d        (b) int64 0

Environment:

Output of xr.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.8.0 (default, Feb 25 2021, 22:10:10)
[GCC 8.4.0]
python-bits: 64
OS: Linux
OS-release: 5.4.0-73-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None

xarray: 0.18.2
pandas: 1.2.4
numpy: 1.20.3
scipy: 1.6.3
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2021.05.0
distributed: None
matplotlib: 3.4.2
cartopy: None
seaborn: None
numbagg: None
pint: 0.17
setuptools: 39.0.1
pip: 21.1.1
conda: None
pytest: 6.2.4
IPython: 7.23.1
sphinx: None
None

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions