Description
What happened?
When I open a dataset without loading it and perform opperations with it. The data-array gets corrupted. The dimensions seem to be in a different order then the coordinates. Therefore you cannot use the data-array anymore. If I load the dataset after opening it I dont have the issue anymore.
What did you expect to happen?
I expect the data-array to keep the correct references to the correct coordinates when doing operations with it. I expect the same to happen as when I do load the data.
Minimal Complete Verifiable Example
import xarray as xr
import numpy as np
coords = {
"location": ["a", "b", "c"],
"duration": [0.3, 0.25, 0.5, 1.0, 3.0],
"dof": ["x", "y", "z", "rx", "ry", "rz"],
"motion": ["dis", "vel"],
"wave_tp": np.arange(3, 19, 1),
"wave_dir": np.arange(0, 361, 15),
}
ds = xr.Dataset(
{
"X": (list(coords.keys()), np.random.rand(*[len(e) for e in coords.values()])),
},
coords=coords,
)
with open("tmp.nc", "wb") as fp:
ds.to_netcdf(fp)
with open("tmp.nc", "rb") as fp:
# If I perform a .load() here, the bug disappears
ds = xr.open_dataset(fp) #.load()
a = ds["X"].sel(
wave_dir=np.arange(0, 360, 30),
dof="z",
motion="vel",
)
b = 1 / a
# Here you can see that the dataset has the wrong coordinates.
# It says location has 12 values, but it should have 3.
display(b)
b.sel(location='a')
MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
- Recent environment — the issue occurs with the latest version of xarray and its dependencies.
Relevant log output
{
"name": "ValueError",
"message": "conflicting sizes for dimension 'location': length 12 on <this-array> and length 3 on {'wave_dir': 'wave_dir', 'wave_tp': 'wave_tp', 'duration': 'duration', 'location': 'location'}",
"stack": "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[1;31mValueError\u001b[0m Traceback (most recent call last)\nCell \u001b[1;32mIn[1], line 37\u001b[0m\n\u001b[0;32m 33\u001b[0m \u001b[38;5;66;03m# Here you can see that the dataset has the wrong coordinates. \u001b[39;00m\n\u001b[0;32m 34\u001b[0m \u001b[38;5;66;03m# It says location has 12 values, but it should have 3.\u001b[39;00m\n\u001b[0;32m 35\u001b[0m display(b)\n\u001b[1;32m---> 37\u001b[0m \u001b[43mb\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43msel\u001b[49m\u001b[43m(\u001b[49m\u001b[43mlocation\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[38;5;124;43ma\u001b[39;49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[43m)\u001b[49m\n\nFile \u001b[1;32mc:\\tools\\python312\\Lib\\site-packages\\xarray\\core\\dataarray.py:1683\u001b[0m, in \u001b[0;36mDataArray.sel\u001b[1;34m(self, indexers, method, tolerance, drop, **indexers_kwargs)\u001b[0m\n\u001b[0;32m 1567\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21msel\u001b[39m(\n\u001b[0;32m 1568\u001b[0m \u001b[38;5;28mself\u001b[39m,\n\u001b[0;32m 1569\u001b[0m indexers: Mapping[Any, Any] \u001b[38;5;241m|\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m,\n\u001b[1;32m (...)\u001b[0m\n\u001b[0;32m 1573\u001b[0m \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mindexers_kwargs: Any,\n\u001b[0;32m 1574\u001b[0m ) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m Self:\n\u001b[0;32m 1575\u001b[0m \u001b[38;5;250m \u001b[39m\u001b[38;5;124;03m\"\"\"Return a new DataArray whose data is given by selecting index\u001b[39;00m\n\u001b[0;32m 1576\u001b[0m \u001b[38;5;124;03m labels along the specified dimension(s).\u001b[39;00m\n\u001b[0;32m 1577\u001b[0m \n\u001b[1;32m (...)\u001b[0m\n\u001b[0;32m 1681\u001b[0m \u001b[38;5;124;03m Dimensions without coordinates: points\u001b[39;00m\n\u001b[0;32m 1682\u001b[0m \u001b[38;5;124;03m \"\"\"\u001b[39;00m\n\u001b[1;32m-> 1683\u001b[0m ds \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_to_temp_dataset\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\u001b[38;5;241m.\u001b[39msel(\n\u001b[0;32m 1684\u001b[0m indexers\u001b[38;5;241m=\u001b[39mindexers,\n\u001b[0;32m 1685\u001b[0m drop\u001b[38;5;241m=\u001b[39mdrop,\n\u001b[0;32m 1686\u001b[0m method\u001b[38;5;241m=\u001b[39mmethod,\n\u001b[0;32m 1687\u001b[0m tolerance\u001b[38;5;241m=\u001b[39mtolerance,\n\u001b[0;32m 1688\u001b[0m \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mindexers_kwargs,\n\u001b[0;32m 1689\u001b[0m )\n\u001b[0;32m 1690\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_from_temp_dataset(ds)\n\nFile \u001b[1;32mc:\\tools\\python312\\Lib\\site-packages\\xarray\\core\\dataarray.py:598\u001b[0m, in \u001b[0;36mDataArray._to_temp_dataset\u001b[1;34m(self)\u001b[0m\n\u001b[0;32m 597\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21m_to_temp_dataset\u001b[39m(\u001b[38;5;28mself\u001b[39m) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m Dataset:\n\u001b[1;32m--> 598\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_to_dataset_whole\u001b[49m\u001b[43m(\u001b[49m\u001b[43mname\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m_THIS_ARRAY\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mshallow_copy\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mFalse\u001b[39;49;00m\u001b[43m)\u001b[49m\n\nFile \u001b[1;32mc:\\tools\\python312\\Lib\\site-packages\\xarray\\core\\dataarray.py:665\u001b[0m, in \u001b[0;36mDataArray._to_dataset_whole\u001b[1;34m(self, name, shallow_copy)\u001b[0m\n\u001b[0;32m 662\u001b[0m indexes \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_indexes\n\u001b[0;32m 664\u001b[0m coord_names \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mset\u001b[39m(\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_coords)\n\u001b[1;32m--> 665\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mDataset\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_construct_direct\u001b[49m\u001b[43m(\u001b[49m\u001b[43mvariables\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcoord_names\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mindexes\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mindexes\u001b[49m\u001b[43m)\u001b[49m\n\nFile \u001b[1;32mc:\\tools\\python312\\Lib\\site-packages\\xarray\\core\\dataset.py:1133\u001b[0m, in \u001b[0;36mDataset._construct_direct\u001b[1;34m(cls, variables, coord_names, dims, attrs, indexes, encoding, close)\u001b[0m\n\u001b[0;32m 1129\u001b[0m \u001b[38;5;250m\u001b[39m\u001b[38;5;124;03m\"\"\"Shortcut around __init__ for internal use when we want to skip\u001b[39;00m\n\u001b[0;32m 1130\u001b[0m \u001b[38;5;124;03mcostly validation\u001b[39;00m\n\u001b[0;32m 1131\u001b[0m \u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[0;32m 1132\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m dims \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m-> 1133\u001b[0m dims \u001b[38;5;241m=\u001b[39m \u001b[43mcalculate_dimensions\u001b[49m\u001b[43m(\u001b[49m\u001b[43mvariables\u001b[49m\u001b[43m)\u001b[49m\n\u001b[0;32m 1134\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m indexes \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[0;32m 1135\u001b[0m indexes \u001b[38;5;241m=\u001b[39m {}\n\nFile \u001b[1;32mc:\\tools\\python312\\Lib\\site-packages\\xarray\\core\\variable.py:3072\u001b[0m, in \u001b[0;36mcalculate_dimensions\u001b[1;34m(variables)\u001b[0m\n\u001b[0;32m 3070\u001b[0m last_used[dim] \u001b[38;5;241m=\u001b[39m k\n\u001b[0;32m 3071\u001b[0m \u001b[38;5;28;01melif\u001b[39;00m dims[dim] \u001b[38;5;241m!=\u001b[39m size:\n\u001b[1;32m-> 3072\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mValueError\u001b[39;00m(\n\u001b[0;32m 3073\u001b[0m \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mconflicting sizes for dimension \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mdim\u001b[38;5;132;01m!r}\u001b[39;00m\u001b[38;5;124m: \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[0;32m 3074\u001b[0m \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mlength \u001b[39m\u001b[38;5;132;01m{\u001b[39;00msize\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m on \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mk\u001b[38;5;132;01m!r}\u001b[39;00m\u001b[38;5;124m and length \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mdims[dim]\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m on \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mlast_used\u001b[38;5;132;01m!r}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m\n\u001b[0;32m 3075\u001b[0m )\n\u001b[0;32m 3076\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m dims\n\n\u001b[1;31mValueError\u001b[0m: conflicting sizes for dimension 'location': length 12 on <this-array> and length 3 on {'wave_dir': 'wave_dir', 'wave_tp': 'wave_tp', 'duration': 'duration', 'location': 'location'}"
}
Anything else we need to know?
Environment
xarray: 2025.4.0
pandas: 2.2.3
numpy: 2.2.6
scipy: 1.15.3
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
zarr: None
cftime: None
nc_time_axis: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: None
pip: 25.1.1
conda: None
pytest: None
mypy: None
IPython: 9.2.0
sphinx: None