Description
What happened:
Testing xarray data storage in a jupyter notebook with varying data sizes and storing to a netcdf, i noticed that open_dataset/array (both show this behaviour) continue to return data from the first testing run, ignoring the fact that each run deletes the previously created netcdf file.
This only happens once the repr
was used to display the xarray object.
But once in error mode, even the previously fine printed objects are then showing the wrong data.
This was hard to track down as it depends on the precise sequence in jupyter.
What you expected to happen:
when i use open_dataset/array
, the resulting object should reflect reality on disk.
Minimal Complete Verifiable Example:
import xarray as xr
from pathlib import Path
import numpy as np
def test_repr(nx):
ds = xr.DataArray(np.random.rand(nx))
path = Path("saved_on_disk.nc")
if path.exists():
path.unlink()
ds.to_netcdf(path)
return path
When executed in a cell with print for display, all is fine:
test_repr(4)
print(xr.open_dataset("saved_on_disk.nc"))
test_repr(5)
print(xr.open_dataset("saved_on_disk.nc"))
but as soon as one cell used the jupyter repr:
xr.open_dataset("saved_on_disk.nc")
all future file reads, even after executing the test function again and even using print
and not repr
, show the data from the last repr use.
Anything else we need to know?:
Here's a notebook showing the issue:
https://gist.github.com/05c2542ed33662cdcb6024815cc0c72c
Environment:
Output of xr.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.7.6 | packaged by conda-forge | (default, Jun 1 2020, 18:57:50)
[GCC 7.5.0]
python-bits: 64
OS: Linux
OS-release: 5.4.0-40-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.6
libnetcdf: 4.7.4
xarray: 0.16.0
pandas: 1.0.5
numpy: 1.19.0
scipy: 1.5.1
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: 2.10.0
Nio: None
zarr: None
cftime: 1.2.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.1.5
cfgrib: None
iris: None
bottleneck: None
dask: 2.21.0
distributed: 2.21.0
matplotlib: 3.3.0
cartopy: 0.18.0
seaborn: 0.10.1
numbagg: None
pint: None
setuptools: 49.2.0.post20200712
pip: 20.1.1
conda: installed
pytest: 6.0.0rc1
IPython: 7.16.1
sphinx: 3.1.2