Skip to content

nan values appearing when saving and loading from netCDF due to encoding #7691

Closed
@euronion

Description

@euronion

What happened?

When writing to and reading my dataset from netCDF using ds.to_netcdf() and xr.open_dataset(...), xarray creates nan values where previously number values (float32) where.

The issue seems related to the encoding used for the original dataset, which causes the data to be stored as short. During loading, the stored values then collide with _FillValue leading to the numbers being interpreted as nan.

What did you expect to happen?

Values after saving & loading should be the same as before saving.

Minimal Complete Verifiable Example

We had a back-and-forth on SO about this, I hope it's fine to just refer to it here:

https://stackoverflow.com/a/75806771/11318472

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

See the SO link above.

Anything else we need to know?

I'm not sure whether this should be considered a bug or just a combination of conflicting features. My current workaround is resetting the encoding and letting xarray decide to store as float instead of short (cf. #7686).

Environment

INSTALLED VERSIONS

commit: None
python: 3.11.0 | packaged by conda-forge | (main, Oct 25 2022, 06:24:40) [GCC 10.4.0]
python-bits: 64
OS: Linux
OS-release: 5.15.90.1-microsoft-standard-WSL2
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.8.1

xarray: 2022.11.0
pandas: 1.5.2
numpy: 1.23.5
scipy: 1.10.0
netCDF4: 1.6.2
pydap: None
h5netcdf: 1.1.0
h5py: 3.8.0
Nio: None
zarr: 2.13.6
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.3.3
cfgrib: None
iris: None
bottleneck: 1.3.5
dask: 2022.02.1
distributed: 2022.2.1
matplotlib: 3.6.2
cartopy: None
seaborn: None
numbagg: None
fsspec: 2022.11.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 65.5.1
pip: 22.3.1
conda: None
pytest: 7.2.0
IPython: 8.11.0
sphinx: None

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions