Skip to content

open_rasterio does not read coordinates from netCDF file properly with netCDF4>=1.4.2 #3185

Closed
@weiji14

Description

@weiji14

MCVE Code Sample

Adapted from the test_serialization unit test at here.

import xarray as xr
from xarray.tests.test_backends import assert_identical, create_tmp_geotiff, create_tmp_file

with create_tmp_geotiff(additional_attrs={}) as (tmp_file, expected):
    # write it to a netcdf and read again (roundtrip) using open_rasterio
    with xr.open_rasterio(tmp_file) as rioda:
        with create_tmp_file(suffix='.nc') as tmp_nc_file:
            # Write geotiff to netcdf file
            rioda.to_netcdf(tmp_nc_file)
            
            # Read using open_dataarray works
            with xr.open_dataarray(tmp_nc_file) as ncds:
                assert_identical(rioda, ncds)
            
            # Read using open_rasterio doesn't work!!
            with xr.open_rasterio(tmp_nc_file) as ncds:
                assert_identical(rioda, ncds)

Actual Output (using netCDF4>=1.4.2)

AssertionError: Left and right DataArray objects are not identical

Differing coordinates:
L * x        (x) float64 5.5e+03 6.5e+03 7.5e+03 8.5e+03
R * x        (x) float64 0.5 1.5 2.5 3.5
L * y        (y) float64 7.9e+04 7.7e+04 7.5e+04
R * y        (y) float64 0.5 1.5 2.5
Differing attributes:
L   transform: (1000.0, 0.0, 5000.0, 0.0, -2000.0, 80000.0)
R   transform: (1.0, 0.0, 0.0, 0.0, 1.0, 0.0)
L   res: (1000.0, 2000.0)
R   res: (1.0, -1.0)
Attributes only on the left object:
    crs: +init=epsg:32618

Expected Output (using netCDF4==1.4.1)

AssertionError: Left and right DataArray objects are not identical

Differing attributes:
L   nodatavals: (nan, nan, nan)
R   nodatavals: (nan, nan, nan)
Attributes only on the left object:
    crs: +init=epsg:32618

Problem Description

I have a script which takes in either NetCDF or GeoTIFF files as an input and I've been using xr.open_rasterio to read them quite successfully before, as I'm basically just interested in parsing the correct XY coordinates out. However, upgrading from NetCDF 1.4.1 to 1.4.2 breaks this behaviour (xr.open_rasterio no longer parses the coordinates properly from a .nc file).

My hunch is that it has something to do with different NetCDF4 formats (e.g. ‘NETCDF4’, 'NETCDF4_CLASSIC’, ‘NETCDF3_64BIT’) but looking at the xr.open_rasterio code, I'm not too sure what's going on... Also not very sure if this is an upstream netcdf4-python or rasterio issue but I thought I'd report it here first.

Output of xr.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.7 | packaged by conda-forge | (default, Jul 2 2019, 02:18:42)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 4.14.127+
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: C.UTF-8
LANG: C.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.8.18
libnetcdf: 4.4.1.1

xarray: 0.12.3
pandas: 0.25.0
numpy: 1.17.0rc2
scipy: 1.3.0
netCDF4: 1.4.1
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.0.3.4
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.0.24
cfgrib: None
iris: None
bottleneck: None
dask: 2.1.0
distributed: None
matplotlib: 3.1.1
cartopy: None
seaborn: None
numbagg: None
setuptools: 41.0.1
pip: 19.2.1
conda: None
pytest: 5.0.1
IPython: 7.6.1
sphinx: None

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions