Skip to content

Decode_cf fails when scale_factor is a length-1 list #4631

Closed
@rabernat

Description

@rabernat

Some datasets I work with have scale_factor and add_offset encoded as length-1 lists. The following code worked as of Xarray 0.16.1

import xarray as xr
ds = xr.DataArray([0, 1, 2], name='foo',
                  attrs={'scale_factor': [0.01],
                         'add_offset': [1.0]}).to_dataset()
xr.decode_cf(ds)

In 0.16.2 (just released) and current master, it fails with this error

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-a0b01d6a314b> in <module>
      2                   attrs={'scale_factor': [0.01],
      3                          'add_offset': [1.0]}).to_dataset()
----> 4 xr.decode_cf(ds)

~/Code/xarray/xarray/conventions.py in decode_cf(obj, concat_characters, mask_and_scale, decode_times, decode_coords, drop_variables, use_cftime, decode_timedelta)
    587         raise TypeError("can only decode Dataset or DataStore objects")
    588 
--> 589     vars, attrs, coord_names = decode_cf_variables(
    590         vars,
    591         attrs,

~/Code/xarray/xarray/conventions.py in decode_cf_variables(variables, attributes, concat_characters, mask_and_scale, decode_times, decode_coords, drop_variables, use_cftime, decode_timedelta)
    490             and stackable(v.dims[-1])
    491         )
--> 492         new_vars[k] = decode_cf_variable(
    493             k,
    494             v,

~/Code/xarray/xarray/conventions.py in decode_cf_variable(name, var, concat_characters, mask_and_scale, decode_times, decode_endianness, stack_char_dim, use_cftime, decode_timedelta)
    333             variables.CFScaleOffsetCoder(),
    334         ]:
--> 335             var = coder.decode(var, name=name)
    336 
    337     if decode_timedelta:

~/Code/xarray/xarray/coding/variables.py in decode(self, variable, name)
    271             dtype = _choose_float_dtype(data.dtype, "add_offset" in attrs)
    272             if np.ndim(scale_factor) > 0:
--> 273                 scale_factor = scale_factor.item()
    274             if np.ndim(add_offset) > 0:
    275                 add_offset = add_offset.item()

AttributeError: 'list' object has no attribute 'item'

I'm very confused, because this feels quite similar to #4471, and I thought it was resolved #4485.
However, the behavior is different with 'scale_factor': np.array([0.01]). That works fine--no error.

How might I end up with a dataset with scale_factor as a python list? It happens when I open a netcdf file using the h5netcdf engine (documented by @gerritholl in #4471 (comment)) and then write it to zarr. The numpy array gets encoded as a list in the zarr json metadata. 🙃

This problem would go away if we could resolve the discrepancies between the two engines' treatment of scalar attributes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions