Description
Some datasets I work with have scale_factor
and add_offset
encoded as length-1 lists. The following code worked as of Xarray 0.16.1
import xarray as xr
ds = xr.DataArray([0, 1, 2], name='foo',
attrs={'scale_factor': [0.01],
'add_offset': [1.0]}).to_dataset()
xr.decode_cf(ds)
In 0.16.2 (just released) and current master, it fails with this error
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-2-a0b01d6a314b> in <module>
2 attrs={'scale_factor': [0.01],
3 'add_offset': [1.0]}).to_dataset()
----> 4 xr.decode_cf(ds)
~/Code/xarray/xarray/conventions.py in decode_cf(obj, concat_characters, mask_and_scale, decode_times, decode_coords, drop_variables, use_cftime, decode_timedelta)
587 raise TypeError("can only decode Dataset or DataStore objects")
588
--> 589 vars, attrs, coord_names = decode_cf_variables(
590 vars,
591 attrs,
~/Code/xarray/xarray/conventions.py in decode_cf_variables(variables, attributes, concat_characters, mask_and_scale, decode_times, decode_coords, drop_variables, use_cftime, decode_timedelta)
490 and stackable(v.dims[-1])
491 )
--> 492 new_vars[k] = decode_cf_variable(
493 k,
494 v,
~/Code/xarray/xarray/conventions.py in decode_cf_variable(name, var, concat_characters, mask_and_scale, decode_times, decode_endianness, stack_char_dim, use_cftime, decode_timedelta)
333 variables.CFScaleOffsetCoder(),
334 ]:
--> 335 var = coder.decode(var, name=name)
336
337 if decode_timedelta:
~/Code/xarray/xarray/coding/variables.py in decode(self, variable, name)
271 dtype = _choose_float_dtype(data.dtype, "add_offset" in attrs)
272 if np.ndim(scale_factor) > 0:
--> 273 scale_factor = scale_factor.item()
274 if np.ndim(add_offset) > 0:
275 add_offset = add_offset.item()
AttributeError: 'list' object has no attribute 'item'
I'm very confused, because this feels quite similar to #4471, and I thought it was resolved #4485.
However, the behavior is different with 'scale_factor': np.array([0.01])
. That works fine--no error.
How might I end up with a dataset with scale_factor
as a python list? It happens when I open a netcdf file using the h5netcdf
engine (documented by @gerritholl in #4471 (comment)) and then write it to zarr. The numpy array gets encoded as a list in the zarr json metadata. 🙃
This problem would go away if we could resolve the discrepancies between the two engines' treatment of scalar attributes.