Description
I really should not open this can of worms but per #4450 (comment):
I'm always confused about whether
ds.groupby(..., keep_attrs=True).mean()
ords.groupby(...).mean(keep_attrs=True)
is correct. (similarly for rolling, coarsen etc.)
Also as I try to fix the keep_attr
behavior in #4510 it would be good to know where they should go. So I tried to figure out how this is currently handled and found the following:
ds.xxx(keep_attrs=True).yyy()
- all fixed
ds.xxx().yyy(keep_attrs=True)
coarsen
(fixed in coarsen: better keep_attrs #5227)groupby
groupby_bin
resample
rolling
(adjusted in rolling keep_attrs & default True #4510)rolling_exp
(fixed in rolling_exp: keep_attrs and typing #4592)weighted
So the working consensus seems to be to to ds.xxx().yyy(keep_attrs=True)
- any comments on that?
(Edit: looking at this it is only half as bad, "only" coarsen
, rolling
(#4510), and rolling_exp
would need to be fixed.)
Detailed analysis
import xarray as xr
ds = xr.tutorial.open_dataset("air_temperature")
da = ds.air
coarsen
ds.coarsen(time=2, keep_attrs=True).mean() # keeps global attributes
ds.coarsen(time=2).mean(keep_attrs=True) # keeps DataArray attributes
ds.coarsen(time=2, keep_attrs=True).mean(keep_attrs=True) # keeps both
da.coarsen(time=2).mean(keep_attrs=True) # error
da.coarsen(time=2, keep_attrs=True).mean() # keeps DataArray attributes
groupby
ds.groupby("time.month").mean(keep_attrs=True) # keeps both
da.groupby("time.month").mean(keep_attrs=True) # keeps DataArray attributes
ds.groupby("time.month", keep_attrs=True).mean() # error
da.groupby("time.month", keep_attrs=True).mean() # error
groupby_bins
ds.groupby_bins(ds.lat, np.arange(0, 90, 10)).mean(keep_attrs=True) # keeps both
da.groupby_bins(ds.lat, np.arange(0, 90, 10)).mean(keep_attrs=True) # keeps DataArray attrs
ds.groupby_bins(ds.lat, np.arange(0, 90, 10), keep_attrs=True) # errors
da.groupby_bins(ds.lat, np.arange(0, 90, 10), keep_attrs=True) # errors
resample
ds.resample(time="A").mean(keep_attrs=True) # keeps both
da.resample(time="A").mean(keep_attrs=True) # keeps DataArray attributes
ds.resample(time="A", keep_attrs=False).mean() # ignored
da.resample(time="A", keep_attrs=False).mean() # ignored
rolling
ds.rolling(time=2).mean(keep_attrs=True) # keeps both
da.rolling(time=2).mean(keep_attrs=True) # keeps DataArray attributes
ds.rolling(time=2, keep_attrs=True).mean() # DeprecationWarning; keeps both
da.rolling(time=2, keep_attrs=True).mean() # DeprecationWarning; keeps DataArray attributes
see #4510
rolling_exp
ds.rolling_exp(time=5, keep_attrs=True).mean() # ignored
da.rolling_exp(time=5, keep_attrs=True).mean() # ignored
ds.rolling_exp(time=5).mean(keep_attrs=True) # keeps both
da.rolling_exp(time=5).mean(keep_attrs=True) # keeps DataArray attributes
weighted
ds.weighted(ds.lat).mean(keep_attrs=True) # keeps both
da.weighted(ds.lat).mean(keep_attrs=True) # keeps DataArray attrs
edit: moved rolling
after #4510, moved rolling_exp
after #4592 and coarsen
after #5227