Skip to content

where should keep_attrs be set in groupby, resample, weighted etc.? #4513

Closed
@mathause

Description

@mathause

I really should not open this can of worms but per #4450 (comment):

I'm always confused about whether ds.groupby(..., keep_attrs=True).mean() or ds.groupby(...).mean(keep_attrs=True) is correct. (similarly for rolling, coarsen etc.)

Also as I try to fix the keep_attr behavior in #4510 it would be good to know where they should go. So I tried to figure out how this is currently handled and found the following:

ds.xxx(keep_attrs=True).yyy()

  • all fixed

ds.xxx().yyy(keep_attrs=True)

So the working consensus seems to be to to ds.xxx().yyy(keep_attrs=True) - any comments on that?

(Edit: looking at this it is only half as bad, "only" coarsen, rolling (#4510), and rolling_exp would need to be fixed.)

Detailed analysis

import xarray as xr
ds = xr.tutorial.open_dataset("air_temperature")
da = ds.air

coarsen

ds.coarsen(time=2, keep_attrs=True).mean() # keeps global attributes
ds.coarsen(time=2).mean(keep_attrs=True) # keeps DataArray attributes
ds.coarsen(time=2, keep_attrs=True).mean(keep_attrs=True) # keeps both

da.coarsen(time=2).mean(keep_attrs=True) # error
da.coarsen(time=2, keep_attrs=True).mean() # keeps DataArray attributes

groupby

ds.groupby("time.month").mean(keep_attrs=True) # keeps both
da.groupby("time.month").mean(keep_attrs=True) # keeps DataArray attributes

ds.groupby("time.month", keep_attrs=True).mean() # error
da.groupby("time.month", keep_attrs=True).mean() # error

groupby_bins

ds.groupby_bins(ds.lat, np.arange(0, 90, 10)).mean(keep_attrs=True) # keeps both
da.groupby_bins(ds.lat, np.arange(0, 90, 10)).mean(keep_attrs=True) # keeps DataArray attrs

ds.groupby_bins(ds.lat, np.arange(0, 90, 10), keep_attrs=True) # errors
da.groupby_bins(ds.lat, np.arange(0, 90, 10), keep_attrs=True) # errors

resample

ds.resample(time="A").mean(keep_attrs=True) # keeps both
da.resample(time="A").mean(keep_attrs=True) # keeps DataArray attributes

ds.resample(time="A", keep_attrs=False).mean() # ignored
da.resample(time="A", keep_attrs=False).mean() # ignored

rolling

ds.rolling(time=2).mean(keep_attrs=True) # keeps both
da.rolling(time=2).mean(keep_attrs=True) # keeps DataArray attributes

ds.rolling(time=2, keep_attrs=True).mean() # DeprecationWarning; keeps both
da.rolling(time=2, keep_attrs=True).mean() # DeprecationWarning; keeps DataArray attributes

see #4510

rolling_exp

ds.rolling_exp(time=5, keep_attrs=True).mean() # ignored
da.rolling_exp(time=5, keep_attrs=True).mean() # ignored

ds.rolling_exp(time=5).mean(keep_attrs=True) # keeps both
da.rolling_exp(time=5).mean(keep_attrs=True) # keeps DataArray attributes

weighted

ds.weighted(ds.lat).mean(keep_attrs=True) # keeps both
da.weighted(ds.lat).mean(keep_attrs=True) # keeps DataArray attrs

edit: moved rolling after #4510, moved rolling_exp after #4592 and coarsen after #5227

Metadata

Metadata

Assignees

No one assigned

    Labels

    topic-metadataRelating to the handling of metadata (i.e. attrs and encoding)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions