Skip to content

Dataset.resample fails with certain time offset strings provided to the loffset parameter #8399

Closed
@kafitzgerald

Description

@kafitzgerald

What happened?

resample fails with offset aliases provided to the loffset argument that do not result in unambiguous timedelta values following #7206. We're running into this over on NCAR/geocat-comp.

I realize the loffset argument is slated to be deprecated anyway, but wanted to at least document this for others who might run into it. Especially since #7596 is still open and the time offset arithmetic gets a little tricky with cftime.

What did you expect to happen?

The operation to complete without error.

Minimal Complete Verifiable Example

import xarray as xr
import numpy as np

dates = xr.cftime_range(start="0001", periods=24, freq="MS", calendar="noleap")
da = xr.DataArray(np.arange(24), coords=[dates], dims=["time"], name="foo")
dar = da.resample({'time':'QS-DEC'},loffset='MS').mean()

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

<stdin>:1: FutureWarning: Following pandas, the `loffset` parameter to resample will be deprecated in a future version of xarray.  Switch to using time offset arithmetic.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/username/miniconda3/envs/pandas/lib/python3.12/site-packages/xarray/core/dataarray.py", line 7087, in resample
    return self._resample(
           ^^^^^^^^^^^^^^^
  File "/Users/username/miniconda3/envs/pandas/lib/python3.12/site-packages/xarray/core/common.py", line 1055, in _resample
    return resample_cls(
           ^^^^^^^^^^^^^
  File "/Users//miniconda3/envs/pandas/lib/python3.12/site-packages/xarray/core/resample.py", line 49, in __init__
    super().__init__(*args, **kwargs)
  File "/Users/username/miniconda3/envs/pandas/lib/python3.12/site-packages/xarray/core/groupby.py", line 729, in __init__
    grouper_.factorize(squeeze)
  File "/Users/username/miniconda3/envs/pandas/lib/python3.12/site-packages/xarray/core/groupby.py", line 377, in factorize
    ) = self._factorize(squeeze)
        ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/username/miniconda3/envs/pandas/lib/python3.12/site-packages/xarray/core/groupby.py", line 546, in _factorize
    full_index, first_items, codes_ = self._get_index_and_items()
                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/username/miniconda3/envs/pandas/lib/python3.12/site-packages/xarray/core/groupby.py", line 519, in _get_index_and_items
    first_items, codes = self.first_items()
                         ^^^^^^^^^^^^^^^^^^
  File "/Users/username/miniconda3/envs/pandas/lib/python3.12/site-packages/xarray/core/groupby.py", line 531, in first_items
    return self.index_grouper.first_items(self.group_as_index)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/username/miniconda3/envs/pandas/lib/python3.12/site-packages/xarray/core/resample_cftime.py", line 155, in first_items
    labels = labels + pd.to_timedelta(self.loffset)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/username/miniconda3/envs/pandas/lib/python3.12/site-packages/pandas/core/tools/timedeltas.py", line 223, in to_timedelta
    return _coerce_scalar_to_timedelta_type(arg, unit=unit, errors=errors)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/username/miniconda3/envs/pandas/lib/python3.12/site-packages/pandas/core/tools/timedeltas.py", line 233, in _coerce_scalar_to_timedelta_type
    result = Timedelta(r, unit)
             ^^^^^^^^^^^^^^^^^^
  File "timedeltas.pyx", line 1820, in pandas._libs.tslibs.timedeltas.Timedelta.__new__
  File "timedeltas.pyx", line 653, in pandas._libs.tslibs.timedeltas.parse_timedelta_string
ValueError: unit abbreviation w/o a number

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None
python: 3.12.0 | packaged by conda-forge | (main, Oct 3 2023, 08:36:57) [Clang 15.0.7 ]
python-bits: 64
OS: Darwin
OS-release: 21.6.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None

xarray: 2023.10.1
pandas: 2.1.2
numpy: 1.26.0
scipy: None
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.6.3
nc_time_axis: None
PseudoNetCDF: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.8.0
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 68.2.2
pip: 23.3.1
conda: None
pytest: None
mypy: None
IPython: None
sphinx: None

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions