Skip to content

Wrong behavior of DataArray.resample #2362

Closed
@fujiisoup

Description

@fujiisoup

From #2356, I noticed resample and groupby works nice for Dataset but not for DataArray

Code Sample, a copy-pastable example if possible

In [14]: import numpy as np
    ...: import xarray as xr
    ...: import pandas as pd
    ...: 
    ...: time = pd.date_range('2000-01-01', freq='6H', periods=365 * 4)
    ...: ds = xr.Dataset({'foo': (('time', 'x'), np.random.randn(365 * 4, 5)), 'time': time, 
    ...:                  'x': np.arange(5)})

In [15]: ds
Out[15]: 
<xarray.Dataset>
Dimensions:  (time: 1460, x: 5)
Coordinates:
  * time     (time) datetime64[ns] 2000-01-01 ... 2000-12-30T18:00:00
  * x        (x) int64 0 1 2 3 4
Data variables:
    foo      (time, x) float64 -0.6916 -1.247 0.5376 ... -0.2197 -0.8479 -0.6719

ds.resample(time='M').mean()['foo'] and ds['foo'].resample(time='M').mean()['foo'] should be the same, but currently not

In [16]: ds.resample(time='M').mean()['foo']
Out[16]: 
<xarray.DataArray 'foo' (time: 12, x: 5)>
array([[-0.005705,  0.018112,  0.22818 , -0.11093 , -0.031283],
       [-0.007595,  0.040065, -0.099885, -0.123539, -0.013808],
       [ 0.112108, -0.040783, -0.023187, -0.107504,  0.082927],
       [-0.007728,  0.031719,  0.155191, -0.030439,  0.095658],
       [ 0.140944, -0.050645,  0.116619, -0.044866, -0.242026],
       [ 0.029198, -0.002858,  0.13024 , -0.096648, -0.170336],
       [-0.062954,  0.116073,  0.111285, -0.009656, -0.164599],
       [ 0.030806,  0.051327, -0.031282,  0.129056, -0.085851],
       [ 0.099617, -0.021049,  0.137962, -0.04432 ,  0.050743],
       [ 0.117366,  0.24129 , -0.086894,  0.066012,  0.004789],
       [ 0.063861, -0.015472,  0.069508,  0.026725, -0.124712],
       [-0.058683,  0.154761,  0.028861, -0.139571, -0.037268]])
Coordinates:
  * time     (time) datetime64[ns] 2000-01-31 2000-02-29 ... 2000-12-31
  * x        (x) int64 0 1 2 3 4
In [17]: ds['foo'].resample(time='M').mean()  # dimension x is gone
Out[17]: 
<xarray.DataArray 'foo' (time: 12)>
array([ 0.019675, -0.040952,  0.004712,  0.04888 , -0.015995, -0.022081,
       -0.00197 ,  0.018811,  0.044591,  0.068512,  0.003982, -0.01038 ])
Coordinates:
  * time     (time) datetime64[ns] 2000-01-31 2000-02-29 ... 2000-12-31

Problem description

resample should work identically for DataArray and Dataset

Expected Output

ds.resample(time='M').mean()['foo'] == ds['foo'].resample(time='M').mean()

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions