Closed
Description
What happened?
I create a DataArray full of complex numbers, and I compute the correlation of the DataArray with itself.
What did you expect to happen?
The absolute value of the correlation coefficient should be equal to 1, up to numerical precision. However, this is not the case. The returned correlation coefficient is around 0.26 and change depending on the number of values in the array.
Minimal Complete Verifiable Example
import xarray as xr
array = xr.DataArray([
-4.21904583e-03-1.53714478e-03j, -4.24663044e-03-1.12832926e-03j,
-4.26968892e-03-4.87451439e-04j, -6.99917538e-03+3.07376860e-04j,
0.00000000e+00+0.00000000e+00j, -2.42585590e-02+1.42052459e-02j,
-5.53404148e-03+4.60188062e-03j, -4.68829482e-03+4.90179019e-03j,
-7.02331258e-03+8.75908673e-03j, -1.31233383e-01+1.86572484e-01j,
-4.05137401e-03+6.59972035e-03j, -4.20701822e-03+7.29813816e-03j,
-3.56487231e-03+6.51759430e-03j, -3.68077200e-03+7.04388575e-03j,
-8.16459981e-02+1.70084145e-01j, -5.11737898e-03+1.98164995e-02j,
6.72772914e-04-7.28110367e-05j, 2.13957504e-03-1.82525995e-03j,
1.60369835e-03-1.54029189e-03j, 8.77788719e-02-8.45568854e-02j,
1.04277417e-01-9.38854749e-02j, 7.58465696e-03-6.07906563e-03j,
8.00776452e-03-5.70470615e-03j, 8.36166252e-03-5.14978313e-03j,
0.00000000e+00+0.00000000e+00j, 0.00000000e+00+0.00000000e+00j,
0.00000000e+00+0.00000000e+00j, 7.26422461e-03+4.40382166e-04j,
4.01364547e-03+1.09269127e-03j, -1.99069471e-01-1.20355081e-01j,
1.56511579e-01+2.59839758e-01j, 9.14046953e-04+5.42262898e-03j,
-8.37800782e-04+5.67555708e-03j, -3.36561822e-03+7.50108018e-03j,
-4.22682090e-03+5.36279242e-03j, 5.95438564e-02-3.48209841e-02j,
-6.77184281e-03+2.10711488e-03j, -4.84293269e-03+3.78698499e-04j,
-5.13547723e-03-6.86765713e-04j, 4.48392070e-01+1.54568226e-01j,
-3.17412047e-01-2.35431216e-01j, -2.95731737e-03-3.39078899e-03j,
-1.95111443e-03-3.77545168e-03j, -2.82719903e-04-1.61393513e-03j,
7.20241467e-04-1.73515565e-03j, -1.96675563e-01-4.42259734e-02j,
0.00000000e+00+0.00000000e+00j, 4.84813452e-03+7.60742077e-03j,
6.31707602e-03+1.51808252e-02j, 2.99277774e-03+1.18667410e-02j,
5.64640060e-04+1.58372118e-02j, -1.74137347e-03+1.70383706e-02j,
-5.91398408e-03+2.30008930e-02j, -7.12027831e-03+1.87732435e-02j,
9.30919156e-02-1.65255887e-01j, -2.09716130e-01+2.30490479e-01j,
-1.80115101e-02+1.37248240e-02j, -1.85851718e-02+9.23420957e-03j,
-1.88459965e-02+5.12854226e-03j, 1.09175874e+00-9.17875627e-02j,
-1.63766142e-02-5.32431671e-03j, -1.24749963e-02-9.63714407e-03j,
-7.58657222e-03-1.27728267e-02j, -1.99052439e-03-1.35879033e-02j,
-5.70595470e-01+2.27742231e+00j, 1.24516564e-02-1.21867738e-02j,
1.82174257e-02-8.67884733e-03j, 2.27204879e-02-3.77097224e-03j,
2.66143091e-02+2.68683768e-03j, 1.06983372e+00+3.19301893e-01j,
-6.86033738e-01-4.72910865e-01j, 3.00291320e-02+3.10297521e-02j,
2.22880055e-02+3.45332319e-02j, 1.61724440e-02+4.04122368e-02j,
9.78881043e-03+4.96053678e-02j, -6.51085120e-03+5.27227722e-02j,
-1.76752380e-02+5.26095806e-02j, -3.81856382e-02+6.41735764e-02j,
0.00000000e+00+0.00000000e+00j, -4.32481463e-02+3.88706950e-02j
])
r = np.abs(xr.corr(array, array).item())
assert np.isclose(r, 1.0), r
MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
Relevant log output
The exact output I get for the self-contained example below is:
AssertionError Traceback (most recent call last)
Cell In [44], line 46
3 array = xr.DataArray([
4 -4.21904583e-03-1.53714478e-03j, -4.24663044e-03-1.12832926e-03j,
5 -4.26968892e-03-4.87451439e-04j, -6.99917538e-03+3.07376860e-04j,
(...)
43 0.00000000e+00+0.00000000e+00j, -4.32481463e-02+3.88706950e-02j
44 ])
45 r = np.abs(xr.corr(array, array).item())
---> 46 assert np.isclose(r, 1.0), r
AssertionError: 0.2664911388214005
### Anything else we need to know?
Python 3.8.5 (default, Sep 4 2020, 07:30:14)
[GCC 7.3.0]
Xarray version is '2022.9.0'
### Environment
<details>
INSTALLED VERSIONS
------------------
commit: None
python: 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:36:39) [GCC 10.4.0]
python-bits: 64
OS: Linux
OS-release: 4.18.0-193.28.1.el8_2.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: None
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.1
libnetcdf: 4.8.1
xarray: 2022.9.0
pandas: 1.5.0
numpy: 1.23.3
scipy: 1.9.1
netCDF4: 1.6.0
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2022.11.0
distributed: None
matplotlib: 3.6.2
cartopy: None
seaborn: 0.12.1
numbagg: None
fsspec: 2022.11.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 65.4.1
pip: 22.2.2
conda: None
pytest: None
IPython: 8.5.0
sphinx: None
</details>