Description
Code Sample
>>> import numpy as np
>>> import xarray as xr
>>> da = xr.DataArray(np.arange(3))
>>> da
<xarray.DataArray (dim_0: 3)>
array([0, 1, 2])
Dimensions without coordinates: dim_0
>>> da[0].values.fill(99)
>>> da
<xarray.DataArray (dim_0: 3)>
array([0, 1, 2])
Dimensions without coordinates: dim_0
Problem description
Indexing into xarray objects creates a view of the underlying data if possible. A surprising exception is when all dimensions are indexed out and the resulting object is 0d. Xarray insists on returning a 0d array rather than a scalar, which suggests (at least to me) that this is also a view whenever possible; however, it is always a copy, and modifying it will never affect the original array.
(The example above is a little contrived, since one could always call da[0] = 99
. In my actual use case I am indexing into a Dataset in a way that creates views for all variables except the one that happens to collapse to 0d, and thus I'm unable to use the indexed Dataset to modify that variable in the original Dataset.)
The copy happens because, internally, the 0d array is created by retrieving a scalar from the underlying numpy array and then wrapping a new array around it. However, in numpy a 0d view can be created directly by indexing with Ellipsis
/...
, as follows:
>>> import numpy as np
>>> arr = np.arange(3)
>>> arr[0, ...]
array(0)
Thus, a fix that solves my immediate issues and passes all current tests is to modify the following method:
xarray/xarray/core/indexing.py
Lines 1154 to 1163 in 778ffc4
to always append an ellipsis for basic and outer indexing:
def _indexing_array_and_key(self, key):
if isinstance(key, OuterIndexer):
array = self.array
> key = _outer_to_numpy_indexer(key, self.array.shape) + (Ellipsis,)
elif isinstance(key, VectorizedIndexer):
array = nputils.NumpyVIndexAdapter(self.array)
key = key.tuple
elif isinstance(key, BasicIndexer):
array = self.array
> key = key.tuple + (Ellipsis,)
I'm not familiar enough with all the indexing variants in xarray to know if this covers all cases of 0d arrays that are currently copies but could be views. If someone wants to share some insight (e.g., some more advanced test cases), I could try and put together a pull request.
Expected Output
>>> da[0].values.fill(99)
>>> da
<xarray.DataArray (dim_0: 3)>
array([99, 1, 2])
Dimensions without coordinates: dim_0
Output of xr.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Linux
OS-release: 4.15.0-42-lowlatency
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
xarray: 0.11.0
pandas: 0.23.0
numpy: 1.14.3
scipy: 1.1.0
netCDF4: 1.4.0
h5netcdf: 0.6.2
h5py: 2.7.1
Nio: None
zarr: None
cftime: 1.0.0b1
PseudonetCDF: None
rasterio: None
iris: None
bottleneck: 1.2.1
cyordereddict: None
dask: 0.17.5
distributed: 1.21.8
matplotlib: 2.2.2
cartopy: None
seaborn: 0.8.1
setuptools: 39.1.0
pip: 10.0.1
conda: 4.5.12
pytest: 3.5.1
IPython: 6.4.0
sphinx: 1.7.4