Skip to content

expand_dims() modifies numpy.ndarray.flags to write only, upon manually reverting this flag back, attempting to set a single inner value using .loc will instead set all of the inner array values #2891

Closed
@davehemming

Description

@davehemming

I am using the newly updated expand_dims API that was recently updated with this PR #2757. However the flag setting behaviour can also be observed using the old API syntax.

>>> expanded_da = xr.DataArray(np.random.rand(3,3), coords={'x': np.arange(3), 'y': np.arange(3)}, dims=('x', 'y')) # Create a 2D DataArray
>>> expanded_da
<xarray.DataArray (x: 3, y: 3)>
array([[0.148579, 0.463005, 0.224993],
       [0.633511, 0.056746, 0.28119 ],
       [0.390596, 0.298519, 0.286853]])
Coordinates:
  * x        (x) int64 0 1 2
  * y        (y) int64 0 1 2

>>> expanded_da.data.flags # Check current state of numpy flags
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

>>> expanded_da.loc[0, 0] = 2.22 # Set a single value before expanding
>>> expanded_da # It works, the single value is set
<xarray.DataArray (x: 3, y: 3)>
array([[2.22    , 0.463005, 0.224993],
       [0.633511, 0.056746, 0.28119 ],
       [0.390596, 0.298519, 0.286853]])
Coordinates:
  * x        (x) int64 0 1 2
  * y        (y) int64 0 1 2

>>> expanded_da = expanded_da.expand_dims({'z': 3}, -1) # Add a new dimension 'z'
>>> expanded_da
<xarray.DataArray (x: 3, y: 3, z: 3)>
array([[[2.22    , 2.22    , 2.22    ],
        [0.463005, 0.463005, 0.463005],
        [0.224993, 0.224993, 0.224993]],

       [[0.633511, 0.633511, 0.633511],
        [0.056746, 0.056746, 0.056746],
        [0.28119 , 0.28119 , 0.28119 ]],

       [[0.390596, 0.390596, 0.390596],
        [0.298519, 0.298519, 0.298519],
        [0.286853, 0.286853, 0.286853]]])
Coordinates:
  * x        (x) int64 0 1 2
  * y        (y) int64 0 1 2
Dimensions without coordinates: z

>>> expanded_da['z'] = np.arange(3) # Add new coordinates to the new dimension 'z'
>>> expanded_da
<xarray.DataArray (x: 3, y: 3, z: 3)>
array([[[2.22    , 2.22    , 2.22    ],
        [0.463005, 0.463005, 0.463005],
        [0.224993, 0.224993, 0.224993]],

       [[0.633511, 0.633511, 0.633511],
        [0.056746, 0.056746, 0.056746],
        [0.28119 , 0.28119 , 0.28119 ]],

       [[0.390596, 0.390596, 0.390596],
        [0.298519, 0.298519, 0.298519],
        [0.286853, 0.286853, 0.286853]]])
Coordinates:
  * x        (x) int64 0 1 2
  * y        (y) int64 0 1 2
  * z        (z) int64 0 1 2

>>> expanded_da.loc[0, 0, 0] = 9.99 # Attempt to set a single value, get 'read-only' error
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/dhemming/.ve/unidata_notebooks/lib/python3.6/site-packages/xarray/core/dataarray.py", line 113, in __setitem__
    self.data_array[pos_indexers] = value
  File "/Users/dhemming/.ve/unidata_notebooks/lib/python3.6/site-packages/xarray/core/dataarray.py", line 494, in __setitem__
    self.variable[key] = value
  File "/Users/dhemming/.ve/unidata_notebooks/lib/python3.6/site-packages/xarray/core/variable.py", line 714, in __setitem__
    indexable[index_tuple] = value
  File "/Users/dhemming/.ve/unidata_notebooks/lib/python3.6/site-packages/xarray/core/indexing.py", line 1174, in __setitem__
    array[key] = value
ValueError: assignment destination is read-only

>>> expanded_da.data.flags # Check flags on the DataArray, notice they have changed
  C_CONTIGUOUS : False
  F_CONTIGUOUS : False
  OWNDATA : False
  WRITEABLE : False
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

>>> expanded_da.data.setflags(write = 1) # Make array writeable again
>>> expanded_da.data.flags
  C_CONTIGUOUS : False
  F_CONTIGUOUS : False
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

>>> expanded_da.loc[0, 0, 0] # Check the value I want to overwrite
<xarray.DataArray ()>
array(2.22)
Coordinates:
    x        int64 0
    y        int64 0
    z        int64 0

>>> expanded_da.loc[0, 0, 0] = 9.99 # Attempt to overwrite single value, instead it overwrites all values in the array located at [0, 0]
>>> expanded_da
<xarray.DataArray (x: 3, y: 3, z: 3)>
array([[[9.99    , 9.99    , 9.99    ],
        [0.463005, 0.463005, 0.463005],
        [0.224993, 0.224993, 0.224993]],

       [[0.633511, 0.633511, 0.633511],
        [0.056746, 0.056746, 0.056746],
        [0.28119 , 0.28119 , 0.28119 ]],

       [[0.390596, 0.390596, 0.390596],
        [0.298519, 0.298519, 0.298519],
        [0.286853, 0.286853, 0.286853]]])
Coordinates:
  * x        (x) int64 0 1 2
  * y        (y) int64 0 1 2
  * z        (z) int64 0 1 2

Problem description

When applying the operation 'expand_dims({'z': 3}, -1)' on a DataArray the underlying Numpy array flags are changed. 'C_CONTIGUOUS' is set to False, and 'WRITEABLE' is set to False, and 'OWNDATA' is set to False. Upon changing 'WRITEABLE' back to True, when I try to set a single value in the DataArray using the '.loc' operator it will instead set all the values in that selected inner array.

I am new to Xarray so I can't be entirely sure if this expected behaviour. Regardless I would expect that adding a new dimension to the array would not make that array 'read-only'. I would also not expect the '.loc' method to work differently to how it would otherwise.

It's also not congruent with the Numpy 'expand_dims' operation. Because when I call the operation 'np.expand_dims(np_arr, axis=-1)' the 'C_CONTIGUOUS ' and 'WRITEABLE ' flags will not be modified.

Expected Output

Here is a similar flow of operations that demonstrates the behaviour I would expect from the DataArray after applying 'expand_dims':

>>> non_expanded_da = xr.DataArray(np.random.rand(3,3,3), coords={'x': np.arange(3), 'y': np.arange(3)}, dims=('x', 'y', 'z')) # Create the new DataArray to be in the same state as I would expect it to be in after applying the operation 'expand_dims({'z': 3}, -1)'
>>> non_expanded_da
<xarray.DataArray (x: 3, y: 3, z: 3)>
array([[[0.017221, 0.374267, 0.231979],
        [0.678884, 0.512903, 0.737573],
        [0.985872, 0.1373  , 0.4603  ]],

       [[0.764227, 0.825059, 0.847694],
        [0.482841, 0.708206, 0.486576],
        [0.726265, 0.860627, 0.435101]],

       [[0.117904, 0.40569 , 0.274288],
        [0.079321, 0.647562, 0.847459],
        [0.57494 , 0.578745, 0.125309]]])
Coordinates:
  * x        (x) int64 0 1 2
  * y        (y) int64 0 1 2
Dimensions without coordinates: z

>>> non_expanded_da.data.flags # Check flags
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

>>> non_expanded_da['z'] = np.arange(3) # Set coordinate for dimension 'z'
>>> non_expanded_da
<xarray.DataArray (x: 3, y: 3, z: 3)>
array([[[0.017221, 0.374267, 0.231979],
        [0.678884, 0.512903, 0.737573],
        [0.985872, 0.1373  , 0.4603  ]],

       [[0.764227, 0.825059, 0.847694],
        [0.482841, 0.708206, 0.486576],
        [0.726265, 0.860627, 0.435101]],

       [[0.117904, 0.40569 , 0.274288],
        [0.079321, 0.647562, 0.847459],
        [0.57494 , 0.578745, 0.125309]]])
Coordinates:
  * x        (x) int64 0 1 2
  * y        (y) int64 0 1 2
  * z        (z) int64 0 1 2

>>> non_expanded_da.loc[0, 0, 0] = 2.22 # Set value using .loc method
>>> non_expanded_da # The single value referenced is set which is what I expect to happen
<xarray.DataArray (x: 3, y: 3, z: 3)>
array([[[2.22    , 0.374267, 0.231979],
        [0.678884, 0.512903, 0.737573],
        [0.985872, 0.1373  , 0.4603  ]],

       [[0.764227, 0.825059, 0.847694],
        [0.482841, 0.708206, 0.486576],
        [0.726265, 0.860627, 0.435101]],

       [[0.117904, 0.40569 , 0.274288],
        [0.079321, 0.647562, 0.847459],
        [0.57494 , 0.578745, 0.125309]]])
Coordinates:
  * x        (x) int64 0 1 2
  * y        (y) int64 0 1 2
  * z        (z) int64 0 1 2

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.7 (default, Dec 29 2018, 12:05:36) [GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.11.45.5)] python-bits: 64 OS: Darwin OS-release: 18.2.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_AU.UTF-8 LOCALE: en_AU.UTF-8 libhdf5: 1.10.2 libnetcdf: 4.4.1.1

xarray: 0.12.1
pandas: 0.24.2
numpy: 1.16.2
scipy: 1.2.1
netCDF4: 1.5.0
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.0.3.4
nc_time_axis: None
PseudonetCDF: None
rasterio: None
cfgrib: 0.9.6.1.post1
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.0.3
cartopy: 0.17.0
seaborn: None
setuptools: 39.0.1
pip: 10.0.1
conda: None
pytest: None
IPython: None
sphinx: None

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions