Skip to content

Writing GDAL ZARR _CRS attribute not possible #6448

Closed
@wankoelias

Description

@wankoelias

What is your issue?

Related to #6374

Writing a ZARR which is compatible with GDAL conventions using xarray.Dataset.to_zarr requires all the data variables to have a _CRS attribute which contains the Spatial Reference System encoding (SRS).
This _CRS attribute itself is a dict in which the SRS is encoded in at least one of these keys: wkt, url, projjson

Because attribute values can't be dictionaries during serialization, it does not seem possible to write GDAL compatible zarrs using xarray.

Example:

lets assume we have a Dataset ds like this:

<xarray.Dataset>
Dimensions:  (Y: 180, X: 360)
Coordinates:
  * X        (X) float64 -179.5 -178.5 -177.5 -176.5 ... 176.5 177.5 178.5 179.5
  * Y        (Y) float64 89.5 88.5 87.5 86.5 85.5 ... -86.5 -87.5 -88.5 -89.5
Data variables:
    Band1    (Y, X) uint16 0 0 0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 0 0
    Band2    (Y, X) uint16 0 0 0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 0 0
    Band3    (Y, X) uint16 0 0 0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 0 0

lets also assume we want to encode the _CRS as wkt like so:

wkt = 'GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]'

(encoding the _CRS in any of the other 2 formats results in the same problem at the end)

Setting the attributes of each data variable:

attributes = {
        "_ARRAY_DIMENSIONS": ['Y', 'X'],
         "_CRS": {"wkt": wkt},
        "AREA_OR_POINT": 'Area',
    }

for data_var in ds.data_vars:
    ds[data_var].attrs = attributes

no problem so far, ds.Band1.attrs results in:

{
    "_ARRAY_DIMENSIONS": ["Y", "X"],
    "_CRS": {
        "wkt": 'GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]'
    },
    "AREA_OR_POINT": "Area",
}

the problem now occurs with writing the dataset using:

ds.to_zarr("test.zarr", consolidated=True)
TypeError: Invalid value for attr '_CRS': {'wkt': 'GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]'}. 

For serialization to netCDF files, its value must be of one of the following types: str, Number, ndarray, number, list, tuple

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions