Description
What is your issue?
Related to #6374
Writing a ZARR which is compatible with GDAL conventions using xarray.Dataset.to_zarr
requires all the data variables to have a _CRS
attribute which contains the Spatial Reference System encoding (SRS).
This _CRS
attribute itself is a dict
in which the SRS is encoded in at least one of these keys: wkt
, url
, projjson
Because attribute values can't be dictionaries during serialization, it does not seem possible to write GDAL compatible zarrs using xarray.
Example:
lets assume we have a Dataset ds
like this:
<xarray.Dataset>
Dimensions: (Y: 180, X: 360)
Coordinates:
* X (X) float64 -179.5 -178.5 -177.5 -176.5 ... 176.5 177.5 178.5 179.5
* Y (Y) float64 89.5 88.5 87.5 86.5 85.5 ... -86.5 -87.5 -88.5 -89.5
Data variables:
Band1 (Y, X) uint16 0 0 0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 0 0
Band2 (Y, X) uint16 0 0 0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 0 0
Band3 (Y, X) uint16 0 0 0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 0 0
lets also assume we want to encode the _CRS
as wkt
like so:
wkt = 'GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]'
(encoding the _CRS in any of the other 2 formats results in the same problem at the end)
Setting the attributes of each data variable:
attributes = {
"_ARRAY_DIMENSIONS": ['Y', 'X'],
"_CRS": {"wkt": wkt},
"AREA_OR_POINT": 'Area',
}
for data_var in ds.data_vars:
ds[data_var].attrs = attributes
no problem so far, ds.Band1.attrs
results in:
{
"_ARRAY_DIMENSIONS": ["Y", "X"],
"_CRS": {
"wkt": 'GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]'
},
"AREA_OR_POINT": "Area",
}
the problem now occurs with writing the dataset using:
ds.to_zarr("test.zarr", consolidated=True)
TypeError: Invalid value for attr '_CRS': {'wkt': 'GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]'}.
For serialization to netCDF files, its value must be of one of the following types: str, Number, ndarray, number, list, tuple