Skip to content

Support non-string dimension/variable names #2292

Closed
@joshburkart

Description

@joshburkart

Problem description

Currently, it appears that "dimension"/"coordinate" labels must be strings. However, in more rigorous software engineering applications it is often desirable to use something more organized/structured for labels, e.g. enums. I think it would be great if xarray supported this.

Obviously storing to e.g. NetCDF necessitates string-valued field names, so I would think calling str could be appropriate when performing this sort of serialization. This is what pandas seems to do (see below). But I imagine there might be other issues that would need to be resolved to do what I'm suggesting...?

Code sample

import enum

import numpy as np
import pandas as pd
import xarray as xr

class CoordId(enum.Enum):
    LAT = 'lat'
    LON = 'lon'

pd.DataFrame({CoordId.LAT: [1,2,3]}).to_csv()
# Returns: ',CoordId.LAT\n0,1\n1,2\n2,3\n'

xr.DataArray(
    data=np.arange(3 * 2).reshape(3, 2),
    coords={CoordId.LAT: [1, 2, 3], CoordId.LON: [7, 8]},
    dims=[CoordId.LAT, CoordId.LON],
)
# Fails: TypeError: dimension CoordId.LAT is not a string

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Linux OS-release: 4.15.0-1010-gcp machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: C.UTF-8 LOCALE: en_US.UTF-8

xarray: 0.10.7
pandas: 0.23.1
numpy: 1.14.5
scipy: 1.1.0
netCDF4: 1.3.1
h5netcdf: 0.5.0
h5py: 2.7.1
Nio: None
zarr: None
bottleneck: None
cyordereddict: None
dask: None
distributed: None
matplotlib: 2.1.1
cartopy: 0.16.0
seaborn: None
setuptools: 39.2.0
pip: 9.0.1
conda: None
pytest: 3.6.1
IPython: 6.4.0
sphinx: None

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions