Skip to content

Clarify that chunks={} in .open_dataset reproduces the default behavior of deprecated .open_zarr #7293

Closed
@jbusecke

Description

@jbusecke

What is your issue?

I was wondering if we could add some language to the docstring of xr.open_dataset regarding the chunk kwarg to make the transition for folks who have used a lot of xr.open_zarr in the past.

the current text is:

chunks (int, dict, 'auto' or None, optional) – If chunks is provided, it is used to load the new dataset into dask arrays. chunks=-1 loads the dataset with dask using a single chunk for all arrays. chunks={} loads the dataset with dask using engine preferred chunks if exposed by the backend, otherwise with a single chunk for all arrays. chunks='auto' will use dask auto chunking taking into account the engine preferred chunks. See dask chunking for more details.

I found that for opening large zarr stores, setting chunks={} reproduces the behavior of xr.open_zarr()? If this is true I think it would be great to include something like

chunks={} loads the dataset with dask using engine preferred chunks if exposed by the backend, otherwise with a single chunk for all arrays. In order to reproduce the default behavior of xr.open_zarr(...) use `xr.open_dataset(..., engine='zarr', chunks={})

to make this clear for users who have been using xr.open_zarr in the past.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions