From 5b52e61f3e61f572ddeaf40e2223371e127e8420 Mon Sep 17 00:00:00 2001 From: TomNicholas Date: Wed, 9 Oct 2024 11:21:13 -0400 Subject: [PATCH 01/15] sketch of migration guide --- DATATREE_MIGRATION_GUIDE.md | 65 +++++++++++++++++++++++++++++++++++++ 1 file changed, 65 insertions(+) create mode 100644 DATATREE_MIGRATION_GUIDE.md diff --git a/DATATREE_MIGRATION_GUIDE.md b/DATATREE_MIGRATION_GUIDE.md new file mode 100644 index 00000000000..87f963e5cc4 --- /dev/null +++ b/DATATREE_MIGRATION_GUIDE.md @@ -0,0 +1,65 @@ +# Migration guide for users of `xarray-contrib/datatree` + +This guide is for previous users of the prototype `datatree.DataTree` in the `xarray-contrib/datatree repository`. That repository has now been archived, and will not be maintained. This guide is intended to help smooth your transition to using the new, updated `xarray.DataTree`. + +.. important + + There are breaking changes! You should not expect that code written with `xarray-contrib/datatree` will work without modifications. + At the absolute minimum you will need to change the top-level import statement, but there are other changes too. + +We have made various changes compared to the prototype version. These can be split into two main types: minor API changes, which mostly consist of renaming methods to be more self-consistent, and some deeper data model changes, which affect the hierarchal structure itself. + +### Data model changes + +Internal alignment + +Coordinate inheritance + +Reflected in repr + +Can no longer represent totally arbitrary datasets in each node - some on-disk structures that `xr.open_datatree` will now refuse to load. +For these cases we made `open_groups`. + +Generally if you don't like this you can get more similar behaviour to the old package by removing indexes from coordinates. + +### Integrated backends + +`open_datatree(group=...)`? + +Performance improvements + +Can now extend other xarray backends to support `open_datatree`! + +### Other API changes + +`from datatree import DataTree, open_datatree` -> `from xarray import DataTree, open_datatree` + +`.ds` -> `.dataset` + +`DataTree(ds=...)` to `DataTree(dataset=)` + +`.to_dataset()` still exists but now has options (`inherited=...`) + +`parent` kwarg removed from `DataTree.__init__` + +`.parent` property is now read-only + +`children` in `DataTree.__init__` are now shallow-copied + +`map_over_subtree` -> ? + +Arithmetic between `DataTree` and `Dataset`/scalars now raises + +`.as_array` -> `.to_dataarray` + +Disabled some methods which were not well tested. In general we have tried to only keep things that are known to work, with the plan to increase API surface incrementally after release. + +## Thank you! + +Thank you for trying out `xarray-contrib/datatree`! + +We welcome contributions of any kind, including things that never quite made it into the original datatree repository. Please also let us know if we have forgotten to mention a change that should have been listed in this guide. + +Sincerely, the datatree team + +(Tom Nicholas, Owen Littlejohns, Matt Savoie, Eni Awowale, Alfonso Ladino, Justus Magin, Stephan Hoyer) \ No newline at end of file From c39b04ccbffd37d29220b4d4d7b156e0f10da1af Mon Sep 17 00:00:00 2001 From: "pre-commit-ci[bot]" <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Wed, 9 Oct 2024 15:29:33 +0000 Subject: [PATCH 02/15] [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --- DATATREE_MIGRATION_GUIDE.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/DATATREE_MIGRATION_GUIDE.md b/DATATREE_MIGRATION_GUIDE.md index 87f963e5cc4..83873082893 100644 --- a/DATATREE_MIGRATION_GUIDE.md +++ b/DATATREE_MIGRATION_GUIDE.md @@ -3,7 +3,7 @@ This guide is for previous users of the prototype `datatree.DataTree` in the `xarray-contrib/datatree repository`. That repository has now been archived, and will not be maintained. This guide is intended to help smooth your transition to using the new, updated `xarray.DataTree`. .. important - + There are breaking changes! You should not expect that code written with `xarray-contrib/datatree` will work without modifications. At the absolute minimum you will need to change the top-level import statement, but there are other changes too. @@ -62,4 +62,4 @@ We welcome contributions of any kind, including things that never quite made it Sincerely, the datatree team -(Tom Nicholas, Owen Littlejohns, Matt Savoie, Eni Awowale, Alfonso Ladino, Justus Magin, Stephan Hoyer) \ No newline at end of file +(Tom Nicholas, Owen Littlejohns, Matt Savoie, Eni Awowale, Alfonso Ladino, Justus Magin, Stephan Hoyer) From 1eff8fe2de0df7f60a7e97afdab006aba861d5cc Mon Sep 17 00:00:00 2001 From: TomNicholas Date: Wed, 9 Oct 2024 12:17:34 -0400 Subject: [PATCH 03/15] whatsnew --- doc/whats-new.rst | 2 ++ 1 file changed, 2 insertions(+) diff --git a/doc/whats-new.rst b/doc/whats-new.rst index b374721c8ee..96bddcf572f 100644 --- a/doc/whats-new.rst +++ b/doc/whats-new.rst @@ -30,6 +30,8 @@ New Features `Matt Savoie `_, `Stephan Hoyer `_ and `Tom Nicholas `_. +- A migration guide for users of the prototype `xarray-contrib/datatree repository `_ has been added, and can be found in the `DATATREE_MIGRATION_GUIDE.md` file in the repository root. + By `Tom Nicholas `_. - Added zarr backends for :py:func:`open_groups` (:issue:`9430`, :pull:`9469`). By `Eni Awowale `_. - Added support for vectorized interpolation using additional interpolators From 05dcf5085736d1dbd175c99c093f092401b17836 Mon Sep 17 00:00:00 2001 From: TomNicholas Date: Mon, 14 Oct 2024 18:13:31 -0400 Subject: [PATCH 04/15] add date --- DATATREE_MIGRATION_GUIDE.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/DATATREE_MIGRATION_GUIDE.md b/DATATREE_MIGRATION_GUIDE.md index 83873082893..bba1b662f45 100644 --- a/DATATREE_MIGRATION_GUIDE.md +++ b/DATATREE_MIGRATION_GUIDE.md @@ -1,6 +1,8 @@ # Migration guide for users of `xarray-contrib/datatree` -This guide is for previous users of the prototype `datatree.DataTree` in the `xarray-contrib/datatree repository`. That repository has now been archived, and will not be maintained. This guide is intended to help smooth your transition to using the new, updated `xarray.DataTree`. +_15th October 2024_ + +This guide is for previous users of the prototype `datatree.DataTree` class in the `xarray-contrib/datatree repository`. That repository has now been archived, and will not be maintained. This guide is intended to help smooth your transition to using the new, updated `xarray.DataTree` class. .. important From b0621ce159d07a5ad847873b5cc1d507f3c61b55 Mon Sep 17 00:00:00 2001 From: TomNicholas Date: Mon, 14 Oct 2024 18:24:40 -0400 Subject: [PATCH 05/15] spell out API changes in more detail --- DATATREE_MIGRATION_GUIDE.md | 50 +++++++++++++++++-------------------- 1 file changed, 23 insertions(+), 27 deletions(-) diff --git a/DATATREE_MIGRATION_GUIDE.md b/DATATREE_MIGRATION_GUIDE.md index bba1b662f45..d8b3c269705 100644 --- a/DATATREE_MIGRATION_GUIDE.md +++ b/DATATREE_MIGRATION_GUIDE.md @@ -6,10 +6,10 @@ This guide is for previous users of the prototype `datatree.DataTree` class in t .. important - There are breaking changes! You should not expect that code written with `xarray-contrib/datatree` will work without modifications. + There are breaking changes! You should not expect that code written with `xarray-contrib/datatree` will work without any modifications. At the absolute minimum you will need to change the top-level import statement, but there are other changes too. -We have made various changes compared to the prototype version. These can be split into two main types: minor API changes, which mostly consist of renaming methods to be more self-consistent, and some deeper data model changes, which affect the hierarchal structure itself. +We have made various changes compared to the prototype version. These can be split into three categories: minor API changes, which mostly consist of renaming methods to be more self-consistent; and some deeper data model changes, which affect the hierarchal structure itself; and integration with xarray's IO backends. ### Data model changes @@ -32,36 +32,32 @@ Performance improvements Can now extend other xarray backends to support `open_datatree`! -### Other API changes +### API changes -`from datatree import DataTree, open_datatree` -> `from xarray import DataTree, open_datatree` - -`.ds` -> `.dataset` - -`DataTree(ds=...)` to `DataTree(dataset=)` - -`.to_dataset()` still exists but now has options (`inherited=...`) - -`parent` kwarg removed from `DataTree.__init__` - -`.parent` property is now read-only - -`children` in `DataTree.__init__` are now shallow-copied - -`map_over_subtree` -> ? - -Arithmetic between `DataTree` and `Dataset`/scalars now raises - -`.as_array` -> `.to_dataarray` - -Disabled some methods which were not well tested. In general we have tried to only keep things that are known to work, with the plan to increase API surface incrementally after release. +A number of other API changes have been made, which should only require minor modifications to your code: +- The top-level import has changed, from `from datatree import DataTree, open_datatree` to `from xarray import DataTree, open_datatree`. Alternatively you can now just use the `import xarray as xr` namespace convention for everything datatree-related. +- The `DataTree.ds` property has been changed to `DataTree.dataset`, though `DataTree.ds` remains as an alias for `DataTree.dataset`. +- Similarly the `ds` kwarg in the `DataTree.__init__` constructor has been replaced by `dataset`, i.e. use `DataTree(dataset=)` instead of `DataTree(ds=...)`. +- The method `DataTree.to_dataset()` still exists but now has different options for controlling which variables are present on the resulting `Dataset`, e.g. `inherited=True/False`. +- The `DataTree.parent` property is now read-only. To assign a node as the parent you should instead use the `.children` property on the other node, which remains settable. +- Similarly the `parent` kwarg has been removed from the `DataTree.__init__` constuctor. +- DataTree objects passed to the `children` kwarg in `DataTree.__init__` are now shallow-copied. +- `DataTree.as_array` has been replaced by `DataTree.to_dataarray`. +- `map_over_subtree` -> ? +- A number of methods which were not well tested have been (temporarily) disabled. In general we have tried to only keep things that are known to work, with the plan to increase API surface incrementally after release. ## Thank you! Thank you for trying out `xarray-contrib/datatree`! -We welcome contributions of any kind, including things that never quite made it into the original datatree repository. Please also let us know if we have forgotten to mention a change that should have been listed in this guide. +We welcome contributions of any kind, including good ideas that never quite made it into the original datatree repository. Please also let us know if we have forgotten to mention a change that should have been listed in this guide. -Sincerely, the datatree team +Sincerely, the datatree team: -(Tom Nicholas, Owen Littlejohns, Matt Savoie, Eni Awowale, Alfonso Ladino, Justus Magin, Stephan Hoyer) +Tom Nicholas, +Owen Littlejohns, +Matt Savoie, +Eni Awowale, +Alfonso Ladino, +Justus Magin, +Stephan Hoyer From 8c5f41beaf70e0b48a2ede48865338abb02a6aa4 Mon Sep 17 00:00:00 2001 From: TomNicholas Date: Mon, 14 Oct 2024 18:33:36 -0400 Subject: [PATCH 06/15] details on backends integration --- DATATREE_MIGRATION_GUIDE.md | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/DATATREE_MIGRATION_GUIDE.md b/DATATREE_MIGRATION_GUIDE.md index d8b3c269705..059dc369989 100644 --- a/DATATREE_MIGRATION_GUIDE.md +++ b/DATATREE_MIGRATION_GUIDE.md @@ -26,11 +26,14 @@ Generally if you don't like this you can get more similar behaviour to the old p ### Integrated backends -`open_datatree(group=...)`? - -Performance improvements - -Can now extend other xarray backends to support `open_datatree`! +Previously `datatree.open_datatree` used a different codepath from `xarray.open_dataset`, and was hard-coded to only support opening netCDF files and Zarr stores. +Now xarray's backend entrypoint system has been generalized to include `open_datatree` and the new `open_groups`. +This means we can now extend other xarray backends to support `open_datatree`! If you are the maintainer of an xarray backend we encourage you to add support for `open_datatree` and `open_groups`! + +Additionally: +- A `group` kwarg has been added to `open_datatree` for choosing which group in the file should become the root group of the created tree. +- Various performance improvements have been made, which should help when opening netCDF files and Zarr stores with large numbers of groups. +- We anticipate further performance improvements being possible for datatree IO. ### API changes From 5ce2d26cc8c7f98ae90a85cfbc385cb2f403125d Mon Sep 17 00:00:00 2001 From: TomNicholas Date: Mon, 14 Oct 2024 20:09:03 -0400 Subject: [PATCH 07/15] explain alignment and open_groups --- DATATREE_MIGRATION_GUIDE.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/DATATREE_MIGRATION_GUIDE.md b/DATATREE_MIGRATION_GUIDE.md index 059dc369989..75568019225 100644 --- a/DATATREE_MIGRATION_GUIDE.md +++ b/DATATREE_MIGRATION_GUIDE.md @@ -13,15 +13,14 @@ We have made various changes compared to the prototype version. These can be spl ### Data model changes -Internal alignment +The most important changes made are to the data model of `DataTree`. Whilst previously data in different nodes was unrelated and therefore unconstrained, now trees have "internal alignment" - meaning that dimensions and indexes in child nodes must exactly align with those in their parents. + +These alignment checks happen at tree construction time, meaning technically there are some netCDF4 files and zarr stores that previously could be opened as `datatree.DataTree` objects using `datatree.open_datatree`, but now cannot be opened as `xr.DataTree` objects using `xr.open_datatree`. For these cases we added a new opener function `xr.open_groups`, which returns a `dict[str, Dataset]`. This is intended as a fallback for tricky cases, where the idea is that you can still open the entire contents of the file using `open_groups`, edit the `Dataset` objects, then construct a valid tree from the edited dictionary using `DataTree.from_dict`. Coordinate inheritance Reflected in repr -Can no longer represent totally arbitrary datasets in each node - some on-disk structures that `xr.open_datatree` will now refuse to load. -For these cases we made `open_groups`. - Generally if you don't like this you can get more similar behaviour to the old package by removing indexes from coordinates. ### Integrated backends From 1e8b04e89348ddf8ac4985b0f9a3379b687a0689 Mon Sep 17 00:00:00 2001 From: TomNicholas Date: Mon, 14 Oct 2024 20:14:21 -0400 Subject: [PATCH 08/15] explain coordinate inheritance --- DATATREE_MIGRATION_GUIDE.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/DATATREE_MIGRATION_GUIDE.md b/DATATREE_MIGRATION_GUIDE.md index 75568019225..b80d2045b9d 100644 --- a/DATATREE_MIGRATION_GUIDE.md +++ b/DATATREE_MIGRATION_GUIDE.md @@ -17,11 +17,9 @@ The most important changes made are to the data model of `DataTree`. Whilst prev These alignment checks happen at tree construction time, meaning technically there are some netCDF4 files and zarr stores that previously could be opened as `datatree.DataTree` objects using `datatree.open_datatree`, but now cannot be opened as `xr.DataTree` objects using `xr.open_datatree`. For these cases we added a new opener function `xr.open_groups`, which returns a `dict[str, Dataset]`. This is intended as a fallback for tricky cases, where the idea is that you can still open the entire contents of the file using `open_groups`, edit the `Dataset` objects, then construct a valid tree from the edited dictionary using `DataTree.from_dict`. -Coordinate inheritance +The alignment checks allowed us to add "Coordinate Inheritance", a much-requested feature where indexed coordinate variables are now "inherited" down to child nodes. This allows you to define common coordinates in a parent group that are then automatically available on every child node. The distinction between a locally-defined coordinate variables and an inherited coordinate that was defined on a parent node is reflected in the `DataTree.__repr__`. Generally if you prefer not to have these variables be inherited you can get more similar behaviour to the old `datatree` package by removing indexes from coordinates, as this prevents inheritance. -Reflected in repr - -Generally if you don't like this you can get more similar behaviour to the old package by removing indexes from coordinates. +For further documentation see the page in the user guide on Hierarchical Data. ### Integrated backends From b8eeaaf90dc175c751707b23cc853e5cf394d0cb Mon Sep 17 00:00:00 2001 From: "pre-commit-ci[bot]" <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Tue, 15 Oct 2024 00:15:12 +0000 Subject: [PATCH 09/15] [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --- DATATREE_MIGRATION_GUIDE.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/DATATREE_MIGRATION_GUIDE.md b/DATATREE_MIGRATION_GUIDE.md index b80d2045b9d..486d9d9d66c 100644 --- a/DATATREE_MIGRATION_GUIDE.md +++ b/DATATREE_MIGRATION_GUIDE.md @@ -40,7 +40,7 @@ A number of other API changes have been made, which should only require minor mo - Similarly the `ds` kwarg in the `DataTree.__init__` constructor has been replaced by `dataset`, i.e. use `DataTree(dataset=)` instead of `DataTree(ds=...)`. - The method `DataTree.to_dataset()` still exists but now has different options for controlling which variables are present on the resulting `Dataset`, e.g. `inherited=True/False`. - The `DataTree.parent` property is now read-only. To assign a node as the parent you should instead use the `.children` property on the other node, which remains settable. -- Similarly the `parent` kwarg has been removed from the `DataTree.__init__` constuctor. +- Similarly the `parent` kwarg has been removed from the `DataTree.__init__` constuctor. - DataTree objects passed to the `children` kwarg in `DataTree.__init__` are now shallow-copied. - `DataTree.as_array` has been replaced by `DataTree.to_dataarray`. - `map_over_subtree` -> ? @@ -54,10 +54,10 @@ We welcome contributions of any kind, including good ideas that never quite made Sincerely, the datatree team: -Tom Nicholas, -Owen Littlejohns, -Matt Savoie, -Eni Awowale, -Alfonso Ladino, -Justus Magin, +Tom Nicholas, +Owen Littlejohns, +Matt Savoie, +Eni Awowale, +Alfonso Ladino, +Justus Magin, Stephan Hoyer From 366e755c9778c8cf6122aac7c54a20c99f6e8986 Mon Sep 17 00:00:00 2001 From: TomNicholas Date: Mon, 14 Oct 2024 20:17:30 -0400 Subject: [PATCH 10/15] re-trigger CI From 1c751dd9dec0d6e30971af5a00acd7f90d5e58d0 Mon Sep 17 00:00:00 2001 From: TomNicholas Date: Mon, 14 Oct 2024 20:19:03 -0400 Subject: [PATCH 11/15] remove bullet about map_over_subtree --- DATATREE_MIGRATION_GUIDE.md | 1 - 1 file changed, 1 deletion(-) diff --git a/DATATREE_MIGRATION_GUIDE.md b/DATATREE_MIGRATION_GUIDE.md index 486d9d9d66c..961988feefa 100644 --- a/DATATREE_MIGRATION_GUIDE.md +++ b/DATATREE_MIGRATION_GUIDE.md @@ -43,7 +43,6 @@ A number of other API changes have been made, which should only require minor mo - Similarly the `parent` kwarg has been removed from the `DataTree.__init__` constuctor. - DataTree objects passed to the `children` kwarg in `DataTree.__init__` are now shallow-copied. - `DataTree.as_array` has been replaced by `DataTree.to_dataarray`. -- `map_over_subtree` -> ? - A number of methods which were not well tested have been (temporarily) disabled. In general we have tried to only keep things that are known to work, with the plan to increase API surface incrementally after release. ## Thank you! From 68b2ab6441f8fa65d762667e54fe6e0bb9d54e46 Mon Sep 17 00:00:00 2001 From: Tom Nicholas Date: Tue, 15 Oct 2024 08:00:32 -0600 Subject: [PATCH 12/15] Markdown formatting for important warning block Co-authored-by: Matt Savoie --- DATATREE_MIGRATION_GUIDE.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/DATATREE_MIGRATION_GUIDE.md b/DATATREE_MIGRATION_GUIDE.md index 961988feefa..8056e393e0a 100644 --- a/DATATREE_MIGRATION_GUIDE.md +++ b/DATATREE_MIGRATION_GUIDE.md @@ -4,10 +4,8 @@ _15th October 2024_ This guide is for previous users of the prototype `datatree.DataTree` class in the `xarray-contrib/datatree repository`. That repository has now been archived, and will not be maintained. This guide is intended to help smooth your transition to using the new, updated `xarray.DataTree` class. -.. important - - There are breaking changes! You should not expect that code written with `xarray-contrib/datatree` will work without any modifications. - At the absolute minimum you will need to change the top-level import statement, but there are other changes too. +> [!IMPORTANT] +> There are breaking changes! You should not expect that code written with `xarray-contrib/datatree` will work without any modifications. At the absolute minimum you will need to change the top-level import statement, but there are other changes too. We have made various changes compared to the prototype version. These can be split into three categories: minor API changes, which mostly consist of renaming methods to be more self-consistent; and some deeper data model changes, which affect the hierarchal structure itself; and integration with xarray's IO backends. From 7d560cd9dfb4ac6513afaac558c8a12c2b7864ad Mon Sep 17 00:00:00 2001 From: Tom Nicholas Date: Tue, 15 Oct 2024 08:00:59 -0600 Subject: [PATCH 13/15] Reorder changes in order of importance Co-authored-by: Matt Savoie --- DATATREE_MIGRATION_GUIDE.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/DATATREE_MIGRATION_GUIDE.md b/DATATREE_MIGRATION_GUIDE.md index 8056e393e0a..f64339d170f 100644 --- a/DATATREE_MIGRATION_GUIDE.md +++ b/DATATREE_MIGRATION_GUIDE.md @@ -7,7 +7,7 @@ This guide is for previous users of the prototype `datatree.DataTree` class in t > [!IMPORTANT] > There are breaking changes! You should not expect that code written with `xarray-contrib/datatree` will work without any modifications. At the absolute minimum you will need to change the top-level import statement, but there are other changes too. -We have made various changes compared to the prototype version. These can be split into three categories: minor API changes, which mostly consist of renaming methods to be more self-consistent; and some deeper data model changes, which affect the hierarchal structure itself; and integration with xarray's IO backends. +We have made various changes compared to the prototype version. These can be split into three categories: data model changes, which affect the hierarchal structure itself, integration with xarray's IO backends; and minor API changes, which mostly consist of renaming methods to be more self-consistent. ### Data model changes From 23d0e4416fcfd3b67b646d414c55b44b9d6e440f Mon Sep 17 00:00:00 2001 From: Tom Nicholas Date: Tue, 15 Oct 2024 08:03:45 -0600 Subject: [PATCH 14/15] Clearer wording on setting relationships Co-authored-by: Matt Savoie --- DATATREE_MIGRATION_GUIDE.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/DATATREE_MIGRATION_GUIDE.md b/DATATREE_MIGRATION_GUIDE.md index f64339d170f..90edc984e7c 100644 --- a/DATATREE_MIGRATION_GUIDE.md +++ b/DATATREE_MIGRATION_GUIDE.md @@ -37,7 +37,7 @@ A number of other API changes have been made, which should only require minor mo - The `DataTree.ds` property has been changed to `DataTree.dataset`, though `DataTree.ds` remains as an alias for `DataTree.dataset`. - Similarly the `ds` kwarg in the `DataTree.__init__` constructor has been replaced by `dataset`, i.e. use `DataTree(dataset=)` instead of `DataTree(ds=...)`. - The method `DataTree.to_dataset()` still exists but now has different options for controlling which variables are present on the resulting `Dataset`, e.g. `inherited=True/False`. -- The `DataTree.parent` property is now read-only. To assign a node as the parent you should instead use the `.children` property on the other node, which remains settable. +- The `DataTree.parent` property is now read-only. To assign a ancestral relationships directly you must instead use the `.children` property on the parent node, which remains settable. - Similarly the `parent` kwarg has been removed from the `DataTree.__init__` constuctor. - DataTree objects passed to the `children` kwarg in `DataTree.__init__` are now shallow-copied. - `DataTree.as_array` has been replaced by `DataTree.to_dataarray`. From 85d9c999b66c6b900fc6074aae4864874cb301c1 Mon Sep 17 00:00:00 2001 From: Tom Nicholas Date: Tue, 15 Oct 2024 08:06:20 -0600 Subject: [PATCH 15/15] remove "technically" Co-authored-by: Matt Savoie --- DATATREE_MIGRATION_GUIDE.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/DATATREE_MIGRATION_GUIDE.md b/DATATREE_MIGRATION_GUIDE.md index 90edc984e7c..312cb842d84 100644 --- a/DATATREE_MIGRATION_GUIDE.md +++ b/DATATREE_MIGRATION_GUIDE.md @@ -13,7 +13,7 @@ We have made various changes compared to the prototype version. These can be spl The most important changes made are to the data model of `DataTree`. Whilst previously data in different nodes was unrelated and therefore unconstrained, now trees have "internal alignment" - meaning that dimensions and indexes in child nodes must exactly align with those in their parents. -These alignment checks happen at tree construction time, meaning technically there are some netCDF4 files and zarr stores that previously could be opened as `datatree.DataTree` objects using `datatree.open_datatree`, but now cannot be opened as `xr.DataTree` objects using `xr.open_datatree`. For these cases we added a new opener function `xr.open_groups`, which returns a `dict[str, Dataset]`. This is intended as a fallback for tricky cases, where the idea is that you can still open the entire contents of the file using `open_groups`, edit the `Dataset` objects, then construct a valid tree from the edited dictionary using `DataTree.from_dict`. +These alignment checks happen at tree construction time, meaning there are some netCDF4 files and zarr stores that could previously be opened as `datatree.DataTree` objects using `datatree.open_datatree`, but now cannot be opened as `xr.DataTree` objects using `xr.open_datatree`. For these cases we added a new opener function `xr.open_groups`, which returns a `dict[str, Dataset]`. This is intended as a fallback for tricky cases, where the idea is that you can still open the entire contents of the file using `open_groups`, edit the `Dataset` objects, then construct a valid tree from the edited dictionary using `DataTree.from_dict`. The alignment checks allowed us to add "Coordinate Inheritance", a much-requested feature where indexed coordinate variables are now "inherited" down to child nodes. This allows you to define common coordinates in a parent group that are then automatically available on every child node. The distinction between a locally-defined coordinate variables and an inherited coordinate that was defined on a parent node is reflected in the `DataTree.__repr__`. Generally if you prefer not to have these variables be inherited you can get more similar behaviour to the old `datatree` package by removing indexes from coordinates, as this prevents inheritance.