-
Notifications
You must be signed in to change notification settings - Fork 116
move custom accessors to advanced and add a more complex example #168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 4 commits
Commits
Show all changes
27 commits
Select commit
Hold shift + click to select a range
d955014
add advanced/accessors section and remove accessors part 2 from xarra…
JessicaS11 65d6984
add geoid offset ex
JessicaS11 5dd2950
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 30639fa
add new tutorial to toc
JessicaS11 37dc3d4
Merge branch 'main' into accessors
JessicaS11 e33bc99
make example 1 executable
JessicaS11 79a3420
start making ex 2 executable
JessicaS11 0d8dd9e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] c134429
add insar velocity accessor example
JessicaS11 774700a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] e0dad6a
apply edits from PR
JessicaS11 55421e5
debug new tutorial
JessicaS11 054422f
Merge branch 'main' into accessors
JessicaS11 7432617
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 52f7d09
add cross ref in tutorial
JessicaS11 78ea4d5
re-add raises-exception cell tag
JessicaS11 5c9ea87
Merge branch 'main' into accessors
dcherian 36e5402
use note directive
JessicaS11 fd3d214
PR review suggestions
JessicaS11 6fdadb7
Merge branch 'xarray-contrib:main' into accessors
JessicaS11 6172e79
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 77ea3ba
fix accessor path
JessicaS11 8875193
fix accessor path
JessicaS11 135777b
add random newline to see if another push will get github to update t…
JessicaS11 4fd7b82
make non-relative path
JessicaS11 d1b814f
fix a few typos
JessicaS11 97162b3
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,240 @@ | ||
{ | ||
JessicaS11 marked this conversation as resolved.
Show resolved
Hide resolved
JessicaS11 marked this conversation as resolved.
Show resolved
Hide resolved
JessicaS11 marked this conversation as resolved.
Show resolved
Hide resolved
JessicaS11 marked this conversation as resolved.
Show resolved
Hide resolved
JessicaS11 marked this conversation as resolved.
Show resolved
Hide resolved
JessicaS11 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
"cells": [ | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Creating custom accessors" | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Introduction\n", | ||
"\n", | ||
"An accessor is a way of attaching a custom function to xarray types so that it can be called as if it were a method while retaining a clear separation between \"core\" xarray API and custom API. It enables you to easily *extend* (which is why you'll sometimes see it referred to as an extension) and customize xarray's functionality while limiting naming conflicts and minimizing the chances of your code breaking with xarray upgrades.\n", | ||
"\n", | ||
"If you've used [rioxarray](https://corteva.github.io/rioxarray/stable/) (e.g. `da.rio.crs`) or [hvplot](https://hvplot.holoviz.org/) (e.g. `ds.hvplot()`), you may have already used an xarray accessor without knowing it!\n", | ||
"\n", | ||
"The [Xarray documentation](https://docs.xarray.dev/en/stable/internals/extending-xarray.html) has some more technical details, and this tutorial provides example custom accessors and their uses." | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Why create a custom accessor\n", | ||
"\n", | ||
"- You can easily create a custom suite of tools that work on Xarray objects\n", | ||
"- It keeps your workflows cleaner and simpler\n", | ||
"- Your project-specific code is easy to share\n", | ||
"- It's easy to implement: you don't need to integrate any code into Xarray\n", | ||
"- it makes it easier to perform checks and write code documentation because you only have to create them once!" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Easy steps to create your own accessor\n", | ||
"\n", | ||
"1. Create your custom class, including the mandatory `__init__` method\n", | ||
"2. Add the `xr.register_dataarray_accessor()` or `xr.register_dataset_accessor()` \n", | ||
"3. Use your custom functions " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Example 1: accessing scipy functionality" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"For example, imagine you're a statistician who regularly uses a special `skewness` function which acts on dataarrays but is only of interest to people in your specific field.\n", | ||
"\n", | ||
"You can create a method which applies this skewness function to an xarray objects, and then register the method under a custom `stats` accessor like this" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from scipy.stats import skew\n", | ||
"\n", | ||
"\n", | ||
"@xr.register_dataarray_accessor(\"stats\")\n", | ||
"class StatsAccessor:\n", | ||
" def __init__(self, da):\n", | ||
" self._da = da\n", | ||
"\n", | ||
" def skewness(self, dim):\n", | ||
" return self._da.reduce(func=skew, dim=dim)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Now we can conveniently access this functionality via the `stats` accessor" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"ds['air'].stats.skewness(dim=\"time\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Notice how the presence of `.stats` clearly differentiates our new \"accessor method\" from core xarray methods." | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Example 2: creating your own workflows\n", | ||
"\n", | ||
"Perhaps you find yourself running similar code for multiple xarray objects or across related projects. By packing your code into an extension, it makes it easy to repeat the same operation while reducing the likelihood of [human introduced] errors.\n", | ||
"\n", | ||
"Consider someone who frequently converts their elevations to be relative to the geoid (rather than the ellipsoid) using a custom, local conversion (otherwise, we'd recommend using an established conversion library like [pyproj](https://pypi.org/project/pyproj/) to switch between datums)." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"@xr.register_dataarray_accessor(\"geoidxr\")\n", | ||
"class GeoidXR:\n", | ||
" \"\"\"\n", | ||
" An extension for an XArray dataset that will calculate geoidal elevations from a local source file.\n", | ||
" \"\"\"\n", | ||
"\n", | ||
" # ----------------------------------------------------------------------\n", | ||
" # Constructors\n", | ||
"\n", | ||
" def __init__(\n", | ||
" self,\n", | ||
" xrds,\n", | ||
" ):\n", | ||
" self._xrds = xrds\n", | ||
" # Running this function on init will check that my dataset has all the needed dimensions and variables\n", | ||
" # as specific to my workflow, saving time and headache later if they were missing and the computation fails\n", | ||
" # partway through.\n", | ||
" self._validate(\n", | ||
" self, req_dim=['x', 'y', 'dtime'], req_vars={'elevation': ['x', 'y', 'dtime']}\n", | ||
" )\n", | ||
"\n", | ||
" # ----------------------------------------------------------------------\n", | ||
" # Methods\n", | ||
"\n", | ||
" @staticmethod\n", | ||
" def _validate(self, req_dim=None, req_vars=None):\n", | ||
" '''\n", | ||
" Make sure the xarray dataset has the correct dimensions and variables\n", | ||
"\n", | ||
" req_dim : list of str\n", | ||
" List of all required dimension names\n", | ||
"\n", | ||
" req_vars : list of str\n", | ||
" List of all required variable names\n", | ||
" '''\n", | ||
"\n", | ||
" if req_dim is not None:\n", | ||
" if all([dim not in list(self._xrds.dims) for dim in req_dim]):\n", | ||
" raise AttributeError(\"Required dimensions are missing\")\n", | ||
" if req_vars is not None:\n", | ||
" if all([var not in self._xrds.variables for var in req_vars.keys()]):\n", | ||
" raise AttributeError(\"Required variables are missing\")\n", | ||
"\n", | ||
" # Notice that 'geoid' has been added to the req_vars list\n", | ||
" def to_geoid(\n", | ||
" self,\n", | ||
" req_dim=['dtime', 'x', 'y'],\n", | ||
" req_vars={'elevation': ['x', 'y', 'dtime', 'geoid']},\n", | ||
" source=None,\n", | ||
" ):\n", | ||
" \"\"\"\n", | ||
" Get geoid layer from your local file, which is provided to the function as \"source\",\n", | ||
" and apply the offset to all elevation values.\n", | ||
" Adds 'geoid_offset' keyword to \"offsets\" attribute so you know the geoid offset was applied.\n", | ||
"\n", | ||
" req_dim : list of str\n", | ||
" List of all required dimension names.\n", | ||
"\n", | ||
" req_vars : list of str\n", | ||
" List of all required variable names\n", | ||
"\n", | ||
" source : str\n", | ||
" Full path to your source file containing geoid offsets\n", | ||
" \"\"\"\n", | ||
"\n", | ||
" # check to make sure you haven't already run this function (and are thus applying the offset twice)\n", | ||
" try:\n", | ||
" values = self._xrds.attrs['offset_names']\n", | ||
" assert 'geoid_offset' not in values, \"You've already applied the geoid offset!\"\n", | ||
" values = list([values]) + ['geoid_offset']\n", | ||
" except KeyError:\n", | ||
" values = ['geoid_offset']\n", | ||
"\n", | ||
" self._validate(self, req_dim, req_vars)\n", | ||
"\n", | ||
" # read in your geoid values\n", | ||
" # WARNING: this implementation assumes your geoid values are in the same CRS and grid as the data you are applying\n", | ||
" # them to. If not, you will need to reproject and/or resample them to match the data to which you are applying them.\n", | ||
" # That step is not included here to emphasize the accessor aspect of the workflow.\n", | ||
" with rasterio.open(source) as src:\n", | ||
" geoid = src['geoid_varname']\n", | ||
"\n", | ||
" # As noted above, this step will fail or produce unreliable results if your data is not properly gridded\n", | ||
" self._xrds['elevation'] = self._xrds.elevation - geoid\n", | ||
"\n", | ||
" self._xrds.attrs['offset_names'] = values\n", | ||
"\n", | ||
" return self._xrds" | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Now, each time we want to convert our ellipsoid data to the geoid, we only have to run one line of code, and it will also perform a multitude of checks for us to make sure we're performing exactly the operation we expect. Imagine the possibilities (and decrease in frustration)!" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"ds = ds.geoidxr.to_geoid(source='/Path/to/Custom/source/file.nc')" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"language_info": { | ||
"name": "python" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 2 | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
```{tableofcontents} | ||
|
||
``` |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.