Skip to content

docs: Move quick overview one level up #2890

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 19, 2019
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion doc/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ Examples
.. toctree::
:maxdepth: 2

examples/quick-overview
examples/weather-data
examples/monthly-means
examples/multidimensional-coords
Expand Down
2 changes: 2 additions & 0 deletions doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ Documentation

* :doc:`why-xarray`
* :doc:`faq`
* :doc:`quick-overview`
* :doc:`examples`
* :doc:`installing`

Expand All @@ -39,6 +40,7 @@ Documentation

why-xarray
faq
quick-overview
examples
installing

Expand Down
57 changes: 45 additions & 12 deletions doc/examples/quick-overview.rst → doc/quick-overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,12 @@ array or list, with optional *dimensions* and *coordinates*:

.. ipython:: python

xr.DataArray(np.random.randn(2, 3))
data = xr.DataArray(np.random.randn(2, 3), coords={'x': ['a', 'b']}, dims=('x', 'y'))
data = xr.DataArray(np.random.randn(2, 3),
dims=('x', 'y'),
coords={'x': [10, 20]})
data

If you supply a pandas :py:class:`~pandas.Series` or
:py:class:`~pandas.DataFrame`, metadata is copied directly:
In this case, we have generated a 2D array, assigned the names *x* and *y* to the two dimensions respectively and associated two *coordinate labels* 'a' and 'b' with the two locations along the x dimension. If you supply a pandas :py:class:`~pandas.Series` or :py:class:`~pandas.DataFrame`, metadata is copied directly:

.. ipython:: python

Expand All @@ -44,25 +44,45 @@ Here are the key properties for a ``DataArray``:
# you can use this dictionary to store arbitrary metadata
data.attrs


Indexing
--------

xarray supports four kind of indexing. These operations are just as fast as in
pandas, because we borrow pandas' indexing machinery.
xarray supports four kind of indexing. Since we have assigned coordinate labels to the x dimension we can use label-based indexing along that dimension just like pandas. The four examples below all yield the same result but at varying levels of convenience and intuitiveness.

.. ipython:: python

# positional and by integer label, like numpy
data[[0, 1]]

# positional and by coordinate label, like pandas
data.loc['a':'b']
data.loc[10:20]

# by dimension name and integer label
data.isel(x=slice(2))

# by dimension name and coordinate label
data.sel(x=['a', 'b'])
data.sel(x=[10, 20])


Unlike positional indexing, label-based indexing frees us from having to know how our array is organized. All we need to know are the dimension name and the label you wish to index i.e. ``data.sel(x=10)`` works regardless of whether x is the first or second dimension of the array and regardless of whether ``10`` is the first or second element of ``x``. We have already told xarray that x is the first dimension when we created ``data``. xarray keeps track of this so you don't have to. These operations are just as fast as in pandas, because xarray borrows pandas' indexing machinery.


Attributes
----------

While you're setting up your DataArray, it's often a good idea to set metadata attributes. A useful choice is to set ``data.attrs['long_name']`` and ``data.attrs['units']`` since xarray will use these, if present, to automatically label your plots. These special names were chosen following the `NetCDF Climate and Forecast (CF) Metadata Conventions <http://cfconventions.org/cf-conventions/cf-conventions.html>`_. ``attrs`` is just a Python dictionary, so you can assign anything you wish.

.. ipython:: python

data.attrs['long_name'] = 'random velocity'
data.attrs['units'] = 'metres/sec'
data.attrs['description'] = 'A random variable created as an example.'
data.attrs['random_attribute'] = 123
data.attrs
# you can add metadata to coordinates too
data.x.attrs['units'] = 'x units'


Computation
-----------
Expand All @@ -73,6 +93,7 @@ Data arrays work very similarly to numpy ndarrays:

data + 10
np.sin(data)
# transpose
data.T
data.sum()

Expand Down Expand Up @@ -121,10 +142,22 @@ xarray supports grouped operations using a very similar API to pandas:
data.groupby(labels).mean('y')
data.groupby(labels).apply(lambda x: x - x.min())

Plotting
--------

Visualizing your datasets is quick and convenient:

.. ipython:: python

@savefig plotting_quick_overview.png
data.plot()

Note the automatic labeling with names and units. Our effort in adding metadata attributes has paid off!

pandas
------

Xarray objects can be easily converted to and from pandas objects:
Xarray objects can be easily converted to and from pandas objects using the :py:meth:`~xarray.DataArray.to_series`, :py:meth:`~xarray.DataArray.to_dataframe` and :py:meth:`~pandas.DataFrame.to_xarray` methods:

.. ipython:: python

Expand Down Expand Up @@ -161,10 +194,10 @@ You can do almost everything you can do with ``DataArray`` objects with
``Dataset`` objects (including indexing and arithmetic) if you prefer to work
with multiple variables at once.

NetCDF
------
Read & write netCDF files
-------------------------

NetCDF is the recommended binary serialization format for xarray objects. Users
NetCDF is the recommended file format for xarray objects. Users
from the geosciences will recognize that the :py:class:`~xarray.Dataset` data
model looks very similar to a netCDF file (which, in fact, inspired it).

Expand Down