Grouping by multiple arrays with Xarray

Tuesday, July 18th, 2023 (about 1 year ago)



TLDR#

Xarray now supports grouping by multiple variables (docs). 🎉 😱 🤯 🥳. Try it out!

How do I use it?#

Install xarray>=2024.08.0 and optionally flox for better performance with reductions.

Simple example#

Set up a multiple variable groupby using Grouper objects.

1import xarray as xr
2from xarray.groupers import UniqueGrouper
3
4da = xr.DataArray(
5    np.array([1, 2, 3, 0, 2, np.nan]),
6    dims="d",
7    coords=dict(
8        labels1=("d", np.array(["a", "b", "c", "c", "b", "a"])),
9        labels2=("d", np.array(["x", "y", "z", "z", "y", "x"])),
10    ),
11)
12
13gb = da.groupby(labels1=UniqueGrouper(), labels2=UniqueGrouper())
14gb
15
<DataArrayGroupBy, grouped over 2 grouper(s), 9 groups in total:
	'labels1': 3 groups with labels 'a', 'b', 'c'
	'labels2': 3 groups with labels 'x', 'y', 'z'>

Reductions work as usual:

1gb.mean()
2
Loading data...

So does map:

1gb.map(lambda x: x[0])
2
Loading data...

Multiple Groupers#

Combining different grouper types is allowed, that is you can combine categorical grouping with UniqueGrouper, binning with BinGrouper, and resampling with TimeResampler.

1from xarray.groupers import BinGrouper
2
3ds = xr.Dataset(
4        {"foo": (("x", "y"), np.arange(12).reshape((4, 3)))},
5        coords={"x": [10, 20, 30, 40], "letters": ("x", list("abba"))},
6    )
7gb = ds.groupby(x=BinGrouper(bins=[5, 15, 25]), letters=UniqueGrouper())
8gb
9
<DatasetGroupBy, grouped over 2 grouper(s), 4 groups in total:
	'x_bins': 2 groups with labels (5,, 15], (15,, 25]
	'letters': 2 groups with labels 'a', 'b'>

Now reduce as usual

1gb.mean()
2
Loading data...
Back to Blog

xarray logo

© 2024, Xarray core developers. Apache 2.0 Licensed.

f7ab0c0

TwitterGitHubYouTubeBlog RSS Feed
Powered by â–² Vercel