xarray.DataArray.drop_duplicates#

DataArray.drop_duplicates(dim, *, keep='first')[source] #

Returns a new DataArray with duplicate dimension values removed.

Parameters:
  • dim (dimension label or labels) – Pass ... to drop duplicates along all dimensions.

  • keep ({"first", "last", False}, default: "first") – Determines which duplicates (if any) to keep.

    • "first" : Drop duplicates except for the first occurrence.

    • "last" : Drop duplicates except for the last occurrence.

    • False : Drop all duplicates.

Returns:

DataArray

Examples

>>> da = xr.DataArray(
...  np.arange(25).reshape(5, 5),
...  dims=("x", "y"),
...  coords={"x": np.array([0, 0, 1, 2, 3]), "y": np.array([0, 1, 2, 3, 3])},
... )
>>> da
<xarray.DataArray (x: 5, y: 5)> Size: 200B
array([[ 0, 1, 2, 3, 4],
 [ 5, 6, 7, 8, 9],
 [10, 11, 12, 13, 14],
 [15, 16, 17, 18, 19],
 [20, 21, 22, 23, 24]])
Coordinates:
 * x (x) int64 40B 0 0 1 2 3
 * y (y) int64 40B 0 1 2 3 3
>>> da.drop_duplicates(dim="x")
<xarray.DataArray (x: 4, y: 5)> Size: 160B
array([[ 0, 1, 2, 3, 4],
 [10, 11, 12, 13, 14],
 [15, 16, 17, 18, 19],
 [20, 21, 22, 23, 24]])
Coordinates:
 * x (x) int64 32B 0 1 2 3
 * y (y) int64 40B 0 1 2 3 3
>>> da.drop_duplicates(dim="x", keep="last")
<xarray.DataArray (x: 4, y: 5)> Size: 160B
array([[ 5, 6, 7, 8, 9],
 [10, 11, 12, 13, 14],
 [15, 16, 17, 18, 19],
 [20, 21, 22, 23, 24]])
Coordinates:
 * x (x) int64 32B 0 1 2 3
 * y (y) int64 40B 0 1 2 3 3

Drop all duplicate dimension values:

>>> da.drop_duplicates(dim=...)
<xarray.DataArray (x: 4, y: 4)> Size: 128B
array([[ 0, 1, 2, 3],
 [10, 11, 12, 13],
 [15, 16, 17, 18],
 [20, 21, 22, 23]])
Coordinates:
 * x (x) int64 32B 0 1 2 3
 * y (y) int64 32B 0 1 2 3