I've downloaded a netcdf from the Climate Data Store and would like to write a CRS to it, so I can clip it for a shapefile. However, I get an error when assigning a CRS. Below my script and what is being printed. I receive this error after trying to write a crs:
MissingSpatialDimensionError: y dimension (lat) not found. Data variable: lon_bnds
# load netcdf with xarray
dset = xr.open_dataset(netcdf_fn)
print(dset)
# add projection system to nc
dset = dset.rio.write_crs("EPSG:4326", inplace=True)
# mask CMIP6 data with shapefile
dset_shp = dset.rio.clip(shp.geometry.apply(mapping), shp.crs)
dset
Out[44]:
<xarray.Dataset>
Dimensions: (time: 1825, bnds: 2, lat: 2, lon: 1)
Coordinates:
* time (time) object 2021年01月01日 12:00:00 ... 2025年12月31日 12:00:00
* lat (lat) float64 0.4712 1.414
* lon (lon) float64 31.25
spatial_ref int32 0
Dimensions without coordinates: bnds
Data variables:
time_bnds (time, bnds) object ...
lat_bnds (lat, bnds) float64 0.0 0.9424 0.9424 1.885
lon_bnds (lon, bnds) float64 ...
pr (time, lat, lon) float32 ...
Attributes: (12/48)
Conventions: CF-1.7 CMIP-6.2
activity_id: ScenarioMIP
branch_method: standard
branch_time_in_child: 60225.0
branch_time_in_parent: 60225.0
comment: none
...
title: CMCC-ESM2 output prepared for CMIP6
variable_id: pr
variant_label: r1i1p1f1
license: CMIP6 model data produced by CMCC is licensed und...
cmor_version: 3.6.0
tracking_id: hdl:21.14100/0c6732f7-2cdd-4296-99a0-7952b7ca911e
2 Answers 2
You are attempting to clip the whole dataset, but only the pr
data var can be clipped. This is because it is the only one with both x and y dimensions. This should work:
dset_shp = dset.pr.rio.clip(shp.geometry, shp.crs)
-
Thanks, but then I get the following error: NoDataInBounds: No data found in bounds. Data variable: pr.CrossLord– CrossLord2022年05月16日 07:59:51 +00:00Commented May 16, 2022 at 7:59
-
That means that there was no data when you performed the clip operation.snowman2– snowman22022年05月16日 12:07:36 +00:00Commented May 16, 2022 at 12:07
-
There seems to be data though.. When I print dset.pr this is what I get: array([[[7.368938e-07], ... [1.653116e-07]]], dtype=float32) Coordinates: * time (time) object 2021年01月01日 12:00:00 ... 2025年12月31日 12:00:00 * lat (lat) float64 0.4712 1.414 * lon (lon) float64 31.25CrossLord– CrossLord2022年05月16日 15:00:10 +00:00Commented May 16, 2022 at 15:00
-
I thought it might had something to do with the size of the polygon compared to the pixel, since the pixels are much larger than the polygon. However, adding all_touched = True does not help. See: gis.stackexchange.com/questions/390777/…CrossLord– CrossLord2022年05月16日 15:03:43 +00:00Commented May 16, 2022 at 15:03
-
Might be that the data in the cell is filled with the no data value.snowman2– snowman22022年05月16日 20:15:49 +00:00Commented May 16, 2022 at 20:15
You may need to access the shp file values and specify the drop
and invert
kwargs:
dset_shp = dset.rio.clip(shp.geometry.values, shp.crs, drop=False, invert=False)
Following examples in rioxarray documentation.