0

I have a NetCDF database (link to file) obtained from Copernicus Climate Data of ~43k unevenly spaced values around the world. Instead of being indexed using (lat, lon) it uses a sequence of 'stations'.
My understanding is that station_x_coordinate and station_y_coordinate are not treated as dimension coordinates by xarrays (according to xarrays docs).

I have a list of my own locations for which I want to get the closest value (i.e., my locations do not necessarily match one of the data points in the NetCDF database).

I would like to use the many selection and interpolation methods of xarray (e.g. xarray.Dataset.sel to get values in locations unmatched to data points). But I get errors or problems which are, I guess, related to the fact that I don't have real dimension coordinates but variables.
For example, I can use xarray.Dataset.sel(stations = 11.5, method='nearest') which will bring the closest value (in this case, station = 12) but this is very unintuitive, as I have no idea how the stations are ordered in the grid. I would rather search using the more sensible indexing (lat, lon) on my target locations. To be clear, I would like to use it like xarray.Dataset.sel(x = 0, y = 0, method='nearest') to find the closest value at (lat = 0, lon = 0) location.

Until now I was able to get all the data points of the NetCDF file into a regular pandas dataframe (columns = ['lat', 'lon', 'value']) and save it as CSV. I can try to make my own function to find the nearest neighbors in 2D space given a target location (it must exist for sure in another library, maybe even numpy or GeoPandas)... but as I said I would like to use the available methods of xarray to get values at 'unmatched' locations, interpolate, etc.

What would you do? For example, can I generate a new NetCDF file with a structure that sets station_x_coordinate and station_y_coordinate of my current NetCDF file as real dimension coordinates in the new file? I am no expert in netCDF files, but I assume this way I could use the methods mentioned above. Makes sense?


This is the structure of the NetCDF file (output of xarray.Dataset):

<xarray.Dataset>
Dimensions: (stations: 43119)
Coordinates:
 * stations (stations) uint16 0 1 2 3 ... 43731 43732 43733
 station_x_coordinate (stations) float64 ...
 station_y_coordinate (stations) float64 ...
Data variables:
 return_mean_surge_level (stations) float64 ...
Attributes: (12/34)
 Conventions: CF-1.6
 featureType: timeSeries
 id: GTSMv3_extreme_value_analysis
 naming_authority: https://deltares.nl/en
 Metadata_Conventions: Unidata Dataset Discovery v1.0
 title: relative change in return values for surge...
 ... ...
 geospatial_vertical_max: 18.564
 geospatial_vertical_units: m
 geospatial_vertical_positive: up
 time_coverage_start: 1985
 time_coverage_end: 2050
 experiment: highres-future
Vince
20.5k16 gold badges49 silver badges65 bronze badges
asked Oct 10, 2023 at 10:42

1 Answer 1

1

You could do something like below. First create some test data:

import xarray as xr
import numpy as np
ds = xr.Dataset({"data": (["stations"], [1.2, 23.7, 77.8])}, coords = {"stations": ("stations", ["la", "lo", "li"]), "lat": ("stations", [45.6, 34.1, 78.2]), "lon": ("stations", [-49.1, 2.1, 179.1])})
lat = 36.3
lon = 3.8

Then calculate the distances from your stations to your point of interest and find the station where the distance is smallest:

selector = np.sqrt((ds["lat"] - lat)**2 + (ds["lon"] - lon)**2).idxmin()

Then use that to filter your dataset:

ds.sel(stations = selector)["data"]

Very similar, you could also find the index of the station for which the distance is smallest and use that to index your dataset like this:

selector = np.sqrt((ds["lat"] - lat)**2 + (ds["lon"] - lon)**2).argmin()
ds.isel(stations = selector)["data"]
answered Nov 14, 2023 at 8:08

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.