I’m using the "Make NetCDF Table View" tool in ArcMap to extract time series data from a NetCDF file. I’ve tested this tool with other .nc files and it works as expected. However, when working with a specific MSEWP daily precipitation NetCDF file, I encounter an issue: I receive identical time series data regardless of the location selected.
In the attached figure, you can see that the precipitation variable varies with location in the .nc file. The variables in the NetCDF file include: precipitation, time, lon, and lat. The coordinate system for both the data points and the NetCDF file is GCS_WGS_1984.
Why does the time series data appears to be the same for all locations, despite the .nc file showing different precipitation values at different locations?
-
1ArcGIS 10.5.1 is very nearly ancient, released seven years ago and retired 19 months ago. Using recent software with modern data is a more likely combination.Vince– Vince2024年07月22日 16:30:21 +00:00Commented Jul 22, 2024 at 16:30
1 Answer 1
Found the problem the .nc file is an h5netcdf file. Instead of ArcGIS I ́m using the following Python code:
import os
import numpy as np
import pandas as pd
import h5netcdf
from datetime import datetime, timedelta
# Define the path to the MSWEP NetCDF file
file_path = r"C:\Users\mcva\OneDrive - FCT NOVA\Artigos\Paper Angola Precipitação\Dados MSWEP_V280_past_nogauge\MSWEP_2020.nc"
# Dictionary of station names with their corresponding latitude and longitude values
stations = {
"Point1": {"lat": -6.68333, "lon": 14.13333},
"Point2": {"lat": -8.513, "lon": 14.59},
# Add more stations as needed
}
# Output folder path
output_folder = r"C:\Users\mcva\OneDrive - FCT NOVA\Artigos\Paper Angola Precipitação\Dados MSWEP_V280_past_nogauge\output_python"
# Check if the output folder exists, if not, create it
if not os.path.exists(output_folder):
os.makedirs(output_folder)
# Open the NetCDF file
try:
with h5netcdf.File(file_path, 'r') as f:
# Retrieve dimensions and variables
time = f.variables['time'][:].astype(np.float64) # Ensure time is float64
latitudes = f.variables['lat'][:]
longitudes = f.variables['lon'][:]
precipitation = f.variables['precipitation'][:]
# Convert time to datetime
time_units = f.variables['time'].attrs['units']
time_base = datetime.strptime(time_units.split(' since ')[1], "%Y-%m-%d %H:%M:%S")
time_dates = [time_base + timedelta(days=float(t)) for t in time] # Ensure days is float
# Create a DataFrame for each station
for station_name, coords in stations.items():
lat = coords["lat"]
lon = coords["lon"]
# Find the nearest grid point in the NetCDF file
lat_idx = np.argmin(np.abs(latitudes - lat))
lon_idx = np.argmin(np.abs(longitudes - lon))
# Extract time series data for the station
time_series = precipitation[:, lat_idx, lon_idx]
# Create a DataFrame for the station's time series
df = pd.DataFrame({
'time': time_dates,
'precipitation': time_series
})
# Define the output .csv file path
output_csv = os.path.join(output_folder, f"{station_name}.csv")
# Save the DataFrame as a .csv file
df.to_csv(output_csv, index=False)
print("NetCDF data extraction completed.")
except Exception as e:
print(f"Error processing file with h5netcdf: {e}")