I'm relatively new to NetCDF and working with a CMAQ model output in NetCDF format. There is no explicit lat/long variable, rather each variable is stored with a ROW/COL. I assume the Global attributes contain what I need to convert to long/lat or Northings and Eastings but I have not found a way to do it with the netCDF4 package or Xarray.
The files are 20GB so I'll just include the start of the metadata
print (nc_f)
<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF3_CLASSIC data model, file format NETCDF3):
IOAPI_VERSION: $Id: @(#) ioapi library version 3.1 $
EXEC_ID: CCTM_v521.exe
FTYPE: 1
CDATE: 2018111
CTIME: 95359
WDATE: 2018111
WTIME: 95359
SDATE: 2013001
STIME: 10000
TSTEP: 10000
NTHIK: 1
NCOLS: 201
NROWS: 306
NLAYS: 1
NVARS: 112
GDTYP: 2
P_ALP: 30.0
P_BET: 60.0
P_GAM: -121.0
XCENT: -121.0
YCENT: 49.0
XORIG: 12000.0
YORIG: 24000.0
XCELL: 4000.0
YCELL: 4000.0
VGTYP: 7
VGTOP: 10000.0
VGLVLS: [1. 0.9975]
GDNAM: 04km
UPNAM: M3CPLE
VAR-LIST: NO2 NO O3 NO3 N2O5 HNO3 HONO H2O2 NTROH NTRALK ROOH ALD2 ALDX ISOPX IEPOX FORM FACD CO AACD MEPX MEOH PAN PACD PANX CRPX OPAN NTRM NTRI SO2 SULF CL2 HOCL FMCL HCL CLNO2 MAPAN NTRCN NTRCNOH NTRPX FORM_PRIMARY ALD2_PRIMARY HG HGIIGAS ASO4J ASO4I ANH4J ANH4I ANO3J ANO3I AALK1J AALK2J AXYL1J AXYL2J AXYL3J ATOL1J ATOL2J ATOL3J ABNZ1J ABNZ2J ABNZ3J APAH1J APAH2J APAH3J ATRP1J ATRP2J AISO1J AISO2J ASQTJ AORGCJ APOCJ APOCI APNCOMJ APNCOMI AECJ AECI AOTHRJ AFEJ AALJ ASIJ ATIJ ACAJ AMGJ AKJ AMNJ ACORS ASOIL ANAJ ACLJ ASEACAT ACLK ASO4K ANH4K ANO3K AISO3J AOLGAJ AOLGBJ NH3 SV_ALK1 SV_ALK2 SV_XYL1 SV_XYL2 SV_TOL1 SV_TOL2 SV_BNZ1 SV_BNZ2 SV_PAH1 SV_PAH2 SV_TRP1 SV_TRP2 SV_ISO1 SV_ISO2 SV_SQT
FILEDESC: hourly 1-layer cross-point RADM dry deposition data
HISTORY:
dimensions(sizes): TSTEP(744), DATE-TIME(2), LAY(1), VAR(112), ROW(306), COL(201)
variables(dimensions): int32 TFLAG(TSTEP,VAR,DATE-TIME), float32 NO2(TSTEP,LAY,ROW,COL)
It's a 4km square grid that appears to start at 121W and 49N, It's not parallel to latitudes and I assume that "VGLVLS: [1. 0.9975]" would contain the angle but I have no idea what python functions to use to convert.
Looking at other questions it seemed there was a built in function as below but there is of course no longitude_in key in my dataset.
# Read variables from ncdf
lon = nc.variables['longitude_in'][:]
lat = nc.variables['latitude_in'][:]
That's from this solution but it doesn't work on this dataset: Longitude/Latitude from NetCDF with (row,column)
Ideally I'd like to get LAT and LONG of the center of each square (although knowing how to get the four corners would be good too) each in a similar 2D array so I can output a .csv file with ROW, COL, LAT, LONG, NO2, SO2, O3, NO3.....after reducing the parameter count and summing the variable data over the month (I have the rest of it figured out I think).
-
I suspect the attributes P_ALP, P_BET, P_GAM describe a coordinate projection, but unless you can see something that says if its conical or cylindrical or transverse mercator you might be stuck. The XCENT, YCENT, XORIG, YORIG look like they describe how to convert row and column to coordinates in that coordinate system but getting them into lat-long needs the full projection info beyond the alpha, beta, gamma parameters because we don't know what they mean (and I'm guessing a bit anyway). Where's the source documentation? Can we see it?Spacedman– Spacedman2021年03月15日 13:53:32 +00:00Commented Mar 15, 2021 at 13:53
-
Unfortunately there is no documentation beyond the files themselves, these were created by a previous modeler in 2014 or so. I believe CMAQ uses a 6370000 spheroid radius and I'd like to get the output in a WGS84 datum but you're right, I don't see anything in the file that would tell me the original datum.RichardS– RichardS2021年03月15日 17:28:10 +00:00Commented Mar 15, 2021 at 17:28
-
I've been attempting to find more on this across the web; CMAS sites, IOAPI etc. It seems that P_ALP is the first true latitude, P_BET is the second, and P_GAM is the origin longitude. GDTYP is the projection and 2 specifically is LCC. The VG Stuff is for vertical layers, but this is surface only data so not sure why it's included. It also looks like these were extracted in 2018 (WDATE) which is later than I expected.RichardS– RichardS2021年03月17日 01:12:52 +00:00Commented Mar 17, 2021 at 1:12
1 Answer 1
I'm going to do this in R because that's easiest for me. But if it works then this should be translatable to Python, and its possible that all you need is the PROJ string that I'm going to construct...
From those data I'd construct a proj string that looks like this:
projstring = "+ellps=WGS84 +proj=lcc +units=m +x_0=12000 +y_0=24000 +lat_1=30 +lat_2=60 +lon_0=-121 +lat_0=49.0"
### earth shape GDTYP2 x and y false origin standard parallels projection centre
### https://proj-tmp.readthedocs.io/en/docs/operations/projections/lcc.html
individual parts are annotated in the second line and the descriptions of the parameters are in the link. Now I'm not sure if the false origins are correct and your grid coordinates start at (0,0), or if your grid starts at 12000,24000 and the false origins should be 0 and 0. Things should end up in more or less the same place but that's one thing that might not be right.
Also, the LCC projection requires two latitudes and a lat-long centre, but you've got two longitudes (P_GAM=-121.0 and XCENT=-121.0). They're the same value, but it might indicate that something else is going on. Not sure.
Anyway, now we have that string we can construct a grid of (NCOLSxNROWS) points with separation XCELL, YCELL (4000 x 4000) starting at (0,0):
grid = expand.grid(x = seq(0, by=XCELL, len=NCOLS),
y = seq(0, by=YCELL, len=NROWS))
> head(grid)
x y
1 0 0
2 4000 0
3 8000 0
Then make that an R spatial object with the coordinate system given by that proj string:
library(sf)
gridsf = st_as_sf(grid, coords=1:2, crs=projstring)
and transform to lat-long (epsg coordinate code 4326):
gridll = st_transform(gridsf, "epsg:4326")
That produces a set of points that form this grid over Canada:
If you know where this grid should be then this might be correct. Or it might be that the offsets are off and you should try it the other way round.
Also, this might be the grid centre points or the grid cell bottom-left coordinates.
Also also, my assumption of the WGS84 ellipse might be wrong. If you can find out that your data comes from a model that assumes a spherical earth of radius 6300000m then you can make the proj string say that with some other parameters.
Ideally you should be able to go back to the community that generated this data and get precise information about the coordinate system used and perhaps a ready-computed grid of lat-long point coordinates. But hey, back in the real world...
Anyway, that proj string should be usable in a Python package that does PROJ transformations, like pyproj
or GDAL, or PyQGIS, to transform from your grid to lat-long.