So I have a folder with loads of .tif files showing NDVI. What I want to do is read them into a rasterStack, then subset that stack by year. I've been able to do this manually, but I'd really like to have a script that could work for any years the data might cover.
Luckily, the .tif files all have the date in their file names, in this format 'NDVI-2017年07月09日.tif'.
My manual code was this:
# Import all rasters from one folder as a rasterStack
all_rasters <- list.files(file.path("N:/R/NDVI/Rasters"),
pattern="tif$", full.names=T)
all_stack <- stack(all_rasters, quick=T)
# Now extract rasters by year from stack
NDVI_stack_2015 <- raster::subset(all_stack, grep(2015, names(all_stack), value=T))
NDVI_stack_2016 <- raster::subset(all_stack, grep(2016, names(all_stack), value=T))
NDVI_stack_2017 <- raster::subset(all_stack, grep(2017, names(all_stack), value=T))
NDVI_stack_2018 <- raster::subset(all_stack, grep(2018, names(all_stack), value=T))
NDVI_stack_2019 <- raster::subset(all_stack, grep(2019, names(all_stack), value=T))
So far I've come up with this to extract the years and make names for the new stacks:
# Extract years from file names
NDVI_years <- unique(stringr::str_extract(all_rasters, "(?<=-).*(?=-..-..\\.tif)"))
# Use those years to create names for the new stacks
NDVI_stack_names <- paste("NDVI_stack", NDVI_years, sep="_")
What I'm struggling with now is creating a loop which subsets the raster stack (all_stack) by the extracted years (NDVI_years), and naming the new stacks with NDVI_stack_names.
3 Answers 3
I'd like to show you an alternative approach using terra
instead of raster
since this a little bit more elegant from my point of view:
# list files available, I created 12 files beforehand
files <- list.files(pattern = "*.tif")
# using filenames following your pattern
head(files)
#> [1] "NDVI-2017年07月09日.tif"
#> [2] "NDVI-2017年08月09日.tif"
#> [3] "NDVI-2017年09月09日.tif"
#> [4] "NDVI-2017年10月09日.tif"
#> [5] "NDVI-2017年11月09日.tif"
#> [6] "NDVI-2017年12月09日.tif"
# create a SpatRast object using terra
stack <- terra::rast(files)
# note that nlyr = 12, so this is basically a raster stack
stack
#> class : SpatRaster
#> dimensions : 900, 900, 12 (nrow, ncol, nlyr)
#> resolution : 0.001111111, 0.001111111 (x, y)
#> extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84
#> sources : NDVI-2017年07月09日.tif
#> NDVI-2017年08月09日.tif
#> NDVI-2017年09月09日.tif
#> ... and 9 more source(s)
#> names : NDVI-~07-09, NDVI-~08-09, NDVI-~09-09, NDVI-~10-09, NDVI-~11-09, NDVI-~12-09, ...
#> min values : -3840, -3840, -3840, -3840, -3840, -3840, ...
#> max values : 3881, 3881, 3881, 3881, 3881, 3881, ...
# parse timestamps from names attribute and set the time attribute
terra::time(stack) <- names(stack) |> stringr::str_sub(start = 6, end = 15) |> strptime(format = "%Y-%m-%d", tz = "UTC")
# your stack now has time information ranging from 2017年07月09日 to 2018年06月09日
stack
#> class : SpatRaster
#> dimensions : 900, 900, 12 (nrow, ncol, nlyr)
#> resolution : 0.001111111, 0.001111111 (x, y)
#> extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84
#> sources : NDVI-2017年07月09日.tif
#> NDVI-2017年08月09日.tif
#> NDVI-2017年09月09日.tif
#> ... and 9 more source(s)
#> names : NDVI-~07-09, NDVI-~08-09, NDVI-~09-09, NDVI-~10-09, NDVI-~11-09, NDVI-~12-09, ...
#> min values : -3840, -3840, -3840, -3840, -3840, -3840, ...
#> max values : 3881, 3881, 3881, 3881, 3881, 3881, ...
#> time : 2017年07月09日 to 2018年06月09日 UTC
# you can either make use of partial matching using `[]` querying layer names
# holding timestamps in your case...
stack["2017"]
#> class : SpatRaster
#> dimensions : 900, 900, 6 (nrow, ncol, nlyr)
#> resolution : 0.001111111, 0.001111111 (x, y)
#> extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84
#> sources : NDVI-2017年07月09日.tif
#> NDVI-2017年08月09日.tif
#> NDVI-2017年09月09日.tif
#> ... and 3 more source(s)
#> names : NDVI-~07-09, NDVI-~08-09, NDVI-~09-09, NDVI-~10-09, NDVI-~11-09, NDVI-~12-09
#> min values : -3840, -3840, -3840, -3840, -3840, -3840
#> max values : 3881, 3881, 3881, 3881, 3881, 3881
#> time : 2017年07月09日 to 2017年12月09日 UTC
# ... or construct an index vector as input based on your time attribute
stack[[time(stack) >= "2017年01月01日" & time(stack) < "2018年01月01日"]]
#> class : SpatRaster
#> dimensions : 900, 900, 6 (nrow, ncol, nlyr)
#> resolution : 0.001111111, 0.001111111 (x, y)
#> extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84
#> sources : NDVI-2017年07月09日.tif
#> NDVI-2017年08月09日.tif
#> NDVI-2017年09月09日.tif
#> ... and 3 more source(s)
#> names : NDVI-~07-09, NDVI-~08-09, NDVI-~09-09, NDVI-~10-09, NDVI-~11-09, NDVI-~12-09
#> min values : -3840, -3840, -3840, -3840, -3840, -3840
#> max values : 3881, 3881, 3881, 3881, 3881, 3881
#> time : 2017年07月09日 to 2017年12月09日 UTC
Personally, I would leave it this way and not create a separate stack per year because querying is quite comfortable. If you wish to split your stack per year nevertheless, you could do something like this, even if this is maybe not the most elegant way to accomplish:
# get unique years from time attribute
years <- terra::time(stack) |> format("%Y") |> unique()
# loop over years and create new sub-stacks
for (i in 1:length(years)) {
assign(paste0("NDVI_stack_", years[i]),
stack[years[i]])
}
-
Thanks so much! I was able to get both methods to work!Xandian97– Xandian972022年08月03日 13:58:03 +00:00Commented Aug 3, 2022 at 13:58
Your script could look like this with "terra":
library(terra)
ff <- list.files(file.path(N:/R/NDVI/Rasters), pattern="tif$", full.names=T)
x <- rast(ff)
years <- str_extract(basename(ff), "(?<=-).*(?=-..-..\\.tif)")
am <- tapp(x, years, mean, na.rm=TRUE)
names(am) <- paste0("NDVI_", names(am))
NDVI_all <- mean(am, na.rm=TRUE)
names(NDVI_all) <- "NDVI_Overall"
x <- c(NDVI_all, am)
I was able to get both of Falk-env's solutions to work, but ended up using a different method which I thought I'd post here as well.
My next step was to calculate the mean for each year, and since I'm hopeless with loops and ran into some confusion due to terra being new to me, I ended up using raster::stackApply
, as below:
# First to import the rasters and stack them
all_rasters <- list.files(file.path(N:/R/NDVI/Rasters),
pattern="tif$", full.names=T)
all_stack <- stack(all_rasters)
# Then extract the years from the file names and rename the layers with them
Years <- str_extract(all_rasters, "(?<=-).*(?=-..-..\\.tif)")
names(all_stack) <- Years
# Now stackApply is used to calculate the mean value for each year in the raster stack
# (using Years as the indices), then the layers in the resulting stack are renamed
Annual_Mean_NDVI <- stackApply(Raster_stack, indices=Years, fun=mean, na.rm=T)
Years_unique <- unique(Years)
names(Annual_Mean_NDVI) <- paste("Mean_NDVI", Years_unique, sep="_")
# Finally the overall mean is calculated and stacked with the annual means
Mean_NDVI_all <- calc(Annual_Mean_NDVI, fun=mean, na.rm=T)
Mean_NDVI_Stack <- stack(Mean_NDVI_all, Annual_Mean_NDVI)
names(Mean_NDVI_Stack)[1] <- "Mean_NDVI_Overall"
As an aside, I also looked into the rts
package (Raster Time Series), which would have been a good solution had I been able to figure out how to deal with NAs and empty outputs when calculating the means.
-
2Since rgdal will retire by the end of 2023, I took the opportunity to switch from
raster
to its successorterra
. Can totally recommend to invest some time, but I'm still becoming acquainted myself.rts
seems to be a combination ofterra
andxts
- also a great package by the way. At the moment I cannot estimate the unique selling point yet since there already are several packages for raster data like e.g.terra
,stars
andgdalcubes
supporting time attributes.dimfalk– dimfalk2022年08月04日 02:16:39 +00:00Commented Aug 4, 2022 at 2:16 -
1Ah, I didn't realise
raster
was being retired. Now that I've got a fully functional script I'll have a crack at adjusting it to useterra
instead - especially since the script will hopefully be in use for a fair few years!Xandian97– Xandian972022年08月04日 14:39:47 +00:00Commented Aug 4, 2022 at 14:39