How to subset a raster stack by year with a loop in R?

Question 1

So I have a folder with loads of .tif files showing NDVI. What I want to do is read them into a rasterStack, then subset that stack by year. I've been able to do this manually, but I'd really like to have a script that could work for any years the data might cover.

Luckily, the .tif files all have the date in their file names, in this format 'NDVI-2017年07月09日.tif'.

My manual code was this:

# Import all rasters from one folder as a rasterStack
all_rasters <- list.files(file.path("N:/R/NDVI/Rasters"),
 pattern="tif$", full.names=T)
all_stack <- stack(all_rasters, quick=T)
# Now extract rasters by year from stack 
NDVI_stack_2015 <- raster::subset(all_stack, grep(2015, names(all_stack), value=T))
NDVI_stack_2016 <- raster::subset(all_stack, grep(2016, names(all_stack), value=T))
NDVI_stack_2017 <- raster::subset(all_stack, grep(2017, names(all_stack), value=T))
NDVI_stack_2018 <- raster::subset(all_stack, grep(2018, names(all_stack), value=T))
NDVI_stack_2019 <- raster::subset(all_stack, grep(2019, names(all_stack), value=T))

So far I've come up with this to extract the years and make names for the new stacks:

# Extract years from file names
NDVI_years <- unique(stringr::str_extract(all_rasters, "(?<=-).*(?=-..-..\\.tif)"))
# Use those years to create names for the new stacks
NDVI_stack_names <- paste("NDVI_stack", NDVI_years, sep="_")

What I'm struggling with now is creating a loop which subsets the raster stack (all_stack) by the extracted years (NDVI_years), and naming the new stacks with NDVI_stack_names.

Question 2

I'd like to show you an alternative approach using terra instead of raster since this a little bit more elegant from my point of view:

# list files available, I created 12 files beforehand
files <- list.files(pattern = "*.tif")
# using filenames following your pattern
head(files)
#> [1] "NDVI-2017年07月09日.tif"
#> [2] "NDVI-2017年08月09日.tif"
#> [3] "NDVI-2017年09月09日.tif"
#> [4] "NDVI-2017年10月09日.tif"
#> [5] "NDVI-2017年11月09日.tif"
#> [6] "NDVI-2017年12月09日.tif"
# create a SpatRast object using terra
stack <- terra::rast(files)
# note that nlyr = 12, so this is basically a raster stack
stack
#> class : SpatRaster 
#> dimensions : 900, 900, 12 (nrow, ncol, nlyr)
#> resolution : 0.001111111, 0.001111111 (x, y)
#> extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84 
#> sources : NDVI-2017年07月09日.tif 
#> NDVI-2017年08月09日.tif 
#> NDVI-2017年09月09日.tif 
#> ... and 9 more source(s)
#> names : NDVI-~07-09, NDVI-~08-09, NDVI-~09-09, NDVI-~10-09, NDVI-~11-09, NDVI-~12-09, ... 
#> min values : -3840, -3840, -3840, -3840, -3840, -3840, ... 
#> max values : 3881, 3881, 3881, 3881, 3881, 3881, ...
# parse timestamps from names attribute and set the time attribute
terra::time(stack) <- names(stack) |> stringr::str_sub(start = 6, end = 15) |> strptime(format = "%Y-%m-%d", tz = "UTC")
# your stack now has time information ranging from 2017年07月09日 to 2018年06月09日
stack
#> class : SpatRaster 
#> dimensions : 900, 900, 12 (nrow, ncol, nlyr)
#> resolution : 0.001111111, 0.001111111 (x, y)
#> extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84 
#> sources : NDVI-2017年07月09日.tif 
#> NDVI-2017年08月09日.tif 
#> NDVI-2017年09月09日.tif 
#> ... and 9 more source(s)
#> names : NDVI-~07-09, NDVI-~08-09, NDVI-~09-09, NDVI-~10-09, NDVI-~11-09, NDVI-~12-09, ... 
#> min values : -3840, -3840, -3840, -3840, -3840, -3840, ... 
#> max values : 3881, 3881, 3881, 3881, 3881, 3881, ... 
#> time : 2017年07月09日 to 2018年06月09日 UTC
# you can either make use of partial matching using `[]` querying layer names 
# holding timestamps in your case...
stack["2017"]
#> class : SpatRaster 
#> dimensions : 900, 900, 6 (nrow, ncol, nlyr)
#> resolution : 0.001111111, 0.001111111 (x, y)
#> extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84 
#> sources : NDVI-2017年07月09日.tif 
#> NDVI-2017年08月09日.tif 
#> NDVI-2017年09月09日.tif 
#> ... and 3 more source(s)
#> names : NDVI-~07-09, NDVI-~08-09, NDVI-~09-09, NDVI-~10-09, NDVI-~11-09, NDVI-~12-09 
#> min values : -3840, -3840, -3840, -3840, -3840, -3840 
#> max values : 3881, 3881, 3881, 3881, 3881, 3881 
#> time : 2017年07月09日 to 2017年12月09日 UTC
# ... or construct an index vector as input based on your time attribute
stack[[time(stack) >= "2017年01月01日" & time(stack) < "2018年01月01日"]]
#> class : SpatRaster 
#> dimensions : 900, 900, 6 (nrow, ncol, nlyr)
#> resolution : 0.001111111, 0.001111111 (x, y)
#> extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84 
#> sources : NDVI-2017年07月09日.tif 
#> NDVI-2017年08月09日.tif 
#> NDVI-2017年09月09日.tif 
#> ... and 3 more source(s)
#> names : NDVI-~07-09, NDVI-~08-09, NDVI-~09-09, NDVI-~10-09, NDVI-~11-09, NDVI-~12-09 
#> min values : -3840, -3840, -3840, -3840, -3840, -3840 
#> max values : 3881, 3881, 3881, 3881, 3881, 3881 
#> time : 2017年07月09日 to 2017年12月09日 UTC

Personally, I would leave it this way and not create a separate stack per year because querying is quite comfortable. If you wish to split your stack per year nevertheless, you could do something like this, even if this is maybe not the most elegant way to accomplish:

# get unique years from time attribute
years <- terra::time(stack) |> format("%Y") |> unique()
# loop over years and create new sub-stacks
for (i in 1:length(years)) {
 
 assign(paste0("NDVI_stack_", years[i]),
 
 stack[years[i]]) 
}

Question 3

Thanks so much! I was able to get both methods to work!

Question 4

Your script could look like this with "terra":

library(terra)
ff <- list.files(file.path(N:/R/NDVI/Rasters), pattern="tif$", full.names=T)
x <- rast(ff)
years <- str_extract(basename(ff), "(?<=-).*(?=-..-..\\.tif)")
am <- tapp(x, years, mean, na.rm=TRUE)
names(am) <- paste0("NDVI_", names(am))
NDVI_all <- mean(am, na.rm=TRUE)
names(NDVI_all) <- "NDVI_Overall"
x <- c(NDVI_all, am)

Question 5

I was able to get both of Falk-env's solutions to work, but ended up using a different method which I thought I'd post here as well.

My next step was to calculate the mean for each year, and since I'm hopeless with loops and ran into some confusion due to terra being new to me, I ended up using raster::stackApply, as below:

# First to import the rasters and stack them
all_rasters <- list.files(file.path(N:/R/NDVI/Rasters),
 pattern="tif$", full.names=T)
all_stack <- stack(all_rasters)
# Then extract the years from the file names and rename the layers with them
Years <- str_extract(all_rasters, "(?<=-).*(?=-..-..\\.tif)")
names(all_stack) <- Years
# Now stackApply is used to calculate the mean value for each year in the raster stack 
# (using Years as the indices), then the layers in the resulting stack are renamed
Annual_Mean_NDVI <- stackApply(Raster_stack, indices=Years, fun=mean, na.rm=T)
Years_unique <- unique(Years)
names(Annual_Mean_NDVI) <- paste("Mean_NDVI", Years_unique, sep="_")
# Finally the overall mean is calculated and stacked with the annual means
Mean_NDVI_all <- calc(Annual_Mean_NDVI, fun=mean, na.rm=T)
Mean_NDVI_Stack <- stack(Mean_NDVI_all, Annual_Mean_NDVI)
names(Mean_NDVI_Stack)[1] <- "Mean_NDVI_Overall"

As an aside, I also looked into the rts package (Raster Time Series), which would have been a good solution had I been able to figure out how to deal with NAs and empty outputs when calculating the means.

Question 6

Since rgdal will retire by the end of 2023, I took the opportunity to switch from raster to its successor terra. Can totally recommend to invest some time, but I'm still becoming acquainted myself. rts seems to be a combination of terra and xts - also a great package by the way. At the moment I cannot estimate the unique selling point yet since there already are several packages for raster data like e.g. terra, stars and gdalcubes supporting time attributes.

Question 7

Ah, I didn't realise raster was being retired. Now that I've got a fully functional script I'll have a crack at adjusting it to use terra instead - especially since the script will hopefully be in use for a fair few years!

dimfalk dimfalk 1,3411 gold badge8 silver badges18 bronze badges · Accepted Answer · 2022-08-02 18:50:17Z

I'd like to show you an alternative approach using terra instead of raster since this a little bit more elegant from my point of view:

# list files available, I created 12 files beforehand
files <- list.files(pattern = "*.tif")
# using filenames following your pattern
head(files)
#> [1] "NDVI-2017年07月09日.tif"
#> [2] "NDVI-2017年08月09日.tif"
#> [3] "NDVI-2017年09月09日.tif"
#> [4] "NDVI-2017年10月09日.tif"
#> [5] "NDVI-2017年11月09日.tif"
#> [6] "NDVI-2017年12月09日.tif"
# create a SpatRast object using terra
stack <- terra::rast(files)
# note that nlyr = 12, so this is basically a raster stack
stack
#> class : SpatRaster 
#> dimensions : 900, 900, 12 (nrow, ncol, nlyr)
#> resolution : 0.001111111, 0.001111111 (x, y)
#> extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84 
#> sources : NDVI-2017年07月09日.tif 
#> NDVI-2017年08月09日.tif 
#> NDVI-2017年09月09日.tif 
#> ... and 9 more source(s)
#> names : NDVI-~07-09, NDVI-~08-09, NDVI-~09-09, NDVI-~10-09, NDVI-~11-09, NDVI-~12-09, ... 
#> min values : -3840, -3840, -3840, -3840, -3840, -3840, ... 
#> max values : 3881, 3881, 3881, 3881, 3881, 3881, ...
# parse timestamps from names attribute and set the time attribute
terra::time(stack) <- names(stack) |> stringr::str_sub(start = 6, end = 15) |> strptime(format = "%Y-%m-%d", tz = "UTC")
# your stack now has time information ranging from 2017年07月09日 to 2018年06月09日
stack
#> class : SpatRaster 
#> dimensions : 900, 900, 12 (nrow, ncol, nlyr)
#> resolution : 0.001111111, 0.001111111 (x, y)
#> extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84 
#> sources : NDVI-2017年07月09日.tif 
#> NDVI-2017年08月09日.tif 
#> NDVI-2017年09月09日.tif 
#> ... and 9 more source(s)
#> names : NDVI-~07-09, NDVI-~08-09, NDVI-~09-09, NDVI-~10-09, NDVI-~11-09, NDVI-~12-09, ... 
#> min values : -3840, -3840, -3840, -3840, -3840, -3840, ... 
#> max values : 3881, 3881, 3881, 3881, 3881, 3881, ... 
#> time : 2017年07月09日 to 2018年06月09日 UTC
# you can either make use of partial matching using `[]` querying layer names 
# holding timestamps in your case...
stack["2017"]
#> class : SpatRaster 
#> dimensions : 900, 900, 6 (nrow, ncol, nlyr)
#> resolution : 0.001111111, 0.001111111 (x, y)
#> extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84 
#> sources : NDVI-2017年07月09日.tif 
#> NDVI-2017年08月09日.tif 
#> NDVI-2017年09月09日.tif 
#> ... and 3 more source(s)
#> names : NDVI-~07-09, NDVI-~08-09, NDVI-~09-09, NDVI-~10-09, NDVI-~11-09, NDVI-~12-09 
#> min values : -3840, -3840, -3840, -3840, -3840, -3840 
#> max values : 3881, 3881, 3881, 3881, 3881, 3881 
#> time : 2017年07月09日 to 2017年12月09日 UTC
# ... or construct an index vector as input based on your time attribute
stack[[time(stack) >= "2017年01月01日" & time(stack) < "2018年01月01日"]]
#> class : SpatRaster 
#> dimensions : 900, 900, 6 (nrow, ncol, nlyr)
#> resolution : 0.001111111, 0.001111111 (x, y)
#> extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84 
#> sources : NDVI-2017年07月09日.tif 
#> NDVI-2017年08月09日.tif 
#> NDVI-2017年09月09日.tif 
#> ... and 3 more source(s)
#> names : NDVI-~07-09, NDVI-~08-09, NDVI-~09-09, NDVI-~10-09, NDVI-~11-09, NDVI-~12-09 
#> min values : -3840, -3840, -3840, -3840, -3840, -3840 
#> max values : 3881, 3881, 3881, 3881, 3881, 3881 
#> time : 2017年07月09日 to 2017年12月09日 UTC

Personally, I would leave it this way and not create a separate stack per year because querying is quite comfortable. If you wish to split your stack per year nevertheless, you could do something like this, even if this is maybe not the most elegant way to accomplish:

# get unique years from time attribute
years <- terra::time(stack) |> format("%Y") |> unique()
# loop over years and create new sub-stacks
for (i in 1:length(years)) {
 
 assign(paste0("NDVI_stack_", years[i]),
 
 stack[years[i]]) 
}

Thanks so much! I was able to get both methods to work!

Xandian97
– Xandian97

2022年08月03日 13:58:03 +00:00
Commented Aug 3, 2022 at 13:58

Stack Exchange Network

How to subset a raster stack by year with a loop in R?

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

How to subset a raster stack by year with a loop in R?

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions