import os
# Data directory
= os.path.join(os.path.expanduser('~'), "grass_foss4geu_2024")
homedir
# GRASS GIS database variables
= "grass"
grassbin = os.path.join(homedir, "grassdata")
grassdata = "eu_laea"
project = "italy_LST_daily" mapset
Part 1: Processing data in GRASS
In this notebook we’ll go through the processing of MODIS LST daily time series data to derive relevant predictor variables for modeling the distribution of Aedes albopictus in Northern Italy. Furthermore, we’ll show how to obtain and process occurrence data and background points.
Let’s first go through some temporal concepts within GRASS GIS…
The TGRASS framework
GRASS GIS was the first FOSS GIS that incorporated capabilities to manage, analyze, process and visualize spatio-temporal data, as well as the temporal relationships among time series.
- TGRASS is fully based on metadata and does not duplicate any dataset
- Snapshot approach, i.e., adds time stamps to maps
- A collection of time stamped maps (snapshots) of the same variable are called space-time datasets or STDS
- Maps in a STDS can have different spatial and temporal extents
- Space-time datasets can be composed of raster, raster 3D or vector maps, and so we call them:
- Space time raster datasets (STRDS)
- Space time 3D raster datasets (STR3DS)
- Space time vector datasets (STVDS)
Temporal tools
GRASS temporal tools are named and organized following GRASS core naming scheme. In this way, we have:
- t.*: general tools to handle STDS of all types
- t.rast.*: tools that deal with STRDS
- t.rast3d.*: tools that deal with STR3DS
- t.vect.*: tools that deal with STVDS
Other TGRASS notions
- Time can be defined as intervals (start and end time) or instances (only start time)
- Time can be absolute (e.g., 2017-04-06 22:39:49) or relative (e.g., 4 years, 90 days)
- Granularity is the greatest common divisor of the temporal extents (and possible gaps) of all maps in the space-time cube
- Topology refers to temporal relations between time intervals in a STDS.
TGRASS framework and workflow
GRASS +
In this part of the studio we’ll work with GRASS and Python, so let’s first see/recall the very basics.
Python package grass.script
The grass.script or GRASS GIS Python Scripting Library provides functions for calling GRASS tools within Python scripts. The most commonly used functions include:
run_command
: used when the output of the tools is a raster or vector, no text type output is expectedread_command
: used when the output of the tools is of text typeparse_command
: used with tools whose output can be converted tokey=value
pairswrite_command
: used with tools that expect text input, either in the form of a file or from standard input
It also provides several wrapper functions for frequently used tools, for example:
- To get info from a raster, script.raster.raster_info() is used:
gs.raster_info('dsm')
- To get info of a vector, script.vector.vector_info() is used:
gs.vector_info('roads')
- To list the raster in a project, script.core.list_grouped() is used:
gs.list_grouped(type=['raster'])
- To obtain the computational region, script.core.region() is used:
gs.region()
Python package grass.jupyter
The grass.jupyter library improves the integration of GRASS and Jupyter, and provides different classes to facilitate GRASS maps visualization:
init
: starts a GRASS session and sets up all necessary environment variablesMap
: 2D renderingMap3D
: 3D renderingInteractiveMap
: interactive visualization with foliumSeriesMap
: visualizations for a series of raster or vector mapsTimeSeriesMap
: visualization for spatio-temporal data
Hands-on
So let’s start… We begin by setting variables, checking GRASS installation and initializing GRASS GIS
# Check the GRASS GIS installation
import subprocess
print(subprocess.check_output([grassbin, "--config", "version"], text=True))
# Ask GRASS GIS where its Python packages are
import sys
sys.path.append("--config", "python_path"], text=True).strip()
subprocess.check_output([grassbin, )
Now we are ready to start a GRASS GIS session
# Import the GRASS GIS packages we need
import grass.script as gs
import grass.jupyter as gj
# Start the GRASS GIS Session
= gj.init(grassdata, project, mapset) session
Explore data in the mapset
Let’s first explore what we have within the italy_LST_daily
mapset and display vector and raster maps using different classes from grass.jupyter
library.
# List vector elements
type="vector")['italy_LST_daily'] gs.list_grouped(
# Display vector map
= gj.Map(width=500, use_region=True)
it_map map="italy_borders_0")
it_map.d_vect( it_map.show()
# List raster elements
= gs.list_grouped(type="raster", pattern="lst*")['italy_LST_daily']
rast 0:10] rast[
# Display raster map with interactive class
= gj.InteractiveMap(width = 500, use_region=True, tiles="OpenStreetMap")
lst_map "lst_2014.005_avg")
lst_map.add_raster(= "bottomright")
lst_map.add_layer_control(position lst_map.show()
SDM workflow
In this part of the Studio we’ll be addressing the left part of the SDM workflow, occurrence and background data and predictors:
Importing species records
We will use occurrence data already downloaded and cleaned. We need to import it into GRASS GIS first.
# Import mosquito records
"v.import",
gs.run_command(input=os.path.join(homedir,"aedes_albopictus.gpkg"),
="aedes_albopictus") output
Let’s add the occurrence points over the previous interactive map
# Display raster map with interactive class
= gj.InteractiveMap(width = 500, use_region=True, tiles="OpenStreetMap")
lst_map "lst_2014.005_avg")
lst_map.add_raster("aedes_albopictus")
lst_map.add_vector(= "bottomright")
lst_map.add_layer_control(position lst_map.show()
You can also get the mosquito occurrences (or any other species or taxa) directly from GBIF into GRASS by means of v.in.pygbif as follows:
# Set computational region
# region = gs.parse_command("g.region", raster="lst_2014.001_avg", flags="g")
# region
# Install extension (requires pygbif: pip install pygbif)
# gs.run_command("g.extension",
# extension="v.in.pygbif")
# Import data from GBIF
# gs.run_command("v.in.pygbif",
# output="aedes_albopictus",
# taxa="Aedes albopictus",
# date_from="2014-01-01",
# date_to="2018-12-31")
Creating random background points
The algorithm MaxEnt that we will use in the next part of this session requires not only the projects of known occurrences, but also information on the rest of the environment available. These are not absences but background data, we actually do not know if the species is there or not, but we need it to compare with the features of the places where the species does occur.
To avoid getting background points exactly where occurrences are, we’ll create buffers around them. Then, we need to ensure that background points are only over land within our computational region. In order to do that, we’ll create a mask over land and we’ll overlay the buffers with the mask. Can you guess what the output will be?
# Create buffer around Aedes albopictus records
"v.buffer",
gs.run_command(input="aedes_albopictus",
="aedes_buffer",
output=2000) distance
# Set computational region
= gs.parse_command("g.region", raster="lst_2014.001_avg", flags="g")
region region
# Create a vector mask to limit background points
="MASK = if(lst_2014.001_avg, 1, null())"
expression=expression)
gs.raster.mapcalc(exp
"r.to.vect",
gs.run_command(input="MASK",
="vect_mask",
outputtype="area")
# Subtract buffers from vector mask
"v.overlay",
gs.run_command(="vect_mask",
ainput="aedes_buffer",
binput="xor",
operator="mask_bg") output
Let’s display the result
# Display raster map with interactive class
= gj.InteractiveMap(width = 500, use_region=True, tiles="OpenStreetMap")
mask_map "mask_bg")
mask_map.add_vector(= "bottomright")
mask_map.add_layer_control(position mask_map.show()
Finally, let’s create the random background points…
# Generate random background points
"v.random",
gs.run_command(="background_points",
output=1000,
npoints="mask_bg",
restrict=3749) seed
and display occurrence and background points together over an LST map.
# Display vector map
= gj.Map(width=500, use_region=True)
pb_map map="lst_2014.005_avg")
pb_map.d_rast(map="italy_borders_0", type="boundary")
pb_map.d_vect(map="background_points")
pb_map.d_vect(map="aedes_albopictus", icon="basic/diamond", fill_color="red", size=8)
pb_map.d_vect( pb_map.show()
Create daily LST time series
Now we’ll start processing the raster data to derive potentially relevant predictors to include in the model. Our data consists of a time series of daily LST averages. We’ll use the GRASS temporal framework. The first step is to create the time series object and register maps in it. See t.create and t.register for further details.
# Create time series
"t.create",
gs.run_command(type="strds",
="absolute",
temporaltype="lst_daily",
output="Average Daily LST",
title="Average daily LST in degree C - 2014-2018") description
# Check it is created
"t.list",
gs.run_command(type="strds")
# Get list of maps
= gs.list_grouped(type="raster", pattern="lst_201*")['italy_LST_daily']
map_list 0:10] map_list[
# Register maps in strds
"t.register",
gs.run_command(input="lst_daily",
=map_list,
maps="1 days",
increment="2014-01-01",
start="i") flags
# Get info about the strds
"t.info",
gs.run_command(input="lst_daily")
Generate environmental variables from LST STRDS
Now that we created the time series or STRDS, let’s start estimating relevant variables. We start by calculating long term aggregations, also called climatologies.
Long term monthly avg, min and max LST
Let’s see an example first; we’ll estimate the average of all maps which start date is within January.
# January average LST
"t.rast.series",
gs.run_command(input="lst_daily",
="average",
method="strftime('%m', start_time)='01'",
where="lst_average_jan") output
# Get map info and check values
"lst_average_jan")['min'], gs.raster_info("lst_average_jan")['max'] gs.raster_info(
If we want to estimate climatologies for all months, let’s try first to get the list of maps that will be the input for t.rast.series, for that we’ll test the condition in t.rast.list first.
# Define list of months as required
=['{0:02d}'.format(m) for m in range(1,13)]
months
for m in months:
"t.rast.list",
gs.run_command(input="lst_daily",
=f"strftime('%m', start_time)='{m}'") where
Now we add the methods
and we are ready to estimate climatologies for all months with three different methods.
# Now we estimate the climatologies for all months and methods
=['{0:02d}'.format(m) for m in range(1,13)]
months=["average","minimum","maximum"]
methods
for m in months:
for me in methods:
"t.rast.series",
gs.run_command(input="lst_daily",
=me,
method=f"strftime('%m', start_time)='{m}'",
where="lst_{}_{}".format(me,m)) output
# List newly created maps
= gs.list_grouped(type="raster", pattern="*{average,minimum,maximum}*")['italy_LST_daily']
map_list print(map_list)
# Remove lst_average_jan
"g.remove", type="raster", name="lst_average_jan", flags="f") gs.run_command(
Bioclimatic variables
Perhaps you have heard of Worldclim or CHELSA bioclimatic variables? Well, this are 19 variables that represent potentially limiting conditions for species. They derive from the combination of temperature and precipitation long term averages. As we do not have precipitation data in this exercise, we’ll only estimate the bioclimatic variables that include temperature. See r.bioclim manual for further details. Note that we’ll use the climatologies estimated in the previous step.
# Install extension
"g.extension",
gs.run_command(="r.bioclim") extension
# Get lists of maps needed
=gs.list_grouped(type="raster", pattern="lst_minimum_??")['italy_LST_daily']
tmin=gs.list_grouped(type="raster", pattern="lst_maximum_??")['italy_LST_daily']
tmax=gs.list_grouped(type="raster", pattern="lst_average_??")['italy_LST_daily']
tavg
print(tmin,tmax,tavg)
# Estimate temperature related bioclimatic variables
"r.bioclim",
gs.run_command(=tmin,
tmin=tmax,
tmax=tavg,
tavg="worldclim_") output
# List output maps
type="raster", pattern="worldclim*")['italy_LST_daily'] gs.list_grouped(
Let’s have a look at some of the maps we just created
# Display raster map with interactive class
= gj.InteractiveMap(width = 500, use_region=True, tiles="OpenStreetMap")
bio_map "worldclim_bio01")
bio_map.add_raster("worldclim_bio02")
bio_map.add_raster(= "bottomright")
bio_map.add_layer_control(position bio_map.show()
Spring warming
We define spring warming as the velocity with which temperature increases from winter into spring and we calculate it as slope(daily Tmean February-March-April)
. We will use t.rast.aggregate.
# Define list of months
=['{0:02d}'.format(m) for m in range(2,5)] months
# Annual spring warming
"t.rast.aggregate",
gs.run_command(input="lst_daily",
="annual_spring_warming",
output="spring_warming",
basename="gran",
suffix="slope",
method="1 years",
granularity=f"strftime('%m',start_time)='{months[0]}' or strftime('%m',start_time)='{months[1]}' or strftime('%m', start_time)='{months[2]}'") where
# Check raster maps in the STRDS
"t.rast.list", input="annual_spring_warming") gs.run_command(
# Average spring warming
"t.rast.series",
gs.run_command(input="annual_spring_warming",
="avg_spring_warming",
output="average") method
# Display raster map with interactive class
= gj.InteractiveMap(width = 500, use_region=True, tiles="OpenStreetMap")
auc_map "avg_spring_warming")
auc_map.add_raster(= "bottomright")
auc_map.add_layer_control(position auc_map.show()
Autumnal cooling
We define autumnal cooling as the velocity with which temperature decreases from summer into fall and we calculate it as slope(daily Tmean August-September-October)
.
# Define list of months
=['{0:02d}'.format(m) for m in range(8,11)] months
# Annual autumnal cooling
"t.rast.aggregate",
gs.run_command(input="lst_daily",
="annual_autumnal_cooling",
output="autumnal_cooling",
basename="gran",
suffix="slope",
method="1 years",
granularity=f"strftime('%m',start_time)='{months[0]}' or strftime('%m',start_time)='{months[1]}' or strftime('%m', start_time)='{months[2]}'") where
# Check raster maps in the STRDS
"t.rast.list", input="annual_autumnal_cooling") gs.run_command(
# Average autumnal cooling
"t.rast.series",
gs.run_command(input="annual_autumnal_cooling",
="avg_autumnal_cooling",
output="average") method
# Display raster map with interactive class
= gj.InteractiveMap(width = 500, use_region=True, tiles="OpenStreetMap")
spw_map "avg_autumnal_cooling")
spw_map.add_raster(= "bottomright")
spw_map.add_layer_control(position spw_map.show()
Number of days with LSTmean >= 20 and <= 30
Mosquitoes (and virus they might carry) tend to thrive in a certain range of temperatures. Let’s assume this range is from 20 to 30 °C. Here, we’ll estimate the number of days within this range per year, and then, we’ll estimate the average along years. See t.rast.algebra manual for further details.
# Keep only pixels meeting the condition
="tmean_higher20_lower30 = if(lst_daily >= 20.0 && lst_daily <= 30.0, 1, null())"
expression
"t.rast.algebra",
gs.run_command(=expression,
expression="tmean_higher20_lower30",
basename="gran",
suffix=7,
nproc="n") flags
# Count how many times per year the condition is met
"t.rast.aggregate",
gs.run_command(input="tmean_higher20_lower30",
="count_tmean_higher20_lower30",
output="tmean_higher20_lower30",
basename="gran",
suffix="count",
method="1 years") granularity
# Check raster maps in the STRDS
"t.rast.list",
gs.run_command(input="count_tmean_higher20_lower30",
="name,start_time,min,max") columns
# Average number of days with LSTmean >= 20 and <= 30
"t.rast.series",
gs.run_command(input="count_tmean_higher20_lower30",
="avg_count_tmean_higher20_lower30",
output="average") method
# Display raster map with interactive class
= gj.InteractiveMap(width = 500, use_region=True, tiles="OpenStreetMap")
h20_map "avg_count_tmean_higher20_lower30")
h20_map.add_raster(= "bottomright")
h20_map.add_layer_control(position h20_map.show()
Number of consecutive days with LSTmean <= -10.0
Likewise, there are temperature thresholds that mark a limit to mosquito survival. Here, we’ll use the temperature lower limit to survival. Most importantly, we we’ll count the number of consecutive days with temperatures below this threshold.
Here, we’ll use again the temporal algebra and we’ll recall the concept of topology that we defined at the beginning of the notebook. First, we need to create a STRDS of annual granularity that will contain only zeroes. This annual STRDS, that we call annual mask
, will be the base to add 1 each time the condition of less than -10 °C in consecutive days is met. Finally, we estimate the median number of days with LST lower than -10 °C over the 5 years.
# Create annual mask
"t.rast.aggregate",
gs.run_command(input="lst_daily",
="annual_mask",
output="annual_mask",
basename="gran",
suffix="1 year",
granularity="count") method
# Replace values by zero
="if(annual_mask, 0)"
expression
"t.rast.mapcalc",
gs.run_command(input="annual_mask",
="annual_mask_0",
output=expression,
expression="annual_mask_0") basename
# Calculate consecutive days with LST <= -10.0
="lower_m10_consec_days = annual_mask_0 {+,contains,l} if(lst_daily <= -10.0 && lst_daily[-1] <= -10.0 || lst_daily[1] <= -10.0 && lst_daily <= -10.0, 1, 0)"
expression
"t.rast.algebra",
gs.run_command(=expression,
expression="lower_m10",
basename="gran",
suffix=7) nproc
# Inspect values
"t.rast.list",
gs.run_command(input="lower_m10_consec_days",
="name,start_time,min,max") columns
# Median number of consecutive days with LST <= -10
"t.rast.series",
gs.run_command(input="lower_m10_consec_days",
="median_lower_m10_consec_days",
output="median") method
# Display raster map with interactive class
= gj.InteractiveMap(width = 500, use_region=True, tiles="OpenStreetMap")
lt10_map "median_lower_m10_consec_days")
lt10_map.add_raster(= "bottomright")
lt10_map.add_layer_control(position lt10_map.show()
We have now derived many potentially relevant predictors for the mosquito habitat suitability and we could still derive some more, for example, the number of mosquito or virus cycles per year based on development temperature thresholds and growing degree days (GDD). This could be achieved with t.rast.accumulate and t.rast.accdetect.
We will now open a GRASS session from R and perform SDM there.
:::