import os
# Data directory
homedir = os.path.join(os.path.expanduser('~'), "grass_foss4geu_2024")
# GRASS GIS database variables
grassbin = "grass"
grassdata = os.path.join(homedir, "grassdata")
project = "eu_laea"
mapset = "italy_LST_daily"Part 1: Processing data in GRASS
In this notebook we’ll go through the processing of MODIS LST daily time series data to derive relevant predictor variables for modeling the distribution of Aedes albopictus in Northern Italy. Furthermore, we’ll show how to obtain and process occurrence data and background points.
Let’s first go through some temporal concepts within GRASS GIS…
The TGRASS framework
GRASS GIS was the first FOSS GIS that incorporated capabilities to manage, analyze, process and visualize spatio-temporal data, as well as the temporal relationships among time series.
- TGRASS is fully based on metadata and does not duplicate any dataset
- Snapshot approach, i.e., adds time stamps to maps
- A collection of time stamped maps (snapshots) of the same variable are called space-time datasets or STDS
- Maps in a STDS can have different spatial and temporal extents
- Space-time datasets can be composed of raster, raster 3D or vector maps, and so we call them:
- Space time raster datasets (STRDS)
- Space time 3D raster datasets (STR3DS)
- Space time vector datasets (STVDS)
Temporal tools
GRASS temporal tools are named and organized following GRASS core naming scheme. In this way, we have:
- t.*: general tools to handle STDS of all types
- t.rast.*: tools that deal with STRDS
- t.rast3d.*: tools that deal with STR3DS
- t.vect.*: tools that deal with STVDS
Other TGRASS notions
- Time can be defined as intervals (start and end time) or instances (only start time)
- Time can be absolute (e.g., 2017-04-06 22:39:49) or relative (e.g., 4 years, 90 days)
- Granularity is the greatest common divisor of the temporal extents (and possible gaps) of all maps in the space-time cube
- Topology refers to temporal relations between time intervals in a STDS.
TGRASS framework and workflow
GRASS +
In this part of the studio we’ll work with GRASS and Python, so let’s first see/recall the very basics.
Python package grass.script
The grass.script or GRASS GIS Python Scripting Library provides functions for calling GRASS tools within Python scripts. The most commonly used functions include:
run_command: used when the output of the tools is a raster or vector, no text type output is expectedread_command: used when the output of the tools is of text typeparse_command: used with tools whose output can be converted tokey=valuepairswrite_command: used with tools that expect text input, either in the form of a file or from standard input
It also provides several wrapper functions for frequently used tools, for example:
- To get info from a raster, script.raster.raster_info() is used:
gs.raster_info('dsm') - To get info of a vector, script.vector.vector_info() is used:
gs.vector_info('roads') - To list the raster in a project, script.core.list_grouped() is used:
gs.list_grouped(type=['raster']) - To obtain the computational region, script.core.region() is used:
gs.region()
Python package grass.jupyter
The grass.jupyter library improves the integration of GRASS and Jupyter, and provides different classes to facilitate GRASS maps visualization:
init: starts a GRASS session and sets up all necessary environment variablesMap: 2D renderingMap3D: 3D renderingInteractiveMap: interactive visualization with foliumSeriesMap: visualizations for a series of raster or vector mapsTimeSeriesMap: visualization for spatio-temporal data
Hands-on
So let’s start… We begin by setting variables, checking GRASS installation and initializing GRASS GIS
# Check the GRASS GIS installation
import subprocess
print(subprocess.check_output([grassbin, "--config", "version"], text=True))# Ask GRASS GIS where its Python packages are
import sys
sys.path.append(
subprocess.check_output([grassbin, "--config", "python_path"], text=True).strip()
)Now we are ready to start a GRASS GIS session
# Import the GRASS GIS packages we need
import grass.script as gs
import grass.jupyter as gj
# Start the GRASS GIS Session
session = gj.init(grassdata, project, mapset)Explore data in the mapset
Let’s first explore what we have within the italy_LST_daily mapset and display vector and raster maps using different classes from grass.jupyter library.
# List vector elements
gs.list_grouped(type="vector")['italy_LST_daily']# Display vector map
it_map = gj.Map(width=500, use_region=True)
it_map.d_vect(map="italy_borders_0")
it_map.show()# List raster elements
rast = gs.list_grouped(type="raster", pattern="lst*")['italy_LST_daily']
rast[0:10]# Display raster map with interactive class
lst_map = gj.InteractiveMap(width = 500, use_region=True, tiles="OpenStreetMap")
lst_map.add_raster("lst_2014.005_avg")
lst_map.add_layer_control(position = "bottomright")
lst_map.show()SDM workflow
In this part of the Studio we’ll be addressing the left part of the SDM workflow, occurrence and background data and predictors:
Importing species records
We will use occurrence data already downloaded and cleaned. We need to import it into GRASS GIS first.
# Import mosquito records
gs.run_command("v.import",
input=os.path.join(homedir,"aedes_albopictus.gpkg"),
output="aedes_albopictus")Let’s add the occurrence points over the previous interactive map
# Display raster map with interactive class
lst_map = gj.InteractiveMap(width = 500, use_region=True, tiles="OpenStreetMap")
lst_map.add_raster("lst_2014.005_avg")
lst_map.add_vector("aedes_albopictus")
lst_map.add_layer_control(position = "bottomright")
lst_map.show()You can also get the mosquito occurrences (or any other species or taxa) directly from GBIF into GRASS by means of v.in.pygbif as follows:
# Set computational region
# region = gs.parse_command("g.region", raster="lst_2014.001_avg", flags="g")
# region# Install extension (requires pygbif: pip install pygbif)
# gs.run_command("g.extension",
# extension="v.in.pygbif")# Import data from GBIF
# gs.run_command("v.in.pygbif",
# output="aedes_albopictus",
# taxa="Aedes albopictus",
# date_from="2014-01-01",
# date_to="2018-12-31")Creating random background points
The algorithm MaxEnt that we will use in the next part of this session requires not only the projects of known occurrences, but also information on the rest of the environment available. These are not absences but background data, we actually do not know if the species is there or not, but we need it to compare with the features of the places where the species does occur.
To avoid getting background points exactly where occurrences are, we’ll create buffers around them. Then, we need to ensure that background points are only over land within our computational region. In order to do that, we’ll create a mask over land and we’ll overlay the buffers with the mask. Can you guess what the output will be?
# Create buffer around Aedes albopictus records
gs.run_command("v.buffer",
input="aedes_albopictus",
output="aedes_buffer",
distance=2000)# Set computational region
region = gs.parse_command("g.region", raster="lst_2014.001_avg", flags="g")
region# Create a vector mask to limit background points
expression="MASK = if(lst_2014.001_avg, 1, null())"
gs.raster.mapcalc(exp=expression)
gs.run_command("r.to.vect",
input="MASK",
output="vect_mask",
type="area")# Subtract buffers from vector mask
gs.run_command("v.overlay",
ainput="vect_mask",
binput="aedes_buffer",
operator="xor",
output="mask_bg")Let’s display the result
# Display raster map with interactive class
mask_map = gj.InteractiveMap(width = 500, use_region=True, tiles="OpenStreetMap")
mask_map.add_vector("mask_bg")
mask_map.add_layer_control(position = "bottomright")
mask_map.show()Finally, let’s create the random background points…
# Generate random background points
gs.run_command("v.random",
output="background_points",
npoints=1000,
restrict="mask_bg",
seed=3749)and display occurrence and background points together over an LST map.
# Display vector map
pb_map = gj.Map(width=500, use_region=True)
pb_map.d_rast(map="lst_2014.005_avg")
pb_map.d_vect(map="italy_borders_0", type="boundary")
pb_map.d_vect(map="background_points")
pb_map.d_vect(map="aedes_albopictus", icon="basic/diamond", fill_color="red", size=8)
pb_map.show()Create daily LST time series
Now we’ll start processing the raster data to derive potentially relevant predictors to include in the model. Our data consists of a time series of daily LST averages. We’ll use the GRASS temporal framework. The first step is to create the time series object and register maps in it. See t.create and t.register for further details.
# Create time series
gs.run_command("t.create",
type="strds",
temporaltype="absolute",
output="lst_daily",
title="Average Daily LST",
description="Average daily LST in degree C - 2014-2018")# Check it is created
gs.run_command("t.list",
type="strds")# Get list of maps
map_list = gs.list_grouped(type="raster", pattern="lst_201*")['italy_LST_daily']
map_list[0:10]# Register maps in strds
gs.run_command("t.register",
input="lst_daily",
maps=map_list,
increment="1 days",
start="2014-01-01",
flags="i")# Get info about the strds
gs.run_command("t.info",
input="lst_daily")Generate environmental variables from LST STRDS
Now that we created the time series or STRDS, let’s start estimating relevant variables. We start by calculating long term aggregations, also called climatologies.
Long term monthly avg, min and max LST
Let’s see an example first; we’ll estimate the average of all maps which start date is within January.
# January average LST
gs.run_command("t.rast.series",
input="lst_daily",
method="average",
where="strftime('%m', start_time)='01'",
output="lst_average_jan")# Get map info and check values
gs.raster_info("lst_average_jan")['min'], gs.raster_info("lst_average_jan")['max']If we want to estimate climatologies for all months, let’s try first to get the list of maps that will be the input for t.rast.series, for that we’ll test the condition in t.rast.list first.
# Define list of months as required
months=['{0:02d}'.format(m) for m in range(1,13)]
for m in months:
gs.run_command("t.rast.list",
input="lst_daily",
where=f"strftime('%m', start_time)='{m}'")Now we add the methods and we are ready to estimate climatologies for all months with three different methods.
# Now we estimate the climatologies for all months and methods
months=['{0:02d}'.format(m) for m in range(1,13)]
methods=["average","minimum","maximum"]
for m in months:
for me in methods:
gs.run_command("t.rast.series",
input="lst_daily",
method=me,
where=f"strftime('%m', start_time)='{m}'",
output="lst_{}_{}".format(me,m))# List newly created maps
map_list = gs.list_grouped(type="raster", pattern="*{average,minimum,maximum}*")['italy_LST_daily']
print(map_list)# Remove lst_average_jan
gs.run_command("g.remove", type="raster", name="lst_average_jan", flags="f")Bioclimatic variables
Perhaps you have heard of Worldclim or CHELSA bioclimatic variables? Well, this are 19 variables that represent potentially limiting conditions for species. They derive from the combination of temperature and precipitation long term averages. As we do not have precipitation data in this exercise, we’ll only estimate the bioclimatic variables that include temperature. See r.bioclim manual for further details. Note that we’ll use the climatologies estimated in the previous step.
# Install extension
gs.run_command("g.extension",
extension="r.bioclim")# Get lists of maps needed
tmin=gs.list_grouped(type="raster", pattern="lst_minimum_??")['italy_LST_daily']
tmax=gs.list_grouped(type="raster", pattern="lst_maximum_??")['italy_LST_daily']
tavg=gs.list_grouped(type="raster", pattern="lst_average_??")['italy_LST_daily']
print(tmin,tmax,tavg)# Estimate temperature related bioclimatic variables
gs.run_command("r.bioclim",
tmin=tmin,
tmax=tmax,
tavg=tavg,
output="worldclim_") # List output maps
gs.list_grouped(type="raster", pattern="worldclim*")['italy_LST_daily']Let’s have a look at some of the maps we just created
# Display raster map with interactive class
bio_map = gj.InteractiveMap(width = 500, use_region=True, tiles="OpenStreetMap")
bio_map.add_raster("worldclim_bio01")
bio_map.add_raster("worldclim_bio02")
bio_map.add_layer_control(position = "bottomright")
bio_map.show()Spring warming
We define spring warming as the velocity with which temperature increases from winter into spring and we calculate it as slope(daily Tmean February-March-April). We will use t.rast.aggregate.
# Define list of months
months=['{0:02d}'.format(m) for m in range(2,5)]# Annual spring warming
gs.run_command("t.rast.aggregate",
input="lst_daily",
output="annual_spring_warming",
basename="spring_warming",
suffix="gran",
method="slope",
granularity="1 years",
where=f"strftime('%m',start_time)='{months[0]}' or strftime('%m',start_time)='{months[1]}' or strftime('%m', start_time)='{months[2]}'")# Check raster maps in the STRDS
gs.run_command("t.rast.list", input="annual_spring_warming")# Average spring warming
gs.run_command("t.rast.series",
input="annual_spring_warming",
output="avg_spring_warming",
method="average")# Display raster map with interactive class
auc_map = gj.InteractiveMap(width = 500, use_region=True, tiles="OpenStreetMap")
auc_map.add_raster("avg_spring_warming")
auc_map.add_layer_control(position = "bottomright")
auc_map.show()Autumnal cooling
We define autumnal cooling as the velocity with which temperature decreases from summer into fall and we calculate it as slope(daily Tmean August-September-October).
# Define list of months
months=['{0:02d}'.format(m) for m in range(8,11)]# Annual autumnal cooling
gs.run_command("t.rast.aggregate",
input="lst_daily",
output="annual_autumnal_cooling",
basename="autumnal_cooling",
suffix="gran",
method="slope",
granularity="1 years",
where=f"strftime('%m',start_time)='{months[0]}' or strftime('%m',start_time)='{months[1]}' or strftime('%m', start_time)='{months[2]}'")# Check raster maps in the STRDS
gs.run_command("t.rast.list", input="annual_autumnal_cooling")# Average autumnal cooling
gs.run_command("t.rast.series",
input="annual_autumnal_cooling",
output="avg_autumnal_cooling",
method="average")# Display raster map with interactive class
spw_map = gj.InteractiveMap(width = 500, use_region=True, tiles="OpenStreetMap")
spw_map.add_raster("avg_autumnal_cooling")
spw_map.add_layer_control(position = "bottomright")
spw_map.show()Number of days with LSTmean >= 20 and <= 30
Mosquitoes (and virus they might carry) tend to thrive in a certain range of temperatures. Let’s assume this range is from 20 to 30 °C. Here, we’ll estimate the number of days within this range per year, and then, we’ll estimate the average along years. See t.rast.algebra manual for further details.
# Keep only pixels meeting the condition
expression="tmean_higher20_lower30 = if(lst_daily >= 20.0 && lst_daily <= 30.0, 1, null())"
gs.run_command("t.rast.algebra",
expression=expression,
basename="tmean_higher20_lower30",
suffix="gran",
nproc=7,
flags="n")# Count how many times per year the condition is met
gs.run_command("t.rast.aggregate",
input="tmean_higher20_lower30",
output="count_tmean_higher20_lower30",
basename="tmean_higher20_lower30",
suffix="gran",
method="count",
granularity="1 years")# Check raster maps in the STRDS
gs.run_command("t.rast.list",
input="count_tmean_higher20_lower30",
columns="name,start_time,min,max")# Average number of days with LSTmean >= 20 and <= 30
gs.run_command("t.rast.series",
input="count_tmean_higher20_lower30",
output="avg_count_tmean_higher20_lower30",
method="average")# Display raster map with interactive class
h20_map = gj.InteractiveMap(width = 500, use_region=True, tiles="OpenStreetMap")
h20_map.add_raster("avg_count_tmean_higher20_lower30")
h20_map.add_layer_control(position = "bottomright")
h20_map.show()Number of consecutive days with LSTmean <= -10.0
Likewise, there are temperature thresholds that mark a limit to mosquito survival. Here, we’ll use the temperature lower limit to survival. Most importantly, we we’ll count the number of consecutive days with temperatures below this threshold.
Here, we’ll use again the temporal algebra and we’ll recall the concept of topology that we defined at the beginning of the notebook. First, we need to create a STRDS of annual granularity that will contain only zeroes. This annual STRDS, that we call annual mask, will be the base to add 1 each time the condition of less than -10 °C in consecutive days is met. Finally, we estimate the median number of days with LST lower than -10 °C over the 5 years.
# Create annual mask
gs.run_command("t.rast.aggregate",
input="lst_daily",
output="annual_mask",
basename="annual_mask",
suffix="gran",
granularity="1 year",
method="count")# Replace values by zero
expression="if(annual_mask, 0)"
gs.run_command("t.rast.mapcalc",
input="annual_mask",
output="annual_mask_0",
expression=expression,
basename="annual_mask_0")# Calculate consecutive days with LST <= -10.0
expression="lower_m10_consec_days = annual_mask_0 {+,contains,l} if(lst_daily <= -10.0 && lst_daily[-1] <= -10.0 || lst_daily[1] <= -10.0 && lst_daily <= -10.0, 1, 0)"
gs.run_command("t.rast.algebra",
expression=expression,
basename="lower_m10",
suffix="gran",
nproc=7)# Inspect values
gs.run_command("t.rast.list",
input="lower_m10_consec_days",
columns="name,start_time,min,max")# Median number of consecutive days with LST <= -10
gs.run_command("t.rast.series",
input="lower_m10_consec_days",
output="median_lower_m10_consec_days",
method="median")# Display raster map with interactive class
lt10_map = gj.InteractiveMap(width = 500, use_region=True, tiles="OpenStreetMap")
lt10_map.add_raster("median_lower_m10_consec_days")
lt10_map.add_layer_control(position = "bottomright")
lt10_map.show()We have now derived many potentially relevant predictors for the mosquito habitat suitability and we could still derive some more, for example, the number of mosquito or virus cycles per year based on development temperature thresholds and growing degree days (GDD). This could be achieved with t.rast.accumulate and t.rast.accdetect.
We will now open a GRASS session from R and perform SDM there.
:::