
File download
Lesson 1 Overview
Introduction to R
•Packages: dplyr, lubridate
•Basic calculations
•Vectors and strings
•Data frames
•Data manipulation
•Time formatting
•Importing and Exporting Data
•Simple importing and exporting of .csv files
•Visualization Basics
•Packages: ggplot2
•Base r
•Ggplot
An introductory lesson to the R language, with definitions of vectors, data frames, and objects
Simple examples of importing and exporting data into and out of R
Basic concepts of data visualizations in base R and ggplot
Lesson 2 Overview
•Occurrence Data
•Packages: rgbif, mapview, scrubr, sp, dplyr
•Data mining species occurrence data from GBIF
•Creating spatial points data objects
•Exploratory visualization techniques
•Species Density Maps
•Packages: sp, raster, usdm, mapview, rgbif, scrubr, GISTools, maps, ggplot2, RColorBrewer, ggspatial
•Spatial data cleaning techniques
•Querying country polygon outlines
•Clustering techniques
•Basic species density and spatial distance analyses
•Multiple data visualization techniques
A walk-through of Lesson 2, exercise 1
Density and distance analysis using species occurrence data queried from GBIF, various mapping techniques, and simple statistical analyses.
Note: As per recent updates to the rgbif package, lines 37 - 40 should now be:
key <- name_suggest(q = "Trimeresurus", rank = "genus")$data$key[1]
occ_search(taxonKey=key, limit=0)$meta$count
dat.all <- occ_search(taxonKey = key, return = "data", limit = 6000)
dat.all <- dat.all$data
Lesson 3 Overview
•Environmental Data
•Packages: raster, mapview, ggplot2, dplyr, rgbif, maptools, scrubr
•Querying environmental data (elevation, temperature) from various online sources
•Forming datasets with species occurrence records and extracted environmental data
•Raster manipulation techniques
•Field Guide Maps
•Packages: rnaturalearth, rnaturalearthdata, ggspatial, sf, dismo, jsonlite, mapdata, raster, mapview, ggplot2, dplyr, rgbif, maptools, rgdal
•Use environmental data to project current climate suitability models
•Raster to polygon manipulation for range map generation
•Species Distribution Models (SDM)
•Packages: dismo, jsonlite, mapdata, raster, mapview, ggplot2, dplyr, rgbif, maptools, rgdal
•Using climate data to create current and future projections of climate suitability models
Mining environmental data using the raster package, basic raster manipulation techniques, extracting data from species occurrences, and building species data frames.
Basic techniques to create species distribution maps based on climate suitability
Correction: In the video I say the highest AUC value indicates the best fit model. I accidentally misspoke in this regard. The best fit model is indicated by the LOWEST AUC value. I must have been still thinking of AIC at the time, in which the best fit model is the highest value. Sorry for the confusion everyone!
A basic introduction to MaxEnt modelling in R using current and future climate data from WorldClim and species occurence data from GBIF.
Note: The models generated in this exercise and the last exercise are simply to show you the basic functionality of MaxEnt in R. SDMs should incorporate multiple parameters and as much data as possible to make robust models when used for scientific publication.
Please read for further information:
The raster variables being used for the tutorial are called "biolclim" variables. Each of the "bio1-19" rasters has a different interpretation. All of those interpretations can be found on this document:
https://pubs.usgs.gov/ds/691/ds691.pdf
Also note that this tutorial is meant for the specific purpose of getting to know the basic R code and functionality behind SDM and is not the more elaborate workflow you would use to conduct an SDM for publication. To better understand the various factors that go into creating publication-ready SDM I highly encourage you to read the following literature:
For information on raster variables should be used (along with bioclim) for a given animal type:
Bradie, J., & Leung, B. (2017). A quantitative synthesis of the importance of variables used in MaxEnt species distribution models. Journal of Biogeography, 44(6), 1344-1361.
https://onlinelibrary.wiley.com/doi/abs/10.1111/jbi.12894
For information on appropriate pseudoabsence selection regimes
Barbet‐Massin, M., Jiguet, F., Albert, C. H., & Thuiller, W. (2012). Selecting pseudo‐absences for species distribution models: how, where and how many?. Methods in ecology and evolution, 3(2), 327-338.
https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/j.2041-210X.2011.00172.x
For information on dealing with multicollinearity in variables
Feng, X., Park, D. S., Liang, Y., Pandey, R., & Papeş, M. (2019). Collinearity in ecological niche modeling: Confusions and challenges. Ecology and Evolution, 9(18), 10365-10376.
https://onlinelibrary.wiley.com/doi/full/10.1002/ece3.5555
For information on how to properly report SDM in a manuscript to be published
Zurell, D., Franklin, J., König, C., Bouchet, P. J., Dormann, C. F., Elith, J., ... & Merow, C. (2020). A standard protocol for reporting species distribution models. Ecography, 43(9), 1261-1277.
https://onlinelibrary.wiley.com/doi/full/10.1111/ecog.04960
If you would like to add a basic interpretation list to your R code, please copy and paste the following into your script:
################ BIOCLIM INTERPRETATIONS #############################################
# BIO1 = Annual Mean Temperature
#BIO2 = Mean Diurnal Range (Mean of monthly (max temp - min temp))
#BIO3 = Isothermality (BIO2/BIO7) (×100)
#BIO4 = Temperature Seasonality (standard deviation ×100)
#BIO5 = Max Temperature of Warmest Month
#BIO6 = Min Temperature of Coldest Month
#BIO7 = Temperature Annual Range (BIO5-BIO6)
#BIO8 = Mean Temperature of Wettest Quarter
#BIO9 = Mean Temperature of Driest Quarter
#BIO10 = Mean Temperature of Warmest Quarter
#BIO11 = Mean Temperature of Coldest Quarter
#BIO12 = Annual Precipitation
#BIO13 = Precipitation of Wettest Month
#BIO14 = Precipitation of Driest Month
#BIO15 = Precipitation Seasonality (Coefficient of Variation)
#BIO16 = Precipitation of Wettest Quarter
#BIO17 = Precipitation of Driest Quarter
#BIO18 = Precipitation of Warmest Quarter
#BIO19 = Precipitation of Coldest Quarter
Lesson 4 Overview
•Movement Dynamics
•Packages: adehabitatHR, sp, ggplot2, dplyr, lubridate, mapview
•Analysis of time series, tracking data
•Tabulating movement summaries and saving to .csv
•Home Ranges
•Packages: move, adehabitatHR, caTools, spatialEco, reshape2, tibble, sp, ggplot2, dplyr, lubridate, mapview, cowplot, ggspatial
•Testing several utilization density (UD) estimators
•Visual and mathematical methods of examining Type I (overfitting) and Type II (under-fitting) errors in UD estimators
A tutorial on basic exploratory analyses on movement data
An in-depth lecture on movement home range models and the limitations of Utilization Density (UD) estimators regarding Type I (over-fitting) and Type II (under-fitting) errors.
Lesson 5 Overview
•Trapping Analysis
•Packages: rgbif, mapview, sp, dplyr, ggplot2, Viridis, scatterpie, raster, reshape2, tibble, Vegan, BiodiversityR, ggspatial
•Query trapping study datasets from GBIF
•Data wrangling techniques to transpose data columns to long-form
•Visualizations of species richness
•Map richness based pie-charts for each study site
•Analysis of species richness, density, and abundance
•Species richness indices
•Rarefaction curves, and rank abundance plots
A simple walk-through on querying datasets from GBIF, exploring trapping data, and using data visualizations and statistical indexes to analyze the data.
Note: When discussing the best-fit model of the radlattice() function call on line 274, I incorrectly state the NULL model would be the best-fit model in this case, therefore disqualifying the others. The best-fit model is in fact the Mandelbrot test with an AIC of 350.17.
Sorry for my mistake and any confusion it may have caused!
Lesson 6 Overview
•Trait Analysis
•Packages: ape, taxise, rentrez, phytools, Select, treeio, ggtree, data.tree, tidytree, ggplot2, dplyr, traits, stringr, cowplot
•Create phylogenetic trees using a species list and taxonomic data from NCBI
•Examine various branch distance simulation options
•Combine trait data with tree data
•Subset and visualize trait data in trees for comparative analysis
•Sequence Analysis
•Packages: sequin, adegenet, ape, ggtree, DECIPHER, viridis, ggplot2
•Query raw Internal Transcribed Spacer (ITS) sequences from NCBI
•Read fasta data into R
•Form alignments and distance matrices
•Visualize tree data using baseR
•Modify and customize tree plots in ggtree
An end-to-end tutorial on gathering sequence data from NCBI in fasta format, importing/exporting, aligning, and building customized trees using ggtree.
Basic guide to tree building and including trait data in R for comparative analysis
Learn a wide variety of ecological data analyses by mining your own species occurrences and environmental data from various online sources. R code provided in each lesson is reproducible and easy to modify for your own projects and research. Upon completion of this course, you should have the knowledge to perform these analyses and GIS techniques with your own data, with an improved knowledge and understanding of the packages, functions, and R language.