Mapping occurrence data from GBIF

R code
GBIF
iNaturalist
Leaflet
ggplot2
Venezuela
Author

José R. Ferrer-Paris

Published

February 19, 2025

Introduction

Hello, bird lovers and data enthusiasts! Today, we’re diving into a colorful world of Amazona parrots in Venezuela.

As part of the NeoMaps initiative (José R. Ferrer-Paris and Sánchez-Mercado 2024) we published a dataset of occurrence records of the Amazona parrots in Venezuela (José R. Ferrer-Paris et al. 2013).

Today, I want explore this dataset, create interactive maps with leaflet, and visualize the occurrence records of the different species using R.

Dataset Overview

Our dataset of Amazona parrots in Venezuela is available at PANGAEA, it has been ingested by the Global Biodiversity Information Facility and has been cited multiple times in the scientific literature:

It has occurrence (presence/absence) records of eight species of Amazona during the NeoMaps bird surveys, and I have found that we need to apply some tricks to avoid mixing the absences with the presences when displying this data.

Tools and Libraries

We’ll use the following R packages:

library(rgbif)
library(leaflet)
library(gameofthrones)
library(ggplot2)
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
  • rgbif: Interface to access GBIF data.
  • leaflet: Create interactive maps with ease.
  • dplyr: Data manipulation with a simple and intuitive syntax.
  • gameofthrones: Why not? Let’s have some fun with the colour palette.

Step-by-Step Guide

Step 1: A quick map of the dataset

With a few lines of code we can construct a Leaflet map based on the GBIF maps API without much sweat.

Get the dataset key, call the maps API with the occurrence density and a base map:

dataset_key <- "dd282bd6-6e30-4f02-bb36-6bdae0a46090"

gbif_map_occ <- sprintf('https://api.gbif.org/v2/map/occurrence/density/{z}/{x}/{y}@2x.png?style=purpleYellow.poly&bin=hex&hexPerTile=32&datasetKey=%s&srs=EPSG%%3A3857',
 dataset_key)

gbif_map_url <- 'https://tile.gbif.org/3857/omt/{z}/{x}/{y}@1x.png?style=gbif-natural'

And add these together with these lines of code:

leaflet() |>
  setView(lng = -65, lat = 8, zoom = 5) |>
  addTiles(urlTemplate = gbif_map_url) |>
  addTiles(urlTemplate = gbif_map_occ) 

This map shows the density of the occurrence records, both presences and absences for all species recorded.

Step 2: Accessing the Dataset

We’ll use the rgbif package to download and explore the presence records found within the dataset.

First we query how many record are there:

all_occurrences <- occ_search(datasetKey = dataset_key, limit = 0) 

Now we will construct the query in a way that the GBIF API will only return the true presence records (number of observed individuals larger than 1).

nonzero_occurrences <- occ_search(datasetKey = dataset_key,
  organismQuantity="1,1000", limit = 2000) 

Compare the result counts returned by both queries:

# Inspect the data
all_occurrences$meta$count
[1] 12936
nonzero_occurrences$meta$count
[1] 274

For the rest of the code, we will focus on the downloaded presence records

count_ds <- nonzero_occurrences$data 
glimpse(count_ds)
Rows: 274
Columns: 70
$ key                    <chr> "868488963", "868488972", "868488976", "8684889…
$ scientificName         <chr> "Amazona amazonica (Linnaeus, 1766)", "Amazona …
$ decimalLatitude        <dbl> 9.72980, 9.71148, 9.70535, 9.69817, 9.68460, 9.…
$ decimalLongitude       <dbl> -66.64898, -66.65822, -66.66168, -66.66322, -66…
$ issues                 <chr> "cdc,cudc", "cdc,cudc", "cdc,cudc", "cdc,cudc",…
$ datasetKey             <chr> "dd282bd6-6e30-4f02-bb36-6bdae0a46090", "dd282b…
$ publishingOrgKey       <chr> "d5778510-eb28-11da-8629-b8a03c50a862", "d57785…
$ installationKey        <chr> "600586fe-f762-11e1-a439-00145eb45e9a", "600586…
$ hostingOrganizationKey <chr> "d5778510-eb28-11da-8629-b8a03c50a862", "d57785…
$ publishingCountry      <chr> "DE", "DE", "DE", "DE", "DE", "DE", "DE", "DE",…
$ protocol               <chr> "DWC_ARCHIVE", "DWC_ARCHIVE", "DWC_ARCHIVE", "D…
$ lastCrawled            <chr> "2025-01-18T00:51:52.275+00:00", "2025-01-18T00…
$ lastParsed             <chr> "2025-02-01T01:32:57.049+00:00", "2025-02-01T01…
$ crawlId                <int> 193, 193, 193, 193, 193, 193, 193, 193, 193, 19…
$ basisOfRecord          <chr> "HUMAN_OBSERVATION", "HUMAN_OBSERVATION", "HUMA…
$ occurrenceStatus       <chr> "PRESENT", "PRESENT", "PRESENT", "PRESENT", "PR…
$ taxonKey               <int> 2479666, 2479666, 2479666, 2479666, 2479666, 24…
$ kingdomKey             <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ phylumKey              <int> 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44,…
$ classKey               <int> 212, 212, 212, 212, 212, 212, 212, 212, 212, 21…
$ orderKey               <int> 1445, 1445, 1445, 1445, 1445, 1445, 1445, 1445,…
$ familyKey              <int> 9340, 9340, 9340, 9340, 9340, 9340, 9340, 9340,…
$ genusKey               <int> 2479603, 2479603, 2479603, 2479603, 2479603, 24…
$ speciesKey             <int> 2479666, 2479666, 2479666, 2479666, 2479666, 24…
$ acceptedTaxonKey       <int> 2479666, 2479666, 2479666, 2479666, 2479666, 24…
$ acceptedScientificName <chr> "Amazona amazonica (Linnaeus, 1766)", "Amazona …
$ kingdom                <chr> "Animalia", "Animalia", "Animalia", "Animalia",…
$ phylum                 <chr> "Chordata", "Chordata", "Chordata", "Chordata",…
$ order                  <chr> "Psittaciformes", "Psittaciformes", "Psittacifo…
$ family                 <chr> "Psittacidae", "Psittacidae", "Psittacidae", "P…
$ genus                  <chr> "Amazona", "Amazona", "Amazona", "Amazona", "Am…
$ species                <chr> "Amazona amazonica", "Amazona amazonica", "Amaz…
$ genericName            <chr> "Amazona", "Amazona", "Amazona", "Amazona", "Am…
$ specificEpithet        <chr> "amazonica", "amazonica", "amazonica", "amazoni…
$ taxonRank              <chr> "SPECIES", "SPECIES", "SPECIES", "SPECIES", "SP…
$ taxonomicStatus        <chr> "ACCEPTED", "ACCEPTED", "ACCEPTED", "ACCEPTED",…
$ iucnRedListCategory    <chr> "LC", "LC", "LC", "LC", "LC", "LC", "LC", "LC",…
$ continent              <chr> "SOUTH_AMERICA", "SOUTH_AMERICA", "SOUTH_AMERIC…
$ year                   <int> 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010,…
$ month                  <int> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,…
$ day                    <int> 12, 12, 12, 12, 12, 12, 12, 18, 17, 17, 17, 17,…
$ eventDate              <chr> "2010-03-12T05:56", "2010-03-12T06:16", "2010-0…
$ startDayOfYear         <int> 71, 71, 71, 71, 71, 71, 71, 77, 76, 76, 76, 76,…
$ endDayOfYear           <int> 71, 71, 71, 71, 71, 71, 71, 77, 76, 76, 76, 76,…
$ modified               <chr> "2017-08-08T07:09:13.000+00:00", "2017-08-08T07…
$ lastInterpreted        <chr> "2025-02-01T01:32:57.049+00:00", "2025-02-01T01…
$ license                <chr> "http://creativecommons.org/licenses/by/4.0/leg…
$ organismQuantity       <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ organismQuantityType   <chr> "individuals", "individuals", "individuals", "i…
$ isSequenced            <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE…
$ identifier             <chr> "1002_5", "1005_5", "1006_5", "1007_5", "1009_5…
$ facts                  <chr> "none", "none", "none", "none", "none", "none",…
$ relations              <chr> "none", "none", "none", "none", "none", "none",…
$ isInCluster            <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE…
$ datasetID              <chr> "https://doi.org/10.1594/PANGAEA.803430", "http…
$ recordedBy             <chr> "{'id': '43638', 'name': 'Ferrer-Paris, José R'…
$ geodeticDatum          <chr> "WGS84", "WGS84", "WGS84", "WGS84", "WGS84", "W…
$ class                  <chr> "Aves", "Aves", "Aves", "Aves", "Aves", "Aves",…
$ countryCode            <chr> "VE", "VE", "VE", "VE", "VE", "VE", "VE", "VE",…
$ recordedByIDs          <chr> "none", "none", "none", "none", "none", "none",…
$ identifiedByIDs        <chr> "none", "none", "none", "none", "none", "none",…
$ gbifRegion             <chr> "LATIN_AMERICA", "LATIN_AMERICA", "LATIN_AMERIC…
$ country                <chr> "Venezuela (Bolivarian Republic of)", "Venezuel…
$ publishedByGbifRegion  <chr> "EUROPE", "EUROPE", "EUROPE", "EUROPE", "EUROPE…
$ identifier.1           <chr> "1002_5", "1005_5", "1006_5", "1007_5", "1009_5…
$ catalogNumber          <chr> "11948840_1002", "11948840_1005", "11948840_100…
$ institutionCode        <chr> "Pangaea", "Pangaea", "Pangaea", "Pangaea", "Pa…
$ collectionCode         <chr> "doi:10.1594/PANGAEA.803430", "doi:10.1594/PANG…
$ gbifID                 <chr> "868488963", "868488972", "868488976", "8684889…
$ name                   <chr> "Amazona amazonica (Linnaeus, 1766)", "Amazona …

Step 3: Data Analysis and Visualization

Let’s use dplyr and ggplot2 to summarise and visualise the number of records per species.

# Group by species and count occurrences
species_counts <- count_ds |>
  group_by(scientificName) |>
  summarise(count = n(),
    inds=sum(organismQuantity))

# Plot species occurrences
species_plot <- ggplot(species_counts, aes(x = reorder(scientificName, count), y = count)) +
  geom_bar(stat = "identity", fill = "skyblue") +
  coord_flip() +
  labs(title = "Occurrence of Amazona Parrots in Venezuela",
       x = "Species", y = "Occurrences") +
  theme_minimal()

# Display the plot
print(species_plot)

Step 4: Interactive Mapping with Leaflet

Now, let’s visualize our parrot species on a map using leaflet.

First let’s create a palette of colours for all species in our dataset:

species <- unique(count_ds$scientificName)
pal <- got(length(species), option = "Daenerys")
names(pal) <- species

Now we initialise a leaflet map with the gbif basemap:

# Create a leaflet map with different species layers
map <- leaflet() |>
  addTiles(urlTemplate = gbif_map_url)

We add the presence records for each species:

# Split data by species and add to map
for (sN in species) {
  x <- count_ds |>
    filter(scientificName %in% sN) 
  map <- map |> addCircleMarkers( data = x,
                     lng = ~decimalLongitude, lat = ~decimalLatitude,
                     popup = ~paste("Species:", sN),
                     radius = 5,
                     color = pal[[sN]],
                     fillOpacity = 0.85,
                     group = sN)
}

We add the layer control and display the map:

# Add layer controls
map <- map |> addLayersControl(
  overlayGroups = unique(count_ds$scientificName),
  options = layersControlOptions(collapsed = FALSE)
)

# Display the map
map

Conclusion

In summary, we successfully accessed and mapped the occurrences of Amazona parrots in Venezuela using the powerful GBIF dataset. We employed interactive maps with leaflet and enriched our understanding through visualizations, unveiling the beauty and distribution of these incredible parrots.

If you’re involved in conservation, a researcher, or simply a nature lover, these tools provide an excellent way to explore and visualize biodiversity data.

Happy bird watching and mapping! 🦜🌍

References

Ferrer-Paris, José R., and Ada Sánchez-Mercado. 2024. The Neotropical Biodiversity Mapping Initiative: Transforming Venezuelan biodiversity monitoring and capacity building.” Societal Impacts 4: 100088. https://doi.org/https://doi.org/10.1016/j.socimp.2024.100088.
Ferrer-Paris, José R, Ada Sánchez-Mercado, Jon Paul Rodrı́guez, and Gustavo A Rodrı́guez. 2013. Detection histories for eight species of Amazona parrots in Venezuela during the NeoMaps bird surveys in 2010.” Dataset. PANGAEA. https://doi.org/10.1594/PANGAEA.803430.