Monitoring an iNaturalist project

Python
iNaturalist
Altair
Australia
folium
leaflet
Gayini
Author

José R. Ferrer-Paris

Published

February 18, 2025

Modified

April 5, 2025

I need to bring together some of the tools and tricks I have written about in previous posts in order to monitor a project in iNaturalist.

I will show here how to download information from a region in iNaturalist, query the observations in that region and visualise the data in three alternative ways: taxonomically, spatially, and temporally.

Prepare for the action!

I am doing this little task with python. First I use the import statements to load the modules I will use in this session. I will be using the get_observations function and the ICONIC_TAXA constants from PyiNaturalist for query and download of the data, I will visualise data with Altair and Folium. Also using some functions from GeoPandas, pandas and datetime for convenience in reading data as a data frame.

from pyinaturalist import get_observations, ICONIC_TAXA
import altair as alt
import pandas as pd
import geopandas as gpd
import folium
from datetime import datetime

What is the project about?

According to the Project page:

The Nari Nari Tribal Council manages and is actively restoring 80,000 ha of the extensive Gayini wetlands on the Murrumbidgee River. With their consortium partners, they are managing environmental flows, feral animals, cultural burning and grazing of livestock. The area is a key breeding area for waterbird rookeries, including the three ibis species, spoonbills, cormorants, herons and Australian pelicans. It also has extensive areas of lignum, river red gum and blackbox as well as terrestrial ecosystems. Nari Nari are supported by three other consortium partners, The Nature Conservancy, Murray Darling Wetlands Working Group and UNSW’s Centre for Ecosystem Science.

Querying iNaturalist from Python

The first option is to use the pyinaturalist library in python. Very useful and download great amounts of information. There are many handy functions in that library, but I am only going to use one, and extract all the information I need from the downloaded json object.

I need to define a place_id that matches the area of interest, in this case the Gayini wetlands in iNaturalist are identified as this:

PLACE_ID = 209778
PLACE_NAME = 'Gayini wetlands'

I will use this place_id in the get_observations function to query the iNaturalist API:

observations = get_observations(place_id=PLACE_ID, 
                               per_page=200)

This object has some handy summaries and lots of results:

observations.keys()

dict_keys(['total_results', 'page', 'per_page', 'results'])

Let’s check the total first:

observations['total_results']

378

Not too many observations, as I said, this project is just getting started. Gayini is a great place to observe wildlife, but is a remote place. Hopefully this code will be re-usable in future years to make comparable visualisation after more people have collected information on animals and plants.

Update: Well the number of total results is now larger than 200, that means that the code only shows the most recent 200 obs.

Taxonomic visualisation

So, first things first, let’s see how many species and which groups do we have here.

iNaturlist group species into iconic taxa. Since we don’t need to get into the details of taxonomic classifications for this project, this will do for this excercise.

Here I prepared a bit of code that goes through the list of iNat’s observations (downloaded as a json object or, in this case, a python dictionary), and filters the research quality grade observations to extract records of species names, iconic taxon, date of the observation and the preferred common name, if present.

records = list()
for obs in observations['results']:
    if obs['quality_grade'] == 'research':
        if obs['taxon'] is not None:
            record = {
                'rank': obs['taxon']['rank'],
                'species_name': obs['taxon']['name'],
                'iconic_taxon': obs['taxon']['iconic_taxon_name'],
                'observed_on': datetime.date(obs['observed_on']),
            }
            if 'preferred_common_name' in  obs['taxon'].keys():
                record['common_name']= obs['taxon']['preferred_common_name']
            records.append(record)

How does this look now? Let’s use the pandas data frame function to have a look:

df = pd.DataFrame(records)
df.head()

rank species_name iconic_taxon observed_on common_name
0 species Haliastur sphenurus Aves 2025-03-31 Whistling Kite
1 species Junonia villida Insecta 2025-03-31 Meadow Argus
2 species Ardea alba Aves 2025-03-31 Great Egret
3 species Macropus giganteus Mammalia 2025-03-31 Eastern Grey Kangaroo
4 species Eolophus roseicapilla Aves 2025-03-31 Galah

There are not to many observation in this data frame, we started with a small-ish set of observations, and we filtered out those regarded as casual observations, so we have few species.

For example, we can group observations by species and count the total number of records.

colnames=['iconic_taxon','species_name','common_name','rank',]
df.groupby(colnames)['species_name'].agg([ 'count'])

count
iconic_taxon species_name common_name rank
Amphibia Limnodynastes tasmaniensis Spotted Marsh Frog species 1
Animalia Cherax destructor Common Yabby species 1
Aves Ardea alba Great Egret species 1
Eolophus roseicapilla Galah species 2
Haliastur sphenurus Whistling Kite species 1
Manorina melanocephala Noisy Miner species 1
Pelecanus conspicillatus Australian Pelican species 2
Insecta Agonoscelis rutila Horehound Bug species 1
Apis mellifera Western Honey Bee species 2
Camponotus consobrinus Banded Sugar Ant species 1
Chauliognathus tricolor Tricolor Soldier Beetle species 1
Junonia villida Meadow Argus species 2
Monistria pustulifera Blistered Pyrgomorph species 1
Scopula rubraria Plantain moth species 1
Mammalia Macropus fuliginosus Western Grey Kangaroo species 1
Macropus giganteus Eastern Grey Kangaroo species 1
Osphranter rufus Red Kangaroo species 1
Mollusca Bullastra lessoni Southern Bubble Pond Snail species 1
Physella acuta Acute Bladder Snail species 1
Plantae Asphodelus fistulosus Onion-Leafed Asphodel species 1
Atriplex nummularia Old Man Saltbush species 1
Azolla rubra Red Azolla species 1
Cirsium vulgare Bull Thistle species 2
Citrullus amarus Fodder Melon species 1
Cucumis myriocarpus paddy melon species 1
Eragrostis cilianensis stinkgrass species 1
Lachnagrostis filiformis blown grass species 1
Lactuca serriola prickly lettuce species 1
Lobelia concolor Poison Pratia species 1
Maireana brevifolia Short-leaf Bluebush species 1
Marrubium vulgare White Horehound species 1
Marsilea drummondii Common Nardoo species 2
Mesembryanthemum granulicaule Wiry Ashbush species 2
Nicotiana glauca tree tobacco species 1
Nitraria billardierei nitre bush species 1
Pseudognaphalium luteoalbum Jersey Cudweed species 1
Sonchus oleraceus Common Sow-thistle species 1
Stemodia florulenta Bluerod species 2
Teucrium racemosum Grey Germander species 1
Xanthium spinosum spiny cocklebur species 2
Reptilia Varanus gouldii Sand Goanna species 1

Now we will summarise this by the iconic taxa:

iconic_df = df.groupby('iconic_taxon')['species_name'].agg([ 'count', 'nunique']).reset_index()

We can now use these lines of code to add an url with icons for each iconic taxon:

TAXON_IMAGE_URL = 'https://raw.githubusercontent.com/inaturalist/inaturalist/main/app/assets/images/iconic_taxa/{taxon}-75px.png'

iconic_df['img']=iconic_df.iconic_taxon.apply(lambda x: TAXON_IMAGE_URL.format(taxon=x.lower()))

And we will prepare some layers of visualisation with Altair, first a barchart of number of observations:

bar1 = alt.Chart(
    iconic_df
    ).mark_bar(
        color='grey', opacity=0.15
        ).encode(
            x=alt.X('iconic_taxon:N', sort='-y'), 
            y='count:Q'
            )
bar1

Then a barchart of number of species:

bar2 = alt.Chart(
    iconic_df
    ).mark_bar(
        color='blue', opacity=0.15
        ).encode(
            x=alt.X('iconic_taxon:N', sort='-y'), 
            y='nunique:Q'
            )
bar2

And we can use the icons as the icing on the cake:

img = alt.Chart(
    iconic_df,
    title=f'Research grade observations in {PLACE_NAME} by iconic taxon',
    width=750,
    height=500,
).mark_image(
    baseline='top'
    ).encode(
        x=alt.X('iconic_taxon:N', sort='-y', title='Iconic taxon'), 
        y=alt.Y('count:Q', title='Number of species/observations'), url='img')
bar1 + bar2 + img

Temporal visualisation

For the temporal component, we are only going to look at two very simple barcharts.

First we will extract the year and month from the observed_on date column in the data frame:

df['Year']=df.observed_on.apply(lambda x: x.year)
df['Month']=df.observed_on.apply(lambda x: x.month)

We can now group the observations by year and count the number of observations and species per year.

observations_by_year = df.groupby('Year')['species_name'].agg([ 'count', 'nunique']).reset_index()

Update: The plots look now biased because they are based on the most recent 200 obs.

With this summary of the grouped data, we build a barchart similar as the examples above:

alt.Chart(
    observations_by_year
).mark_bar().encode(
    x=alt.X('Year:N'), 
    y=alt.Y('count:Q', title='Number of observations'))

So we can see that almost all observations come from the last five years.

Now we can do the same for the month of the year, taking all years together:

observations_by_month = df.groupby('Month')['species_name'].agg([ 'count', 'nunique']).reset_index()
alt.Chart(observations_by_month).mark_bar().encode(x=alt.X('Month:N'), y='count:Q')

We then see how most observations are from the summer months, and none from winter months.

Spatial visualisation

Now the nice part that we always like to see in this blog, the MAP!

For this, we have to do quite a lot of preparation:

  • create a map canvas for leaflet
  • get a base layer for the map
  • get a polygon of the area of interest
  • add the inat observations
  • and enjoy the map!

Folium is Python’s leaflet

Like many artist, our work starts with an empty canvas. In Python we use folium to create a leaflet widget in our website. We just need an initial location and zoom level.

m = folium.Map(location=[-34.65, 143.583333],tiles = None, zoom_start=9)

We are not going to show this yet. Let’s keep adding layers to this.

My dear base layer

As mentioned in a previous post the NSW Spatial Services. Check available services here: https://maps.six.nsw.gov.au/arcgis/rest/services/public.

We will use here the NSW Base Map:

NSW_basemap_url = "http://maps.six.nsw.gov.au/arcgis/rest/services/public/NSW_Base_Map/MapServer/WMTS"

nsw_basemap = NSW_basemap_url + "?Service=WMTS" + "&Request=GetTile" + "&Version=1.0.0" + "&Style=default" + "&tilematrixset=default028mm" + "&Format=image/png" +  "&layer=public_NSW_Base_Map" + "&TileMatrix={z}" + "&TileRow={y}" + "&TileCol={x}"

Let’s not forget the right attribution to the data:

attrib_string = " © State of New South Wales, Department of Customer Service, Spatial Services"

And we can add this base layer to our map with this:

folium.TileLayer(tiles=nsw_basemap, attr=attrib_string, name='NSW base map').add_to(m)

<folium.raster_layers.TileLayer object at 0x1403ffef0>

I know, you probably want to have a peak at how this is looking so far, but let’s wait a bit more, we still need to add the boundary polygon and the observations.

Get the polygon!

Luckily, inaturalist provide an easy to retrieve spatial information using places that have been contributed by the community. The only trick is knowing the place_id beforehand.

In my case, I know this information already, and will use to find a path a KML file with the boundaries of the region of interest:

path = f'https://www.inaturalist.org/places/geometry/{PLACE_ID}.kml'
gayini_polygon = gpd.read_file(path)

And voilà, we have our polygon. How did I know how to do this? The iNatForum is a great place to get answers!

Now we will transform this polygon into a geojson object, and use folium’s GeoJson method to prepare the layer for our map, complete with a pop up message:

gayini_geojson = gpd.GeoSeries(gayini_polygon["geometry"]).to_json()
geo_j = folium.GeoJson(data=gayini_geojson, style_function=lambda x: {"fillColor": "orange"})
folium.Popup('Gayini wetlands').add_to(geo_j)

<folium.map.Popup object at 0x1403d3e60>

Next, we a feature group for our map:

pol = folium.FeatureGroup(name="Boundaries", control=True).add_to(m)

And add the GeoJson layer to it:

geo_j.add_to(pol)

<folium.features.GeoJson object at 0x121420440>

We are almost there, one more step.

A marker for each iNat obs

Now we can add the iNat observations. First let’s prepare another feature group for our map.

fg = folium.FeatureGroup(name="iNaturalist observations", control=True).add_to(m)

We will also need a template for the pop-up of each marker, for example, something like this:

popup_text = """<img src='{url}'>
<caption><i>{species}</i> observed on {observed_on} / {attribution}</caption> {desc}
   """

Next we use these lines of code to run through all the observations queried from iNaturalist in json format, filer the research quality grade observations, and prepare our folium markers (complete with their pop-up), one for each valid observation.

for obs in observations['results']:
    if obs['quality_grade'] == 'research':
        if obs['description'] is None:
            desc = ""
        else:
            desc = obs['description']
        pincolor = 'green'
    else:
        desc = "Observation is not research quality grade."
        pincolor = 'gray'
    if len(obs['observation_photos'])>0:
        photo = obs['observation_photos'][0]['photo']
        photourl = photo['url']
        photoattr = photo['attribution']
    else:
        photourl = "https://static.inaturalist.org/wiki_page_attachments/3154-original.png"
        photoattr = obs['user']
    fg.add_child(
        folium.Marker(
            location=obs['location'],
            popup=popup_text.format(
               species=obs['species_guess'],
                observed_on=obs['observed_on'],
                desc=desc,
               url = photourl,
               attribution = photoattr),
            icon=folium.Icon(color=pincolor),
        )
      )

Enjoy the map

Now we are almost ready. Let’s just fix the bounds of the map to snuggly fit all our markers:

m.fit_bounds(m.get_bounds())

Now add the layer controls to show/hide the layers of information:

folium.LayerControl().add_to(m)

<folium.map.LayerControl object at 0x140696630>

And enjoy!

m

Make this Notebook Trusted to load map: File -> Trust Notebook

Conclusion

Here we use python, pyinaturalist and altair to explore biodiversity records in one region that is part of iNat community project. Here we are also using freely available data for the NSW basemaps. Thanks to iNaturalist and NSW Spatial Services for providing wonderful tools to access their data!

Here the basic recipe:

  • Find the place id for the iNaturalist region of interest,
  • Query the iNaturalist API,
  • Loop through the data to filter and select the variables of interest
  • Explore the taxonomic and temporal dimensions of the data with Altair
  • Mix polygons, basemaps and iNat observations location data into a dynamic map
  • Done!

That’s it for now. Will come back to this in a few years to see the progress of this project.