---
title: "UNSW Kensington campus biodiversity project"
subtitle: "Reproducible workflow using Python"
author:
- name:
given: "José R."
family: "Ferrer-Paris"
email: j.ferrer@unsw.edu.au
orcid: 0000-0002-9554-3395
corresponding: true
affiliations:
- id: ces
name: Centre for Ecosystem Science, University of New South Wales
city: Sydney
country: Australia
- id: udash
name: UNSW Data Science Hub, University of New South Wales
city: Sydney
country: Australia
date: 25 Apr 2026
date-modified: last-modified
categories: [Python, Plotly, Folium, Geopandas, NSW]
citation:
url: https://jrfep.quarto.pub/natural-code/projects
engine: jupyter
format:
html:
code-fold: true
code-summary: "Show the code"
code-tools: true
toc: true
from: markdown
editor_options:
chunk_output_type: console
image: "https://inaturalist-open-data.s3.amazonaws.com/photos/644927954/medium.jpg"
---
This document summarises observations from the **UNSW Kensington campus biodiversity** iNaturalist project.
> [About the project](https://www.inaturalist.org/projects/unsw-kensington-campus-biodiversity?tab=about): How much biodiversity actually occurs on campus? Which native species do we walk past every day without really noticing? Join us in developing an understanding of the urban biodiversity maintained on campus. Project created by [Mark Ooi](https://www.inaturalist.org/people/markooi) and maintained by [Natalie A](https://www.inaturalist.org/people/natalie1227)
Check out the [the project page](https://www.inaturalist.org/projects/unsw-kensington-campus-biodiversity) at iNaturalist.
This document brings together data from online resources and is meant to be completely reproducible.
I will show here how to download information from this `project` in iNaturalist, query the observations and visualise the data in three complementary ways: taxonomically, spatially, and temporally.
## Overview
This document provides a reproducible overview of biodiversity observations recorded on the UNSW Kensington campus via iNaturalist. Using the official project as a data source, it demonstrates how to:
- Retrieve and update project observations programmatically,
- Enrich records with taxonomic metadata,
- Explore contributions across users,
- Visualise biodiversity taxonomically, temporally, and spatially.
While the analysis is specific to this campus project, the workflow is generic and can be reused for other urban biodiversity projects or bioblitzes.
:::{.callout-note}
### Methods overview
- Data source: iNaturalist project observations
- Access method: iNaturalist public API via pyinaturalist
- Spatial reference: WGS84 geographic coordinates
- Visualisation libraries: Plotly (taxonomic, temporal), Folium (spatial)
:::
## Reproducible workflow with Python
For this document I am using some functions from [PyiNaturalist](https://pyinaturalist.readthedocs.io/en/stable/) for querying and downloading of the data; [pandas](https://pandas.pydata.org/) for reading data as a data frame; [plotly](https://plotly.com/python/) and [Folium](https://python-visualization.github.io/folium/latest/index.html) for data visualisation; as well as selected functions from the _urllib_, _owslib_, _json_ and _datetime_ modules.
```{python}
#| code-summary: "Load modules in python"
from pyinaturalist import (
get_observations,
get_projects_by_id,
get_taxa_by_id,
pprint,
)
from itertools import compress, islice
import plotly.express as px
import folium
import pandas as pd
from datetime import datetime
import urllib.parse, urllib.request, json
from itables import init_notebook_mode
init_notebook_mode(all_interactive=True)
```
I will also use a custom function to create a treemap of the taxonomic information of all records.
```{python}
#| eval: true
#| code-summary: "Define function"
def group_and_plot_data(x,aggfuncs,groupcols):
gd=x.groupby(groupcols).agg(aggfuncs).reset_index()
gd.columns = [' '.join(col).strip() for col in gd.columns.values]
value_col = gd.columns.values[-1]
fig = px.treemap(gd,
path=[px.Constant("UNSW Campus Bioblitz 2026"),] + groupcols,
values=value_col,
color=value_col,
hover_data=[value_col],
color_continuous_scale='RdBu')
fig.update_layout(margin = dict(t=5, l=5, r=5, b=5))
return(fig)
```
## Data access and download
For this workflow we will load observations records and spatial data from [iNaturalist](https://www.inaturalist.org/home) and map layers from New South Wales Spatial Services.
### iNaturalist observations for the project
The `pyinaturalist` library in Python provides convenient access to the iNaturalist API. We need the `PROJECT_ID` to query the API with the function `get_observations`.
```{python}
#| code-summary: "Get project information from iNat"
#| eval: true
projects = get_projects_by_id([285699, 281267])
pprint(projects)
```
Note: two project IDs are queried here for comparison and testing; only the UNSW Kensington Campus project is used in the analysis below.
```{python}
#| code-summary: "Get number of observations from iNat"
#| eval: true
PROJECT_ID = projects['results'][1]['id']
PROJECT_NAME = projects['results'][1]['title']
observations = get_observations(project_id=PROJECT_ID,
per_page=0)
n_obs = observations['total_results']
print("Project _{}_ has {} observations in iNaturalist".format(PROJECT_ID,n_obs))
```
The following snippet of code goes through the list of iNat's observations (downloaded as a json object or, in this case, a python dictionary) to extract records of species names, taxonomic information, date of the observation and the preferred common name, if present. Coordinates are stored using iNaturalist’s preferred latitude–longitude order and passed directly to Folium for mapping.
```{python}
#| code-summary: "Download all observations"
#| eval: true
records=list()
taxa=list()
j=1
requested=0
while requested < n_obs:
print("Requesting observations from project _{}_: page {}, total of {} observations downloaded".format(PROJECT_NAME,j,min(j*200,n_obs)))
observations = get_observations(project_id=PROJECT_ID,per_page=200,page=j)
requested = j*200
j=j+1
for obs in observations['results']:
if obs['taxon'] is not None:
for tid in obs['ident_taxon_ids']:
taxon_record = {
'uuid': obs['uuid'],
'taxon id': tid
}
taxa.append(taxon_record)
record = {
'uuid': obs['uuid'],'quality_grade': obs['quality_grade'],
'rank': obs['taxon']['rank'],
'species_name': obs['taxon']['name'],
'observed_on': datetime.date(obs['observed_on']),
'location': obs['location'],
'date': obs['observed_on_details']['date'],
'user': obs['user']['login'],
'user_name': obs['user']['name']
}
if 'iconic_taxon_name' in obs['taxon'].keys():
record['iconic_taxon']= obs['taxon']['iconic_taxon_name']
if 'preferred_common_name' in obs['taxon'].keys():
record['common_name']= obs['taxon']['preferred_common_name']
if len(obs['observation_photos'])>0:
record['photourl'] = obs['observation_photos'][0]['photo']['url']
record['photoattrb'] = obs['observation_photos'][0]['photo']['attribution']
else:
record['photourl'] = "https://upload.wikimedia.org/wikipedia/commons/d/d9/Icon-round-Question_mark.svg"
record['photoattrb'] = "no image"
records.append(record)
inat_obs = pd.DataFrame(records).sort_values('observed_on')
taxa = pd.DataFrame(taxa)
```
Note that iNaturalist API usage is subject to rate limits; for very large projects, pagination and batching strategies may need adjustment.
## Observations per user
Citizen science projects rely on uneven but often highly committed participation. Here we summarise the number of observations contributed by each participant, separated by iNaturalist quality grade.
This helps identify:
- Highly active contributors,
- The proportion of research‑grade records,
- Opportunities for targeted outreach or training.
```{python}
#| code-summary: "crosstabulate users and quality grade"
#| eval: true
pd.crosstab([inat_obs.user,inat_obs.user_name,],inat_obs.quality_grade)
```
As expected for a campus‑scale project, a small number of contributors account for a large fraction of records, while many users contribute occasional observations.
## Taxonomic information
Raw iNaturalist observations often include identifications at different taxonomic depths. To allow consistent summaries, we retrieve the full taxonomic context for each identification and attach it to the observation table.
This allows us to:
- Count records at any taxonomic rank,
- Visualise the structure of campus biodiversity,
- Preserve uncertainty where species‑level identifications are not yet available.
```{python}
#| code-summary: "Add taxonomic information to data frame"
#| eval: true
all_taxa=list(set(taxa['taxon id']))
def chunk(it, size):
it = iter(it)
return iter(lambda: tuple(islice(it, size)), ())
for slc in chunk(all_taxa,30):
taxa_query = get_taxa_by_id(slc, rank_level=[70,60,50,40,30,20,10])
for res in taxa_query['results']:
qry = taxa.loc[taxa['taxon id'] == res['id'],'uuid']
inat_obs.loc[inat_obs.uuid.isin(qry), res['rank']] = res['name']
```
```{python}
#| eval: true
#| code-summary: "Summarise observations by nested taxonomic levels"
agg_funcs = {'uuid':['count']}
group_columns = ['kingdom','phylum','class','order','family']
fig1 = group_and_plot_data(inat_obs, agg_funcs, group_columns)
fig1.show()
```
The treemap highlights the taxonomic breadth of the campus, showing how observations are distributed across major lineages. Uneven block sizes reflect the sampling intensity for different groups, given a high participation of Biological, Earth and Environmental Science students, you would expect a broad representation of different taxonomic groups, including some that are often overlooked or underrepresented in other Bioblitzes.
## Timeline of observations
To understand how the project has grown over time, we track cumulative counts of observations, species, and users.
```{python}
#| eval: true
#| code-summary: "Plot of cumulative number of records per time"
inat_obs['Observations'] = (~inat_obs['uuid'].duplicated()).cumsum()
inat_obs['Species recorded'] = (~inat_obs['species_name'].duplicated()).cumsum()
inat_obs['Participants'] = (~inat_obs['user'].duplicated()).cumsum()
timeline = (
inat_obs
.groupby('date')
.agg({
'Observations': 'max',
'Species recorded': 'max',
'Participants': 'max'
})
.reset_index()
)
```
Cumulative curves make it easy to detect periods of intense activity (e.g. bioblitz events) and to compare rates of species discovery versus sampling effort. We use an interactive line graph to show these changes. This plotly graph allows to zoom and pan, and to hide and show each variable.
```{python}
fig = px.line(
timeline,
x='date',
y=[
'Observations',
'Species recorded',
'Participants'
],
labels={
'date': 'Observation date',
'value': 'Cumulative count',
'variable': 'Metric'
},
title='Growth of the UNSW Kensington Campus Biodiversity Project'
)
fig.update_traces(line=dict(width=3))
fig.update_layout(
legend_title_text='What is being counted',
legend=dict(
orientation='h',
yanchor='bottom',
y=1.02,
xanchor='right',
x=1
),
hovermode='x unified',
margin=dict(t=60, l=10, r=10, b=10)
)
fig.show()
```
The cumulative curves show how sampling effort (observations and participants) translates into biodiversity discovery over time. Periods of rapid increase typically correspond to organised events or teaching activities, while plateaus may indicate reduced sampling effort.
## Map of iNat observations
For the spatial visualisation of the data, we first select a base layer from [NSW Spatial Services](https://www.spatial.nsw.gov.au) as a WMTS layer.
```{python}
#| code-summary: "Information for creating a WebMapTileService request"
#| eval: true
NSW_basemap_url = "http://maps.six.nsw.gov.au/arcgis/rest/services/public/NSW_Base_Map/MapServer/WMTS?"
nsw_base_layer = 'public_NSW_Base_Map'
params = {
'Service': 'WMTS',
'Request': 'GetTile',
'Version': '1.0.0',
'Style': 'default',
'tilematrixset': 'default028mm',
'Format': 'image/png',
'layer': nsw_base_layer,
'TileMatrix': '{z}',
'TileRow': '{y}',
'TileCol': '{x}'
}
NSW_basemap_params = urllib.parse.urlencode(params, safe='{}')
NSW_basemap=NSW_basemap_url + NSW_basemap_params
nsw_base_attrib = u" © State of New South Wales, Department of Customer Service, Spatial Services"
```
Now we create a leaflet map using `folium` and add all research grade observations to it:
```{python}
#| code-summary: "Visualising the map of observations"
#| eval: true
m = folium.Map(location=[-33.918, 151.235],tiles = None, zoom_start=9)
folium.TileLayer(tiles=NSW_basemap,
attr=nsw_base_attrib,
name='NSW base map').add_to(m)
fg = folium.FeatureGroup(name="iNaturalist observations", control=True, attribution="observers @ iNaturalist").add_to(m)
popup_text = """<img src='{url}'>
<caption><i>{species}</i> observed on {date} / {attribution}</caption>
"""
for obs in records:
if obs['quality_grade'] == 'research':
if obs['iconic_taxon']=="Plantae":
pincolor = 'green'
elif obs['iconic_taxon']=="Insecta":
pincolor = 'red'
else:
pincolor = 'gray'
fM=folium.Marker(
location=obs['location'],
popup=popup_text.format(
species=obs['species_name'],
date=obs['date'],
url = obs['photourl'],
attribution = obs['photoattrb']),
icon=folium.Icon(color=pincolor),
)
fg.add_child(fM)
folium.LayerControl().add_to(m)
m.fit_bounds(m.get_bounds())
m
```
## Summary and next steps
In this document we:
1. Retrieved all observations from the UNSW Kensington Campus Biodiversity project,
2. Structured and enriched the data with taxonomic metadata,
3. Explored contributor participation,
4. Visualised biodiversity across taxonomic, temporal, and spatial dimensions.
Because the entire workflow is reproducible, rerunning this notebook will automatically reflect new observations as the project grows.
Future extensions could include:
- Seasonal analyses of flowering and insect activity,
- Comparison with other urban campuses,
- Integration with vegetation or land‑use layers.
Most importantly, the analysis highlights how much biodiversity exists on campus — often unnoticed — and how collective observation can make it visible.
## About
The aim of this code is to be re-used and adapted to track the progress of iNaturalist projects.
For this project the code can be re-run as is, and results will be updated with the latest observations. To adapt this notebook to another iNaturalist project, only the project ID needs to be changed; all other steps remain identical.
### Acknowledgement of country
I acknowledge the Bedegal and Gadigal peoples who are the Traditional Owners of the lands where UNSW Sydney is located.
### This document
This document was created with [quarto](https://quarto.org/), [Jupyter](https://jupyter.org/), [Python](http://python.org), and good quality coffee.
See the code tools in the top right corner of this document for all the source code, and the citation information at the bottom of this document.
### Session information
```{python}
import session_info
session_info.show()
```