import geopandas as gpd
import zipfile, requests, io
import pandas as pd
from pyinaturalist import get_observations, get_places_by_id
import matplotlib.pyplot as plt
import numpy as np
import pyprojroot
from IPython.display import display, HTML, MarkdownVisiting National Parks in New South Wales
My iNaturalist Observations in National Parks and reserves of New South Wales
A large part of my iNat activity has been accumulating in the past years, since I moved to Australia in 2019.
See my previous posts regarding my timeline of observations or the summary of observations around the world
Here I want to zoom in into these observations in New South Wales to figure out how many species I have recorded in the terrestrial National Parks and reserves of this wonderful state.
Overview
I divided this post in four steps:
- Download iNaturalist observations from around the world.
- Download spatial data representing Australia and its subdivisions.
- Intersect observations with state and regional boundaries.
- Aggregate observations to explore number of observations at different levels.
Tools and Libraries
I will be using a Python environment with a selection of my favourite libraries, as explained here.
We will be using the following libraries in this blog post:
Declare the root folder for the repository:
repodir = pyprojroot.find_root(pyprojroot.has_dir(".git"))Step-by-Step Guide
Step 1: Downloading iNaturalist Observations
Let’s begin by fetching iNaturalist observations with pyinaturalist. We will need a selection of global observations so we select the user neomapas1, and see where in the world they have been.
username = 'neomapas'
place = 6825 # New South Wales AU
observations = get_observations(user_id=username, place_id=place, per_page=0)
n_obs = observations['total_results']Let’s print an overview of these total results:
print("user {} has {} observations in the target region (place_id: {}) in iNaturalist".format(username,n_obs,place))user neomapas has 777 observations in the target region (place_id: 6825) in iNaturalist
The maximum number of observations we can download in each query 200, so we need to use pagination to get all results. For each query we will extract a minimum selection of fields that we will use for summarising the data (coordinates, place and species guess), but there are many other fields that could be important to include for more in depth explorations.
records=list()
j=1
while len(records) < n_obs:
print("Requesting observations from user _{}_: page {}, total of {} observations downloaded".format(username,j,min(j*200,n_obs)))
observations = get_observations(user_id='neomapas',place_id=place,per_page=1000,page=j)
for obs in observations['results']:
record = {
'uuid': obs['uuid'],
'quality': obs['quality_grade'],
'description': obs['description'],
'location': obs['place_guess'],
'longitude': obs['location'][1],
'latitude': obs['location'][0],
'iconic taxon': obs['taxon']['iconic_taxon_name'],
'species guess': obs['species_guess'],
'observed on': obs['observed_on'],
'points': obs['faves_count'] * 10 + obs['comments_count'] + obs['identifications_count'] * 3,
}
if len(obs['observation_photos'])>0:
record['url'] = obs['observation_photos'][0]['photo']['url']
record['attribution'] = obs['observation_photos'][0]['photo']['attribution']
records.append(record)
j=j+1Requesting observations from user _neomapas_: page 1, total of 200 observations downloaded
Requesting observations from user _neomapas_: page 2, total of 400 observations downloaded
Requesting observations from user _neomapas_: page 3, total of 600 observations downloaded
Requesting observations from user _neomapas_: page 4, total of 777 observations downloaded
Later in the code, I will need to format a html string to define figures with captions, let’s do this now for each record in this list of records:
for record in records:
record['figure']="<figure class='mini'><a href='https://www.inaturalist.org/observations/%s' target=_blank><img src='%s' height=50><figcaption class='mini'>%s: %s</figcaption></a></figure>" % (
record['uuid'],
record['url'],
record['iconic taxon'],
record['species guess'])Now we need to bundle all these records into a data frame with geospatial information using geopandas. We define a data frame with pandas:
inat_obs=pd.DataFrame(records)And transform the numeric variables latitude and longitude into a geometry with a explicit Coordinate Reference System (CRS):
gs = gpd.points_from_xy(inat_obs.longitude, inat_obs.latitude, crs="EPSG:4326")
inat_obs_xy=gpd.GeoDataFrame(inat_obs, geometry=gs).to_crs(3112)Step 2: Downloading spatial data for Australia
This can very easy to do thanks to the great features of geopandas.read_file function and the excellent data service of the Australian Bureau of Statistics.
I am using the Collaborative Australian Protected Areas Database (CAPAD) 2024 - Terrestrial data updated in November 2025. The data is available in different formats. I downloaded a GeoJSON version and keep it in my local folder.
I read the vector file from the data folder:
CAPAD24 = gpd.read_file(repodir / "data" / "Collaborative_Australian_Protected_Areas_Database_(CAPAD)_%E2%80%93_Terrestrial.geojson")And
NSW_parks = CAPAD24.loc[CAPAD24.STATE == 'NSW']How many parks and reserves are there in New South Wales? A lot…
NSW_parks.groupby('TYPE').agg({'GAZ_AREA':['count','sum']})| GAZ_AREA | ||
|---|---|---|
| count | sum | |
| TYPE | ||
| Aboriginal Area | 16 | 1.567268e+04 |
| CCA Zone 1 National Park | 34 | 1.328628e+05 |
| CCA Zone 2 Aboriginal Area | 5 | 2.166141e+04 |
| CCA Zone 3 State Conservation Area | 23 | 1.965327e+05 |
| Conservation Reserve | 10 | 3.048435e+04 |
| Flora Reserve | 106 | 6.994412e+04 |
| Historic Site | 8 | 2.805091e+03 |
| Indigenous Protected Area | 14 | 1.939915e+04 |
| Karst Conservation Reserve | 4 | 5.228100e+03 |
| NRS Addition - Gazettal in Progress | 1 | 0.000000e+00 |
| National Park | 212 | 5.571571e+06 |
| Nature Reserve | 430 | 9.672521e+05 |
| Permanent Park Preserve | 1 | 1.340810e+03 |
| Private Nature Reserve | 2 | 0.000000e+00 |
| Regional Park | 23 | 2.166177e+04 |
| State Conservation Area | 129 | 7.401583e+05 |
It will take some time to visit them all.
Step 3: Intersect observations
We’ll overlay these observations on administrative boundaries using another geopandas function: sjoin_nearest. This will find the nearest feature, but we will need to use an appropriate projection (here we use the coordinate reference system EPSG:3112):
join_nnb_df = gpd.sjoin_nearest(inat_obs_xy, NSW_parks.to_crs(3112), distance_col="distances", how="left")This is useful to include some observations near the boundaries of National Parks (for example less than 100m) :
obs_in_parks=join_nnb_df.loc[join_nnb_df.distances<100,]print("From my {} observations in NSW, I have {} observations in terrestrial parks and reserves".format(n_obs,obs_in_parks.shape[0]))From my 777 observations in NSW, I have 248 observations in terrestrial parks and reserves
Step 4: Summarise the data
Finally, we’ll look at how many observations I have in each of the parks and reserves I have visited so far:
obs_in_parks.groupby(['TYPE','NAME']).agg({'observed on': 'count',
'species guess': 'nunique'})| observed on | species guess | ||
|---|---|---|---|
| TYPE | NAME | ||
| Flora Reserve | Letts Mountain | 1 | 1 |
| Maxwells | 1 | 1 | |
| National Park | Blue Mountains | 22 | 20 |
| Brisbane Water | 3 | 3 | |
| Deua | 10 | 9 | |
| Kamay Botany Bay | 42 | 37 | |
| Lane Cove | 12 | 8 | |
| Malabar Headland | 12 | 12 | |
| Monga | 9 | 8 | |
| Morton | 2 | 2 | |
| Royal | 32 | 27 | |
| South East Forest | 1 | 1 | |
| Sydney Harbour | 46 | 39 | |
| Nature Reserve | Tarawi | 44 | 35 |
| Windsor Downs | 6 | 6 | |
| Regional Park | Killalea | 1 | 1 |
| State Conservation Area | Garawarra | 4 | 3 |
obs_in_parks.groupby(['iconic taxon']).agg({'species guess':['count', 'nunique'], 'NAME':'unique'})| species guess | NAME | ||
|---|---|---|---|
| count | nunique | unique | |
| iconic taxon | |||
| Actinopterygii | 1 | 1 | [Malabar Headland] |
| Amphibia | 1 | 1 | [Garawarra] |
| Animalia | 5 | 4 | [Tarawi, Kamay Botany Bay, Malabar Headland] |
| Arachnida | 10 | 10 | [Letts Mountain, Deua, Brisbane Water, Tarawi,... |
| Aves | 54 | 32 | [Monga, Lane Cove, Tarawi, Malabar Headland, B... |
| Chromista | 1 | 1 | [Kamay Botany Bay] |
| Fungi | 6 | 5 | [Sydney Harbour, Lane Cove, Tarawi, Blue Mount... |
| Insecta | 75 | 63 | [Deua, Sydney Harbour, Killalea, Garawarra, Ka... |
| Mammalia | 9 | 6 | [South East Forest, Deua, Tarawi, Royal] |
| Mollusca | 2 | 2 | [Malabar Headland] |
| Plantae | 63 | 49 | [Maxwells, Monga, Morton, Lane Cove, Tarawi, R... |
| Reptilia | 18 | 10 | [Deua, Lane Cove, Kamay Botany Bay, Royal, Tar... |
These lines of code will show fotos of the most popular observations of each iconic taxon in each park. I group the data twice, first I do the selection based on the points column for each combination of park and iconic taxon, then I iterate across the parks and join the figures in a container.
selection = (
obs_in_parks
.sort_values('points')
.groupby(['TYPE','NAME','iconic taxon'])
.first()
.groupby(['TYPE','NAME'])
.agg({'figure':'unique'})
)
sections = list()
for idx,row in selection.iterrows():
sectionfigures=" ".join(row['figure'])
sectionname="<figure class='mini'><p class='figsection'>%s <i>%s</i></p></figure>" % idx
sections.append(sectionname + sectionfigures)
allsections="<div class='container'>%s</div>" % ("".join(sections))
display(HTML(allsections))Flora Reserve Letts Mountain

Flora Reserve Maxwells

National Park Blue Mountains






National Park Brisbane Water



National Park Deua




National Park Kamay Botany Bay






National Park Lane Cove




National Park Malabar Headland







National Park Monga


National Park Morton

National Park Royal






National Park South East Forest

National Park Sydney Harbour






Nature Reserve Tarawi








Nature Reserve Windsor Downs


Regional Park Killalea

State Conservation Area Garawarra


The look of the output html code depends on the site’s css style definitions. Look at this file if you want to reuse/adapt my style.
This is still a small number of parks and reserves with iNat observations, I will need to plan my future trips around the state to increase these numbers.
Conclusion
In summary, we:
- Downloaded observations with
pyinaturalist. - Intersected observations with national park boundaries using
geopandas. - Grouped and aggregated the data
Hope you find this useful!
Footnotes
My alter ego in the iNat world↩︎