Aussie, Aussie, Aussie

Python

Plotly

Geopandas

Australia

Author

José R. Ferrer-Paris

Published

May 25, 2025

Modified

February 23, 2026

Sunburst plot of my iNaturalist Observations in Australia

A large part of my iNat activity has been accumulating in the past years, since I moved to Australia in 2019.

See my previous posts regarding my timeline of observations or the summary of observations around the world

Here I want to zoom in into these observations in Australia to figure out how many species I have recorded in different subdivisions of this huge country. Which states and regional areas have I visited so far? how many observations have I made in each of those?

Overview

Here I reproduce the minimum number of steps required to create an interactive sunburst plot with the plotly library in Python.

Download iNaturalist observations from around the world.
Download spatial data representing Australia and its subdivisions.
Intersect observations with state and regional boundaries.
Generate interactive graph to explore number of observations at different levels.

Tools and Libraries

I will be using a Python environment with a selection of my favourite libraries, as explained here.

We will be using the following libraries in this blog post:

import geopandas as gpd
import zipfile, requests, io
import pandas as pd
from pyinaturalist import get_observations, get_places_by_id
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
import plotly.express as px
import numpy as np
from itertools import chain
import mapclassify
import pyprojroot

Declare the root folder for the repository:

repodir = pyprojroot.find_root(pyprojroot.has_dir(".git"))

Step-by-Step Guide

Step 1: Downloading iNaturalist Observations

Let’s begin by fetching iNaturalist observations with pyinaturalist. We will need a selection of global observations so we select the user neomapas¹, and see where in the world they have been.

username = 'neomapas'
observations = get_observations(user_id=username, per_page=0)
n_obs = observations['total_results']

Let’s print an overview of these total results:

print("user {} has {} observations in iNaturalist".format(username,n_obs))

user neomapas has 3324 observations in iNaturalist

The maximum number of observations we can download in each query 200, so we need to use pagination to get all results. For each query we will extract a minimum selection of fields that we will use for summarising the data (coordinates, place and species guess), but there are many other fields that could be important to include for more in depth explorations.

records=list()
j=1
while len(records) < n_obs:
    print("Requesting observations from user _{}_: page {}, total of {} observations downloaded".format(username,j,min(j*200,n_obs)))
    observations = get_observations(user_id='neomapas',per_page=1000,page=j)
    for obs in observations['results']:
        record = {'quality': obs['quality_grade'],
        'description': obs['description'],
        'location': obs['place_guess'],
        'longitude': obs['location'][1],
        'latitude': obs['location'][0],
        'species guess': obs['species_guess'],
        'observed on': obs['observed_on'],
        }
        if len(obs['observation_photos'])>0:
            record['url'] = obs['observation_photos'][0]['photo']['url']
            record['attribution'] = obs['observation_photos'][0]['photo']['attribution']
        records.append(record)
    j=j+1

Requesting observations from user _neomapas_: page 1, total of 200 observations downloaded
Requesting observations from user _neomapas_: page 2, total of 400 observations downloaded
Requesting observations from user _neomapas_: page 3, total of 600 observations downloaded
Requesting observations from user _neomapas_: page 4, total of 800 observations downloaded
Requesting observations from user _neomapas_: page 5, total of 1000 observations downloaded
Requesting observations from user _neomapas_: page 6, total of 1200 observations downloaded
Requesting observations from user _neomapas_: page 7, total of 1400 observations downloaded
Requesting observations from user _neomapas_: page 8, total of 1600 observations downloaded
Requesting observations from user _neomapas_: page 9, total of 1800 observations downloaded
Requesting observations from user _neomapas_: page 10, total of 2000 observations downloaded
Requesting observations from user _neomapas_: page 11, total of 2200 observations downloaded
Requesting observations from user _neomapas_: page 12, total of 2400 observations downloaded
Requesting observations from user _neomapas_: page 13, total of 2600 observations downloaded
Requesting observations from user _neomapas_: page 14, total of 2800 observations downloaded
Requesting observations from user _neomapas_: page 15, total of 3000 observations downloaded
Requesting observations from user _neomapas_: page 16, total of 3200 observations downloaded
Requesting observations from user _neomapas_: page 17, total of 3324 observations downloaded

Now we need to bundle all these records into a data frame with geospatial information using geopandas. We define a data frame with pandas:

inat_obs=pd.DataFrame(records)

And transform the numeric variables latitude and longitude into a geometry with a explicit Coordinate Reference System (CRS):

gs = gpd.points_from_xy(inat_obs.longitude, inat_obs.latitude, crs="EPSG:4326")
inat_obs_xy=gpd.GeoDataFrame(inat_obs, geometry=gs).to_crs(3112)

Step 2: Downloading spatial data for Australia

This can very easy to do thanks to the great features of geopandas.read_file function and the excellent data service of the Australian Bureau of Statistics.

I am using the Australian Statistical Geography Standard (ASGS) Edition 3 data released in July 2021, which provides a link to access the a zip file with the spatial vector files:

zip_url='https://www.abs.gov.au/statistics/standards/australian-statistical-geography-standard-asgs-edition-3/jul2021-jun2026/access-and-downloads/digital-boundary-files/SA2_2021_AUST_SHP_GDA2020.zip'

In previous versions that link was all we needed, geopandas did the rest:

SA2 = gpd.read_file(zip_url)

But this stopped working and I had to switch to a two step process, first download and extract to our data folder:

r = requests.get(zip_url)
z = zipfile.ZipFile(io.BytesIO(r.content))
z.extractall(repodir / "data")

Then read the shapefile from the data folder:

SA2 = gpd.read_file(repodir / "data" / "SA2_2021_AUST_GDA2020.shp")

Step 3: Intersect observations

We’ll overlay these observations on administrative boundaries using another geopandas function: sjoin_nearest. This will find the nearest feature, but we will need to use an appropriate projection (here we use the coordinate reference system EPSG:3112):

join_nnb_df = gpd.sjoin_nearest(inat_obs_xy, SA2.to_crs(3112), distance_col="distances", how="left")

This is useful to include some observations near the shore of Australia, that would not be detected using a direct intersection query, but we need to exclude the ones which are off-shore (in other countries):

obs_in_Australia=join_nnb_df.loc[join_nnb_df.distances<10000,]

Step 4: Generate interactive graphs

Finally, we’ll visualize the complete data using an interactive sunburst plot in plotly.

We start by grouping our data using state names (STE_NAME2), greater capital cities names (GCC_NAME21) and three statistical area levels (SA4_NAME21, SA3_NAME21 and SA2_NAME21), and counting the number of unique species and places names.

aggfuncs = {'species guess':['count',pd.Series.nunique],
           'location':['count',pd.Series.nunique]}
obs_by_region=obs_in_Australia.groupby(['AUS_NAME21','STE_NAME21','GCC_NAME21','SA4_NAME21','SA3_NAME21','SA2_NAME21']).agg(aggfuncs)

obs_by_region.columns = [' '.join(col).strip() for col in obs_by_region.columns.values]

Now we create the plotly graph using this object. We visualise the number of unique species per group:

fig = px.sunburst(obs_by_region.reset_index(), path=['AUS_NAME21','STE_NAME21','GCC_NAME21','SA4_NAME21','SA3_NAME21','SA2_NAME21'], values='species guess nunique', )
fig.update_layout(width=800, height=800, plot_bgcolor="white")
fig.show()

And that’s it!

Conclusion

In summary, we:

Downloaded observations with pyinaturalist.
Intersected observations with state and regional boundaries using geopandas.
Created insightful graphs with plotly.

Hope you find this useful!

Footnotes

My alter ego in the iNat world↩︎