Look back in time: the history of my iNaturalist observations

Python

plotly

global

temporal

timeline

Author

José R. Ferrer-Paris

Published

April 22, 2025

Modified

May 3, 2026

Today, I want to look back at the timeline of my observations and explore temporal patterns of my observation effort throughout the years.

I will explore how to summarise and analyse global iNaturalist observations using Python.

Overview

I will walk through the following steps:

Download iNaturalist observations from around the world.
Display a timeline of the number of observations accumulated by date.
Filter and group observations by years and months.
Generate interactive graphs to explore my levels of activity per year and month.

Tools and Libraries

I will be using a Python environment with a selection of my favourite libraries, as explained here.

We will be using the following libraries in this blog post:

from pyinaturalist import get_observations
from datetime import datetime, date
import pandas as pd
import plotly.express as px
import numpy as np

pyinaturalist: A Python client for the iNaturalist API, allowing access to observation data.
plotly: A graphing library that makes interactive plots and dashboards with ease.

Step-by-Step Guide

Step 1: Downloading iNaturalist Observations

Let’s begin by fetching all my iNaturalist observations with pyinaturalist. My user name in iNat is neomapas.

username = 'neomapas'
observations = get_observations(user_id=username, per_page=0)
n_obs = observations['total_results']

# First we need to figure out how many observations to expect:
print("User _{}_ has {} observations in iNaturalist".format(username,n_obs))

User _neomapas_ has 3324 observations in iNaturalist

The maximum number of observations we can download in each query 200, so we need to use pagination to get all results. For each query we will extract a minimum selection of fields that we will use for summarising the data (coordinates, place and species guess), but there are many other fields that could be important to include for more in depth explorations.

records=list()
j=1
while len(records) < n_obs:
    print("Requesting observations from user _{}_: page {}, total of {} observations downloaded".format(username,j,min(j*200,n_obs)))
    observations = get_observations(user_id=username,per_page=200,page=j)
    for obs in observations['results']:
        record = {
        'location': obs['place_guess'],
        'species guess': obs['species_guess'],
        'year': obs['observed_on_details']['year'],
        'month': obs['observed_on_details']['month'],
        'day': obs['observed_on_details']['day']
        }
        if isinstance(obs['observed_on'], datetime):
            record['observed on']=obs['observed_on'].date()
        elif isinstance(obs['observed_on'], date):
            record['observed on']=obs['observed_on'].date()
        else:
            record['observed on']=datetime.strptime(obs['observed_on'], "%Y-%m-%d").date()
        records.append(record)
    j=j+1

Requesting observations from user _neomapas_: page 1, total of 200 observations downloaded
Requesting observations from user _neomapas_: page 2, total of 400 observations downloaded
Requesting observations from user _neomapas_: page 3, total of 600 observations downloaded
Requesting observations from user _neomapas_: page 4, total of 800 observations downloaded
Requesting observations from user _neomapas_: page 5, total of 1000 observations downloaded
Requesting observations from user _neomapas_: page 6, total of 1200 observations downloaded
Requesting observations from user _neomapas_: page 7, total of 1400 observations downloaded
Requesting observations from user _neomapas_: page 8, total of 1600 observations downloaded
Requesting observations from user _neomapas_: page 9, total of 1800 observations downloaded
Requesting observations from user _neomapas_: page 10, total of 2000 observations downloaded
Requesting observations from user _neomapas_: page 11, total of 2200 observations downloaded
Requesting observations from user _neomapas_: page 12, total of 2400 observations downloaded
Requesting observations from user _neomapas_: page 13, total of 2600 observations downloaded
Requesting observations from user _neomapas_: page 14, total of 2800 observations downloaded
Requesting observations from user _neomapas_: page 15, total of 3000 observations downloaded
Requesting observations from user _neomapas_: page 16, total of 3200 observations downloaded
Requesting observations from user _neomapas_: page 17, total of 3324 observations downloaded

Now we need to bundle all these records into a data frame using pandas:

df=pd.DataFrame(records)

Step 2: Timeline of Observations

Here I am using a simple approach to visualise the timeline of number of observations accumulated through time with the plotly express function ecdf and option ecdfnorm=None. An additional histogram in the top margin allows to also view the counts per period.

This is an interactive graph, so you can hover your cursor along the line or histogram and a popup window will show you the values at that location of the graph. Cool!

fig = px.ecdf(df['observed on'],
              ecdfnorm=None,
              marginal="histogram",
              labels={
                 "value": "Date",
                 },
              title="Timeline of number of observations")
fig.update_traces({'name': 'iNat Observations'}, selector={'name': 'observed on'})
fig.show()

Step 3: Filter and group observations

Now I also want to add information about the different stages in my career, because this is related to my activity in iNaturalist. I consider here my time as PhD, Postdoc, early and mid career.

This time I will use the histogram function and show the stage in different colors:

df['stage']='PhD'
df.loc[df['observed on']>date(2009, 2, 9),'stage']='Postdoc'
df.loc[df['observed on']>date(2013, 9, 15),'stage']='Early career'
df.loc[df['observed on']>date(2019, 6, 1),'stage']='Mid career'
df.loc[df['observed on']>date(2024, 1, 1),'stage']='Serious iNat user'

fig = px.histogram(df,
              x='observed on',
              color="stage",
              labels={
                 "observed on": "Date",
                 },
              title="Highlight different stages")
fig.update_traces({'name': 'iNat Observations'}, selector={'name': 'observed on'})
fig.show()

This is also an interactive graph, so you can zoom in, or click on the legend items to hide, double-click to focus on an item, etc. Cooler!

Step 4: Interactive plot of activity

Finally, we’ll visualize the complete data using an interactive treemap. This plot will look like a calendar, with nested boxes proportional to the number of observations in each period.

I am using a trick to make the data look like a calendar, but is not really one. The boxes are not ordered in chronological order. The ordering follows an algorithm that optimises the distribution of area between units, so the units with more observations tend to be on the upper left corner and the ones with less observations in the bottom right.

I need to add some additional information to the data frame. First, turn the date from the observed on column into strings representing combinations of years and months, or full dates:

df['year_month']=[x.strftime('%Y-%m') for x in df['observed on']]
df['full_date']=[x.strftime('%Y-%m-%d') for x in df['observed on']]

I start by grouping our data using stage, year, combination of year and month, and the full date. Then I count the number of unique species and places names.

aggfuncs = {'species guess':['count',pd.Series.nunique],
           'location':['count',pd.Series.nunique]}

obs_by_date=df.groupby(['stage','year','year_month','full_date']).agg(aggfuncs).reset_index()

obs_by_date.columns = [' '.join(col).strip() for col in obs_by_date.columns.values]

For this visualisation I will filter the data to focus on my early career first:

df = obs_by_date.query("stage == 'Early career'")

And this is the code I use for the treemap:

fig = px.treemap(df, 
    path=[px.Constant("My early career observations"), 'year','year_month','full_date'], 
    values='species guess count',
    color='species guess count', 
    hover_data=['location count'],
    color_continuous_scale='RdBu',
    color_continuous_midpoint=
        np.average(df['species guess count'],     
            weights=df['location count']))
fig.update_layout(margin = dict(t=50, l=25, r=25, b=25))
fig.show()

Now, compare that with the most recent five years of my career:

df = obs_by_date.query("stage == 'Mid career'")

fig = px.treemap(df, 
    path=[px.Constant("My mid-career observations"), 'year','year_month','full_date'], 
    values='species guess count',
    color='species guess count', 
    hover_data=['location count'],
    color_continuous_scale='RdBu',
    color_continuous_midpoint=
        np.average(df['species guess count'],     
            weights=df['location count']))
fig.update_layout(margin = dict(t=50, l=25, r=25, b=25))
fig.show()

By now you can guess that this is also an interactive graph. What can you do with it? Click on the boxes representing years or month to zoom into that period. You can see how the dark blue boxes highlight month and days with more observations and the reddish boxes represent days with single or few observations.

The rate I add observations is growing more and more each year!

And that’s it!

Conclusion

I find these visualisations useful to highlight activity patterns over the years. In the last plot the dark blue boxes are mostly related to fieldwork or holiday trips where I spent a lot of time with my cameras looking for plants and animals.

But alas!, there is a bias here! I still have a lot of photos in my hard drives and memory cards that I haven’t organised yet. Some of the boxes with low numbers in the graph should actually have higher values. I just need to catch up with uploading old observations!

I have recently update my observation from the year 2005, year 2006, year 2010, etc.

So this graph will evolve as I add more and more observations to iNaturalist. Stay tuned!