Bee-utiful world!

Anthophila

Python
global
Apoidea
plotly
Author

José R. Ferrer-Paris

Published

February 14, 2026

Modified

February 24, 2026

My Bee observations in iNaturalist

My daughter loves bees and bees love her. A colony of bees nested on our house when she was about to be born, and we have this family connection with them. Always when I see a bee I smile… and if I have a camera at hand, I also make a photo to upload to iNat.

Here is a summary of the ones that are currently there, with regular updates comming each time I do a field trip or excursion. This post will include a list of how many species I have recorded in each of the countries I have visited, and break this down by families and subfamilies.

To bee or not to bee

Following the book of Laurence Parker, Bees of the world, I am using the taxon Anthophila as the root for my search. This is ranked as an Epifamily in iNats taxonomy, and includes seven families of bees.

Overview

Tools and Libraries

I will be using a Python environment with a selection of my favourite libraries, as explained here.

We will be using the following libraries in this blog post:

from pyinaturalist import get_observations
from pyinaturalist import (
    get_observations,
    get_taxa_by_id,
    get_places_by_id
    )
import pandas as pd
##import numpy as np
##import geopandas as gpd
##from IPython.display import display, HTML
from itertools import islice
import plotly.express as px
from IPython.display import display, HTML

Step-by-Step Guide

Step 1: Downloading iNaturalist observations

Let’s begin by fetching iNaturalist observations with pyinaturalist. We will need a selection of global observations so we select the user neomapas, and see where in the world they have been.

username = 'neomapas'
taxonid=630955
observations = get_observations(user_id=username, taxon_id=taxonid, per_page=0)
n_obs = observations['total_results']

Let’s print an overview of these total results:

print("User {} has {} observations of Anthophila (taxon id {}) in iNaturalist".format(username,n_obs,taxonid))
User neomapas has 29 observations of Anthophila (taxon id 630955) in iNaturalist

The maximum number of observations we can download in each query 200, so we need to use pagination to get all results. For each query we will extract a selection of fields that we will use for summarising the observation records (coordinates, species guess, quality grade), and at the same time we will extract the taxonomic information from each identification.

records=list()
taxa=list()
places=list()
msg="Requesting observations from user _{}_: page {}, total of {} observations downloaded"
j=1
while len(records) < n_obs:
    print(msg.format(username,j,min(j*200,n_obs)))
    observations = get_observations(
        user_id='neomapas',
        taxon_id=taxonid,
        per_page=1000,
        page=j)
    for obs in observations['results']:
        record = {
            'uuid': obs['uuid'],
            'quality': obs['quality_grade'],
            'description': obs['description'],
            'location': obs['place_guess'],
            'longitude': obs['location'][1],
            'latitude': obs['location'][0],
            'species guess': obs['species_guess'],
            'observed on': obs['observed_on'],
            'points': obs['faves_count'] * 10 + obs['comments_count'] + obs['identifications_count'] * 3,
        }
        for pid in obs['place_ids']:
            place_record = {
                'uuid': obs['uuid'],
                'place id': pid
            }
            places.append(place_record)
        for tid in obs['ident_taxon_ids']:
            taxon_record = {
                'uuid': obs['uuid'],
                'taxon id': tid
            }
            taxa.append(taxon_record)
        if len(obs['observation_photos'])>0:
            record['url'] = obs['observation_photos'][0]['photo']['url']
            record['attribution'] = obs['observation_photos'][0]['photo']['attribution']
        records.append(record)
    j=j+1
Requesting observations from user _neomapas_: page 1, total of 29 observations downloaded

This example requires extracting some additional information that is nested within the json structure of the API response. I explain some of the details in this post.

We transform these sets of records into two data frames with pandas:

inat_obs=pd.DataFrame(records)
taxa = pd.DataFrame(taxa)
places = pd.DataFrame(places)

Some steps hereafter require us to extract the taxonomic information for each observation, and the place information for each observation. We will use the get_taxa_by_id and get_places_by_id functions to do this, but we need to prepare the list of taxon ids and place ids that we want to query first.

all_taxa=list(set(taxa['taxon id']))
def chunk(it, size):
    it = iter(it)
    return iter(lambda: tuple(islice(it, size)), ())
for slc in chunk(all_taxa,30):
    taxa_query = get_taxa_by_id(slc, rank_level=[30,29,28,27,26,25,20,10,5])
    for res in taxa_query['results']:
        qry = taxa.loc[taxa['taxon id'] == res['id'],'uuid']
        inat_obs.loc[inat_obs.uuid.isin(qry), res['rank']] = res['name']
all_places=list(set(places['place id']))
response = get_places_by_id(all_places,
                            admin_level=[0,10])
for res in response['results']:
    qry = places.loc[places['place id'] == res['id'],'uuid']
    if res['admin_level'] == 10:
        level='state'
    elif res['admin_level'] == 0:
        level='country'
    inat_obs.loc[inat_obs.uuid.isin(qry), level] = res['name']
def group_and_plot_data(x,aggfuncs,groupcols):
    gd=x.groupby(groupcols).agg(aggfuncs).reset_index()
    gd.columns = [' '.join(col).strip() for col in gd.columns.values]
    value_col = gd.columns.values[-1]
    fig = px.treemap(gd, 
        path=[px.Constant("Bees obs"),] + groupcols,  
        values=value_col,
        color=value_col, 
        hover_data=[value_col],
        color_continuous_scale='RdBu')
    fig.update_layout(margin = dict(t=5, l=5, r=5, b=5))
    return(fig)
group_columns = ['family','subfamily','tribe','genus','species']
agg_funcs = {'uuid':['count']}
fig1 = group_and_plot_data(
    inat_obs.fillna('-- unassigned --'), 
    agg_funcs, 
    group_columns)
fig1.show()
group_columns = ['country','state','family']
agg_funcs = {'uuid':['count']}
fig1 = group_and_plot_data(
    inat_obs.fillna('-- unassigned --'), 
    agg_funcs, 
    group_columns)
fig1.show()

Step 5: Displaying a sample of observations

Now we combine spatial and taxonomic information to get a wall of pictures showing the most interesting observations for each butterfly family in each of the countries I have visited.

These lines of code perform a couple of tricks. I group the data twice, first I do the selection based on the points column for each combination of country and family, then I iterate across the countries and join the figures in a list. I then use display and HTML functions to read the formatted text strings as html elements1 to organise the figures and captions on this webpage.

inat_obs['figure'] = [
    "<figure class='mini'><a href='https://www.inaturalist.org/observations/%s' target=_blank><img src='%s' height=50><figcaption class='mini'>%s: <i>%s</i></figcaption></a></figure>" % (
        record['uuid'],
        record['url'],
        record['family'],
        record['species guess'])
    for idx,record in inat_obs.iterrows() 
]
selection = (
    inat_obs
    .sort_values('points')
    .groupby(['country','state','family'])
    .first()
    .groupby(['country','state'])
    .agg({'figure':'unique'})
)


sections = list()
for idx,row in selection.iterrows():
    sectionfigures="&nbsp;".join(row['figure'])
    sectionname="<figure class='mini'><p class='figsection'>%s<br>%s</p></figure>" % idx
    sections.append(sectionname + sectionfigures)

allsections="<div class='container'>%s</div>" % ("".join(sections))

display(HTML(allsections))

Footnotes

  1. The look of the output html code depends on the site’s css style definitions. Look at this file if you want to reuse/adapt my style.↩︎