Google photos

Google cloud
Photos
magick
Mexico
Author

José R. Ferrer-Paris

Published

July 1, 2024

What I want to do

My aim is to create a local copy of my photos in Google Photos to be able to use them in my Quarto Website. I choose to use R for this.

Challenges

This is the kind of thing that works great when it works, but that can enter an infinite loop of trial-and-error if you miss a tiny, vital detail.

Sometimes the procedure has been described in detail in older posts, but specific configurations or methods have changed since. So it is important to understand what is needed and to adapt the steps according to the most recent documentation.

Sources

My code is based on on some blogs, medium posts and stack overflow posts describing the procedure for R and Python.

Set-up Google authentication

These are the basic steps:

  1. create a project in google cloud, and open the APIS y servicios tab (or equivalent in your language)
  2. enable Photos Library API (not sure if this is relevant here),
  3. configure a simple consent page (Pantalla de consentimiento), publishing status can be “Testing”,
  4. create an OAuth 2.0 client ID and download the json file.
  5. add GC_PROJECT_EMAIL and GC_PROJECT_CRED_JSON to a .Renviron file

Steps in R

Load the libraries

library(gargle)
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(jsonlite)
library(httr)
library(foreach)
library(stringr)
library(magick)
Linking to ImageMagick 6.9.12.93
Enabled features: cairo, fontconfig, freetype, heic, lcms, pango, raw, rsvg, webp
Disabled features: fftw, ghostscript, x11

Read environment variables

Make sure to update the .Renviron file, then you can (re-)load it in the current R session with:

readRenviron("~/.Renviron")

Read credentials and authenticate

The credentials are in a json file in a private folder, the environment variable contains this location. Now we can check if the file exists, and read it:

cred_json <- Sys.getenv("GC_PROJECT_CRED_JSON")
if (!file.exists(cred_json)) {
  stop("credentials not found, please update Renviron file")
} else {
  clnt <- gargle_oauth_client_from_json(path=cred_json)
}

You can print the client information with:

print(clnt)

Output not shown

Now fetch the token:

tkn <- gargle2.0_token(
  email = Sys.getenv("GC_PROJECT_EMAIL"),
  client = clnt,
  scope = c("https://www.googleapis.com/auth/photoslibrary.readonly",
            "https://www.googleapis.com/auth/photoslibrary.sharing")
)
This step is important!

In an interactive session, this will open a tab/window in the browser to complete authentication and confirm permissions for the app. It might use information in the cache, if available.

If this is run non-interactively, it will try to use the information in the cache, but will fail if this info is stale.

Final steps of authentication:

k <- token_fetch(token=tkn)
authorization = paste('Bearer', k$credentials$access_token)

Album information

Now we can get the album information using function GET:

getalbum <-
  GET("https://photoslibrary.googleapis.com/v1/albums",
      add_headers(
        'Authorization' = authorization,
        'Accept'  = 'application/json'),
      query = list("pageSize" = 50)) |> 
  content( as = "text", encoding = "UTF-8") |>
  fromJSON() 

Here I use select to show only two columns:

knitr::kable(
    select(
        getalbum$albums, 
        c("title", "mediaItemsCount")))
title mediaItemsCount
Eventos - RLE 15
logos para la web 3
Lugares - México 40
libros viajeros 33
Lugares - España 11
Lugares - Europa 23
rompecabeza 2
Eventos - CEBA LEE 22
Lugares - Sur América 14
Lugares - Venezuela 3
Eventos - Venezuela 4
Eventos - Mariposas 10
40 años de Chinco y Betty 18
Nuestro año sudafricano 9
mayo en Ciudad del Cabo 14
30 de abril de 2012 1
Verano en Ciudad del Cabo 6
Drop Box 4
Diciembre2009 762
FotosNietos 826
Eventos - IVIC 7

If there are multiple pages per query, it is possible to use the nextPageToken to paginate the results:

if (!is.null(getalbum$nextPageToken)) {
  getalbum2 <-
    GET("https://photoslibrary.googleapis.com/v1/albums",
      add_headers(
        'Authorization' = authorization,
        'Accept'  = 'application/json'),
      query = list("pageToken" = getalbum$nextPageToken)) |>
    content(as = "text", encoding = "UTF-8") |>
    fromJSON() 
}

Fotos in an album

If we want to pull information from one album:

aID <- filter(getalbum$albums,
    title %in% c("Lugares - México")) |>
    pull(id)

dts <-  POST("https://photoslibrary.googleapis.com/v1/mediaItems:search",
      add_headers(
        'Authorization' = authorization,
        'Accept'  = 'application/json'),
      body = list("albumId" = aID,
                  "pageSize" = 50),
      encode = "json"
      ) |> 
    content( as = "text", encoding = "UTF-8") |>
    fromJSON( flatten = TRUE) |> 
    data.frame()

Let’s have a glimpse at the data frame

glimpse(dts)
Rows: 40
Columns: 15
$ mediaItems.id                                  <chr> "ADXhca0hTz4W4UNe2bCvXA…
$ mediaItems.description                         <chr> "Desayuno en Oaxaca", "…
$ mediaItems.productUrl                          <chr> "https://photos.google.…
$ mediaItems.baseUrl                             <chr> "https://lh3.googleuser…
$ mediaItems.mimeType                            <chr> "image/jpeg", "image/jp…
$ mediaItems.filename                            <chr> "Desayuno-Oaxaca.jpg", …
$ mediaItems.mediaMetadata.creationTime          <chr> "2005-11-11T09:01:05Z",…
$ mediaItems.mediaMetadata.width                 <chr> "480", "640", "640", "6…
$ mediaItems.mediaMetadata.height                <chr> "640", "480", "480", "4…
$ mediaItems.mediaMetadata.photo.cameraMake      <chr> "KONICA MINOLTA ", "KON…
$ mediaItems.mediaMetadata.photo.cameraModel     <chr> "DiMAGE Z3", "DiMAGE Z3…
$ mediaItems.mediaMetadata.photo.focalLength     <dbl> 5.859375, 5.859375, 69.…
$ mediaItems.mediaMetadata.photo.apertureFNumber <dbl> 6.3, 5.0, 4.5, 8.0, 6.3…
$ mediaItems.mediaMetadata.photo.isoEquivalent   <int> 50, 50, 50, 50, 50, 50,…
$ mediaItems.mediaMetadata.photo.exposureTime    <chr> "0.004999999s", "0.0099…

We downloaded the information for all fotos. The baseUrl links are useful during the R session, but are not good for sharing the links to the photos. They are random urls and become defunct after the session is closed.

For example, this will display the image using the baseUrl when rendering this page, but will eventually disappear:

cat(sprintf("<img src='%s'/>", 
    dts[23,"mediaItems.baseUrl"]))

But this link will still be valid:

cat(sprintf("View _%1$s_ in its [Google Photos album](%2$s){target='gphotos'}", 
    dts[23, "mediaItems.description"],
    dts[23,"mediaItems.productUrl"]))

View Estela en Palenque in its Google Photos album

Keeping a persistent version

One way to share the photos is by selecting existing files, creating shareable albums with the API and downloading the shareableURL of the album and photos. I still haven’t worked out the code for doing that in R.

Another option is to just download the photos in the size needed for the session/website and share the productUrl to link back to the Google photos page for the image.

For example we can visualise one photo with the image_read function in the magick library using the baseUrl attribute:

oaxaca <- image_read(dts[1,"mediaItems.baseUrl"])
print(oaxaca)
# A tibble: 1 × 7
  format width height colorspace matte filesize density
  <chr>  <int>  <int> <chr>      <lgl>    <int> <chr>  
1 JPEG     384    512 sRGB       FALSE    91613 72x72  

Or, we can download the image to an accessible folder. First we create the folder:

here::i_am("Rcode/google-photos.qmd")
here() starts at /Users/z3529065/proyectos/personal/explicado
img_folder <- here::here("Rcode","img")
if (!dir.exists(img_folder))
  dir.create(img_folder)

Now we use download.file to trigger the download if the file does not exist yet.

photo <- slice(dts,16)

durl <- sprintf("%s=w400-h400-d", 
    photo$mediaItems.baseUrl)
dfile <- sprintf("%s-%s.jpg", 
    abbreviate(photo$mediaItems.id), str_replace_all(photo$mediaItems.description, 
        "[ ,/]+", "-"))

if (!file.exists(dfile))
    download.file(url = durl, 
    destfile = here::here(img_folder, dfile))

The downloaded image is now available locally:

cat(sprintf("![View _%1$s_ in its [Google Photos album](%2$s){target='gphotos'}](img/%3$s)", 
    photo$mediaItems.description,
    photo$mediaItems.productUrl,
    dfile
    ))

View Visita a Palenque in its Google Photos album

Multiple fotos in multiple albums

We can select multiple ids from multiple albums

album_info <- getalbum$albums %>% select(id, title)

lugares <- c("Lugares - México", "Lugares - Europa", "Lugares - Sur América", "Eventos - Venezuela")

eventos <- c("Eventos - CEBA LEE", "Eventos - RLE", "Eventos - Venezuela", "Eventos - Mariposas", "Eventos - IVIC")

aIDs <- album_info |> filter(title %in% c(lugares, eventos)) |> pull(id)

And use foreach to run an efficient loop:

photos <- foreach(aID=aIDs, .combine = "bind_rows") %do% {
  dts <-  POST("https://photoslibrary.googleapis.com/v1/mediaItems:search",
      add_headers(
        'Authorization' = authorization,
        'Accept'  = 'application/json'),
      body = list("albumId" = aID,
                  "pageSize" = 50),
      encode = "json"
      ) |> 
    content( as = "text", encoding = "UTF-8") |>
    fromJSON( flatten = TRUE) |>
    data.frame()
  dts$album <- album_info |> 
    filter(id %in% aID) |> pull(title)
  dts <- dts |> 
    mutate(
      output_file = str_replace_all(mediaItems.description, "[ ,/]+", "-"),
      output_id = abbreviate(mediaItems.id))
  dts 
}

Look how many photos we have now!

glimpse(photos)
Rows: 135
Columns: 18
$ mediaItems.id                                  <chr> "ADXhca1KacJZA43EjYWu9f…
$ mediaItems.description                         <chr> "Visita al IUCN Conserv…
$ mediaItems.productUrl                          <chr> "https://photos.google.…
$ mediaItems.baseUrl                             <chr> "https://lh3.googleuser…
$ mediaItems.mimeType                            <chr> "image/jpeg", "image/jp…
$ mediaItems.filename                            <chr> "F9FC0766-5A4D-4E2F-A92…
$ mediaItems.mediaMetadata.creationTime          <chr> "2018-06-26T06:44:48.29…
$ mediaItems.mediaMetadata.width                 <chr> "1024", "1024", "1024",…
$ mediaItems.mediaMetadata.height                <chr> "768", "768", "768", "1…
$ mediaItems.mediaMetadata.photo.cameraMake      <chr> "Apple", "Apple", "Appl…
$ mediaItems.mediaMetadata.photo.cameraModel     <chr> "iPhone 5s", "iPhone 5s…
$ mediaItems.mediaMetadata.photo.focalLength     <dbl> 2.150000, 4.150000, 4.1…
$ mediaItems.mediaMetadata.photo.apertureFNumber <dbl> 2.4, 2.2, 2.2, 2.2, 2.2…
$ mediaItems.mediaMetadata.photo.isoEquivalent   <int> 50, 50, 40, 40, 100, 40…
$ mediaItems.mediaMetadata.photo.exposureTime    <chr> "0.002183406s", "0.0099…
$ album                                          <chr> "Eventos - RLE", "Event…
$ output_file                                    <chr> "Visita-al-IUCN-Conserv…
$ output_id                                      <chr> "ADX1K", "ADX2S", "ADX3…

We can store this information in a rds file, but remember the baseUrl wont be valid next time we need them:

file_name <- here::here("data","google-photos.rds")
saveRDS(file=file_name, photos)

In the loop above we added some extra steps to create local file names so that we can locate the files and re-use them in our website:

for (i in seq(along=photos$mediaItems.id)) {
  photo <- photos %>% slice(i)
  durl <- sprintf("%s=w400-h400-d", photo$mediaItems.baseUrl)
  dfile <- sprintf("%s/%s-%s.jpg",img_folder, photo$output_id, photo$output_file)
  if (!file.exists(dfile))
    download.file(url=durl, destfile=dfile)
}

That’s it!

I think this code is now ready for using and reusing in other quarto- and R-projects.

Cheers!