Download data

Martin Westgate & Dax Kellie

2023-11-07

The atlas_ functions are used to return data from the atlas chosen using galah_config(). They are:

The final atlas_ function - atlas_citation - is unusual in that it does not return any new data. Instead it provides a citation for an existing dataset ( downloaded using atlas_occurrences) that has an associated DOI. The other functions are described below.

Record counts

atlas_counts() provides summary counts on records in the specified atlas, without needing to download all the records.

galah_config(atlas = "Australia")
# Total number of records in the ALA
atlas_counts()
## # A tibble: 1 × 1
##       count
##       <int>
## 1 131963195

In addition to the filter arguments, it has an optional group_by argument, which provides counts binned by the requested field.

galah_call() |>
  galah_group_by(kingdom) |>
  atlas_counts()
## # A tibble: 11 × 2
##    kingdom       count
##    <chr>         <int>
##  1 Animalia  101176164
##  2 Plantae    26026027
##  3 Fungi       2285235
##  4 Chromista   1020325
##  5 Protista     352282
##  6 Bacteria     113165
##  7 Eukaryota      8821
##  8 Protozoa       4716
##  9 Archaea        4120
## 10 Virus          2306
## 11 Viroid          103

Species lists

A common use case of atlas data is to identify which species occur in a specified region, time period, or taxonomic group. atlas_species() is similar to search_taxa, in that it returns taxonomic information and unique identifiers in a tibble. It differs in not being able to return information on taxonomic levels other than the species; but also in being more flexible by supporting filtering:

species <- galah_call() |>
  galah_identify("Rodentia") |>
  galah_filter(stateProvince == "Northern Territory") |>
  atlas_species()
  
species |> head()
## # A tibble: 6 × 10
##   kingdom  phylum   class    order  family genus species author species_guid vernacular_name
##   <chr>    <chr>    <chr>    <chr>  <chr>  <chr> <chr>   <chr>  <chr>        <chr>          
## 1 Animalia Chordata Mammalia Roden… Murid… Pseu… Pseudo… (Goul… https://bio… Delicate Mouse 
## 2 Animalia Chordata Mammalia Roden… Murid… Mese… Mesemb… (J.E.… https://bio… Black-footed T…
## 3 Animalia Chordata Mammalia Roden… Murid… Zyzo… Zyzomy… (Thom… https://bio… Common Rock-rat
## 4 Animalia Chordata Mammalia Roden… Murid… Pseu… Pseudo… (Wait… https://bio… Sandy Inland M…
## 5 Animalia Chordata Mammalia Roden… Murid… Melo… Melomy… (Rams… https://bio… Grassland Melo…
## 6 Animalia Chordata Mammalia Roden… Murid… Noto… Notomy… Thoma… https://bio… Spinifex Hoppi…

Occurrence data

To download occurrence data you will need to specify your email in galah_config(). This email must be associated with an active ALA account. See more information in the config section

galah_config(email = "your_email@email.com", atlas = "Australia")

Download occurrence records for Eolophus roseicapilla

occ <- galah_call() |>
  galah_identify("Eolophus roseicapilla") |>
  galah_filter(
    stateProvince == "Australian Capital Territory",
    year >= 2010,
    profile = "ALA"
  ) |>
  galah_select(institutionID, group = "basic") |>
  atlas_occurrences()
## Retrying in 1 seconds.
## Retrying in 2 seconds.
## Retrying in 4 seconds.
occ |> head()
## # A tibble: 6 × 9
##   recordID                    scientificName taxonConceptID decimalLatitude decimalLongitude
##   <chr>                       <chr>          <chr>                    <dbl>            <dbl>
## 1 0000a928-d756-42eb-8058-6f… Eolophus rose… https://biodi…           -35.6             149.
## 2 0001bc78-d2e9-48aa-8b9d-d6… Eolophus rose… https://biodi…           -35.2             149.
## 3 0002064f-08ea-425b-97c5-26… Eolophus rose… https://biodi…           -35.3             149.
## 4 00022dd2-9f85-4802-b837-7f… Eolophus rose… https://biodi…           -35.3             149.
## 5 0002cc35-8d5a-4d20-8012-12… Eolophus rose… https://biodi…           -35.3             149.
## 6 00030a8c-082f-44f0-898a-ad… Eolophus rose… https://biodi…           -35.3             149.
## # ℹ 4 more variables: eventDate <dttm>, occurrenceStatus <chr>, dataResourceName <chr>,
## #   institutionID <lgl>

Media metadata

In addition to text data describing individual occurrences and their attributes, ALA stores images, sounds and videos associated with a given record. Metadata on these records can be downloaded to R using atlas_media() and the same set of filters as the other data download functions.

media_data <- galah_call() |>
  galah_identify("Eolophus roseicapilla") |>
  galah_filter(
    year == 2020,
    cl22 == "Australian Capital Territory") |>
  atlas_media()
## Retrying in 1 seconds.
media_data |> head()
## # A tibble: 6 × 19
##   media_id           recordID scientificName taxonConceptID decimalLatitude decimalLongitude
##   <chr>              <chr>    <chr>          <chr>                    <dbl>            <dbl>
## 1 ff8322d0-f44c-47a… 003a192… Eolophus rose… https://biodi…           -35.3             149.
## 2 c66fc819-7022-44f… 015ee7c… Eolophus rose… https://biodi…           -35.4             149.
## 3 fe6d7b94-9e61-4ac… 05e86b7… Eolophus rose… https://biodi…           -35.4             149.
## 4 2f4d32c0-a084-4bb… 063bb0f… Eolophus rose… https://biodi…           -35.6             149.
## 5 73407414-0707-429… 063bb0f… Eolophus rose… https://biodi…           -35.6             149.
## 6 89171c49-5a64-423… 063bb0f… Eolophus rose… https://biodi…           -35.6             149.
## # ℹ 13 more variables: eventDate <dttm>, occurrenceStatus <chr>, dataResourceName <chr>,
## #   multimedia <chr>, images <chr>, videos <lgl>, sounds <lgl>, creator <chr>,
## #   license <chr>, mimetype <chr>, width <int>, height <int>, image_url <chr>

To actually download the media files to your computer, use [collect_media()].

Taxonomic trees

atlas_taxonomy provides a way to build taxonomic trees from one clade down to another using each service’s internal taxonomy. Specify which taxonomic level your tree will go down to with galah_filter() using the rank argument.

galah_call() |>
  galah_identify("chordata") |>
  galah_filter(rank == class) |>
  atlas_taxonomy()
## # A tibble: 19 × 4
##    name            rank      parent_taxon_concept_id                        taxon_concept_id
##    <chr>           <chr>     <chr>                                          <chr>           
##  1 Chordata        phylum    <NA>                                           https://biodive…
##  2 Cephalochordata subphylum https://biodiversity.org.au/afd/taxa/065f1da4… https://biodive…
##  3 Tunicata        subphylum https://biodiversity.org.au/afd/taxa/065f1da4… https://biodive…
##  4 Appendicularia  class     https://biodiversity.org.au/afd/taxa/1c20ed62… https://biodive…
##  5 Ascidiacea      class     https://biodiversity.org.au/afd/taxa/1c20ed62… https://biodive…
##  6 Thaliacea       class     https://biodiversity.org.au/afd/taxa/1c20ed62… https://biodive…
##  7 Vertebrata      subphylum https://biodiversity.org.au/afd/taxa/065f1da4… https://biodive…
##  8 Agnatha         informal  https://biodiversity.org.au/afd/taxa/5d6076b1… https://biodive…
##  9 Myxini          informal  https://biodiversity.org.au/afd/taxa/66db22c8… https://biodive…
## 10 Petromyzontida  informal  https://biodiversity.org.au/afd/taxa/66db22c8… https://biodive…
## 11 Gnathostomata   informal  https://biodiversity.org.au/afd/taxa/5d6076b1… https://biodive…
## 12 Amphibia        class     https://biodiversity.org.au/afd/taxa/ef5515fd… https://biodive…
## 13 Aves            class     https://biodiversity.org.au/afd/taxa/ef5515fd… https://biodive…
## 14 Mammalia        class     https://biodiversity.org.au/afd/taxa/ef5515fd… https://biodive…
## 15 Pisces          informal  https://biodiversity.org.au/afd/taxa/ef5515fd… https://biodive…
## 16 Actinopterygii  class     https://biodiversity.org.au/afd/taxa/e22efeb4… https://biodive…
## 17 Chondrichthyes  class     https://biodiversity.org.au/afd/taxa/e22efeb4… https://biodive…
## 18 Sarcopterygii   class     https://biodiversity.org.au/afd/taxa/e22efeb4… https://biodive…
## 19 Reptilia        class     https://biodiversity.org.au/afd/taxa/ef5515fd… https://biodive…

Configuring galah

Various aspects of the galah package can be customized.

Email

To download occurrence records, you will need to provide an email address registered with the service that you want to use (e.g. for the ALA you can create an account here). Once an email is registered, it should be stored in the config:

galah_config(email = "myemail@gmail.com")

Setting your directory

By default, galah stores downloads in a temporary folder, meaning that the local files are automatically deleted when the R session is ended. This behaviour can be altered so that downloaded files are preserved by setting the directory to a non-temporary location.

galah_config(directory = "example/dir")

Setting the download reason

ALA requires that you provide a reason when downloading occurrence data (via the galah atlas_occurrences() function). The reason is set as “scientific research” by default, but you can change this using galah_config(). See show_all_reasons() for valid download reasons.

galah_config(download_reason_id = your_reason_id)

Debugging

If things aren’t working as expected, more detail (particularly about web requests and caching behaviour) can be obtained by setting the verbose configuration option:

galah_config(verbose = TRUE)