This short vignette demonstrates how to download and annotate images inside Tweets. Besides
imgrec, we will use rtweet to get Twitter data and the dplyr package for data wrangling.
Before we start, access credentials are required for Twitter and Google Cloud Vision. For this example, all credentials are stored as R environment variables. Check out the rtweet authentication vignette to obtain and use Twitter API access tokens. The authentication for Google Cloud Vision is described in the imgrec intro vignette.
# load libraries library(imgrec) library(rtweet) library(dplyr) # prepare twitter credentials <- Sys.getenv('twitter_app_name') app_name <-Sys.getenv('twitter_consumer_key') consumer_key <- Sys.getenv('twitter_consumer_secret') consumer_secret <- Sys.getenv('twitter_access_token') access_token <- Sys.getenv('twitter_access_secret') access_token_secret # obtain twitter access token <- create_token(app = app_name, token consumer_key = consumer_key, consumer_secret = consumer_secret, access_token = access_token, access_secret = access_token_secret, set_renv = TRUE) # setup authentification for google vision gvision_init()
If you know the status id’s of Tweets that you would like to obtain, you can use
lookup_tweets(), which takes a vector of status id’s as input and retrieves all corresponding tweets. URL’s of images (and videos) are stored in the list column
We use one of the most-retweeted tweets posted by Barack Obama as an example:
“No one is born hating another person because of the color of his skin or his background or his religion…” (Barack Obama, Twitter Status)
<- lookup_tweets(896523232098078720) example $media_urlexample
Now, we retrieve and parse annotations for the Tweet image:
<- get_annotations(images = example$media_url[], results max_res = 10, # max. number of labels, mode = "url", # we pass an image url features = 'all') %>% parse_annotations() names(results) # features obtained by Google Cloud Vision
And that’s it! The results are stored in a list object which includes dataframes for all annotations retrieved from Google Cloud Vision:
$labels %>% head()results