Code Snippet: Automatically hydrating tweets in rtweet while avoiding rate limits
R twitterAPI code snippetThe documentation of the R-package rtweet states
concerning the function lookup_statuses()
:
Returns data on up to 90,000 Twitter statuses. To return data on more than 90,000 statuses, users must iterate through status IDs whilst avoiding rate limits, which reset every 15 minutes.
Here is one possible solution to this problem, featuring a .rds
file including a vector of the status-ids
one wishes to hydrate. The code divides all status-ids into chunks of 90,000 and hydrates them one at a time while
halting the console for 15 minutes in between downloads. This guarentees a rate reset. All objects are
saved as .rds
files with the signature hydrated_n.rds
for download n.
library(rtweet)
status_ids <- readRDS("status_ids.rds")
chunk_reference <- c(seq(1, length(status_ids), 89999), length(status_ids))
for (i in 2:length(chunk_reference)){
filename <- paste0("hydrated_", (i-1), ".rds")
print(paste0("Hydrating chunk ", (i-1), " out of ", (length(chunk_reference)-1)))
tweets_dl <- lookup_statuses(sids[chunk_reference[i-1]:chunk_reference[i]])
print(paste0("Saving chunk ", (i-1), " out of ", (length(chunk_reference)-1)))
saveRDS(tweets_dl, filename)
print(paste0("Setting console to sleep for 15 minutes"))
Sys.sleep(15 * 60)
}