Sections Text Mining Plan Twitter API twitteR package Obtain Authorization Info from Twitter Run Search Function to Get Tweets Convert to DataFrame
Emails, Newsgroups twitteR Text Mining (tm) Network Analysis (igraph) Corpus Igraph Object Transformations TermDocument Matrix Term Adjacency Matrix FindFrequentTerms Sentiment (sentiment140) (wordcloud) removeSparseTerms FindAssocs Network Graph Topic Modeling Communities Classification
Twitter Search p # times retweeted who tweet
What Kind of Data Can You Get? IF an API is Public WYSIWYG What You See Is What You Get
What Kind of Data Can You Get?
https://dev.twitter.com/rest/public REST stands for Representational State Transfer. (It is sometimes spelled "ReST".) It relies on a stateless, client-server, cacheable communications protocol -- and in virtually all cases, the HTTP protocol is used.
The Search API https://dev.twitter.com/rest/public/search
How to Build a Query https://dev.twitter.com/rest/public/search ` ` `
Twitter – JSON Only https://dev.twitter.com/faq/rest-api-v1.1 https://twittercommunity.com/t/deprecation-of-xml-response-type-for-single-tweet-oembed/62013
#RedSox https://twitter.com/search?q=%23redsox&src=typd Replace “https://twitter.com/search” with “https://api.twitter.com/1.1/search/tweets.json” and you will get: https://api.twitter.com/1.1/search/tweets.json? q=%23redsox
twitteR searchTwitter(“#RedSox",1500) https://api.twitter.com/1.1/search/tweets.json? q=%23redsox
searchTwitter(“#RedSox",1500) Function Search Term n searchTwitter(“#RedSox",1500) twListToDF
Framework Source: Hadley Wickham Data Structures numeric vector character vector Dataframe: d <- c(1,2,3,4) e <- c("red", "white", "red", NA) f <- c(TRUE,TRUE,TRUE,FALSE) mydata <- data.frame(d,e,f) names(mydata) <- c("ID","Color","Passed") List: w <- list(name="Fred", age=5.3) Numeric Vector: a <- c(1,2,5.3,6,-2,4) Character Vector: b <- c("one","two","three") Matrix: y<-matrix(1:20, nrow=5,ncol=4) Framework Source: Hadley Wickham
Functions: searchTwitter, twListToDF What searchTwitter issue a search of Twitter twListToDF convert into data.frame # search twitter tweet_rstats = searchTwitter("#rstats",1500) length(tweet_rstats) head(tweet_rstats) tweets.df <- twListToDF(tweet_rstats) tweets.df$text head(tweets.df$text)
searchTwitter https://github.com/geoffjentry/twitteR
Website API? WYSIWYG What data is available? Subset Public Is a Key needed? Private / Key XML XML and/or JSON JSON
require you to register most organizations require you to register you will then receive an API Key
https://apps.twitter.com
“Create an Application”
Uh oh. Twitter Wants My Phone Numbers
Notification Settings
Keys
Access Tokens
twitteR and Authorization
Pull Tweets library(twitteR) setup_twitter_oauth(api_key,api_secret,access_token,access_token_secret) # search twitter tweets = searchTwitter("#redsox",1000) class(tweets) length(tweets) head(tweets) tweets.df <- twListToDF(tweets) class(tweets.df) write.csv(tweets.df, "redsox_tweets.csv", row.names=FALSE)