TWITTER 3 DAY /12/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University
Course organization 10-Nov-2014NLP, Prof. Howard, Tulane University 2 The syllabus is under construction. Chapter numbering 3.7. How to deal with non-English characters 3.7. How to deal with non-English characters 4.5. How to create a pattern with Unicode characters 4.5. How to create a pattern with Unicode characters 6. Control 6. Control
Open Spyder 10-Nov NLP, Prof. Howard, Tulane University
Twitter Review 10-Nov NLP, Prof. Howard, Tulane University
logon() 1. def logon(): 2. import tweepy 3. API_KEY = 'your_info_here' 4. API_SECRET = 'your_info_here' 5. ACCESS_TOKEN = 'your_info_here' 6. ACCESS_TOKEN_SECRET = 'your_info_here' 7. key = tweepy.OAuthHandler(API_KEY, API_SECRET) 8. key.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET) 9. return key 10-Nov-2014NLP, Prof. Howard, Tulane University 5
The other functions of tweepies.py 1. stream2screen(num, terms) 2. stream2var(num, terms) 3. stream2file(num, terms) 4. json2screen(num, terms) 5. json2screenpretty(num, terms) 6. dict2screen(num, terms) 7. dict2var(num, terms) 10-Nov-2014NLP, Prof. Howard, Tulane University 6
Quiz Task: can you find a group of words that will distinguish two Twitter topics? How to do it Collect 500+ tweets from two trending topics into different variables. Run each through a FreqDist to find frequent words that may be unique to each topic (filter out the stop words). Use these key words in a ConditionalFreqDist to show how well they would work in identifying or classifying each topic. 10-Nov-2014NLP, Prof. Howard, Tulane University 7
tweepy's REST API 10-Nov NLP, Prof. Howard, Tulane University
How to access Twitter's APIs 07-Nov-2014NLP, Prof. Howard, Tulane University 9 streamingREST representational state transfer tweepy
New version of tweepies New functions timeline(num, userName) trends() localTrends(WOEID) srch(num, query) 10-Nov-2014NLP, Prof. Howard, Tulane University 10
Usage 1. >>> from tweepies import timeline 2. >>> timeline(1,'JustinBieber') 3. >>> from tweepies import trends 4. >>> world = trends() 5. >>> for t in world: print t['name'], t['countryCode'], t['woeid'] 10-Nov-2014NLP, Prof. Howard, Tulane University 11
Usage, cont >>> from tweepies import localTrends 3. >>> import pprint 4. >>> nola = localTrends(' ') 5. >>> pprint.pprint(nola) 6. [{u'as_of': u' T19:36:01Z', 7. u'created_at': u' T19:29:32Z', 8. u'locations': [{u'name': u'New Orleans', u'woeid': }], 9. u'trends': [{u'name': u'Veterans Day', 10. u'promoted_content': None, 11. u'query': u'%22Veterans+Day%22', 12. u'url': u' 10-Nov-2014NLP, Prof. Howard, Tulane University 12
Usage, cont 1. >>> from tweepies import srch 2. >>> VD = srch(20, 'Verterans Day') 3. >>> pprint.pprint(VD) 10-Nov-2014NLP, Prof. Howard, Tulane University 13
something else Next time 10-Nov-2014NLP, Prof. Howard, Tulane University 14