Examining Hurricane Irma with Twitter Data

Slides:



Advertisements
Similar presentations
Disciplinary Differences in Selected Scholars' Twitter Transmissions Kim Holmberg 1 and Mike Thelwall 2 1 |
Advertisements

Deep Learning in NLP Word representation and how to use it for Parsing
Addressing the Medical Image Annotation Task using visual words representation Uri Avni, Tel Aviv University, Israel Hayit GreenspanTel Aviv University,
Sentence Classifier for Helpdesk s Anthony 6 June 2006 Supervisors: Dr. Yuval Marom Dr. David Albrecht.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.
Amplifier Design and Modeling Doug Bouler: CURENT REU Dr. Daniel Costinett: Mentor Final CURENT Presentation 7/18/2014 Knoxville, TN.
Introduction to Data Mining Engineering Group in ACL.
CS 5604 Spring 2015 Classification Xuewen Cui Rongrong Tao Ruide Zhang May 5th, 2015.
Modeling (Chap. 2) Modern Information Retrieval Spring 2000.
Predicting Income from Census Data using Multiple Classifiers Presented By: Arghya Kusum Das Arnab Ganguly Manohar Karki Saikat Basu Subhajit Sidhanta.
LEARNING PRIORITY OF TECHNOLOGY PROCESS SKILLS AT ELEMENTARY LEVEL Hung-Jen Yang & Miao-Kuei Ho DEPARTMENT OF INDUSTRIAL TECHNOLOGY EDUCATION THE NATIONAL.
Smart RSS Aggregator A text classification problem Alban Scholer & Markus Kirsten 2005.
Permission-based Malware Detection in Android Devices REU fellow: Nadeen Saleh 1, Faculty mentor: Dr. Wenjia Li 2 Affiliation: 1. Florida Atlantic University,
Feature selection LING 572 Fei Xia Week 4: 1/29/08 1.
Nuhi BESIMI, Adrian BESIMI, Visar SHEHU
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
Flywheels and Power System Stability Michael Breuhl Mentors: Horacio Silva Dr. Héctor Pulgar-Painemal.
Unsupervised Streaming Feature Selection in Social Media
Optimization of EV Charging Shivam Patel Final Presentation 07/23/15 CURENT.
Fabricio Benevenuto, Gabriel Magno, Tiago Rodrigues, and Virgilio Almeida Universidade Federal de Minas Gerais Belo Horizonte, Brazil ACSAC 2010 Fabricio.
Downscaling Global Climate Model Forecasts by Using Neural Networks Mark Bailey, Becca Latto, Dr. Nabin Malakar, Dr. Barry Gross, Pedro Placido The City.
Neural Network Recognition of Frequency Disturbance Recorder Signals Stephen Tang REU Final Presentation July 22, 2014.
Sports Market Research. Know Your Customer How do businesses know their customers needs and wants?  Ask them/talking to customers  Surveys  Questionnaires.
Alvin CHAN Kay CHEUNG Alex YING Relationship between Twitter Events and Real-life.
The cat.
Big Data Processing of School Shooting Archives
Candace Pang1 and Elizabeth Price2; Mentors: Dr. Chien-fei Chen3, Dr
Jonatas Wehrmann, Willian Becker, Henry E. L. Cagnini, and Rodrigo C
Name: Sushmita Laila Khan Affiliation: Georgia Southern University
Kendra David and Dr. Hayley Lanier
3D Animation of Power System Data
Sentiment analysis tools
CATEGORIZATION OF NEWS ARTICLES USING NEURAL TEXT CATEGORIZER
Paper by Zou, L. , S. N. Miller and E. T
Title of your project (First page of notebook) First, Last Name
A Deep Learning Technical Paper Recommender System
5th Grade Science Ms. Torres.
Dynamic Transmission Network Behavior for DER Power Systems
Candace Pang1 and Elizabeth Price2; Mentors: Dr. Chien-fei Chen3, Dr
Distributed Storage in Automated Data Transfer
Analog to Digital Conversion
Candace Pang & Elizabeth Price Young Scholars Program
Introduction to Azure Machine Learning Studio
Paper by Zou, L. , S. N. Miller and E. T
Error Detection in the Frequency Monitoring Network (FNET)
RECORDS AND INFORMATION
Chapter 1 Data Analysis Ch.1 Introduction
3D Animation of Power System Data
Deep Learning Cascading Failure Prediction in a High Performance Computing System Eric Abreut1, Zhongbo Li2 1 Florida International University 2 The University.
Analog to Digital Converter
Introduction to Predictive Modeling
Word embeddings based mapping
Word embeddings based mapping
The information Content of IPO Prospectuses
24/02/2019 Climate Change Climate Change1 - Observations.
The AIC Website.
Scientific Method.
Microarray Data Set The microarray data set we are dealing with is represented as a 2d numerical array.
Information Processing by Neuronal Populations Chapter 5 Measuring distributed properties of neural representations beyond the decoding of local variables:
Research Article Title
Analyzing social media data to monitor public health trends
Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.
Neural Machine Translation using CNN
1Peyton Spencer, 2Yang Liu, 2Dr. Kai Sun
A Comparison of Modulation Techniques for Three-Level Neutral-Point-Clamped Inverter Fed Motor Drives William Karls1, Ruirui Chen2, Dr Fred Wang2 1The.
Norbert Bigirimana1, Dr. Hantao Cui2, Dr. Kevin Tomsovic2
THE ASSISTIVE SYSTEM SHIFALI KUMAR BISHWO GURUNG JAMES CHOU
CS249: Neural Language Model
Professor Junghoo “John” Cho UCLA
Sharifa Sharfeldden1, Peter Pham2, Dr. Daniel Costinett2
Presentation transcript:

Examining Hurricane Irma with Twitter Data and Machine Learning Gbemisola Oladosu, Dylan Johnson, (Mentors: Dr. Chien-fei Chen, Dr. Xiaojing Xu, Zach McMichael, Jullian Ball)

Outline Introduction Purpose Research Questions Methods Data Collection Text Cleaning Descriptives Bag-of-words Doc2Vec Doc2Vec Models Results Conclusion Future Work

Introduction Proportion of Category 4 and 5 hurricanes in the US has increased in the past two decades[1] Cost of hurricane damage is increasing[2] Larger responses are necessary Higher risk for social issues Measuring response effectiveness and social issues is important

Introduction (Cont.) How did tweet content and purpose change over time? What tweet content categories had the most number of complaints, messages of appreciation, or requests for help? How did people feel about the government?

Methods 10,784 Irma-related tweets from September 2017 Tweets were labeled for content (Code 1) and purpose (Code2) Removed: Duplicates Labeling mistakes

Methods (Cont.) Text was cleaned: Non-English Retweets Punctuation Articles URLs Set to lowercase Original Text: Senior community finally gets power restored 18 days after Irma http:// on.wtsp.com/2xOJoDN pic.twitter.com/Og1dyXLMEa Cleaned Text: senior community finally gets power restored days irma

Methods (Cont.) Oversampled to 10,000 Composition of Tweet Purpose (Code2) Composition of Tweet Content (Code1)

Methods (Cont.) Neural networks require numeric inputs Bag of Words feature vector Represent relationships btw. words Word similarity Word order (‘Large’ as far from ‘fat’ as ‘cat’) Vocabulary Vector The 2 Fat 1 Cat Sat On Mat Rat Blue Ran Large The fat cat sat on the mat.

Doc2Vec Uses word embeddings (many dimensional vectors) to represent: Word similarity Word order Relationships between words [3]

Methods (Cont.) Eight Doc2Vec models were created to predict Code 1 and Code 2 Dependent Variable Text Type Sampling Model Accuracy Code1 Original Normal 17.5% Cleaned 18.6% Oversampled 38.1% 43.3% Code2 11.1% 15.3% 20.6% 31.6%

Results (Cont.) Only complaints at govt. Mostly regarding infrastructure and animals Tweets of appreciation were related to health, infrastructure, and social issues/crimes Requests for help related to animals Code1 vs Code2

Results (Cont.) Tweets dramatically increased after landfall Tweet composition remained constant Code1 vs Time Code1 vs Time

Results (Cont.) Not very accurate ‘Relief efforts’ was predicted a lot before the hurricane made landfall ‘Recovery info’ and ‘relief efforts’ were predicted disproportionately

Conclusion Predictions were inaccurate due to a small training set of fewer than 1400 tweets Code 1 model’s accuracy of 43% and Code 2 model’s accuracy of 31% suggest more data could produce better results

Conclusion (Cont.) Analyzing the results of a more accurate model can allow future researchers to determine the most impactful hurricane problems This can inform hurricane response groups on how to address these problems

References Holland, G., & Bruyère, C. L. (2013). Recent intense hurricane response to global climate change. Climate Dynamics, 42(3–4), 617–627. https://doi.org/10.1007/s00382-013-17130 2. Hurricane Costs. (2017). Retrieved July 23, 2019, from Noaa.gov website: https://coast.noaa.gov/states/fast-facts/hurricane-costs.html 3. Le, Q., & Mikolov, T. (2014). Distributed Representations of Sentences and Documents. Retrieved from https://arxiv.org/pdf/1405.4053.pdf

Acknowledgements Special thanks to our mentors: Dr. Chien-fei Chen, Dr. Xiaojing Xu, Zach McMichael, and Julian Ball

Acknowledgements This work was supported primarily by the ERC Program of the National Science Foundation and DOE under NSF Award Number EEC-1041877 and the CURENT Industry Partnership Program. Other US government and industrial sponsors of CURENT research are also gratefully acknowledged.