Download presentation
Presentation is loading. Please wait.
Published byAron Kelley Modified over 9 years ago
1
Members: Raghuram Krishnamachari Manish Maheshwari Maryam El Kherba Guided by: Prof. Alan Mislove
2
Flu Prediction / Activity CDC Flu Activity Reports Influenza like Illness (ILI) for each region Google Flu Trends Aggregates search data to estimate flu activity Our experiment (Twitter) Analyze Twitter data (tweets) to estimate flu activity
3
Google Flu Trends CDC’s ILI data VS Google Flu Trends
4
Google Flu Trends Vs Twitter
6
Tweets, Phrases "having a cold"4 "have a cold“ 7 "feel feverish" "flu"5 "headache" "flu"8 "sick" "flu" 9 "flu" "fever“5 "came down with the flu"7 "chills" "flu"7 "catching the flu"6 "cough" "flu"6 "fatigue" "flu"8 "weakness" "flu"6 "flu like symptoms"4 "runny nose" "flu"5 "sore throat" "flu"7 "stomach ache" "flu"6 "stuffy nose" "flu"6 "tiredness" "flu"4 "vomiting" "flu"4 "watery eyes" "flu"6 "body hurts" "flu"7
7
Process Filter flu tweets from twitter data Store data for each state (FIPS) Filter Count flu tweets (weekly) Count total tweets (weekly) Count Ratio of flu related to total tweets Compare against Google/CDC Plot
8
Implementation Linux bash shell script Filtering find fips -name "*.gz" -exec zcat {} \; | grep "$1" Counting find … -exec zcat {} \; | awk ‘{ print $3 }' | awk '{ print $3 " " $2 " " $6 } sort -k 3n -k 2M -k 1n | uniq -c Plotting pr -mft -s, dates.txt NJ.tot NY.tot > RE2.tot Microsoft Excel
9
Challenges Filtering Phrases that express flu symptoms Processing time Segregation based on location Counting Processing time Storage format Plotting Lack of consistent CDC data Handling of large numeric data
10
Future Better prediction algorithm Live Tweet monitoring Flu propagation Facebook application
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.