Presentation is loading. Please wait.

Presentation is loading. Please wait.

Detecting Influenza Outbreaks by Analyzing Twitter Messages By Aron Culotta Jedsada Chartree 02/28/11.

Similar presentations


Presentation on theme: "Detecting Influenza Outbreaks by Analyzing Twitter Messages By Aron Culotta Jedsada Chartree 02/28/11."— Presentation transcript:

1 Detecting Influenza Outbreaks by Analyzing Twitter Messages By Aron Culotta Jedsada Chartree 02/28/11

2 Outline Introduction Motivations Data Methodology Results Conclusion Reference

3 Introduction The growing in monitoring disease outbreaks using the Internet The growing of Twitter

4 Motivations Developing methods that can reliably track ILI rates in real- time.

5 Data The U.S. Centers for Disease Control and Prevention (CDC) Twitter data 36 week period from August 29, 2009 to May 8, 2010.

6 Data The ILI rates from the CDC’s weekly tracking statistics (09/05/09 to 05/08/10) The number of Twitter messages collected per week

7 Methodology Gathering the ILI rates and Twitter messages Finding the correlation between the ILI rates and Twitter messages P = The proportion of the population exhibiting in ILI symptoms W = {w 1 …w k } = A set of k keywords, D = Document collection = The coefficients = The error term Q(W,D) = The fraction of documents in D the match W (|D w |/|D|) Logit(P) = ln(P/(1-P))

8 Methodology Filtering spurious matches (noise) The number of messages containing the keyword “flu” and a number of keywords that might lead to spurious correlations.

9 Methodology Filtering spurious matches by supervised learning - Training a document classifier using logistic regression

10 Methodology Filtering spurious matches by supervised learning - Combining filtering with regression 1. Soft classifier

11 Methodology Filtering spurious matches by supervised learning - Combining filtering with regression 2. Hard classifier Applying both classifier to the simple linear model.

12 Methodology Evaluating false alarms by simulation - Sample 1,000 messages deemed to be spurious. - Sample with replacement an increasing number of the spurious messages and add them to the original message set. - Use the same trained regression models.

13 Results Fitted and predicted ILI rates using regression over query fractions of Twitter messages

14 Results Fitted and predicted ILI rates using regression over query fractions of Twitter messages

15 Results Correlation results with refinements of the flu query

16 Results Correlation results with refinements of the flu query

17 Results

18 Number false messages added

19 Conclusion The proposed method can be used to track influenza rates from Twitter messages. The proposed evaluating false alarm can be used satisfying.

20 References Aron Culotta. 2010. Detecting influenza outbreaks by analyzing Twitter messages. Jeremy Ginsberg and others. 2009. Detecting influenza epidemics using search engine query data.


Download ppt "Detecting Influenza Outbreaks by Analyzing Twitter Messages By Aron Culotta Jedsada Chartree 02/28/11."

Similar presentations


Ads by Google