Download presentation
Presentation is loading. Please wait.
Published byCorey Gardner Modified over 8 years ago
1
Some Final Material
2
GOOGLE FLU TRENDS
3
Sore throat? Sniffles? Google it! Duh! During flu season, more people enter search queries concerning the flu. Each year 90 million American adults search web for info about specific illnesses = LOTS OF DATA Importance: 250,000 - 500,000 deaths from respiratory illnesses worldwide.
4
Previous Attempts Swedish website counted queries in order track flu activity. There was a strong correlation between frequency of search terms containing “flu” and “influenza” and virologic surveillance data These models look for a very limited number of queries.
5
Google’s Version Took 50 million of the most common search queries between 2003-2008 and did a weekly count for each state Normalized data by dividing count by total searches for the week (thereby getting a percentage)
6
Each of 50 million queries were tested for correlation with CDC data Ranked according from most to least correlated We want to estimate flu activity based on more then just a few queries
7
Google added top ranked queries together to see what number would yield the most accurate results. The magic number is 45
8
Previously unused data for flu season of 2007- 2008 as a test set The mean correlation was 0.97 (ranged between 0.92 and 0.99)
9
Advantages Generate accurate estimates faster than CDC. CDC takes one to two weeks to process data and generate a flu activity report It takes Google one to two days to generate an estimate Faster estimates means that health officials can quickly direct resources to where the need is greatest
10
Future Expand Google Flu Trends to predict flu activity across the globe. Challenges: some countries do not have official historical data
11
Self Driving Cars Google “commercial” videovideo Alternative future autonomous “vehicles” – video video
12
Sample Telecommunication Applications
13
Some Applications Applications – Classify a phone line/customer as a business or residential customer Will build predictive model for called customer, who may not be an AT&T customer. – Classifying inbound service by types of use (voice, fax, modem) – Identify telemarketers Uses: Marketing, revenue prediction, impact of changes (e.g., do not call list for telemarketing)
14
Distribution of Weekday Calls by Hour
15
Comparison of Weekday Calling Patterns
16
Call Durations
17
Market Segments
18
Enterprise Miner Workspace
19
Some Results Segment 0Segment 1
20
More Results Segment 2Segment 3
21
Application 2 Identify how inbound (toll-free) service is used – Is an inbound line being used for: Voice Fax Data/Modem – Useful for identifying trends and prediction Fax usage has dropped significantly since last study, most likely to increased use of the Internet – Useful for Marketing For example, for new fax services
22
Segmentation of Inbound Lines
23
Type of Usage by Segment
24
Distribution of Usage Fax and modem lines show opposite trends. Fax lines become more common in the low-usage segments while modem lines become less common in these segments. Fax usage grows to 5% in segment 8, but this contributes very few minutes
25
Summary Results for AT&T Toll-Free Lines
26
Chronological Comparison
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.