Predicting Prevalence of Influenza-Like Illness From Geo-Tagged Tweets

Slides:



Advertisements
Similar presentations
A brief review of non-neural-network approaches to deep learning
Advertisements

Guiding Semi- Supervision with Constraint-Driven Learning Ming-Wei Chang,Lev Ratinow, Dan Roth.
Entity-Centric Topic-Oriented Opinion Summarization in Twitter Date : 2013/09/03 Author : Xinfan Meng, Furu Wei, Xiaohua, Liu, Ming Zhou, Sujian Li and.
Large-Scale Entity-Based Online Social Network Profile Linkage.
Rapid Object Detection using a Boosted Cascade of Simple Features Paul Viola, Michael Jones Conference on Computer Vision and Pattern Recognition 2001.
Twitter Catches The Flu: Detecting Influenza Epidemics using Twitter Eiji ARAMAKI * Sachiko MASKAWA * Mizuki MORITA ** * The University of Tokyo ** National.
ICDM, Shenzhen, 2014 Flu Gone Viral: Syndromic Surveillance of Flu on Twitter using Temporal Topic Models Liangzhe Chen, K. S. M. Tozammel Hossain, Patrick.
Team : Priya Iyer Vaidy Venkat Sonali Sharma Mentor: Andy Schlaikjer Twist : User Timeline Tweets Classifier.
Tweet Classification for Political Sentiment Analysis Micol Marchetti-Bowick.
Towards Twitter Context Summarization with User Influence Models Yi Chang et al. WSDM 2013 Hyewon Lim 21 June 2013.
Distant Supervision for Emotion Classification in Twitter posts 1/17.
Supervised Learning Techniques over Twitter Data Kleisarchaki Sofia.
BEHAVIORAL PREDICTION OF TWITTER USERS BASED ON TEXTUAL INFORMATION Shiyao Wang.
Political Party, Gender, and Age Classification Based on Political Blogs Michelle Hewlett and Elizabeth Lingg.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Classification with reject option in gene expression data Blaise Hanczar and Edward R Dougherty BIOINFORMATICS Vol. 24 no , pages
Extracting Interest Tags from Twitter User Biographies Ying Ding, Jing Jiang School of Information Systems Singapore Management University AIRS 2014, Kuching,
TTI's Gender Prediction System using Bootstrapping and Identical-Hierarchy Mohammad Golam Sohrab Computational Intelligence Laboratory Toyota.
© 2013 IBM Corporation Efficient Multi-stage Image Classification for Mobile Sensing in Urban Environments Presented by Shashank Mujumdar IBM Research,
Forecasting with Twitter data Presented by : Thusitha Chandrapala MARTA ARIAS, ARGIMIRO ARRATIA, and RAMON XURIGUERA.
Computational Analysis of USA Swimming Data Junfu Xu School of Computer Engineering and Science, Shanghai University.
Towards Detecting Influenza Epidemics by Analyzing Twitter Massages Aron Culotta Jedsada Chartree.
Text Classification using SVM- light DSSI 2008 Jing Jiang.
1 A study on automatically extracted keywords in text categorization Authors:Anette Hulth and Be´ata B. Megyesi From:ACL 2006 Reporter: 陳永祥 Date:2007/10/16.
Multimodal Alignment of Scholarly Documents and Their Presentations Bamdad Bahrani JCDL 2013 Submission Feb 2013.
Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.
Permission-based Malware Detection in Android Devices REU fellow: Nadeen Saleh 1, Faculty mentor: Dr. Wenjia Li 2 Affiliation: 1. Florida Atlantic University,
Page 1 Ming Ji Department of Computer Science University of Illinois at Urbana-Champaign.
Classification and Ranking Approaches to Discriminative Language Modeling for ASR Erinç Dikici, Murat Semerci, Murat Saraçlar, Ethem Alpaydın 報告者:郝柏翰 2013/01/28.
Confidence-Aware Graph Regularization with Heterogeneous Pairwise Features Yuan FangUniversity of Illinois at Urbana-Champaign Bo-June (Paul) HsuMicrosoft.
BING: Binarized Normed Gradients for Objectness Estimation at 300fps
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
Detecting Influenza Outbreaks by Analyzing Twitter Messages By Aron Culotta Jedsada Chartree 02/28/11.
Prediction of Influencers from Word Use Chan Shing Hei.
Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.
TEXT ANALYTICS - LABS Maha Althobaiti Udo Kruschwitz Massimo Poesio.
Automated Solar Cavity Detection
Recognizing Stances in Online Debates Unsupervised opinion analysis method for debate-side classification. Mine the web to learn associations that are.
Don’t Follow me : Spam Detection in Twitter January 12, 2011 In-seok An SNU Internet Database Lab. Alex Hai Wang The Pensylvania State University International.
 Effective Multi-Label Active Learning for Text Classification Bishan yang, Juan-Tao Sun, Tengjiao Wang, Zheng Chen KDD’ 09 Supervisor: Koh Jia-Ling Presenter:
Max-Confidence Boosting With Uncertainty for Visual tracking WEN GUO, LIANGLIANG CAO, TONY X. HAN, SHUICHENG YAN AND CHANGSHENG XU IEEE TRANSACTIONS ON.
1 Clustering Web Queries John S. Whissell, Charles L.A. Clarke, Azin Ashkan CIKM ’ 09 Speaker: Hsin-Lan, Wang Date: 2010/08/31.
Language Identification and Part-of-Speech Tagging
A Simple Approach for Author Profiling in MapReduce
Experience Report: System Log Analysis for Anomaly Detection
Using Social Media to Enhance Emergency Situation Awareness
Click Through Rate Prediction for Local Search Results
System for Semi-automatic ontology construction
AP Statistics Chapter 14 Section 1.
Epidemic Alerts EECS E6898: TOPICS – INFORMATION PROCESSING: From Data to Solutions Alexander Loh May 5, 2016.
Speaker: Jim-an tsai advisor: professor jia-lin koh
Proportion of Original Tweets
Roberts MG, Cootes TF, Pacheco E, Adams JE
Speaker: Jim-an tsai advisor: professor jia-lin koh
A Large Scale Prediction Engine for App Install Clicks and Conversions
Discriminative Frequent Pattern Analysis for Effective Classification
COSC 4335: Other Classification Techniques
iSRD Spam Review Detection with Imbalanced Data Distributions
Measuring the Latency of Depression Detection in Social Media
Outline Background Motivation Proposed Model Experimental Results
Graph-based Security and Privacy Analytics via Collective Classification with Joint Weight Learning and Propagation Binghui Wang, Jinyuan Jia, and Neil.
Concave Minimization for Support Vector Machine Classifiers
Clinically Significant Information Extraction from Radiology Reports
Jia-Bin Huang Virginia Tech
Semi-Supervised Learning
Introduction to Sentiment Analysis
Title First name, last name, university, city, country
Spam Detection Using Support Vector Machine Presenting By Nan Mya Oo University of Computer Studies Taunggyi.
Austin Karingada, Jacob Handy, Adviser : Dr
Presentation transcript:

Predicting Prevalence of Influenza-Like Illness From Geo-Tagged Tweets AdViser: Jia-Ling,Koh Source: WWW’17 Speaker: Ming-Chieh,Chiang Date: 2018/02/27

OutLine Introduction Method Experiment Conclusion

Motivation

Motivation Modeling disease spread with Twitter data involves several challenges Only relying on text classification can include large errors

GOAL A strong positive linear correlation exists between the number of ILI-related tweets and the number of recorded influenza notifications at state scale

OutLine Introduction Method Experiment Conclusion

Method Obtain a set of labeled training data Apply a semi-supervised cascade learning approach to learning SVM classifiers Distinguish “sick” tweets and “other” tweets

Training SVM classifiers

Training SVM classifiers Two parameters: class weight and C parameter Fix one parameter and vary the other within a wide range of values

Classification Stage

Classification Stage Features: unigram, bigram, trigram Example: “ I got the flu” -> (i, get, flu, i get, get flu, i get flu) Use TF-IDF features to represent tweet data

Modeling Illness Prevalence Fit a linear regression model The number of influenza notifications in each state: Internet access * Twitter penetration rate

OutLine Introduction Method Experiment Conclusion

Experiment CSIRO Geo-tagged tweets within Australia for the entire year of 2015 8,961,932 tweets 225,641unique users

Experiment JSON format Data cleaning and tokenization

Experiment Classifier performance 1027 tweets are found to be ILI-related

Experiment Temporal Analysis

Experiment Spatial Analysis

Experiment Regional Spatial Analysis

Experiment

Experiment

Limitation The scarcity of the tweets The laboratory confirmed influenza notifications are incomplete Twitter is more popular among younger generations Manual checking

OutLine Introduction Method Experiment Conclusion

Conclusion Propose effective modifications to the state-of-art approach in detecting illness-related tweets Twitter data is a reasonable proxy for detecting disease outbreak and possesses strong linear correlation with real-world influenza notification data