Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

Slides:



Advertisements
Similar presentations
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
Advertisements

LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014.
1 CS 391L: Machine Learning: Instance Based Learning Raymond J. Mooney University of Texas at Austin.
Comparing Twitter Summarization Algorithms for Multiple Post Summaries David Inouye and Jugal K. Kalita SocialCom May 10 Hyewon Lim.
EBI Statistics 101.
Robust Network Compressive Sensing
Robust Network Compressive Sensing Lili Qiu UT Austin NSF Workshop Nov. 12, 2014.
NetQuest: A Flexible Framework for Internet Measurement Lili Qiu Joint work with Mike Dahlin, Harrick Vin, and Yin Zhang UT Austin.
Predictive Automatic Relevance Determination by Expectation Propagation Yuan (Alan) Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani.
Supervised classification performance (prediction) assessment Dr. Huiru Zheng Dr. Franscisco Azuaje School of Computing and Mathematics Faculty of Engineering.
On the Constancy of Internet Path Properties Yin Zhang, Nick Duffield AT&T Labs Vern Paxson, Scott Shenker ACIRI Internet Measurement Workshop 2001 Presented.
Heterogeneous Consensus Learning via Decision Propagation and Negotiation Jing Gao † Wei Fan ‡ Yizhou Sun † Jiawei Han † †University of Illinois at Urbana-Champaign.
Heterogeneous Consensus Learning via Decision Propagation and Negotiation Jing Gao† Wei Fan‡ Yizhou Sun†Jiawei Han† †University of Illinois at Urbana-Champaign.
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
+ Doing More with Less : Student Modeling and Performance Prediction with Reduced Content Models Yun Huang, University of Pittsburgh Yanbo Xu, Carnegie.
Data Mining – Intro.
Today Evaluation Measures Accuracy Significance Testing
Network Planète Chadi Barakat
Temporal Event Map Construction For Event Search Qing Li Department of Computer Science City University of Hong Kong.
Attention and Event Detection Identifying, attributing and describing spatial bursts Early online identification of attention items in social media Louis.
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Personalisation Seminar on Unlocking the Secrets of the Past: Text Mining for Historical Documents Sven Steudter.
Predicting performance of applications and infrastructures Tania Lorido 27th May 2011.
1 Pengjie Ren, Zhumin Chen and Jun Ma Information Retrieval Lab. Shandong University 报告人:任鹏杰 2013 年 11 月 18 日 Understanding Temporal Intent of User Query.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
Improving Web Search Ranking by Incorporating User Behavior Information Eugene Agichtein Eric Brill Susan Dumais Microsoft Research.
Bug Localization with Machine Learning Techniques Wujie Zheng
Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.
Wei Feng , Jiawei Han, Jianyong Wang , Charu Aggarwal , Jianbin Huang
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
BEHAVIORAL TARGETING IN ON-LINE ADVERTISING: AN EMPIRICAL STUDY AUTHORS: JOANNA JAWORSKA MARCIN SYDOW IN DEFENSE: XILING SUN & ARINDAM PAUL.
Sharing Features Between Objects and Their Attributes Sung Ju Hwang 1, Fei Sha 2 and Kristen Grauman 1 1 University of Texas at Austin, 2 University of.
Qi Guo Emory University Ryen White, Susan Dumais, Jue Wang, Blake Anderson Microsoft Presented by Tetsuya Sakai, Microsoft Research.
K. Kolomvatsos 1, C. Anagnostopoulos 2, and S. Hadjiefthymiades 1 An Efficient Environmental Monitoring System adopting Data Fusion, Prediction & Fuzzy.
CONFIDENTIAL1 Hidden Decision Trees to Design Predictive Scores – Application to Fraud Detection Vincent Granville, Ph.D. AnalyticBridge October 27, 2009.
A Content-Based Approach to Collaborative Filtering Brandon Douthit-Wood CS 470 – Final Presentation.
Mining Anomalies in Network-Wide Flow Data Anukool Lakhina with Mark Crovella and Christophe Diot NANOG35, Oct 23-25, 2005.
Guest lecture: Feature Selection Alan Qi Dec 2, 2004.
Class Imbalance in Text Classification
Sparse Granger Causality Graphs for Human Action Classification Saehoon Yi and Vladimir Pavlovic Rutgers, The State University of New Jersey.
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
Context-Aware Query Classification Huanhuan Cao, Derek Hao Hu, Dou Shen, Daxin Jiang, Jian-Tao Sun, Enhong Chen, Qiang Yang Microsoft Research Asia SIGIR.
Learning to Estimate Query Difficulty Including Applications to Missing Content Detection and Distributed Information Retrieval Elad Yom-Tov, Shai Fine,
NetQuest: A Flexible Framework for Large-Scale Network Measurement Lili Qiu University of Texas at Austin Joint work with Han Hee Song.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Machine Learning CUNY Graduate Center Lecture 6: Linear Regression II.
Facets: Fast Comprehensive Mining of Coevolving High-order Time Series Hanghang TongPing JiYongjie CaiWei FanQing He Joint Work by Presenter:Wei Fan.
G W. Yan 1 Multi-Model Fusion for Robust Time-Series Forecasting Weizhong Yan Industrial Artificial Intelligence Lab GE Global Research Center.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Accurate WiFi Packet Delivery Rate Estimation and Applications Owais Khan and Lili Qiu. The University of Texas at Austin 1 Infocom 2016, San Francisco.
Chapter 11 – Neural Nets © Galit Shmueli and Peter Bruce 2010 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
OS Fingerprinting and Tethering Detection in Mobile Networks
P.Demestichas (1), S. Vassaki(2,3), A.Georgakopoulos(2,3)
Data Mining – Intro.
Jian Wu (University of Michigan)
Event Detection using Customer Care Calls
DM-Group Meeting Liangzhe Chen, Nov
MACHINE LEARNING.
iSRD Spam Review Detection with Imbalanced Data Distributions
Classification and Prediction
Predicting Prevalence of Influenza-Like Illness From Geo-Tagged Tweets
Binghui Wang, Le Zhang, Neil Zhenqiang Gong
Automatic Segmentation of Data Sequences
Intro to Machine Learning
K. Kolomvatsos1, C. Anagnostopoulos2, and S. Hadjiefthymiades1
Graph-based Security and Privacy Analytics via Collective Classification with Joint Weight Learning and Propagation Binghui Wang, Jinyuan Jia, and Neil.
GANG: Detecting Fraudulent Users in OSNs
Leverage Consensus Partition for Domain-Specific Entity Coreference
Semi-Supervised Learning
Presentation transcript:

Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University of Texas at Austin 1, AT&T Labs – Research 2

2 Motivation INFOCOM 2013  Customer care call is a direct channel between service provider and customers  Reveal problems observed by customers  Understand impact of network events on customer perceived performance Regions, services, group of users, …

Motivation INFOCOM 2013  Service providers have strong motivation to understand customer care calls  Reduce cost: ~$10 per call  Prioritize and handle anomalies by the impact  Improve customers’ impression to the service provider 3

Problem Formulation INFOCOM 2013  Goal  Automatically detect anomalies using customer care calls.  Input  Customer care calls  Call agents label calls using predefined categories ~10,000 categroies  A call is labeled with multiple categories Issue, customer need, call type, problem resolution 4

Problem Formulation INFOCOM 2013  Output  Anomalies  Performance problems observed by customers  Different from traditional network metrics, e.g. video conference: bandwidth is not important as long as it’s higher than some threshold online games: 1ms -> 10ms delay could be a big problem. 5

Example: Categories of Customer Care Calls

Input: Customer Care Calls INFOCOM T1T1 T2T2 T3T3 TMTM … # of calls of category n in time bin t

Example: Anomaly INFOCOM Power outage Service outage due to DOS attack Low bandwidth due to maintenance Output: anomaly indicator = [ ]

Challenges  Customers respond to an anomaly in different ways.  Events may not be detectable by single category.  Single category is vulnerable to outliers  There are thousands of categories. 9

Challenges  Customers respond to an anomaly in different ways.  Events may not be detectable by single category.  Single category is vulnerable to outliers  There are thousands of categories. 10

Challenges  Customers respond to an anomaly in different ways.  Events may not be detectable by single category.  Single category is vulnerable to outliers  There are thousands of categories.

Challenges  Inconsistent classification across agents and across call centers

Our Approach INFOCOM 2013  We use regression to approach the problem by casting it as an inference problem: 13 A(t,n): # calls of category n at time t x(n): weight of the n-th category b(t): anomaly indicator at time t

Our Approach Training set: the history data that contains: A: Input timeseries of customer care calls b: Ground-truth of when anomalies take place x: The weight to learn Testing set: A’: The latest customer care call timeseries x: The weight learning from training set b’: The anomaly to detect Abx = 14 A’b’x =

Issues INFOCOM 2013  Dynamic x  The relationship between customer care calls and anomalies may change.  Under-constraints  # categories can be larger than # training traces  Over-fitting  The weights begin to memorize training data rather than learning the generic trend  Scalability  There are thousands of categories and thousands of time intervals.  Varying customer response time 15

Our Approach INFOCOM 2013 Clustering Identifying important categories Temporal stability Low-rank structure Combining multiple classifiers Reducing Categories Regression 16

Clustering INFOCOM 2013  Agents usually classify calls based on the textual names of categories.  e.g. “Equipment”, “Equipment problem”, and “Troubleshooting- Equipment”  Cluster categories based on the similarity of their textual names  Dice’s coefficient Clustering Identifying important categories Temporal stability Low-rank structure Combining multiple classifiers Reducing Categories Regression 17

Identify Important Categories INFOCOM 2013  L 1 -norm regularization  Penalize all factors in x equally and make x sparse  Select categories with corresponding value in x is not 0 18 Clustering Identifying important categories Temporal stability Low-rank structure Combining multiple classifiers Reducing Categories Regression

INFOCOM 2013  Impose additional structures for under-constraints and over-fitting  The weight values are stable  Small number of factors of dominate anomalies  Fitting Error 19 Clustering Identifying important categories Temporal stability Low-rank structure Combining multiple classifiers Reducing Categories Regression

 Temporal stability  The weight values are stable across consecutive days. 20 Clustering Identifying important categories Temporal stability Low-rank structure Combining multiple classifiers Reducing Categories Regression

INFOCOM 2013  Low-rank structure  The weight values exhibit low- rank structure due to the temporal stability and the small number of dominant factors that cause the anomalies. 21 Clustering Identifying important categories Temporal stability Low-rank structure Combining multiple classifiers Reducing Categories Regression

INFOCOM 2013  Find the weight X that minimize: 22 Clustering Identifying important categories Temporal stability Low-rank structure Combining multiple classifiers Reducing Categories Regression

Combining Multiple Classifiers INFOCOM 2013  What time scale should be used?  Customers do not respond to an anomaly immediately  The response time may differ by hours  Include calls made in previous n (1~5) and next m (0~6) hours as additional features. 23 Clustering Identifying important categories Temporal stability Low-rank structure Combining multiple classifiers Reducing Categories Regression

Evaluation  Dataset  Customer care calls: data from a large network service provider in the US during Aug. 2010~ July 2011  Ground-truth Events: all events reported by Call Centers, Network Operation Centers, and etc.  Metrics  Precision: the fraction of claimed anomalies which are real anomalies  Recall: the fraction of real anomalies are claimed  F-score: the harmonic mean of precision and recall 24

Evaluation – Identifying Important Features 25

Evaluation – Identifying Important Features 26

Evaluation – Identifying Important Features 27

Evaluation – Identifying Important Features 28 L 1 -norm out-performs rand 2000, PCA, and L2-norm by 454%, 32%, and 10%

Evaluation – Regression 29

Evaluation – Regression 30

Evaluation – Regression 31 Fit+Temp+LR out-performs Random and Fit by 823% and 64%

Evaluation – Number of Classifiers 32

Evaluation – Number of Classifiers 33

Evaluation – Number of Classifiers 34 use 30 classifiers to trade off the benefit and cost.

Contribution INFOCOM 2013  Propose to use customer care calls as a complementary source to network metrics.  A direct measurement of QoS perceived by customers  Develop a systematic method to automatically detect events using customer care calls.  Scale to a large number of features  Robust to the noise 35

Thank you! IEEE INFOCOM 2013

Backup Slides INFOCOM

INFOCOM

Twitter  Leverage Twitter as external data  Additional features  Interpreting detected anomalies  Information from a tweet  Timestamp  Text Term Frequency - Inverse Document Frequency (TF-IDF)  Hashtags: keyword of the topics Used as features e.g. #ATTFAIL  Location INFOCOM

Interpreting Anomaly - Location INFOCOM

Describing the Anomaly  Examples  3G network outage Location: New York, NY Event Summary: service, outage, nyc, calls, ny, morning, service  Outage due to an earthquake Location: East Coast Event Summary: #earthquake, working, wireless, service, nyc, apparently, new, york  Internet service outage Location: Bay Area Event Summary: serviceU, bay, outage, service, Internet, area, support, #fail

How to Select Parameters  K-fold cross-validation  Partition the training data into K equal size parts.  In round i, use the partition i for training and the remaining k-1 partitions for testing. The process is repeated k times.  Average k results as the evaluation of the selected value INFOCOM