Download presentation
Presentation is loading. Please wait.
1
SCS CMU Joint Work by Hanghang Tong, Yasushi Sakurai, Tina Eliassi-Rad, Christos Faloutsos Speaker: Hanghang Tong Oct. 26-30, 2008, Napa, CA CIKM 2008 Fast Mining of Complex Time-Stamped Events
2
SCS CMU A Motivating Example: Inputs TimeEvent (e.g., Session) Entity Oct. 26Link AnalysisTom, Bob ClusteringBob, Alan Oct. 27ClassificationBob, Alan Anomaly DetectionAlan, Beck Oct. 28PartyBeck, Dan Oct. 29Web SearchDan, Jack AdvertisingJack, Peter Oct. 30Enterprise SearchJack, Peter Oct. 31Q & APeter, Smith 2
3
SCS CMU Time Cluster, rep. entities: b 7,b 6, b 8 A Motivating Example: Outputs Jack Oct. 29 Oct. 30 Oct. 28 Oct. 26 Oct. 27 Time Cluster Rep. Entities: ``Jack’’, ``Peter’’, ``Smith’’ Abnormal Time Rep. Entities: ``Beck’’ ``Dan’’ Time Cluster Rep. Entities: ``Tom’’, ``Bob’’,``Alan’’ 1 st eigen-vector 2 nd eigen-vector
4
SCS CMU Problem Definitions: ( How to Understand Time in such complex context) Given datasets collected at different time stamps; Find –Q1: Time Clusters –Q2: Abnormal Time stamps –Q3: Interpretations –Q4: Right time granularity 4
5
SCS CMU Roadmap Motivation T3: Single Resolution Analysis MT3: Multi Resolution Analysis Experimental Evaluations Conclusion 5
6
SCS CMU T3: Single Resolution Analysis Given the data sets collected at different time stamps… Find –(1) Clusters for time stamps –(2) Abnormal time stamps –(3) Interpretations 6
7
SCS CMU How to represent the data sets? TimeEvent (e.g., Session) Entity Oct. 26Link AnalysisTom, Bob ClusteringBob, Alan Oct. 27ClassificationBob, Alan Anomaly DetectionAlan, Beck Oct. 28PartyBeck, Dan Oct. 29Web SearchDan, Jack AdvertisingJack, Peter Oct. 30Enterprise SearchJack, Peter Oct. 31Q & APeter, Smith 7
8
SCS CMU A: Graph Representation! Oct. 26, 2008 Oct. 27, 2008 Oct. 28, 2008 Oct. 29, 2008 Oct. 30, 2008 Oct. 31, 2008 Link Analysis Clustering Classification Anomaly Dect. Party Web Search Advertising En. Search Q & A Tom Bob Alan Beck Dan Jack Peter Smith 8
9
SCS CMU A: Graph Representation! Oct. 26, 2008 Oct. 27, 2008 Oct. 28, 2008 Oct. 29, 2008 Oct. 30, 2008 Oct. 31, 2008 Link Analysis Clustering Classification Anomaly Dect. Party Web Search Advertising En. Search Q & A Tom Bob Alan Beck Dan Jack Peter Smith 9 Prof. CEO Stu.
10
SCS CMU Qs: Given the graph, How to cluster time nodes? How to spot abnormal time nodes? How to interpret? 10
11
SCS CMU Q1: How to cluster time nodes? Step 1: Time-To-Time (TT) proximity matrix Oct. 26 Oct. 27 Oct. 28 Oct. 29 Oct. 30 Oct. 31 Oct. 26 Oct. 27Oct. 28Oct. 29Oct. 30Oct. 31 11
12
SCS CMU Q1: How to cluster time nodes? Step 2: Cluster time nodes by TT matrix –Spectral Cluster Alg. (and a lot of others) Oct. 26 Oct. 27 Oct. 28 Oct. 29 Oct. 30 Oct. 31 Oct. 26 Oct. 27Oct. 28Oct. 29Oct. 30Oct. 31 12
13
SCS CMU Q2: how to find abnormal time node? Abnormal time = Time cluster with singleton Oct. 26 Oct. 27 Oct. 28 Oct. 29 Oct. 30 Oct. 31 Oct. 26 Oct. 27Oct. 28Oct. 29Oct. 30Oct. 31 Oct. 28 is abnormal! 13
14
SCS CMU Q3: How to interpret? Step 1: Time-to-People (TP) proximity matrix.9.8.01.3.5.8.5.01.3.2.9.01.3.01.5.8.01.8.01.8 Tom Oct. 26 Oct. 27 Oct. 28 Oct. 29 Oct. 30 Oct. 31 Bob Alan Beck Dan Jack Peter Smith e.g., we want to use people to interpret time cluster/anomaly 14
15
SCS CMU Q3: How to interpret? Step 2: Time Cluster-to-People (TCP) matrix.9.8.01.3.5.8.5.01.3.2.9.2.3.01.5.8.01.8.01.8 Tom Oct. 26 Oct. 27 Oct. 28 Oct. 29 Oct. 30 Oct. 31 Bob Alan Beck Dan Jack Peter Smith e.g., we want to use people to interpret time cluster/anomaly 15
16
SCS CMU Q3: How to interpret? Step 2: Time Cluster-to-People (TCP) matrix.7.8.25.01.3.2.9.2.3.01.17.8 Tom Oct. 26 Oct. 27 Oct. 28 Oct. 29 Oct. 30 Oct. 31 Bob Alan Beck Dan Jack Peter Smith e.g., we want to use people to interpret time cluster/anomaly 16
17
SCS CMU Q3: How to interpret? Step 3: Find `unique’ entity nodes.7.8.25.01.3.2.9.2.3.01.17.8 Tom Oct. 26 Oct. 27 Oct. 28 Oct. 29 Oct. 30 Oct. 31 Bob Alan Beck Dan Jack Peter Smith.9.8 e.g., “Bob is close to green cluster on average, but far away from both red & blue clusters”
18
SCS CMU Summary So Far… Given the data sets collected at different time stamps, We –Construct a graph representation –Get two proximity matrices –Find time clusters/abnormal time stamps –Provide the interpretations. Q: How to get proximity matrices ? 18
19
SCS CMU How to get proximity matrices ? (i.e., TT/TP matrices) a.k.a Relevance, Closeness, ‘Similarity’… 19 e.g., Oct. 28, 2008 Or, ``John Smith’’
20
SCS CMU What is a ``good’’ Proximity? Multiple Connections/paths Quality of connection Direct & In-directed Conns Length, Degree, Weight… … 20
21
SCS CMU 1 4 3 2 5 6 7 9 10 8 1 1212 Random walk with restart 21
22
SCS CMU Random walk with restart Node 4 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node 12 0.13 0.10 0.13 0.22 0.13 0.05 0.08 0.04 0.03 0.04 0.02 1 4 3 2 5 6 7 9 10 8 1 1212 0.13 0.10 0.13 0.05 0.08 0.04 0.02 0.04 0.03 Ranking vector More red, more relevant Nearby nodes, higher scores 22
23
SCS CMU Computing RWR 1 4 3 2 5 6 7 9 10 8 1 1212 n x n n x 1 Ranking vector Starting vector Adjacency matrix 1 Restart p A lot of techniques exist to solve this, - e.g., Iterative method 23
24
SCS CMU Roadmap Motivation T3: Single Resolution Analysis MT3: Multi Resolution Analysis Experimental Evaluations Conclusion 24
25
SCS CMU MT3: Multiple Resolution Analysis Given –(1) the data sets collected at different time stamps; –(2) different time resolutions Find –(1) Clusters for time stamps –(2) Abnormal time stamps –(3) Interpretations At each of the given resolutions, efficiently. 25
26
SCS CMU Given We want to … –(At the Finest Res.) Mine & Interpret `Oct 26’, `Oct 27’, `Oct 28’, `Oct 29’, `Oct 30’, `Oct 31’ –(At the coarser Res.) Mine & Interpret `Oct 26-27’, `Oct 28-29’, `Oct 30-31’ MT3:an example 26
27
SCS CMU Outputs At the finest resolutionAt the coaser resolution 27
28
SCS CMU MT3: How to (Naïve Solution) TT TP Time Cluster & Anomaly Annotations/ interpretations TT TP Time Cluster & Anomaly ~ ~ 28 Annotations/ interpretations
29
SCS CMU Challenges Given the mining results at the finest resolution, How to speed up the analysis at the coarser resolutions? 29
30
SCS CMU MT3: Observation A lot of overlap between two graphs ! for finest resolution for coarser resolution
31
SCS CMU MT3: Solution TT TP TT TP ~ ~ 31
32
SCS CMU Roadmap Motivation T3: Single Resolution Analysis MT3: Multi Resolution Analysis Experimental Evaluations Conclusion 32
33
SCS CMU Data Sets CIKM: from CIKM proceedings Time: Publication year (1993-2007, 15) Event: Paper-published (952) Entities: Author (1895) & Session (279) Attribute: Keyword (158) DeviceScan: from MIT Reality Mining Time: the day scanning happened 1/1/2004-5/5/2005, 294 Event: blue tooth device scanning person (114, 046) Entities: Device (103) & Person (97) Attribute: NA 33
34
SCS CMU T3 on `CIKM’ Data Set Rep. AuthorsRep. Keywords James. P. Callan W. Bruce Croft James Allan Philip S. Yu George Karypis Charles Clarke Web Cluster Classification XML Language Stream Rep. AuthorsRep. Keywords Elke Rundensteiner Daniel Miranker Andreas Henrich Il-Yeol Song Scott B Huffman Robert J. Hall Knowledge System Unstructured Rule Object-oriented Deductive 34
35
SCS CMU MT3 on `DeviceScan’ Data Set Aggregate by Month Apr. 2004 is anomaly Aggregate by Day Work day Semester Break & Holiday 35
36
SCS CMU Evaluation on Speed of MT3 Aggregation Length Log Time (Sec.) MT3 Naïve Sol. DeviceScan Data Set 120x speed up 36
37
SCS CMU Conclusion T3: Single Resolution Analysis Graph Representation Using Proximity to Find Time Cluster/Anomaly Provide Interpretations MT3: Multiple Resolution Analysis Redundancy among different resolutions Up to 2 orders of magnitude speedup (same quality) 37
38
SCS CMU Thank you! htong@cs.cmu.edu 38
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.