Download presentation
Presentation is loading. Please wait.
Published bySuzanna Joy Ward Modified over 9 years ago
1
CMU SCS Patterns, Anomalies, and Fraud Detection in Large Graphs Christos Faloutsos CMU
2
CMU SCS (c) 2015, C. Faloutsos 2 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –tools Part#2: time-evolving graphs –Patterns –Tools Conclusions Eurlion, Beijing 2015
3
CMU SCS (c) 2015, C. Faloutsos 3 Graphs - why should we care? Financial (user-account) Social networks computer network security: email/IP traffic and anomaly detection Recommendation systems.... Many-to-many db relationship -> graph Eurlion, Beijing 2015
4
CMU SCS Motivating problems P1: patterns? Fraud detection? P2: patterns in time-evolving graphs / tensors Eurlion, Beijing 2015(c) 2015, C. Faloutsos 4 time user account
5
CMU SCS Motivating problems P1: patterns? Fraud detection? P2: patterns in time-evolving graphs / tensors Eurlion, Beijing 2015(c) 2015, C. Faloutsos 5 time Patterns anomalies user account
6
CMU SCS (c) 2015, C. Faloutsos 6 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –tools Part#2: time-evolving graphs –Patterns –Tools Conclusions Eurlion, Beijing 2015
7
CMU SCS Eurlion, Beijing 2015(c) 2015, C. Faloutsos 7 Part 1: Static graphs
8
CMU SCS (c) 2015, C. Faloutsos 8 Laws and patterns Q1: Are real graphs random? Eurlion, Beijing 2015
9
CMU SCS (c) 2015, C. Faloutsos 9 Laws and patterns Q1: Are real graphs random? A1: NO!! –Diameter (‘6 degrees’; ‘Kevin Bacon’) –in- and out- degree distributions –other (surprising) patterns So, let’s look at the data Eurlion, Beijing 2015
10
CMU SCS (c) 2015, C. Faloutsos 10 Solution# S.1 Power law in the degree distribution [Faloutsos x 3 SIGCOMM99] log(rank) log(degree) internet domains att.com ibm.com Eurlion, Beijing 2015
11
CMU SCS (c) 2015, C. Faloutsos 11 Solution# S.1 Power law in the degree distribution [Faloutsos x 3 SIGCOMM99; + Siganos] log(rank) log(degree) -0.82 internet domains att.com ibm.com Eurlion, Beijing 2015
12
CMU SCS (c) 2015, C. Faloutsos 12 Solution# S.2: Eigen Exponent E A2: power law in the eigenvalues of the adjacency matrix (‘ eig() ’) E = -0.48 Exponent = slope Eigenvalue Rank of decreasing eigenvalue May 2001 Eurlion, Beijing 2015 A x = x
13
CMU SCS (c) 2015, C. Faloutsos 13 Solution# S.3: Triangle ‘Laws’ Real social networks have a lot of triangles Eurlion, Beijing 2015
14
CMU SCS (c) 2015, C. Faloutsos 14 Solution# S.3: Triangle ‘Laws’ Real social networks have a lot of triangles –Friends of friends are friends Any patterns? –2x the friends, 2x the triangles ? Eurlion, Beijing 2015
15
CMU SCS (c) 2015, C. Faloutsos 15 Triangle Law: #S.3 [Tsourakakis ICDM 2008] SNReuters Epinions X-axis: degree Y-axis: mean # triangles n friends -> ~n 1.6 triangles Eurlion, Beijing 2015
16
CMU SCS Triangle counting for large graphs? Anomalous nodes in Twitter(~ 3 billion edges) [U Kang, Brendan Meeder, +, PAKDD’11] 16 Eurlion, Beijing 2015 16 (c) 2015, C. Faloutsos ?? ?
17
CMU SCS Triangle counting for large graphs? Anomalous nodes in Twitter(~ 3 billion edges) [U Kang, Brendan Meeder, +, PAKDD’11] 17 Eurlion, Beijing 2015 17 (c) 2015, C. Faloutsos
18
CMU SCS Triangle counting for large graphs? Anomalous nodes in Twitter(~ 3 billion edges) [U Kang, Brendan Meeder, +, PAKDD’11] 18 Eurlion, Beijing 2015 18 (c) 2015, C. Faloutsos
19
CMU SCS Triangle counting for large graphs? Anomalous nodes in Twitter(~ 3 billion edges) [U Kang, Brendan Meeder, +, PAKDD’11] 19 Eurlion, Beijing 2015 19 (c) 2015, C. Faloutsos
20
CMU SCS Triangle counting for large graphs? Anomalous nodes in Twitter(~ 3 billion edges) [U Kang, Brendan Meeder, +, PAKDD’11] 20 Eurlion, Beijing 2015 20 (c) 2015, C. Faloutsos
21
CMU SCS (c) 2015, C. Faloutsos 21 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns (binary; weighted) –tools Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015
22
CMU SCS (c) 2015, C. Faloutsos 22 Observations on weighted graphs? A: yes - even more ‘laws’! M. McGlohon, L. Akoglu, and C. Faloutsos Weighted Graphs and Disconnected Components: Patterns and a Generator. SIG-KDD 2008 Eurlion, Beijing 2015
23
CMU SCS (c) 2015, C. Faloutsos 23 Observation W.1: Fortification Q: How do the weights of nodes relate to degree? Eurlion, Beijing 2015
24
CMU SCS (c) 2015, C. Faloutsos 24 Observation W.1: Fortification Double the checks, double the $ ? $10 $5 Eurlion, Beijing 2015 ‘Reagan’ ‘Clinton’ $7
25
CMU SCS Edges (# donors) In-weights ($) (c) 2015, C. Faloutsos 25 Observation W.1: fortification: Snapshot Power Law Weight: super-linear on in-degree exponent ‘iw’: 1.01 < iw < 1.26 Orgs-Candidates e.g. John Kerry, $10M received, from 1K donors $10 $5 Eurlion, Beijing 2015 Double the checks, double the $ ?
26
CMU SCS MORE Graph Patterns Eurlion, Beijing 2015(c) 2015, C. Faloutsos 26 ✔ ✔ ✔ RTG: A Recursive Realistic Graph Generator using Random Typing Leman Akoglu and Christos Faloutsos. PKDD’09.
27
CMU SCS MORE Graph Patterns Eurlion, Beijing 2015(c) 2015, C. Faloutsos 27 Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks. in "Social Network Data Analytics” (Ed.: Charu Aggarwal) Deepayan Chakrabarti and Christos Faloutsos, Graph Mining: Laws, Tools, and Case Studies Oct. 2012, Morgan Claypool. Graph Mining: Laws, Tools, and Case Studies
28
CMU SCS (c) 2015, C. Faloutsos 28 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral Label propagation Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015
29
CMU SCS Overview of tools Eurlion, Beijing 2015(c) 2015, C. Faloutsos 29 Binary / weights Time- evolving with class- labels Triangle-degreeB oddBallW eigenSpokesW fBoxW netProbe+WY tensorsWY ✔
30
CMU SCS OddBall: Spotting Ano malies in Weighted Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School of Computer Science PAKDD 2010, Hyderabad, India
31
CMU SCS Main idea For each node, extract ‘ego-net’ (=1-step-away neighbors) Extract features (#edges, total weight, etc etc) Compare with the rest of the population (c) 2015, C. Faloutsos 31 Eurlion, Beijing 2015
32
CMU SCS What is an egonet? ego 32 egonet (c) 2015, C. FaloutsosEurlion, Beijing 2015
33
CMU SCS Selected Features N i : number of neighbors (degree) of ego i E i : number of edges in egonet i W i : total weight of egonet i λ w,i : principal eigenvalue of the weighted adjacency matrix of egonet I 33 (c) 2015, C. FaloutsosEurlion, Beijing 2015
34
CMU SCS Near-Clique/Star 34 Eurlion, Beijing 2015(c) 2015, C. Faloutsos
35
CMU SCS Near-Clique/Star 35 (c) 2015, C. FaloutsosEurlion, Beijing 2015
36
CMU SCS Near-Clique/Star 36 (c) 2015, C. FaloutsosEurlion, Beijing 2015
37
CMU SCS Andrew Lewis (director) Near-Clique/Star 37 (c) 2015, C. FaloutsosEurlion, Beijing 2015
38
CMU SCS (c) 2015, C. Faloutsos 38 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral (EigenSpokes, CopyCatch) Label propagation Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015
39
CMU SCS Overview of tools Eurlion, Beijing 2015(c) 2015, C. Faloutsos 39 Binary / weights Time- evolving with class- labels Triangle-degreeB oddBallW eigenSpokesW fBoxW netProbe+WY tensorsWY ✔ ✔
40
CMU SCS EigenSpokes B. Aditya Prakash, Mukund Seshadri, Ashwin Sridharan, Sridhar Machiraju and Christos Faloutsos: EigenSpokes: Surprising Patterns and Scalable Community Chipping in Large Graphs, PAKDD 2010, Hyderabad, India, 21-24 June 2010. (c) 2015, C. Faloutsos 40 Eurlion, Beijing 2015
41
CMU SCS EigenSpokes Eigenvectors of adjacency matrix equivalent to singular vectors (symmetric, undirected graph) 41 (c) 2015, C. FaloutsosEurlion, Beijing 2015
42
CMU SCS EigenSpokes Eigenvectors of adjacency matrix equivalent to singular vectors (symmetric, undirected graph) 42 (c) 2015, C. FaloutsosEurlion, Beijing 2015 N N details
43
CMU SCS EigenSpokes Eigenvectors of adjacency matrix equivalent to singular vectors (symmetric, undirected graph) 43 (c) 2015, C. FaloutsosEurlion, Beijing 2015 N N details
44
CMU SCS EigenSpokes Eigenvectors of adjacency matrix equivalent to singular vectors (symmetric, undirected graph) 44 (c) 2015, C. FaloutsosEurlion, Beijing 2015 N N details
45
CMU SCS EigenSpokes Eigenvectors of adjacency matrix equivalent to singular vectors (symmetric, undirected graph) 45 (c) 2015, C. FaloutsosEurlion, Beijing 2015 N N details
46
CMU SCS EigenSpokes EE plot: Scatter plot of scores of u1 vs u2 One would expect –Many points @ origin –A few scattered ~randomly (c) 2015, C. Faloutsos 46 u1 u2 Eurlion, Beijing 2015 1 st Principal component 2 nd Principal component
47
CMU SCS EigenSpokes EE plot: Scatter plot of scores of u1 vs u2 One would expect –Many points @ origin –A few scattered ~randomly (c) 2015, C. Faloutsos 47 u1 u2 90 o Eurlion, Beijing 2015
48
CMU SCS EigenSpokes - pervasiveness Present in mobile social graph across time and space Patent citation graph 48 (c) 2015, C. FaloutsosEurlion, Beijing 2015
49
CMU SCS EigenSpokes - explanation Near-cliques, or near- bipartite-cores, loosely connected 49 (c) 2015, C. FaloutsosEurlion, Beijing 2015
50
CMU SCS EigenSpokes - explanation Near-cliques, or near- bipartite-cores, loosely connected 50 (c) 2015, C. FaloutsosEurlion, Beijing 2015
51
CMU SCS EigenSpokes - explanation Near-cliques, or near- bipartite-cores, loosely connected 51 (c) 2015, C. FaloutsosEurlion, Beijing 2015
52
CMU SCS EigenSpokes - explanation Near-cliques, or near- bipartite-cores, loosely connected So what? Extract nodes with high scores high connectivity Good “communities” spy plot of top 20 nodes 52 (c) 2015, C. FaloutsosEurlion, Beijing 2015
53
CMU SCS Bipartite Communities! magnified bipartite community patents from same inventor(s) `cut-and-paste’ bibliography! 53 (c) 2015, C. FaloutsosEurlion, Beijing 2015
54
CMU SCS (c) 2015, C. Faloutsos 54 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral (eigenspokes, CopyCatch) Label propagation Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015
55
CMU SCS Fraud Given –Who ‘likes’ what page, and when Find –Suspicious users and suspicious products Eurlion, Beijing 2015(c) 2015, C. Faloutsos 55 CopyCatch: Stopping Group Attacks by Spotting Lockstep Behavior in Social Networks, Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, Christos Faloutsos WWW, 2013.
56
CMU SCS Fraud Given –Who ‘likes’ what page, and when Find –Suspicious users and suspicious products Eurlion, Beijing 2015(c) 2015, C. Faloutsos 56 CopyCatch: Stopping Group Attacks by Spotting Lockstep Behavior in Social Networks, Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, Christos Faloutsos WWW, 2013. Likes
57
CMU SCS Our intuition ▪ Lockstep behavior: Same Likes, same time Graph Patterns and Lockstep Behavior Likes Eurlion, Beijing 2015 57 (c) 2015, C. Faloutsos
58
CMU SCS Our intuition ▪ Lockstep behavior: Same Likes, same time Graph Patterns and Lockstep Behavior Likes Eurlion, Beijing 2015 58 (c) 2015, C. Faloutsos
59
CMU SCS Our intuition ▪ Lockstep behavior: Same Likes, same time Graph Patterns and Lockstep Behavior Suspicious Lockstep Behavior Likes Eurlion, Beijing 2015 59 (c) 2015, C. Faloutsos
60
CMU SCS MapReduce Overview ▪ Use Hadoop to search for many clusters in parallel: 1. Start with randomly seed 2. Update set of Pages and center Like times for each cluster 3. Repeat until convergence Likes Eurlion, Beijing 2015 60 (c) 2015, C. Faloutsos
61
CMU SCS Deployment at Facebook ▪ CopyCatch runs regularly (along with many other security mechanisms, and a large Site Integrity team) 3 months of CopyCatch @ Facebook #users caught time Eurlion, Beijing 2015 61 (c) 2015, C. Faloutsos
62
CMU SCS Deployment at Facebook Manually labeled 22 randomly selected clusters from February 2013 Most clusters (77%) come from real but compromised users Fake acct Eurlion, Beijing 2015 62 (c) 2015, C. Faloutsos
63
CMU SCS (c) 2015, C. Faloutsos 63 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral (EigenSpokes, CopyCatch, fBox) Label propagation Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015
64
CMU SCS (c) 2015, C. Faloutsos 64 Problem: Social Network Link Fraud Eurlion, Beijing 2015 Target: find “stealthy” attackers missed by other algorithms Clique Bipartite core 41.7M nodes 1.5B edges
65
CMU SCS (c) 2015, C. Faloutsos 65 Problem: Social Network Link Fraud Eurlion, Beijing 2015 Neil Shah, Alex Beutel, Brian Gallagher and Christos Faloutsos. Spotting Suspicious Link Behavior with fBox: An Adversarial Perspective. ICDM 2014, Shenzhen, China. Target: find “stealthy” attackers missed by other algorithms Takeaway: use reconstruction error between true/latent representation!
66
CMU SCS Overview of tools Eurlion, Beijing 2015(c) 2015, C. Faloutsos 66 Binary / weights Time- evolving with class- labels Triangle-degreeB oddBallW eigenSpokesW fBoxW netProbe+WY tensorsWY ✔ ✔ ✔ ✔
67
CMU SCS (c) 2015, C. Faloutsos 67 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral (eigenspokes, CopyCatch) Label propagation (NetProbe, Polonium, Snare) Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015
68
CMU SCS Eurlion, Beijing 2015(c) 2015, C. Faloutsos 68 E-bay Fraud detection w/ Polo Chau & Shashank Pandit, CMU [www’07]
69
CMU SCS Eurlion, Beijing 2015(c) 2015, C. Faloutsos 69 E-bay Fraud detection
70
CMU SCS Eurlion, Beijing 2015(c) 2015, C. Faloutsos 70 E-bay Fraud detection
71
CMU SCS Eurlion, Beijing 2015(c) 2015, C. Faloutsos 71 E-bay Fraud detection - NetProbe
72
CMU SCS Popular press Eurlion, Beijing 2015(c) 2015, C. Faloutsos 72
73
CMU SCS (c) 2015, C. Faloutsos 73 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral (eigenspokes, CopyCatch) Label propagation (NetProbe, Polonium, Snare) Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015
74
CMU SCS Polo Chau Machine Learning Dept Carey Nachenberg Vice President & Fellow Jeffrey Wilhelm Principal Software Engineer Adam Wright Software Engineer Prof. Christos Faloutsos Computer Science Dept Polonium: Tera-Scale Graph Mining and Inference for Malware Detection PATENT PENDING SDM 2011, Mesa, Arizona
75
CMU SCS Polonium: The Data 60+ terabytes of data anonymously contributed by participants of worldwide Norton Community Watch program 50+ million machines 900+ million executable files Constructed a machine-file bipartite graph (0.2 TB+) 1 billion nodes (machines and files) 37 billion edges Eurlion, Beijing 2015 75 (c) 2015, C. Faloutsos
76
CMU SCS Polonium: Key Ideas Use Belief Propagation to propagate domain knowledge in machine-file graph to detect malware Use “guilt-by-association” (i.e., homophily) –E.g., files that appear on machines with many bad files are more likely to be bad Scalability: handles 37 billion-edge graph Eurlion, Beijing 2015 76 (c) 2015, C. Faloutsos
77
CMU SCS Polonium: One-Interaction Results 84.9% True Positive Rate 1% False Positive Rate True Positive Rate % of malware correctly identified 77 Ideal Eurlion, Beijing 2015(c) 2015, C. Faloutsos False Positive Rate % of non-malware wrongly labeled as malware
78
CMU SCS (c) 2015, C. Faloutsos 78 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral (eigenspokes, CopyCatch) Label propagation (NetProbe, Polonium, Snare) Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015
79
CMU SCS Network Effect Tools: SNARE 79 Some accounts are sort-of-suspicious – how to combine weak signals? Before Eurlion, Beijing 2015(c) 2015, C. Faloutsos Inventory Acc. payable Revenue L.A.
80
CMU SCS Network Effect Tools: SNARE 80 Some accounts are sort-of-suspicious – how to combine weak signals? Before Eurlion, Beijing 2015(c) 2015, C. Faloutsos
81
CMU SCS Network Effect Tools: SNARE 81 A: Belief Propagation. Before Eurlion, Beijing 2015(c) 2015, C. Faloutsos
82
CMU SCS Network Effect Tools: SNARE 82 A: Belief Propagation. After Before Eurlion, Beijing 2015(c) 2015, C. Faloutsos Mary McGlohon, Stephen Bay, Markus G. Anderle, David M. Steier, Christos Faloutsos: SNARE: a link analytic system for graph labeling and risk detection. KDD 2009: 1265-1274
83
CMU SCS Network Effect Tools: SNARE 83 Produces improvement over simply using flags –Up to 6.5 lift –Improvement especially for low false positive rate True positive rate Results for accounts data (ROC Curve) Ideal SNARE Baseline (flags only) Eurlion, Beijing 2015(c) 2015, C. Faloutsos False positive rate
84
CMU SCS Network Effect Tools: SNARE 84 Accurate- Produces large improvement over simply using flags Flexible- Can be applied to other domains Scalable- One iteration BP runs in linear time (# edges) Robust- Works on large range of parameters Eurlion, Beijing 2015(c) 2015, C. Faloutsos
85
CMU SCS (c) 2015, C. Faloutsos 85 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral (eigenspokes, CopyCatch) Label propagation (NetProbe,…, Unification) Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015
86
CMU SCS Unifying Guilt-by-Association Approaches: Theorems and Fast Algorithms Danai Koutra U Kang Hsing-Kuo Kenneth Pao Tai-You Ke Duen Horng (Polo) Chau Christos Faloutsos ECML PKDD, 5-9 September 2011, Athens, Greece
87
CMU SCS Problem Definition: G B A techniques (Guilt By Association) (c) 2015, C. Faloutsos 87 Given: Graph; & few labeled nodes Find: labels of rest (assuming network effects) Eurlion, Beijing 2015
88
CMU SCS Are they related? RWR (Random Walk with Restarts) –google’s pageRank (‘if my friends are important, I’m important, too’) SSL (Semi-supervised learning) –minimize the differences among neighbors BP (Belief propagation) –send messages to neighbors, on what you believe about them Eurlion, Beijing 2015(c) 2015, C. Faloutsos 88
89
CMU SCS Are they related? RWR (Random Walk with Restarts) –google’s pageRank (‘if my friends are important, I’m important, too’) SSL (Semi-supervised learning) –minimize the differences among neighbors BP (Belief propagation) –send messages to neighbors, on what you believe about them Eurlion, Beijing 2015(c) 2015, C. Faloutsos 89 YES!
90
CMU SCS Correspondence of Methods (c) 2015, C. Faloutsos 90 MethodMatrixUnknow n known RWR[I – c AD -1 ]×x=(1-c)y SSL [I + a (D - A)] ×x=y F A BP [I + a D - c ’ A] ×bhbh =φhφh 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 ? ? 0 1 0 1 d1 d2 d3 d1 d2 d3 final labels/ beliefs prior labels/ beliefs adjacency matrix Eurlion, Beijing 2015 DETAILS
91
CMU SCS Results: Scalability (c) 2015, C. Faloutsos 91 F A BP is linear on the number of edges. # of edges runtime (min) Eurlion, Beijing 2015 DETAILS
92
CMU SCS Overview of tools Eurlion, Beijing 2015(c) 2015, C. Faloutsos 92 Binary / weights Time- evolving with class- labels Triangle-degreeB oddBallW eigenSpokesW fBoxW netProbe+WY tensorsWY ✔ ✔ ✔ ✔ ✔
93
CMU SCS Summary of Part#1 *many* patterns in real graphs –Power-laws everywhere –Long (and growing) list of tools for anomaly/fraud detection Eurlion, Beijing 2015(c) 2015, C. Faloutsos 93 Patterns anomalies fBoxNetProbe oddBall …
94
CMU SCS (c) 2015, C. Faloutsos 94 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral (eigenspokes, CopyCatch) Label propagation (NetProbe, Polonium, Snare) Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015
95
CMU SCS Eurlion, Beijing 2015(c) 2015, C. Faloutsos 95 Part 2: Time evolving graphs
96
CMU SCS Graphs over time -> tensors! Problem #2: –Given who calls whom, and when –Find patterns / anomalies Eurlion, Beijing 2015(c) 2015, C. Faloutsos 96 smith johnson
97
CMU SCS Graphs over time -> tensors! Problem #2: –Given who calls whom, and when –Find patterns / anomalies Eurlion, Beijing 2015(c) 2015, C. Faloutsos 97
98
CMU SCS Graphs over time -> tensors! Problem #2: –Given who calls whom, and when –Find patterns / anomalies Eurlion, Beijing 2015(c) 2015, C. Faloutsos 98 Mon Tue
99
CMU SCS Graphs over time -> tensors! Problem #2: –Given who calls whom, and when –Find patterns / anomalies Eurlion, Beijing 2015(c) 2015, C. Faloutsos 99 callee caller time
100
CMU SCS Graphs over time -> tensors! Problem #2’: –Given customer-account-date –Find patterns / anomalies Eurlion, Beijing 2015(c) 2015, C. Faloutsos 100 account customer date MANY more settings, with >2 ‘modes’
101
CMU SCS Graphs over time -> tensors! Problem #2’’: –Given author-keyword-date –Find patterns / anomalies Eurlion, Beijing 2015(c) 2015, C. Faloutsos 101 keyword author date MANY more settings, with >2 ‘modes’
102
CMU SCS Graphs over time -> tensors! Problem #2’’’: –Given subject – verb – object facts –Find patterns / anomalies Eurlion, Beijing 2015(c) 2015, C. Faloutsos 102 object subject verb MANY more settings, with >2 ‘modes’
103
CMU SCS Graphs over time -> tensors! Problem #2’’’’: –Given –Find patterns / anomalies Eurlion, Beijing 2015(c) 2015, C. Faloutsos 103 mode2 mode1 mode3 MANY more settings, with >2 ‘modes’ (and 4, 5, etc modes)
104
CMU SCS Answer to all: tensor factorization Recall: (SVD) matrix factorization: finds blocks Eurlion, Beijing 2015(c) 2015, C. Faloutsos 104 N users M products ‘meat-eaters’ ‘steaks’ ‘vegetarians’ ‘plants’ ‘kids’ ‘cookies’ ~ + +
105
CMU SCS Answer to all: tensor factorization PARAFAC decomposition Eurlion, Beijing 2015(c) 2015, C. Faloutsos 105 = + + subject object verb politicians artistsathletes
106
CMU SCS Answer: tensor factorization PARAFAC decomposition Results for who-calls-whom-when –4M x 15 days Eurlion, Beijing 2015(c) 2015, C. Faloutsos 106 = + + caller callee time ??
107
CMU SCS Anomaly detection in time- evolving graphs Anomalous communities in phone call data: –European country, 4M clients, data over 2 weeks ~200 calls to EACH receiver on EACH day! 1 caller5 receivers4 days of activity Eurlion, Beijing 2015 107 (c) 2015, C. Faloutsos =
108
CMU SCS Anomaly detection in time- evolving graphs Anomalous communities in phone call data: –European country, 4M clients, data over 2 weeks ~200 calls to EACH receiver on EACH day! 1 caller5 receivers4 days of activity Eurlion, Beijing 2015 108 (c) 2015, C. Faloutsos =
109
CMU SCS Anomaly detection in time- evolving graphs Anomalous communities in phone call data: –European country, 4M clients, data over 2 weeks ~200 calls to EACH receiver on EACH day! Eurlion, Beijing 2015 109 (c) 2015, C. Faloutsos = Miguel Araujo, Spiros Papadimitriou, Stephan Günnemann, Christos Faloutsos, Prithwish Basu, Ananthram Swami, Evangelos Papalexakis, Danai Koutra. Com2: Fast Automatic Discovery of Temporal (Comet) Communities. PAKDD 2014, Tainan, Taiwan.
110
CMU SCS (c) 2015, C. Faloutsos 110 Roadmap Introduction – Motivation –Why study (big) graphs? Part#1: Patterns in graphs Part#2: time-evolving graphs; tensors Resources and Conclusions Eurlion, Beijing 2015
111
CMU SCS (c) 2015, C. Faloutsos 111 Project info: PEGASUS Eurlion, Beijing 2015 www.cs.cmu.edu/~pegasus Results on large graphs: with Pegasus + hadoop + M45 Apache license Code, papers, manual, video Prof. U Kang Prof. Polo Chau
112
CMU SCS (c) 2015, C. Faloutsos 112 Cast Akoglu, Leman Chau, Polo Kang, U Prakash, Aditya Eurlion, Beijing 2015 Koutra, Danai Beutel, Alex Papalexakis, Vagelis Shah, Neil Lee, Jay Yoon Araujo, Miguel
113
CMU SCS (c) 2015, C. Faloutsos 113 CONCLUSION#0 Patterns Anomalies Eurlion, Beijing 2015
114
CMU SCS (c) 2015, C. Faloutsos 114 CONCLUSION#1 Many, surprising patterns in real graphs Eurlion, Beijing 2015 Rank-degree Degree-#triangles Degree-weight …
115
CMU SCS (c) 2015, C. Faloutsos 115 CONCLUSION#2 - tools Many, powerful tools Eurlion, Beijing 2015 fBox oddBall tensors …
116
CMU SCS Overview of tools Eurlion, Beijing 2015(c) 2015, C. Faloutsos 116 Binary / weights Time- evolving with class- labels Triangle-degreeB oddBallW eigenSpokesW fBoxW netProbe+WY tensorsWY
117
CMU SCS (c) 2015, C. Faloutsos 117 References D. Chakrabarti, C. Faloutsos: Graph Mining – Laws, Tools and Case Studies, Morgan Claypool 2012 http://www.morganclaypool.com/doi/abs/10.2200/S004 49ED1V01Y201209DMK006 Eurlion, Beijing 2015
118
CMU SCS (c) 2015, C. Faloutsos 118 Many, powerful tools Eurlion, Beijing 2015 fBox oddBall tensors … Thank you! christos@cs.cmu.edu www.cs.cmu.edu/~christos
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.