CMU SCS Patterns, Anomalies, and Fraud Detection in Large Graphs Christos Faloutsos CMU
CMU SCS (c) 2015, C. Faloutsos 2 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –tools Part#2: time-evolving graphs –Patterns –Tools Conclusions Eurlion, Beijing 2015
CMU SCS (c) 2015, C. Faloutsos 3 Graphs - why should we care? Financial (user-account) Social networks computer network security: /IP traffic and anomaly detection Recommendation systems.... Many-to-many db relationship -> graph Eurlion, Beijing 2015
CMU SCS Motivating problems P1: patterns? Fraud detection? P2: patterns in time-evolving graphs / tensors Eurlion, Beijing 2015(c) 2015, C. Faloutsos 4 time user account
CMU SCS Motivating problems P1: patterns? Fraud detection? P2: patterns in time-evolving graphs / tensors Eurlion, Beijing 2015(c) 2015, C. Faloutsos 5 time Patterns anomalies user account
CMU SCS (c) 2015, C. Faloutsos 6 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –tools Part#2: time-evolving graphs –Patterns –Tools Conclusions Eurlion, Beijing 2015
CMU SCS Eurlion, Beijing 2015(c) 2015, C. Faloutsos 7 Part 1: Static graphs
CMU SCS (c) 2015, C. Faloutsos 8 Laws and patterns Q1: Are real graphs random? Eurlion, Beijing 2015
CMU SCS (c) 2015, C. Faloutsos 9 Laws and patterns Q1: Are real graphs random? A1: NO!! –Diameter (‘6 degrees’; ‘Kevin Bacon’) –in- and out- degree distributions –other (surprising) patterns So, let’s look at the data Eurlion, Beijing 2015
CMU SCS (c) 2015, C. Faloutsos 10 Solution# S.1 Power law in the degree distribution [Faloutsos x 3 SIGCOMM99] log(rank) log(degree) internet domains att.com ibm.com Eurlion, Beijing 2015
CMU SCS (c) 2015, C. Faloutsos 11 Solution# S.1 Power law in the degree distribution [Faloutsos x 3 SIGCOMM99; + Siganos] log(rank) log(degree) internet domains att.com ibm.com Eurlion, Beijing 2015
CMU SCS (c) 2015, C. Faloutsos 12 Solution# S.2: Eigen Exponent E A2: power law in the eigenvalues of the adjacency matrix (‘ eig() ’) E = Exponent = slope Eigenvalue Rank of decreasing eigenvalue May 2001 Eurlion, Beijing 2015 A x = x
CMU SCS (c) 2015, C. Faloutsos 13 Solution# S.3: Triangle ‘Laws’ Real social networks have a lot of triangles Eurlion, Beijing 2015
CMU SCS (c) 2015, C. Faloutsos 14 Solution# S.3: Triangle ‘Laws’ Real social networks have a lot of triangles –Friends of friends are friends Any patterns? –2x the friends, 2x the triangles ? Eurlion, Beijing 2015
CMU SCS (c) 2015, C. Faloutsos 15 Triangle Law: #S.3 [Tsourakakis ICDM 2008] SNReuters Epinions X-axis: degree Y-axis: mean # triangles n friends -> ~n 1.6 triangles Eurlion, Beijing 2015
CMU SCS Triangle counting for large graphs? Anomalous nodes in Twitter(~ 3 billion edges) [U Kang, Brendan Meeder, +, PAKDD’11] 16 Eurlion, Beijing (c) 2015, C. Faloutsos ?? ?
CMU SCS Triangle counting for large graphs? Anomalous nodes in Twitter(~ 3 billion edges) [U Kang, Brendan Meeder, +, PAKDD’11] 17 Eurlion, Beijing (c) 2015, C. Faloutsos
CMU SCS Triangle counting for large graphs? Anomalous nodes in Twitter(~ 3 billion edges) [U Kang, Brendan Meeder, +, PAKDD’11] 18 Eurlion, Beijing (c) 2015, C. Faloutsos
CMU SCS Triangle counting for large graphs? Anomalous nodes in Twitter(~ 3 billion edges) [U Kang, Brendan Meeder, +, PAKDD’11] 19 Eurlion, Beijing (c) 2015, C. Faloutsos
CMU SCS Triangle counting for large graphs? Anomalous nodes in Twitter(~ 3 billion edges) [U Kang, Brendan Meeder, +, PAKDD’11] 20 Eurlion, Beijing (c) 2015, C. Faloutsos
CMU SCS (c) 2015, C. Faloutsos 21 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns (binary; weighted) –tools Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015
CMU SCS (c) 2015, C. Faloutsos 22 Observations on weighted graphs? A: yes - even more ‘laws’! M. McGlohon, L. Akoglu, and C. Faloutsos Weighted Graphs and Disconnected Components: Patterns and a Generator. SIG-KDD 2008 Eurlion, Beijing 2015
CMU SCS (c) 2015, C. Faloutsos 23 Observation W.1: Fortification Q: How do the weights of nodes relate to degree? Eurlion, Beijing 2015
CMU SCS (c) 2015, C. Faloutsos 24 Observation W.1: Fortification Double the checks, double the $ ? $10 $5 Eurlion, Beijing 2015 ‘Reagan’ ‘Clinton’ $7
CMU SCS Edges (# donors) In-weights ($) (c) 2015, C. Faloutsos 25 Observation W.1: fortification: Snapshot Power Law Weight: super-linear on in-degree exponent ‘iw’: 1.01 < iw < 1.26 Orgs-Candidates e.g. John Kerry, $10M received, from 1K donors $10 $5 Eurlion, Beijing 2015 Double the checks, double the $ ?
CMU SCS MORE Graph Patterns Eurlion, Beijing 2015(c) 2015, C. Faloutsos 26 ✔ ✔ ✔ RTG: A Recursive Realistic Graph Generator using Random Typing Leman Akoglu and Christos Faloutsos. PKDD’09.
CMU SCS MORE Graph Patterns Eurlion, Beijing 2015(c) 2015, C. Faloutsos 27 Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks. in "Social Network Data Analytics” (Ed.: Charu Aggarwal) Deepayan Chakrabarti and Christos Faloutsos, Graph Mining: Laws, Tools, and Case Studies Oct. 2012, Morgan Claypool. Graph Mining: Laws, Tools, and Case Studies
CMU SCS (c) 2015, C. Faloutsos 28 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral Label propagation Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015
CMU SCS Overview of tools Eurlion, Beijing 2015(c) 2015, C. Faloutsos 29 Binary / weights Time- evolving with class- labels Triangle-degreeB oddBallW eigenSpokesW fBoxW netProbe+WY tensorsWY ✔
CMU SCS OddBall: Spotting Ano malies in Weighted Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School of Computer Science PAKDD 2010, Hyderabad, India
CMU SCS Main idea For each node, extract ‘ego-net’ (=1-step-away neighbors) Extract features (#edges, total weight, etc etc) Compare with the rest of the population (c) 2015, C. Faloutsos 31 Eurlion, Beijing 2015
CMU SCS What is an egonet? ego 32 egonet (c) 2015, C. FaloutsosEurlion, Beijing 2015
CMU SCS Selected Features N i : number of neighbors (degree) of ego i E i : number of edges in egonet i W i : total weight of egonet i λ w,i : principal eigenvalue of the weighted adjacency matrix of egonet I 33 (c) 2015, C. FaloutsosEurlion, Beijing 2015
CMU SCS Near-Clique/Star 34 Eurlion, Beijing 2015(c) 2015, C. Faloutsos
CMU SCS Near-Clique/Star 35 (c) 2015, C. FaloutsosEurlion, Beijing 2015
CMU SCS Near-Clique/Star 36 (c) 2015, C. FaloutsosEurlion, Beijing 2015
CMU SCS Andrew Lewis (director) Near-Clique/Star 37 (c) 2015, C. FaloutsosEurlion, Beijing 2015
CMU SCS (c) 2015, C. Faloutsos 38 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral (EigenSpokes, CopyCatch) Label propagation Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015
CMU SCS Overview of tools Eurlion, Beijing 2015(c) 2015, C. Faloutsos 39 Binary / weights Time- evolving with class- labels Triangle-degreeB oddBallW eigenSpokesW fBoxW netProbe+WY tensorsWY ✔ ✔
CMU SCS EigenSpokes B. Aditya Prakash, Mukund Seshadri, Ashwin Sridharan, Sridhar Machiraju and Christos Faloutsos: EigenSpokes: Surprising Patterns and Scalable Community Chipping in Large Graphs, PAKDD 2010, Hyderabad, India, June (c) 2015, C. Faloutsos 40 Eurlion, Beijing 2015
CMU SCS EigenSpokes Eigenvectors of adjacency matrix equivalent to singular vectors (symmetric, undirected graph) 41 (c) 2015, C. FaloutsosEurlion, Beijing 2015
CMU SCS EigenSpokes Eigenvectors of adjacency matrix equivalent to singular vectors (symmetric, undirected graph) 42 (c) 2015, C. FaloutsosEurlion, Beijing 2015 N N details
CMU SCS EigenSpokes Eigenvectors of adjacency matrix equivalent to singular vectors (symmetric, undirected graph) 43 (c) 2015, C. FaloutsosEurlion, Beijing 2015 N N details
CMU SCS EigenSpokes Eigenvectors of adjacency matrix equivalent to singular vectors (symmetric, undirected graph) 44 (c) 2015, C. FaloutsosEurlion, Beijing 2015 N N details
CMU SCS EigenSpokes Eigenvectors of adjacency matrix equivalent to singular vectors (symmetric, undirected graph) 45 (c) 2015, C. FaloutsosEurlion, Beijing 2015 N N details
CMU SCS EigenSpokes EE plot: Scatter plot of scores of u1 vs u2 One would expect –Many origin –A few scattered ~randomly (c) 2015, C. Faloutsos 46 u1 u2 Eurlion, Beijing st Principal component 2 nd Principal component
CMU SCS EigenSpokes EE plot: Scatter plot of scores of u1 vs u2 One would expect –Many origin –A few scattered ~randomly (c) 2015, C. Faloutsos 47 u1 u2 90 o Eurlion, Beijing 2015
CMU SCS EigenSpokes - pervasiveness Present in mobile social graph across time and space Patent citation graph 48 (c) 2015, C. FaloutsosEurlion, Beijing 2015
CMU SCS EigenSpokes - explanation Near-cliques, or near- bipartite-cores, loosely connected 49 (c) 2015, C. FaloutsosEurlion, Beijing 2015
CMU SCS EigenSpokes - explanation Near-cliques, or near- bipartite-cores, loosely connected 50 (c) 2015, C. FaloutsosEurlion, Beijing 2015
CMU SCS EigenSpokes - explanation Near-cliques, or near- bipartite-cores, loosely connected 51 (c) 2015, C. FaloutsosEurlion, Beijing 2015
CMU SCS EigenSpokes - explanation Near-cliques, or near- bipartite-cores, loosely connected So what? Extract nodes with high scores high connectivity Good “communities” spy plot of top 20 nodes 52 (c) 2015, C. FaloutsosEurlion, Beijing 2015
CMU SCS Bipartite Communities! magnified bipartite community patents from same inventor(s) `cut-and-paste’ bibliography! 53 (c) 2015, C. FaloutsosEurlion, Beijing 2015
CMU SCS (c) 2015, C. Faloutsos 54 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral (eigenspokes, CopyCatch) Label propagation Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015
CMU SCS Fraud Given –Who ‘likes’ what page, and when Find –Suspicious users and suspicious products Eurlion, Beijing 2015(c) 2015, C. Faloutsos 55 CopyCatch: Stopping Group Attacks by Spotting Lockstep Behavior in Social Networks, Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, Christos Faloutsos WWW, 2013.
CMU SCS Fraud Given –Who ‘likes’ what page, and when Find –Suspicious users and suspicious products Eurlion, Beijing 2015(c) 2015, C. Faloutsos 56 CopyCatch: Stopping Group Attacks by Spotting Lockstep Behavior in Social Networks, Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, Christos Faloutsos WWW, Likes
CMU SCS Our intuition ▪ Lockstep behavior: Same Likes, same time Graph Patterns and Lockstep Behavior Likes Eurlion, Beijing (c) 2015, C. Faloutsos
CMU SCS Our intuition ▪ Lockstep behavior: Same Likes, same time Graph Patterns and Lockstep Behavior Likes Eurlion, Beijing (c) 2015, C. Faloutsos
CMU SCS Our intuition ▪ Lockstep behavior: Same Likes, same time Graph Patterns and Lockstep Behavior Suspicious Lockstep Behavior Likes Eurlion, Beijing (c) 2015, C. Faloutsos
CMU SCS MapReduce Overview ▪ Use Hadoop to search for many clusters in parallel: 1. Start with randomly seed 2. Update set of Pages and center Like times for each cluster 3. Repeat until convergence Likes Eurlion, Beijing (c) 2015, C. Faloutsos
CMU SCS Deployment at Facebook ▪ CopyCatch runs regularly (along with many other security mechanisms, and a large Site Integrity team) 3 months of Facebook #users caught time Eurlion, Beijing (c) 2015, C. Faloutsos
CMU SCS Deployment at Facebook Manually labeled 22 randomly selected clusters from February 2013 Most clusters (77%) come from real but compromised users Fake acct Eurlion, Beijing (c) 2015, C. Faloutsos
CMU SCS (c) 2015, C. Faloutsos 63 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral (EigenSpokes, CopyCatch, fBox) Label propagation Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015
CMU SCS (c) 2015, C. Faloutsos 64 Problem: Social Network Link Fraud Eurlion, Beijing 2015 Target: find “stealthy” attackers missed by other algorithms Clique Bipartite core 41.7M nodes 1.5B edges
CMU SCS (c) 2015, C. Faloutsos 65 Problem: Social Network Link Fraud Eurlion, Beijing 2015 Neil Shah, Alex Beutel, Brian Gallagher and Christos Faloutsos. Spotting Suspicious Link Behavior with fBox: An Adversarial Perspective. ICDM 2014, Shenzhen, China. Target: find “stealthy” attackers missed by other algorithms Takeaway: use reconstruction error between true/latent representation!
CMU SCS Overview of tools Eurlion, Beijing 2015(c) 2015, C. Faloutsos 66 Binary / weights Time- evolving with class- labels Triangle-degreeB oddBallW eigenSpokesW fBoxW netProbe+WY tensorsWY ✔ ✔ ✔ ✔
CMU SCS (c) 2015, C. Faloutsos 67 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral (eigenspokes, CopyCatch) Label propagation (NetProbe, Polonium, Snare) Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015
CMU SCS Eurlion, Beijing 2015(c) 2015, C. Faloutsos 68 E-bay Fraud detection w/ Polo Chau & Shashank Pandit, CMU [www’07]
CMU SCS Eurlion, Beijing 2015(c) 2015, C. Faloutsos 69 E-bay Fraud detection
CMU SCS Eurlion, Beijing 2015(c) 2015, C. Faloutsos 70 E-bay Fraud detection
CMU SCS Eurlion, Beijing 2015(c) 2015, C. Faloutsos 71 E-bay Fraud detection - NetProbe
CMU SCS Popular press Eurlion, Beijing 2015(c) 2015, C. Faloutsos 72
CMU SCS (c) 2015, C. Faloutsos 73 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral (eigenspokes, CopyCatch) Label propagation (NetProbe, Polonium, Snare) Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015
CMU SCS Polo Chau Machine Learning Dept Carey Nachenberg Vice President & Fellow Jeffrey Wilhelm Principal Software Engineer Adam Wright Software Engineer Prof. Christos Faloutsos Computer Science Dept Polonium: Tera-Scale Graph Mining and Inference for Malware Detection PATENT PENDING SDM 2011, Mesa, Arizona
CMU SCS Polonium: The Data 60+ terabytes of data anonymously contributed by participants of worldwide Norton Community Watch program 50+ million machines 900+ million executable files Constructed a machine-file bipartite graph (0.2 TB+) 1 billion nodes (machines and files) 37 billion edges Eurlion, Beijing (c) 2015, C. Faloutsos
CMU SCS Polonium: Key Ideas Use Belief Propagation to propagate domain knowledge in machine-file graph to detect malware Use “guilt-by-association” (i.e., homophily) –E.g., files that appear on machines with many bad files are more likely to be bad Scalability: handles 37 billion-edge graph Eurlion, Beijing (c) 2015, C. Faloutsos
CMU SCS Polonium: One-Interaction Results 84.9% True Positive Rate 1% False Positive Rate True Positive Rate % of malware correctly identified 77 Ideal Eurlion, Beijing 2015(c) 2015, C. Faloutsos False Positive Rate % of non-malware wrongly labeled as malware
CMU SCS (c) 2015, C. Faloutsos 78 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral (eigenspokes, CopyCatch) Label propagation (NetProbe, Polonium, Snare) Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015
CMU SCS Network Effect Tools: SNARE 79 Some accounts are sort-of-suspicious – how to combine weak signals? Before Eurlion, Beijing 2015(c) 2015, C. Faloutsos Inventory Acc. payable Revenue L.A.
CMU SCS Network Effect Tools: SNARE 80 Some accounts are sort-of-suspicious – how to combine weak signals? Before Eurlion, Beijing 2015(c) 2015, C. Faloutsos
CMU SCS Network Effect Tools: SNARE 81 A: Belief Propagation. Before Eurlion, Beijing 2015(c) 2015, C. Faloutsos
CMU SCS Network Effect Tools: SNARE 82 A: Belief Propagation. After Before Eurlion, Beijing 2015(c) 2015, C. Faloutsos Mary McGlohon, Stephen Bay, Markus G. Anderle, David M. Steier, Christos Faloutsos: SNARE: a link analytic system for graph labeling and risk detection. KDD 2009:
CMU SCS Network Effect Tools: SNARE 83 Produces improvement over simply using flags –Up to 6.5 lift –Improvement especially for low false positive rate True positive rate Results for accounts data (ROC Curve) Ideal SNARE Baseline (flags only) Eurlion, Beijing 2015(c) 2015, C. Faloutsos False positive rate
CMU SCS Network Effect Tools: SNARE 84 Accurate- Produces large improvement over simply using flags Flexible- Can be applied to other domains Scalable- One iteration BP runs in linear time (# edges) Robust- Works on large range of parameters Eurlion, Beijing 2015(c) 2015, C. Faloutsos
CMU SCS (c) 2015, C. Faloutsos 85 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral (eigenspokes, CopyCatch) Label propagation (NetProbe,…, Unification) Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015
CMU SCS Unifying Guilt-by-Association Approaches: Theorems and Fast Algorithms Danai Koutra U Kang Hsing-Kuo Kenneth Pao Tai-You Ke Duen Horng (Polo) Chau Christos Faloutsos ECML PKDD, 5-9 September 2011, Athens, Greece
CMU SCS Problem Definition: G B A techniques (Guilt By Association) (c) 2015, C. Faloutsos 87 Given: Graph; & few labeled nodes Find: labels of rest (assuming network effects) Eurlion, Beijing 2015
CMU SCS Are they related? RWR (Random Walk with Restarts) –google’s pageRank (‘if my friends are important, I’m important, too’) SSL (Semi-supervised learning) –minimize the differences among neighbors BP (Belief propagation) –send messages to neighbors, on what you believe about them Eurlion, Beijing 2015(c) 2015, C. Faloutsos 88
CMU SCS Are they related? RWR (Random Walk with Restarts) –google’s pageRank (‘if my friends are important, I’m important, too’) SSL (Semi-supervised learning) –minimize the differences among neighbors BP (Belief propagation) –send messages to neighbors, on what you believe about them Eurlion, Beijing 2015(c) 2015, C. Faloutsos 89 YES!
CMU SCS Correspondence of Methods (c) 2015, C. Faloutsos 90 MethodMatrixUnknow n known RWR[I – c AD -1 ]×x=(1-c)y SSL [I + a (D - A)] ×x=y F A BP [I + a D - c ’ A] ×bhbh =φhφh ? ? d1 d2 d3 d1 d2 d3 final labels/ beliefs prior labels/ beliefs adjacency matrix Eurlion, Beijing 2015 DETAILS
CMU SCS Results: Scalability (c) 2015, C. Faloutsos 91 F A BP is linear on the number of edges. # of edges runtime (min) Eurlion, Beijing 2015 DETAILS
CMU SCS Overview of tools Eurlion, Beijing 2015(c) 2015, C. Faloutsos 92 Binary / weights Time- evolving with class- labels Triangle-degreeB oddBallW eigenSpokesW fBoxW netProbe+WY tensorsWY ✔ ✔ ✔ ✔ ✔
CMU SCS Summary of Part#1 *many* patterns in real graphs –Power-laws everywhere –Long (and growing) list of tools for anomaly/fraud detection Eurlion, Beijing 2015(c) 2015, C. Faloutsos 93 Patterns anomalies fBoxNetProbe oddBall …
CMU SCS (c) 2015, C. Faloutsos 94 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral (eigenspokes, CopyCatch) Label propagation (NetProbe, Polonium, Snare) Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015
CMU SCS Eurlion, Beijing 2015(c) 2015, C. Faloutsos 95 Part 2: Time evolving graphs
CMU SCS Graphs over time -> tensors! Problem #2: –Given who calls whom, and when –Find patterns / anomalies Eurlion, Beijing 2015(c) 2015, C. Faloutsos 96 smith johnson
CMU SCS Graphs over time -> tensors! Problem #2: –Given who calls whom, and when –Find patterns / anomalies Eurlion, Beijing 2015(c) 2015, C. Faloutsos 97
CMU SCS Graphs over time -> tensors! Problem #2: –Given who calls whom, and when –Find patterns / anomalies Eurlion, Beijing 2015(c) 2015, C. Faloutsos 98 Mon Tue
CMU SCS Graphs over time -> tensors! Problem #2: –Given who calls whom, and when –Find patterns / anomalies Eurlion, Beijing 2015(c) 2015, C. Faloutsos 99 callee caller time
CMU SCS Graphs over time -> tensors! Problem #2’: –Given customer-account-date –Find patterns / anomalies Eurlion, Beijing 2015(c) 2015, C. Faloutsos 100 account customer date MANY more settings, with >2 ‘modes’
CMU SCS Graphs over time -> tensors! Problem #2’’: –Given author-keyword-date –Find patterns / anomalies Eurlion, Beijing 2015(c) 2015, C. Faloutsos 101 keyword author date MANY more settings, with >2 ‘modes’
CMU SCS Graphs over time -> tensors! Problem #2’’’: –Given subject – verb – object facts –Find patterns / anomalies Eurlion, Beijing 2015(c) 2015, C. Faloutsos 102 object subject verb MANY more settings, with >2 ‘modes’
CMU SCS Graphs over time -> tensors! Problem #2’’’’: –Given –Find patterns / anomalies Eurlion, Beijing 2015(c) 2015, C. Faloutsos 103 mode2 mode1 mode3 MANY more settings, with >2 ‘modes’ (and 4, 5, etc modes)
CMU SCS Answer to all: tensor factorization Recall: (SVD) matrix factorization: finds blocks Eurlion, Beijing 2015(c) 2015, C. Faloutsos 104 N users M products ‘meat-eaters’ ‘steaks’ ‘vegetarians’ ‘plants’ ‘kids’ ‘cookies’ ~ + +
CMU SCS Answer to all: tensor factorization PARAFAC decomposition Eurlion, Beijing 2015(c) 2015, C. Faloutsos 105 = + + subject object verb politicians artistsathletes
CMU SCS Answer: tensor factorization PARAFAC decomposition Results for who-calls-whom-when –4M x 15 days Eurlion, Beijing 2015(c) 2015, C. Faloutsos 106 = + + caller callee time ??
CMU SCS Anomaly detection in time- evolving graphs Anomalous communities in phone call data: –European country, 4M clients, data over 2 weeks ~200 calls to EACH receiver on EACH day! 1 caller5 receivers4 days of activity Eurlion, Beijing (c) 2015, C. Faloutsos =
CMU SCS Anomaly detection in time- evolving graphs Anomalous communities in phone call data: –European country, 4M clients, data over 2 weeks ~200 calls to EACH receiver on EACH day! 1 caller5 receivers4 days of activity Eurlion, Beijing (c) 2015, C. Faloutsos =
CMU SCS Anomaly detection in time- evolving graphs Anomalous communities in phone call data: –European country, 4M clients, data over 2 weeks ~200 calls to EACH receiver on EACH day! Eurlion, Beijing (c) 2015, C. Faloutsos = Miguel Araujo, Spiros Papadimitriou, Stephan Günnemann, Christos Faloutsos, Prithwish Basu, Ananthram Swami, Evangelos Papalexakis, Danai Koutra. Com2: Fast Automatic Discovery of Temporal (Comet) Communities. PAKDD 2014, Tainan, Taiwan.
CMU SCS (c) 2015, C. Faloutsos 110 Roadmap Introduction – Motivation –Why study (big) graphs? Part#1: Patterns in graphs Part#2: time-evolving graphs; tensors Resources and Conclusions Eurlion, Beijing 2015
CMU SCS (c) 2015, C. Faloutsos 111 Project info: PEGASUS Eurlion, Beijing Results on large graphs: with Pegasus + hadoop + M45 Apache license Code, papers, manual, video Prof. U Kang Prof. Polo Chau
CMU SCS (c) 2015, C. Faloutsos 112 Cast Akoglu, Leman Chau, Polo Kang, U Prakash, Aditya Eurlion, Beijing 2015 Koutra, Danai Beutel, Alex Papalexakis, Vagelis Shah, Neil Lee, Jay Yoon Araujo, Miguel
CMU SCS (c) 2015, C. Faloutsos 113 CONCLUSION#0 Patterns Anomalies Eurlion, Beijing 2015
CMU SCS (c) 2015, C. Faloutsos 114 CONCLUSION#1 Many, surprising patterns in real graphs Eurlion, Beijing 2015 Rank-degree Degree-#triangles Degree-weight …
CMU SCS (c) 2015, C. Faloutsos 115 CONCLUSION#2 - tools Many, powerful tools Eurlion, Beijing 2015 fBox oddBall tensors …
CMU SCS Overview of tools Eurlion, Beijing 2015(c) 2015, C. Faloutsos 116 Binary / weights Time- evolving with class- labels Triangle-degreeB oddBallW eigenSpokesW fBoxW netProbe+WY tensorsWY
CMU SCS (c) 2015, C. Faloutsos 117 References D. Chakrabarti, C. Faloutsos: Graph Mining – Laws, Tools and Case Studies, Morgan Claypool ED1V01Y201209DMK006 Eurlion, Beijing 2015
CMU SCS (c) 2015, C. Faloutsos 118 Many, powerful tools Eurlion, Beijing 2015 fBox oddBall tensors … Thank you!