Presentation is loading. Please wait.

Presentation is loading. Please wait.

CMU SCS Patterns, Anomalies, and Fraud Detection in Large Graphs Christos Faloutsos CMU.

Similar presentations


Presentation on theme: "CMU SCS Patterns, Anomalies, and Fraud Detection in Large Graphs Christos Faloutsos CMU."— Presentation transcript:

1 CMU SCS Patterns, Anomalies, and Fraud Detection in Large Graphs Christos Faloutsos CMU

2 CMU SCS (c) 2015, C. Faloutsos 2 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –tools Part#2: time-evolving graphs –Patterns –Tools Conclusions Eurlion, Beijing 2015

3 CMU SCS (c) 2015, C. Faloutsos 3 Graphs - why should we care? Financial (user-account) Social networks computer network security: email/IP traffic and anomaly detection Recommendation systems.... Many-to-many db relationship -> graph Eurlion, Beijing 2015

4 CMU SCS Motivating problems P1: patterns? Fraud detection? P2: patterns in time-evolving graphs / tensors Eurlion, Beijing 2015(c) 2015, C. Faloutsos 4 time user account

5 CMU SCS Motivating problems P1: patterns? Fraud detection? P2: patterns in time-evolving graphs / tensors Eurlion, Beijing 2015(c) 2015, C. Faloutsos 5 time Patterns anomalies user account

6 CMU SCS (c) 2015, C. Faloutsos 6 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –tools Part#2: time-evolving graphs –Patterns –Tools Conclusions Eurlion, Beijing 2015

7 CMU SCS Eurlion, Beijing 2015(c) 2015, C. Faloutsos 7 Part 1: Static graphs

8 CMU SCS (c) 2015, C. Faloutsos 8 Laws and patterns Q1: Are real graphs random? Eurlion, Beijing 2015

9 CMU SCS (c) 2015, C. Faloutsos 9 Laws and patterns Q1: Are real graphs random? A1: NO!! –Diameter (‘6 degrees’; ‘Kevin Bacon’) –in- and out- degree distributions –other (surprising) patterns So, let’s look at the data Eurlion, Beijing 2015

10 CMU SCS (c) 2015, C. Faloutsos 10 Solution# S.1 Power law in the degree distribution [Faloutsos x 3 SIGCOMM99] log(rank) log(degree) internet domains att.com ibm.com Eurlion, Beijing 2015

11 CMU SCS (c) 2015, C. Faloutsos 11 Solution# S.1 Power law in the degree distribution [Faloutsos x 3 SIGCOMM99; + Siganos] log(rank) log(degree) -0.82 internet domains att.com ibm.com Eurlion, Beijing 2015

12 CMU SCS (c) 2015, C. Faloutsos 12 Solution# S.2: Eigen Exponent E A2: power law in the eigenvalues of the adjacency matrix (‘ eig() ’) E = -0.48 Exponent = slope Eigenvalue Rank of decreasing eigenvalue May 2001 Eurlion, Beijing 2015 A x = x

13 CMU SCS (c) 2015, C. Faloutsos 13 Solution# S.3: Triangle ‘Laws’ Real social networks have a lot of triangles Eurlion, Beijing 2015

14 CMU SCS (c) 2015, C. Faloutsos 14 Solution# S.3: Triangle ‘Laws’ Real social networks have a lot of triangles –Friends of friends are friends Any patterns? –2x the friends, 2x the triangles ? Eurlion, Beijing 2015

15 CMU SCS (c) 2015, C. Faloutsos 15 Triangle Law: #S.3 [Tsourakakis ICDM 2008] SNReuters Epinions X-axis: degree Y-axis: mean # triangles n friends -> ~n 1.6 triangles Eurlion, Beijing 2015

16 CMU SCS Triangle counting for large graphs? Anomalous nodes in Twitter(~ 3 billion edges) [U Kang, Brendan Meeder, +, PAKDD’11] 16 Eurlion, Beijing 2015 16 (c) 2015, C. Faloutsos ?? ?

17 CMU SCS Triangle counting for large graphs? Anomalous nodes in Twitter(~ 3 billion edges) [U Kang, Brendan Meeder, +, PAKDD’11] 17 Eurlion, Beijing 2015 17 (c) 2015, C. Faloutsos

18 CMU SCS Triangle counting for large graphs? Anomalous nodes in Twitter(~ 3 billion edges) [U Kang, Brendan Meeder, +, PAKDD’11] 18 Eurlion, Beijing 2015 18 (c) 2015, C. Faloutsos

19 CMU SCS Triangle counting for large graphs? Anomalous nodes in Twitter(~ 3 billion edges) [U Kang, Brendan Meeder, +, PAKDD’11] 19 Eurlion, Beijing 2015 19 (c) 2015, C. Faloutsos

20 CMU SCS Triangle counting for large graphs? Anomalous nodes in Twitter(~ 3 billion edges) [U Kang, Brendan Meeder, +, PAKDD’11] 20 Eurlion, Beijing 2015 20 (c) 2015, C. Faloutsos

21 CMU SCS (c) 2015, C. Faloutsos 21 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns (binary; weighted) –tools Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015

22 CMU SCS (c) 2015, C. Faloutsos 22 Observations on weighted graphs? A: yes - even more ‘laws’! M. McGlohon, L. Akoglu, and C. Faloutsos Weighted Graphs and Disconnected Components: Patterns and a Generator. SIG-KDD 2008 Eurlion, Beijing 2015

23 CMU SCS (c) 2015, C. Faloutsos 23 Observation W.1: Fortification Q: How do the weights of nodes relate to degree? Eurlion, Beijing 2015

24 CMU SCS (c) 2015, C. Faloutsos 24 Observation W.1: Fortification Double the checks, double the $ ? $10 $5 Eurlion, Beijing 2015 ‘Reagan’ ‘Clinton’ $7

25 CMU SCS Edges (# donors) In-weights ($) (c) 2015, C. Faloutsos 25 Observation W.1: fortification: Snapshot Power Law Weight: super-linear on in-degree exponent ‘iw’: 1.01 < iw < 1.26 Orgs-Candidates e.g. John Kerry, $10M received, from 1K donors $10 $5 Eurlion, Beijing 2015 Double the checks, double the $ ?

26 CMU SCS MORE Graph Patterns Eurlion, Beijing 2015(c) 2015, C. Faloutsos 26 ✔ ✔ ✔ RTG: A Recursive Realistic Graph Generator using Random Typing Leman Akoglu and Christos Faloutsos. PKDD’09.

27 CMU SCS MORE Graph Patterns Eurlion, Beijing 2015(c) 2015, C. Faloutsos 27 Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks. in "Social Network Data Analytics” (Ed.: Charu Aggarwal) Deepayan Chakrabarti and Christos Faloutsos, Graph Mining: Laws, Tools, and Case Studies Oct. 2012, Morgan Claypool. Graph Mining: Laws, Tools, and Case Studies

28 CMU SCS (c) 2015, C. Faloutsos 28 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral Label propagation Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015

29 CMU SCS Overview of tools Eurlion, Beijing 2015(c) 2015, C. Faloutsos 29 Binary / weights Time- evolving with class- labels Triangle-degreeB oddBallW eigenSpokesW fBoxW netProbe+WY tensorsWY ✔

30 CMU SCS OddBall: Spotting Ano malies in Weighted Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School of Computer Science PAKDD 2010, Hyderabad, India

31 CMU SCS Main idea For each node, extract ‘ego-net’ (=1-step-away neighbors) Extract features (#edges, total weight, etc etc) Compare with the rest of the population (c) 2015, C. Faloutsos 31 Eurlion, Beijing 2015

32 CMU SCS What is an egonet? ego 32 egonet (c) 2015, C. FaloutsosEurlion, Beijing 2015

33 CMU SCS Selected Features  N i : number of neighbors (degree) of ego i  E i : number of edges in egonet i  W i : total weight of egonet i  λ w,i : principal eigenvalue of the weighted adjacency matrix of egonet I 33 (c) 2015, C. FaloutsosEurlion, Beijing 2015

34 CMU SCS Near-Clique/Star 34 Eurlion, Beijing 2015(c) 2015, C. Faloutsos

35 CMU SCS Near-Clique/Star 35 (c) 2015, C. FaloutsosEurlion, Beijing 2015

36 CMU SCS Near-Clique/Star 36 (c) 2015, C. FaloutsosEurlion, Beijing 2015

37 CMU SCS Andrew Lewis (director) Near-Clique/Star 37 (c) 2015, C. FaloutsosEurlion, Beijing 2015

38 CMU SCS (c) 2015, C. Faloutsos 38 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral (EigenSpokes, CopyCatch) Label propagation Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015

39 CMU SCS Overview of tools Eurlion, Beijing 2015(c) 2015, C. Faloutsos 39 Binary / weights Time- evolving with class- labels Triangle-degreeB oddBallW eigenSpokesW fBoxW netProbe+WY tensorsWY ✔ ✔

40 CMU SCS EigenSpokes B. Aditya Prakash, Mukund Seshadri, Ashwin Sridharan, Sridhar Machiraju and Christos Faloutsos: EigenSpokes: Surprising Patterns and Scalable Community Chipping in Large Graphs, PAKDD 2010, Hyderabad, India, 21-24 June 2010. (c) 2015, C. Faloutsos 40 Eurlion, Beijing 2015

41 CMU SCS EigenSpokes Eigenvectors of adjacency matrix  equivalent to singular vectors (symmetric, undirected graph) 41 (c) 2015, C. FaloutsosEurlion, Beijing 2015

42 CMU SCS EigenSpokes Eigenvectors of adjacency matrix  equivalent to singular vectors (symmetric, undirected graph) 42 (c) 2015, C. FaloutsosEurlion, Beijing 2015 N N details

43 CMU SCS EigenSpokes Eigenvectors of adjacency matrix  equivalent to singular vectors (symmetric, undirected graph) 43 (c) 2015, C. FaloutsosEurlion, Beijing 2015 N N details

44 CMU SCS EigenSpokes Eigenvectors of adjacency matrix  equivalent to singular vectors (symmetric, undirected graph) 44 (c) 2015, C. FaloutsosEurlion, Beijing 2015 N N details

45 CMU SCS EigenSpokes Eigenvectors of adjacency matrix  equivalent to singular vectors (symmetric, undirected graph) 45 (c) 2015, C. FaloutsosEurlion, Beijing 2015 N N details

46 CMU SCS EigenSpokes EE plot: Scatter plot of scores of u1 vs u2 One would expect –Many points @ origin –A few scattered ~randomly (c) 2015, C. Faloutsos 46 u1 u2 Eurlion, Beijing 2015 1 st Principal component 2 nd Principal component

47 CMU SCS EigenSpokes EE plot: Scatter plot of scores of u1 vs u2 One would expect –Many points @ origin –A few scattered ~randomly (c) 2015, C. Faloutsos 47 u1 u2 90 o Eurlion, Beijing 2015

48 CMU SCS EigenSpokes - pervasiveness Present in mobile social graph  across time and space Patent citation graph 48 (c) 2015, C. FaloutsosEurlion, Beijing 2015

49 CMU SCS EigenSpokes - explanation Near-cliques, or near- bipartite-cores, loosely connected 49 (c) 2015, C. FaloutsosEurlion, Beijing 2015

50 CMU SCS EigenSpokes - explanation Near-cliques, or near- bipartite-cores, loosely connected 50 (c) 2015, C. FaloutsosEurlion, Beijing 2015

51 CMU SCS EigenSpokes - explanation Near-cliques, or near- bipartite-cores, loosely connected 51 (c) 2015, C. FaloutsosEurlion, Beijing 2015

52 CMU SCS EigenSpokes - explanation Near-cliques, or near- bipartite-cores, loosely connected So what?  Extract nodes with high scores  high connectivity  Good “communities” spy plot of top 20 nodes 52 (c) 2015, C. FaloutsosEurlion, Beijing 2015

53 CMU SCS Bipartite Communities! magnified bipartite community patents from same inventor(s) `cut-and-paste’ bibliography! 53 (c) 2015, C. FaloutsosEurlion, Beijing 2015

54 CMU SCS (c) 2015, C. Faloutsos 54 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral (eigenspokes, CopyCatch) Label propagation Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015

55 CMU SCS Fraud Given –Who ‘likes’ what page, and when Find –Suspicious users and suspicious products Eurlion, Beijing 2015(c) 2015, C. Faloutsos 55 CopyCatch: Stopping Group Attacks by Spotting Lockstep Behavior in Social Networks, Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, Christos Faloutsos WWW, 2013.

56 CMU SCS Fraud Given –Who ‘likes’ what page, and when Find –Suspicious users and suspicious products Eurlion, Beijing 2015(c) 2015, C. Faloutsos 56 CopyCatch: Stopping Group Attacks by Spotting Lockstep Behavior in Social Networks, Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, Christos Faloutsos WWW, 2013. Likes

57 CMU SCS Our intuition ▪ Lockstep behavior: Same Likes, same time Graph Patterns and Lockstep Behavior Likes Eurlion, Beijing 2015 57 (c) 2015, C. Faloutsos

58 CMU SCS Our intuition ▪ Lockstep behavior: Same Likes, same time Graph Patterns and Lockstep Behavior Likes Eurlion, Beijing 2015 58 (c) 2015, C. Faloutsos

59 CMU SCS Our intuition ▪ Lockstep behavior: Same Likes, same time Graph Patterns and Lockstep Behavior Suspicious Lockstep Behavior Likes Eurlion, Beijing 2015 59 (c) 2015, C. Faloutsos

60 CMU SCS MapReduce Overview ▪ Use Hadoop to search for many clusters in parallel: 1. Start with randomly seed 2. Update set of Pages and center Like times for each cluster 3. Repeat until convergence Likes Eurlion, Beijing 2015 60 (c) 2015, C. Faloutsos

61 CMU SCS Deployment at Facebook ▪ CopyCatch runs regularly (along with many other security mechanisms, and a large Site Integrity team) 3 months of CopyCatch @ Facebook #users caught time Eurlion, Beijing 2015 61 (c) 2015, C. Faloutsos

62 CMU SCS Deployment at Facebook Manually labeled 22 randomly selected clusters from February 2013 Most clusters (77%) come from real but compromised users Fake acct Eurlion, Beijing 2015 62 (c) 2015, C. Faloutsos

63 CMU SCS (c) 2015, C. Faloutsos 63 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral (EigenSpokes, CopyCatch, fBox) Label propagation Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015

64 CMU SCS (c) 2015, C. Faloutsos 64 Problem: Social Network Link Fraud Eurlion, Beijing 2015 Target: find “stealthy” attackers missed by other algorithms Clique Bipartite core 41.7M nodes 1.5B edges

65 CMU SCS (c) 2015, C. Faloutsos 65 Problem: Social Network Link Fraud Eurlion, Beijing 2015 Neil Shah, Alex Beutel, Brian Gallagher and Christos Faloutsos. Spotting Suspicious Link Behavior with fBox: An Adversarial Perspective. ICDM 2014, Shenzhen, China. Target: find “stealthy” attackers missed by other algorithms Takeaway: use reconstruction error between true/latent representation!

66 CMU SCS Overview of tools Eurlion, Beijing 2015(c) 2015, C. Faloutsos 66 Binary / weights Time- evolving with class- labels Triangle-degreeB oddBallW eigenSpokesW fBoxW netProbe+WY tensorsWY ✔ ✔ ✔ ✔

67 CMU SCS (c) 2015, C. Faloutsos 67 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral (eigenspokes, CopyCatch) Label propagation (NetProbe, Polonium, Snare) Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015

68 CMU SCS Eurlion, Beijing 2015(c) 2015, C. Faloutsos 68 E-bay Fraud detection w/ Polo Chau & Shashank Pandit, CMU [www’07]

69 CMU SCS Eurlion, Beijing 2015(c) 2015, C. Faloutsos 69 E-bay Fraud detection

70 CMU SCS Eurlion, Beijing 2015(c) 2015, C. Faloutsos 70 E-bay Fraud detection

71 CMU SCS Eurlion, Beijing 2015(c) 2015, C. Faloutsos 71 E-bay Fraud detection - NetProbe

72 CMU SCS Popular press Eurlion, Beijing 2015(c) 2015, C. Faloutsos 72

73 CMU SCS (c) 2015, C. Faloutsos 73 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral (eigenspokes, CopyCatch) Label propagation (NetProbe, Polonium, Snare) Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015

74 CMU SCS Polo Chau Machine Learning Dept Carey Nachenberg Vice President & Fellow Jeffrey Wilhelm Principal Software Engineer Adam Wright Software Engineer Prof. Christos Faloutsos Computer Science Dept Polonium: Tera-Scale Graph Mining and Inference for Malware Detection PATENT PENDING SDM 2011, Mesa, Arizona

75 CMU SCS Polonium: The Data 60+ terabytes of data anonymously contributed by participants of worldwide Norton Community Watch program 50+ million machines 900+ million executable files Constructed a machine-file bipartite graph (0.2 TB+) 1 billion nodes (machines and files) 37 billion edges Eurlion, Beijing 2015 75 (c) 2015, C. Faloutsos

76 CMU SCS Polonium: Key Ideas Use Belief Propagation to propagate domain knowledge in machine-file graph to detect malware Use “guilt-by-association” (i.e., homophily) –E.g., files that appear on machines with many bad files are more likely to be bad Scalability: handles 37 billion-edge graph Eurlion, Beijing 2015 76 (c) 2015, C. Faloutsos

77 CMU SCS Polonium: One-Interaction Results 84.9% True Positive Rate 1% False Positive Rate True Positive Rate % of malware correctly identified 77 Ideal Eurlion, Beijing 2015(c) 2015, C. Faloutsos False Positive Rate % of non-malware wrongly labeled as malware

78 CMU SCS (c) 2015, C. Faloutsos 78 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral (eigenspokes, CopyCatch) Label propagation (NetProbe, Polonium, Snare) Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015

79 CMU SCS Network Effect Tools: SNARE 79 Some accounts are sort-of-suspicious – how to combine weak signals? Before Eurlion, Beijing 2015(c) 2015, C. Faloutsos Inventory Acc. payable Revenue L.A.

80 CMU SCS Network Effect Tools: SNARE 80 Some accounts are sort-of-suspicious – how to combine weak signals? Before Eurlion, Beijing 2015(c) 2015, C. Faloutsos

81 CMU SCS Network Effect Tools: SNARE 81 A: Belief Propagation. Before Eurlion, Beijing 2015(c) 2015, C. Faloutsos

82 CMU SCS Network Effect Tools: SNARE 82 A: Belief Propagation. After Before Eurlion, Beijing 2015(c) 2015, C. Faloutsos Mary McGlohon, Stephen Bay, Markus G. Anderle, David M. Steier, Christos Faloutsos: SNARE: a link analytic system for graph labeling and risk detection. KDD 2009: 1265-1274

83 CMU SCS Network Effect Tools: SNARE 83 Produces improvement over simply using flags –Up to 6.5 lift –Improvement especially for low false positive rate True positive rate Results for accounts data (ROC Curve) Ideal SNARE Baseline (flags only) Eurlion, Beijing 2015(c) 2015, C. Faloutsos False positive rate

84 CMU SCS Network Effect Tools: SNARE 84 Accurate- Produces large improvement over simply using flags Flexible- Can be applied to other domains Scalable- One iteration BP runs in linear time (# edges) Robust- Works on large range of parameters Eurlion, Beijing 2015(c) 2015, C. Faloutsos

85 CMU SCS (c) 2015, C. Faloutsos 85 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral (eigenspokes, CopyCatch) Label propagation (NetProbe,…, Unification) Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015

86 CMU SCS Unifying Guilt-by-Association Approaches: Theorems and Fast Algorithms Danai Koutra U Kang Hsing-Kuo Kenneth Pao Tai-You Ke Duen Horng (Polo) Chau Christos Faloutsos ECML PKDD, 5-9 September 2011, Athens, Greece

87 CMU SCS Problem Definition: G B A techniques (Guilt By Association) (c) 2015, C. Faloutsos 87 Given: Graph; & few labeled nodes Find: labels of rest (assuming network effects) Eurlion, Beijing 2015

88 CMU SCS Are they related? RWR (Random Walk with Restarts) –google’s pageRank (‘if my friends are important, I’m important, too’) SSL (Semi-supervised learning) –minimize the differences among neighbors BP (Belief propagation) –send messages to neighbors, on what you believe about them Eurlion, Beijing 2015(c) 2015, C. Faloutsos 88

89 CMU SCS Are they related? RWR (Random Walk with Restarts) –google’s pageRank (‘if my friends are important, I’m important, too’) SSL (Semi-supervised learning) –minimize the differences among neighbors BP (Belief propagation) –send messages to neighbors, on what you believe about them Eurlion, Beijing 2015(c) 2015, C. Faloutsos 89 YES!

90 CMU SCS Correspondence of Methods (c) 2015, C. Faloutsos 90 MethodMatrixUnknow n known RWR[I – c AD -1 ]×x=(1-c)y SSL [I + a (D - A)] ×x=y F A BP [I + a D - c ’ A] ×bhbh =φhφh 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 ? ? 0 1 0 1 d1 d2 d3 d1 d2 d3 final labels/ beliefs prior labels/ beliefs adjacency matrix Eurlion, Beijing 2015 DETAILS

91 CMU SCS Results: Scalability (c) 2015, C. Faloutsos 91 F A BP is linear on the number of edges. # of edges runtime (min) Eurlion, Beijing 2015 DETAILS

92 CMU SCS Overview of tools Eurlion, Beijing 2015(c) 2015, C. Faloutsos 92 Binary / weights Time- evolving with class- labels Triangle-degreeB oddBallW eigenSpokesW fBoxW netProbe+WY tensorsWY ✔ ✔ ✔ ✔ ✔

93 CMU SCS Summary of Part#1 *many* patterns in real graphs –Power-laws everywhere –Long (and growing) list of tools for anomaly/fraud detection Eurlion, Beijing 2015(c) 2015, C. Faloutsos 93 Patterns anomalies fBoxNetProbe oddBall …

94 CMU SCS (c) 2015, C. Faloutsos 94 Roadmap Introduction – Motivation Part#1: Static graphs –Patterns –Tools Local Spectral (eigenspokes, CopyCatch) Label propagation (NetProbe, Polonium, Snare) Part#2: time-evolving graphs Conclusions Eurlion, Beijing 2015

95 CMU SCS Eurlion, Beijing 2015(c) 2015, C. Faloutsos 95 Part 2: Time evolving graphs

96 CMU SCS Graphs over time -> tensors! Problem #2: –Given who calls whom, and when –Find patterns / anomalies Eurlion, Beijing 2015(c) 2015, C. Faloutsos 96 smith johnson

97 CMU SCS Graphs over time -> tensors! Problem #2: –Given who calls whom, and when –Find patterns / anomalies Eurlion, Beijing 2015(c) 2015, C. Faloutsos 97

98 CMU SCS Graphs over time -> tensors! Problem #2: –Given who calls whom, and when –Find patterns / anomalies Eurlion, Beijing 2015(c) 2015, C. Faloutsos 98 Mon Tue

99 CMU SCS Graphs over time -> tensors! Problem #2: –Given who calls whom, and when –Find patterns / anomalies Eurlion, Beijing 2015(c) 2015, C. Faloutsos 99 callee caller time

100 CMU SCS Graphs over time -> tensors! Problem #2’: –Given customer-account-date –Find patterns / anomalies Eurlion, Beijing 2015(c) 2015, C. Faloutsos 100 account customer date MANY more settings, with >2 ‘modes’

101 CMU SCS Graphs over time -> tensors! Problem #2’’: –Given author-keyword-date –Find patterns / anomalies Eurlion, Beijing 2015(c) 2015, C. Faloutsos 101 keyword author date MANY more settings, with >2 ‘modes’

102 CMU SCS Graphs over time -> tensors! Problem #2’’’: –Given subject – verb – object facts –Find patterns / anomalies Eurlion, Beijing 2015(c) 2015, C. Faloutsos 102 object subject verb MANY more settings, with >2 ‘modes’

103 CMU SCS Graphs over time -> tensors! Problem #2’’’’: –Given –Find patterns / anomalies Eurlion, Beijing 2015(c) 2015, C. Faloutsos 103 mode2 mode1 mode3 MANY more settings, with >2 ‘modes’ (and 4, 5, etc modes)

104 CMU SCS Answer to all: tensor factorization Recall: (SVD) matrix factorization: finds blocks Eurlion, Beijing 2015(c) 2015, C. Faloutsos 104 N users M products ‘meat-eaters’ ‘steaks’ ‘vegetarians’ ‘plants’ ‘kids’ ‘cookies’ ~ + +

105 CMU SCS Answer to all: tensor factorization PARAFAC decomposition Eurlion, Beijing 2015(c) 2015, C. Faloutsos 105 = + + subject object verb politicians artistsathletes

106 CMU SCS Answer: tensor factorization PARAFAC decomposition Results for who-calls-whom-when –4M x 15 days Eurlion, Beijing 2015(c) 2015, C. Faloutsos 106 = + + caller callee time ??

107 CMU SCS Anomaly detection in time- evolving graphs Anomalous communities in phone call data: –European country, 4M clients, data over 2 weeks ~200 calls to EACH receiver on EACH day! 1 caller5 receivers4 days of activity Eurlion, Beijing 2015 107 (c) 2015, C. Faloutsos =

108 CMU SCS Anomaly detection in time- evolving graphs Anomalous communities in phone call data: –European country, 4M clients, data over 2 weeks ~200 calls to EACH receiver on EACH day! 1 caller5 receivers4 days of activity Eurlion, Beijing 2015 108 (c) 2015, C. Faloutsos =

109 CMU SCS Anomaly detection in time- evolving graphs Anomalous communities in phone call data: –European country, 4M clients, data over 2 weeks ~200 calls to EACH receiver on EACH day! Eurlion, Beijing 2015 109 (c) 2015, C. Faloutsos = Miguel Araujo, Spiros Papadimitriou, Stephan Günnemann, Christos Faloutsos, Prithwish Basu, Ananthram Swami, Evangelos Papalexakis, Danai Koutra. Com2: Fast Automatic Discovery of Temporal (Comet) Communities. PAKDD 2014, Tainan, Taiwan.

110 CMU SCS (c) 2015, C. Faloutsos 110 Roadmap Introduction – Motivation –Why study (big) graphs? Part#1: Patterns in graphs Part#2: time-evolving graphs; tensors Resources and Conclusions Eurlion, Beijing 2015

111 CMU SCS (c) 2015, C. Faloutsos 111 Project info: PEGASUS Eurlion, Beijing 2015 www.cs.cmu.edu/~pegasus Results on large graphs: with Pegasus + hadoop + M45 Apache license Code, papers, manual, video Prof. U Kang Prof. Polo Chau

112 CMU SCS (c) 2015, C. Faloutsos 112 Cast Akoglu, Leman Chau, Polo Kang, U Prakash, Aditya Eurlion, Beijing 2015 Koutra, Danai Beutel, Alex Papalexakis, Vagelis Shah, Neil Lee, Jay Yoon Araujo, Miguel

113 CMU SCS (c) 2015, C. Faloutsos 113 CONCLUSION#0 Patterns Anomalies Eurlion, Beijing 2015

114 CMU SCS (c) 2015, C. Faloutsos 114 CONCLUSION#1 Many, surprising patterns in real graphs Eurlion, Beijing 2015 Rank-degree Degree-#triangles Degree-weight …

115 CMU SCS (c) 2015, C. Faloutsos 115 CONCLUSION#2 - tools Many, powerful tools Eurlion, Beijing 2015 fBox oddBall tensors …

116 CMU SCS Overview of tools Eurlion, Beijing 2015(c) 2015, C. Faloutsos 116 Binary / weights Time- evolving with class- labels Triangle-degreeB oddBallW eigenSpokesW fBoxW netProbe+WY tensorsWY

117 CMU SCS (c) 2015, C. Faloutsos 117 References D. Chakrabarti, C. Faloutsos: Graph Mining – Laws, Tools and Case Studies, Morgan Claypool 2012 http://www.morganclaypool.com/doi/abs/10.2200/S004 49ED1V01Y201209DMK006 Eurlion, Beijing 2015

118 CMU SCS (c) 2015, C. Faloutsos 118 Many, powerful tools Eurlion, Beijing 2015 fBox oddBall tensors … Thank you! christos@cs.cmu.edu www.cs.cmu.edu/~christos


Download ppt "CMU SCS Patterns, Anomalies, and Fraud Detection in Large Graphs Christos Faloutsos CMU."

Similar presentations


Ads by Google