CMU SCS Big (graph) data analytics Christos Faloutsos CMU.

Slides:



Advertisements
Similar presentations
CMU SCS Identifying on-line Fraudsters: Anomaly Detection Using Network Effects Christos Faloutsos CMU.
Advertisements

CMU SCS I2.2 Large Scale Information Network Processing INARC 1 Overview Goal: scalable algorithms to find patterns and anomalies on graphs 1. Mining Large.
FUNNEL: Automatic Mining of Spatially Coevolving Epidemics Yasuko Matsubara, Yasushi Sakurai (Kumamoto University) Willem G. van Panhuis (University of.
School of Computer Science Carnegie Mellon University Duke University DeltaCon: A Principled Massive- Graph Similarity Function Danai Koutra Joshua T.
School of Computer Science Carnegie Mellon University National Taiwan University of Science & Technology Unifying Guilt-by-Association Approaches: Theorems.
1 CANTINA : A Content-Based Approach to Detecting Phishing Web Sites WWW Yue Zhang, Jason Hong, and Lorrie Cranor.
Node labels as random variables prior belief observed neighbor potentials compatibility potentials Opinion Fraud Detection in Online Reviews using Network.
CMU SCS : Multimedia Databases and Data Mining Lecture#1: Introduction Christos Faloutsos CMU
CMU SCS Large Graph Mining - Patterns, Tools and Cascade Analysis Christos Faloutsos CMU.
Automated Anomaly Detection, Data Validation and Correction for Environmental Sensors using Statistical Machine Learning Techniques
SFU, CMPT 741, Fall 2009, Martin Ester 418 Outlook Outline Trends in KDD research Graph mining and social network analysis Recommender systems Information.
CMU SCS C. Faloutsos (CMU)#1 Large Graph Algorithms Christos Faloutsos CMU McGlohon, Mary Prakash, Aditya Tong, Hanghang Tsourakakis, Babis Akoglu, Leman.
NetMine: Mining Tools for Large Graphs Deepayan Chakrabarti Yiping Zhan Daniel Blandford Christos Faloutsos Guy Blelloch.
CMU SCS Mining Billion-node Graphs Christos Faloutsos CMU.
Shipi Kankane Prashanth Nakirekommula.  Applying analytics and risk- management capabilities to health insurance through LexisNexis data platforms. 
Social Networks and Graph Mining Christos Faloutsos CMU - MLD.
WindMine: Fast and Effective Mining of Web-click Sequences SDM 2011Y. Sakurai et al.1 Yasushi Sakurai (NTT) Lei Li (Carnegie Mellon Univ.) Yasuko Matsubara.
Neighborhood Formation and Anomaly Detection in Bipartite Graphs Jimeng Sun Huiming Qu Deepayan Chakrabarti Christos Faloutsos Speaker: Jimeng Sun.
CMU SCS Large Graph Mining Christos Faloutsos CMU.
Detecting Fraudulent Personalities in Networks of Online Auctioneers Duen Horng (“Polo”) Chau Shashank Pandit Christos Faloutsos School of Computer Science.
CMU SCS Bio-informatics, Graph and Stream mining Christos Faloutsos CMU.
CMU SCS Graph and stream mining Christos Faloutsos CMU.
1 The Expected Performance Curve Samy Bengio, Johnny Mariéthoz, Mikaela Keller MI – 25. oktober 2007 Kresten Toftgaard Andersen.
CMU SCS Graph Mining and Influence Propagation Christos Faloutsos CMU.
CMU SCS : Multimedia Databases and Data Mining Lecture#1: Introduction Christos Faloutsos CMU
CMU SCS Data Mining in Streams and Graphs Christos Faloutsos CMU.
School of Computer Science Carnegie Mellon University National Taiwan University of Science & Technology Unifying Guilt-by-Association Approaches: Theorems.
CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P0-1 Large Graph Mining: Power Tools and a Practitioner’s guide Christos Faloutsos Gary Miller Charalampos.
Fast and Exact Monitoring of Co-evolving Data Streams Yasuko Matsubara, Yasushi Sakurai (Kumamoto University) Naonori Ueda (NTT) Masatoshi Yoshikawa (Kyoto.
Tracking with Unreliable Node Sequences Ziguo Zhong, Ting Zhu, Dan Wang and Tian He Computer Science and Engineering, University of Minnesota Infocom 2009.
Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.
CMU SCS Large Graph Mining Christos Faloutsos CMU.
Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto University), Yasushi Sakurai (NTT), Christos Faloutsos (CMU), Tomoharu.
CMU SCS Big (graph) data analytics Christos Faloutsos CMU.
AutoPlait: Automatic Mining of Co-evolving Time Sequences Yasuko Matsubara (Kumamoto University) Yasushi Sakurai (Kumamoto University) Christos Faloutsos.
Mining and Querying Multimedia Data Fan Guo Sep 19, 2011 Committee Members: Christos Faloutsos, Chair Eric P. Xing William W. Cohen Ambuj K. Singh, University.
CMU SCS Mining Billion Node Graphs Christos Faloutsos CMU.
On Node Classification in Dynamic Content-based Networks.
EVENT DETECTION IN TIME SERIES OF MOBILE COMMUNICATION GRAPHS
CMU SCS Mining Large Graphs: Fraud Detection, and Algorithms Christos Faloutsos CMU.
Application 2: Misstatement detection Problem: Given network and noisy domain knowledge about suspicious nodes (flags), which nodes are most risky? Cash.
CMU SCS Graph Mining: patterns and tools for static and time-evolving graphs Christos Faloutsos CMU.
RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School.
CMU SCS Graph Mining Christos Faloutsos CMU. CMU SCS iCAST, Jan. 09C. Faloutsos 2 Thank you! Prof. Hsing-Kuo Kenneth Pao Eric, Morgan, Ian, Teenet.
CMU SCS Patterns, Anomalies, and Fraud Detection in Large Graphs Christos Faloutsos CMU.
Supervised Random Walks: Predicting and Recommending Links in Social Networks Lars Backstrom (Facebook) & Jure Leskovec (Stanford) Proc. of WSDM 2011 Present.
CMU SCS Mining Large Social Networks: Patterns and Anomalies Christos Faloutsos CMU.
CMU SCS Graph Mining: Laws, Generators and Tools Christos Faloutsos CMU.
CopyCatch: Stopping Group Attacks by Spotting Lockstep Behavior in Social Networks (WWW2013) BEUTEL, ALEX, WANHONG XU, VENKATESAN GURUSWAMI, CHRISTOPHER.
Arizona State University1 Fast Mining of a Network of Coevolving Time Series Wei FanHanghang TongPing JiYongjie Cai.
CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-1 Large Graph Mining: Power Tools and a Practitioner’s guide Christos Faloutsos Gary Miller Charalampos.
CMU SCS Panel: Social Networks Christos Faloutsos CMU.
CMU SCS KDD '09Faloutsos, Miller, Tsourakakis P8-1 Large Graph Mining: Power Tools and a Practitioner’s guide Task 8: hadoop and Tera/Peta byte graphs.
CMU SCS Anomaly Detection in Large Graphs Christos Faloutsos CMU.
Forecasting with Cyber-physical Interactions in Data Centers (part 3)
BGP-lens: Patterns and Anomalies in Internet Routing Updates
Non-linear Mining of Competing Local Activities
NetMine: Mining Tools for Large Graphs
Large Graph Mining: Power Tools and a Practitioner’s guide
Part 1: Graph Mining – patterns
Graph and Tensor Mining for fun and profit
Graph and Tensor Mining for fun and profit
Graph and Tensor Mining for fun and profit
Christos Faloutsos CMU
Graph and Tensor Mining for fun and profit
Predicting Prevalence of Influenza-Like Illness From Geo-Tagged Tweets
Binghui Wang, Le Zhang, Neil Zhenqiang Gong
Large Graph Mining: Power Tools and a Practitioner’s guide
Graph-based Security and Privacy Analytics via Collective Classification with Joint Weight Learning and Propagation Binghui Wang, Jinyuan Jia, and Neil.
GANG: Detecting Fraudulent Users in OSNs
Presentation transcript:

CMU SCS Big (graph) data analytics Christos Faloutsos CMU

CMU SCS CMU visit '14C. Faloutsos2 Outline Problem definition / Motivation Anomaly detection Time series analysis Conclusions

CMU SCS CMU visit '14C. Faloutsos3 Motivation Data mining: ~ find patterns (rules, outliers) How do real graphs look like? Anomalies? Time series / Monitoring PA, NY, …

CMU SCS CMU visit '14C. Faloutsos4 Graphs - why should we care?

CMU SCS C. Faloutsos5 Graphs - why should we care? Internet Map [lumeta.com] Food Web [Martinez ’91] ~1B users $10-$100B revenue CMU visit '14

CMU SCS CMU visit '14C. Faloutsos6 Outline Problem definition / Motivation Anomaly/fraud detection –Financial fraud –Ebay fraud Time Series Analysis Conclusions

CMU SCS Network Effect Tools: SNARE 7 Some accounts are sort-of-suspicious – how to combine weak signals? Before CMU visit '14C. Faloutsos

CMU SCS Network Effect Tools: SNARE 8 A: Belief Propagation. Before CMU visit '14C. Faloutsos

CMU SCS Network Effect Tools: SNARE 9 A: Belief Propagation. After Before CMU visit '14C. Faloutsos Mary McGlohon, Stephen Bay, Markus G. Anderle, David M. Steier, Christos Faloutsos: SNARE: a link analytic system for graph labeling and risk detection. KDD 2009:

CMU SCS Network Effect Tools: SNARE 10 Produces improvement over simply using flags –Up to 6.5 lift –Improvement especially for low false positive rate False positive rate True positive rate Results for accounts data (ROC Curve) Ideal SNARE Baseline (flags only) CMU visit '14C. Faloutsos

CMU SCS Network Effect Tools: SNARE 11 Accurate- Produces large improvement over simply using flags Flexible- Can be applied to other domains Scalable- One iteration BP runs in linear time (# edges) Robust- Works on large range of parameters CMU visit '14C. Faloutsos

CMU SCS CMU visit '14C. Faloutsos12 Outline Problem definition / Motivation Anomaly/fraud detection –Financial fraud –Ebay fraud Time series analysis Conclusions

CMU SCS C. Faloutsos E-bay Fraud detection Detects ‘non-delivery’ fraud: seller takes $$ and disappears CMU visit '14 Shashank Pandit, Duen Horng Chau, Samuel Wang, and Christos Faloutsos. NetProbe: A Fast and Scalable System for Fraud Detection in Online Auction Networks WWW 07.

CMU SCS C. Faloutsos E-bay Fraud detection - NetProbe CMU visit '14

CMU SCS App-store fraud Opinion Fraud Detection in Online Reviews using Network Effects Leman Akoglu, Rishi Chandy, CF ICWSM’13 CMU visit '14C. Faloutsos15

CMU SCS Problem Given –user-product review network –review sign (+/-) Classify –objects into type-specific classes: users: `honest’ / `fraudster’ products: `good’ / `bad’ reviews: `genuine’ / `fake’ No side data! (e.g., timestamp, review text) CMU visit '14C. Faloutsos16

CMU SCS Formulation: BP UserProduct honestbad honestgood CMU visit '14C. Faloutsos17 – + Before After

CMU SCS Top scorers CMU visit '14C. Faloutsos18 + positive (4-5) rating o negative (1-2) rating Users Products

CMU SCS Top scorers CMU visit '14C. Faloutsos19 + positive (4-5) rating o negative (1-2) rating Users Products

CMU SCS ‘Fraud-bot’ member reviews CMU visit '14C. Faloutsos20 Same developer!Duplicated text! Same day activity!

CMU SCS CMU visit '14C. Faloutsos21 Outline Problem definition / Motivation Anomaly/fraud detection Time series, monitoring / forecasting Conclusions

CMU SCS ‘Tycho’ – epidemics analysis CMU visit '1422C. Faloutsos Yasuko Matsubara 50 states x 46 diseases

CMU SCS ‘Tycho’ – epidemics analysis CMU visit '1423C. Faloutsos Prof. Yasuko Matsubara

CMU SCS ‘Tycho’ – epidemics analysis CMU visit '1424C. Faloutsos Prof. Yasuko Matsubara Flu? Measles? August? No periodicity?

CMU SCS ‘Tycho’ – epidemics analysis CMU visit '1425C. Faloutsos Prof. Yasuko Matsubara Flu? Measles? August? No periodicity?

CMU SCS ‘Tycho’ – epidemics analysis CMU visit '1426C. Faloutsos Prof. Yasuko Matsubara Flu? Measles? August? No periodicity?

CMU SCS ‘Tycho’ – epidemics analysis CMU visit '1427C. Faloutsos Prof. Yasuko Matsubara Flu? Measles? August? No periodicity?

CMU SCS ‘Tycho’ – epidemics analysis CMU visit '1428C. Faloutsos Prof. Yasuko Matsubara Flu? Measles? August? No periodicity?

CMU SCS ‘Tycho’ – epidemics analysis CMU visit '1429C. Faloutsos Prof. Yasuko Matsubara from U. Pitt (epidemiology dept.) Yasuko Matsubara, Yasushi Sakurai, Willem van Panhuis, and Christos Faloutsos, FUNNEL: Automatic Mining of Spatially Coevolving Epidemics, KDD 2014, New York City, NY, USA, Aug , 2014.

CMU SCS Open research questions Patterns/anomalies for time-evolving graphs (Call graph, 3M people x 6mo) Spot fraudsters in soc-net (eg., Twitter ‘$10 -> 1000 followers’) CMU visit '14C. Faloutsos30

CMU SCS CMU visit '14C. Faloutsos31 Contact info GHC 8019 Ph#: x8.1457