Download presentation
Presentation is loading. Please wait.
Published byBethany Gibbs Modified over 9 years ago
1
CMU SCS Big (graph) data analytics Christos Faloutsos CMU
2
CMU SCS CMU SCS IC '14C. Faloutsos2 CONGRATULATIONS!
3
CMU SCS CMU SCS IC '14C. Faloutsos3 Outline Q+A Problem definition / Motivation Graphs, tensors and brains Anomaly detection Conclusions
4
CMU SCS CMU SCS IC '14C. Faloutsos4 Q+A Are you recruiting? How many? How many do you have? How frequently you meet them? What is your advising style? How do you feel about summer internships?
5
CMU SCS CMU SCS IC '14C. Faloutsos5 Q+A Are you recruiting? How many? How many do you have? How frequently you meet them? What is your advising style? How do you feel about summer internships? 1 or 2 6 (+5pdocs) 1/week results Yes/Maybe (FB, MSR, IBM, ++)
6
CMU SCS CMU SCS IC '14C. Faloutsos6 Outline Q+A Problem definition / Motivation Graphs, tensors and brains Anomaly detection Conclusions
7
CMU SCS CMU SCS IC '14C. Faloutsos7 Motivation Data mining: ~ find patterns (rules, outliers) How do real graphs look like? Anomalies? Time series / Monitoring Measles @ PA, NY, …
8
CMU SCS CMU SCS IC '14C. Faloutsos8 Graphs - why should we care?
9
CMU SCS C. Faloutsos9 Graphs - why should we care? Internet Map [lumeta.com] Food Web [Martinez ’91] ~1B users $10-$100B revenue CMU SCS IC '14
10
CMU SCS CMU SCS IC '14C. Faloutsos10 Outline Q+A Problem definition / Motivation Graphs, tensors and brains Anomaly detection Conclusions
11
CMU SCS NELL & concepts (=groups) Predicates (subject, verb, object) in knowledge base “Barack Obama is the president of U.S.” “Eric Clapton plays guitar” (26M) (48M) NELL (Never Ending Language Learner) data Nonzeros =144M CMU SCS IC '14C. Faloutsos Tom Mitchell CMU/CS-MLD 11 Vagelis Papalexakis CMU-CS
12
CMU SCS Answer : tensor factorization Recall: (SVD) matrix factorization: finds blocks CMU SCS IC '14C. Faloutsos12 N users M products ‘meat-eaters’ ‘steaks’ ‘vegetarians’ ‘plants’ ‘kids’ ‘cookies’ ~ ++
13
CMU SCS PARAFAC decomposition CMU SCS IC '14C. Faloutsos13 = + + subject object verb politicians artistsathletes Answer : tensor factorization
14
CMU SCS PARAFAC decomposition Results for who-calls-whom-when –4M x 15 days CMU SCS IC '14C. Faloutsos14 = + + caller callee time ?? Answer : tensor factorization
15
CMU SCS Concept Discovery Concept Discovery in Knowledge Base CMU SCS IC '14C. Faloutsos15
16
CMU SCS Concept Discovery Concept Discovery in Knowledge Base CMU SCS IC '14C. Faloutsos16 NP1: Internet, file, data NP2: Protocol, software, suite
17
CMU SCS Neuro-semantics Brain Scan Data * 9 persons 60 nouns Questions 218 questions ‘is it alive?’, ‘can you eat it?’ CMU SCS IC '1417C. Faloutsos *Mitchell et al. Predicting human brain activity associated with the meanings of nouns. Science,2008. Data@ www.cs.cmu.edu/afs/cs/project/theo-73/www/science2008/data.html www.cs.cmu.edu/afs/cs/project/theo-73/www/science2008/data.html
18
CMU SCS Neuro-semantics Brain Scan Data * 9 persons 60 nouns Questions 218 questions ‘is it alive?’, ‘can you eat it?’ CMU SCS IC '1418C. Faloutsos Patterns?
19
CMU SCS Neuro-semantics Brain Scan Data * 9 persons 60 nouns Questions 218 questions ‘is it alive?’, ‘can you eat it?’ … airplane dog persons nouns questions voxels CMU SCS IC '1419C. Faloutsos Patterns?
20
CMU SCS Neuro-semantics 20CMU SCS IC '14C. Faloutsos =
21
CMU SCS Neuro-semantics 21CMU SCS IC '14C. Faloutsos Small items -> Premotor cortex =
22
CMU SCS Neuro-semantics 22CMU SCS IC '14C. Faloutsos Evangelos Papalexakis, Tom Mitchell, Nicholas Sidiropoulos, Christos Faloutsos, Partha Pratim Talukdar, Brian Murphy, Turbo-SMT: Accelerating Coupled Sparse Matrix-Tensor Factorizations by 200x, SDM 2014 Small items -> Premotor cortex
23
CMU SCS CMU SCS IC '14C. Faloutsos23 Scalability Google: > 450,000 processors in clusters of ~2000 processors each [ Barroso+, “Web Search for a Planet: The Google Cluster Architecture” IEEE Micro 2003 ] Yahoo: 5Pb of data [Fayyad, KDD’07] Google-NY, Aug’14: ‘graph with 1T edges, 300B nodes’ Problem: machine failures, on a daily basis How to parallelize data mining tasks, then? A: map/reduce – hadoop (open-source clone) http://hadoop.apache.org/ http://hadoop.apache.org/
24
CMU SCS CMU SCS IC '14C. Faloutsos24 Outline Q+A Problem definition / Motivation Graphs, tensors and brains Anomaly/fraud detection Conclusions
25
CMU SCS App-store fraud Opinion Fraud Detection in Online Reviews using Network Effects Leman Akoglu, Rishi Chandy, CF ICWSM’13 CMU SCS IC '14C. Faloutsos25 (NSF grant, with Alex Beutel)
26
CMU SCS Problem Given –user-product review network –review sign (+/-) Classify –objects into type-specific classes: users: `honest’ / `fraudster’ products: `good’ / `bad’ reviews: `genuine’ / `fake’ No side data! (e.g., timestamp, review text) CMU SCS IC '14C. Faloutsos26
27
CMU SCS Formulation: BP UserProduct honestbad honestgood CMU SCS IC '14C. Faloutsos27 – + Before After
28
CMU SCS Top scorers CMU SCS IC '14C. Faloutsos28 + positive (4-5) rating o negative (1-2) rating Users Products
29
CMU SCS Top scorers CMU SCS IC '14C. Faloutsos29 + positive (4-5) rating o negative (1-2) rating Users Products
30
CMU SCS ‘Fraud-bot’ member reviews CMU SCS IC '14C. Faloutsos30 Same developer!Duplicated text! Same day activity!
31
CMU SCS CMU SCS IC '14C. Faloutsos31 Outline Q+A Problem definition / Motivation Graphs, tensors and brains Anomaly/fraud detection Time series, monitoring / forecasting Conclusions
32
CMU SCS ‘Tycho’ – epidemics analysis CMU SCS IC '1432C. Faloutsos Yasuko Matsubara 50 states x 46 diseases
33
CMU SCS ‘Tycho’ – epidemics analysis CMU SCS IC '1433C. Faloutsos Prof. Yasuko Matsubara
34
CMU SCS ‘Tycho’ – epidemics analysis CMU SCS IC '1434C. Faloutsos Prof. Yasuko Matsubara Flu? Measles? August? No periodicity?
35
CMU SCS ‘Tycho’ – epidemics analysis CMU SCS IC '1435C. Faloutsos Prof. Yasuko Matsubara Flu? Measles? August? No periodicity?
36
CMU SCS ‘Tycho’ – epidemics analysis CMU SCS IC '1436C. Faloutsos Prof. Yasuko Matsubara Flu? Measles? August? No periodicity?
37
CMU SCS ‘Tycho’ – epidemics analysis CMU SCS IC '1437C. Faloutsos Prof. Yasuko Matsubara Flu? Measles? August? No periodicity?
38
CMU SCS ‘Tycho’ – epidemics analysis CMU SCS IC '1438C. Faloutsos Prof. Yasuko Matsubara Flu? Measles? August? No periodicity?
39
CMU SCS ‘Tycho’ – epidemics analysis CMU SCS IC '1439C. Faloutsos Prof. Yasuko Matsubara https://www.tycho.pitt.edu/resources.php from U. Pitt (epidemiology dept.) Yasuko Matsubara, Yasushi Sakurai, Willem van Panhuis, and Christos Faloutsos, FUNNEL: Automatic Mining of Spatially Coevolving Epidemics, KDD 2014, New York City, NY, USA, Aug. 24-27, 2014.
40
CMU SCS Open research questions Patterns/anomalies for time-evolving graphs (Call graph, 3M people x 6mo) Spot fraudsters in soc-net (eg., Twitter ‘$10 -> 1000 followers’) How is the human brain wired CMU SCS IC '14C. Faloutsos40
41
CMU SCS CMU SCS IC '14C. Faloutsos41 Contact info www.cs.cmu.edu/~christos GHC 8019 Ph#: x8.1457 www.cs.cmu.edu/~christos/TALKS/14- 09-ic/ FYI: Course: 15-826, Tu-Th 3:00-4:20 and, again WELCOME!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.