Download presentation
Presentation is loading. Please wait.
1
© 2011 IBM Corporation IBM Research SIAM-DM 2011, Mesa AZ, USA, Non-Negative Residual Matrix Factorization w/ Application to Graph Anomaly Detection Hanghang Tong and Ching-Yung Lin April 28-30, 2011
2
IBM Research © 2011 IBM Corporation Large Graphs are Everywhere! 2 ---------- Internet Map [Koren 2009] Food Web [2007] Protein Network [Salthe 2004] Social Network [Newman 2005] Web Graph Terrorist Network [Krebs 2002] Q: How to find patterns? e.g., community, anomaly, etc.
3
IBM Research © 2011 IBM Corporation A Typical Procedure: Matrix Tool for Finding Graph Patterns Graph Adj. Matrix A A = F x G + R Low-rank matrices Residual matrix 3
4
IBM Research © 2011 IBM Corporation A Typical Procedure: Matrix Tool for Finding Graph Patterns Graph Adj. Matrix A A = F x G + R community anomalies 4 An Illustrative Example Low-rank matrices Residual matrix
5
IBM Research © 2011 IBM Corporation A Typical Procedure: An Example Improve Interpretation by Non-negativity Interpretation by Non-negativity Graph Adjacency Matrix A A = F x G + R community anomalies Non-negative Matrix Factorization F >= 0; G >= 0 (for community detection) Non-negative Residual Matrix Factorization R(i,j) >= 0; for A(i,j) > 0 (for anomaly detection) This Paper 5
6
IBM Research © 2011 IBM Corporation Anomaly Detection on Graphs Social Networks –`Popularity contest’ Computer Networks –Spammer, Port Scanner, Vulnerable Machines, etc Financial Transaction Networks –Fraud transaction (e.g., money-laundry ring), scammer Criminal Networks –New criminal trend Tele-communication Networks –Tele-marketer 6 Key Observation: Abnormal Behavior Actual Activities
7
IBM Research © 2011 IBM Corporation Challenges and Core Ideas Challenges 1: Lack of `Ground-truth’ Core Idea 1: Using residual graph to improve the usability of anomaly detection results –(which is turned achieved by non-negative residual matrix factorization methods) Challenges 2: Large Data Core Idea 2: Carefully designed method, which scales linear wrt the size of the graph 7
8
IBM Research © 2011 IBM Corporation Optimization Formulation General Case 8 Weighted Frobenius Form WeightCommon in Any Matrix Factorization
9
IBM Research © 2011 IBM Corporation Optimization Formulation General Case 9 Non-negative residual Weighted Frobenius Form WeightCommon in Any Matrix Factorization Unique in This Paper
10
IBM Research © 2011 IBM Corporation Optimization Formulation 0/1 Weight Matrix (Major Focus of the Paper) 10 Non-negative residual Common in Any Matrix Factorization Unique in This Paper 0/1 weight
11
IBM Research © 2011 IBM Corporation Optimization Formulation with 0/1 Weight Matrix NrMF with 0/1 Weight Matrix Q: How to find ‘optimal’ F and G? –D1: Quality C1: non-convexity of opt. objective –D2: Scalability C2: large size of the graph 11
12
IBM Research © 2011 IBM Corporation Optimization Method: Batch Mode Basic Idea 1: Alternating Basic Idea 2: Separation 12 Not convex wrt F and G, jointly But convex if fixing either F or G argmin G s.t.. argmin G s.t.. For each j i, Standard Quadratic Programming Prob. Overall Complexity: Polynomial Can we do better?
13
IBM Research © 2011 IBM Corporation Optimization Method: Incremental Mode Basic Idea 1: Recursive Basic Idea 2: Alternating Basic Idea 3: Separation 13 Overall Complexity: Linear wrt # of edges QP for a single variable w/ boundary constrains Adjacency Matrix A Initialize: R=A Rank-1 Approximation Update Residual Matrix R Output Final Residual Matrix Do r times Can be solved in constant time
14
IBM Research © 2011 IBM Corporation Experimental Evaluation Effectiveness Anomaly Type AccuracyWall-clock Time # of edges 14 Efficiency
15
IBM Research © 2011 IBM Corporation Experimental Evaluation Effectiveness Efficiency Anomaly Type Accuracy Time # of edges # of type-2 nodes# of type-1 nodes 15
16
IBM Research © 2011 IBM Corporation Batch Method vs. Incremental Method Log Wall-clock time (sec.) Data SetIncremental Method Batch Method 16
17
IBM Research © 2011 IBM Corporation Conclusion Problem Formulation: Non-negative Residual Matrix Factorization –a new matrix factorization for interpretable graph anomaly detection Optimization Methods –Batch: straight-forward, polynomial time complexity –Incremental: linear time complexity Future Work –Other interpretable properties (sparseness) for anomaly detection –Matrix Factorization w/ Total Non-negativity 17
18
IBM Research © 2011 IBM Corporation Thank you! htong@us.ibm.com (We are hiring at IBM Research!) 18
19
IBM Research © 2011 IBM Corporation Visual Comparison 19
20
IBM Research © 2011 IBM Corporation low q up q low up
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.