1 Communication-Efficient Online Detection of Network-Wide Anomalies Ling Huang* XuanLong Nguyen* Minos Garofalakis § Joe Hellerstein* Michael Jordan*

Slides:



Advertisements
Similar presentations
Data Mining Feature Selection. Data reduction: Obtain a reduced representation of the data set that is much smaller in volume but yet produces the same.
Advertisements

Detectability of Traffic Anomalies in Two Adjacent Networks Augustin Soule, Haakon Ringberg, Fernando Silveira, Jennifer Rexford, Christophe Diot.
Sensitivity of PCA for Traffic Anomaly Detection Evaluating the robustness of current best practices Haakon Ringberg 1, Augustin Soule 2, Jennifer Rexford.
Robust Moving Object Detection & Categorization using self- improving classifiers Omar Javed, Saad Ali & Mubarak Shah.
Models and Security Requirements for IDS. Overview The system and attack model Security requirements for IDS –Sensitivity –Detection Analysis methodology.
“Real-time” Transient Detection Algorithms Dr. Kang Hyeun Ji, Thomas Herring MIT.
Kuang-Hao Liu et al Presented by Xin Che 11/18/09.
Probabilistic Aggregation in Distributed Networks Ling Huang, Ben Zhao, Anthony Joseph and John Kubiatowicz {hling, ravenben, adj,
1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel.
Jan 13, 2006Lahore University of Management Sciences1 Protection Routing in an MPLS Network using Bandwidth Sharing with Primary Paths Zartash Afzal Uzmi.
Traffic Engineering With Traditional IP Routing Protocols
Distributed Regression: an Efficient Framework for Modeling Sensor Network Data Carlos Guestrin Peter Bodik Romain Thibaux Mark Paskin Samuel Madden.
Computer Graphics Recitation 6. 2 Last week - eigendecomposition A We want to learn how the transformation A works:
1 A New Paradigm For Distributed Monitoring Ling Huang, Minos Garofalakis, Nina Taft and Anthony Joseph {minos.garofalakis,
Reverse Hashing for High-speed Network Monitoring: Algorithms, Evaluation, and Applications Robert Schweller 1, Zhichun Li 1, Yan Chen 1, Yan Gao 1, Ashish.
3D Geometry for Computer Graphics
Chapter 10: Stream-based Data Management Title: Design, Implementation, and Evaluation of the Linear Road Benchmark on the Stream Processing Core Authors:
Multi-Scale Analysis for Network Traffic Prediction and Anomaly Detection Ling Huang Joint work with Anthony Joseph and Nina Taft January, 2005.
Approximate data collection in sensor networks the appeal of probabilistic models David Chu Amol Deshpande Joe Hellerstein Wei Hong ICDE 2006 Atlanta,
1 Toward Sophisticated Detection With Distributed Triggers Ling Huang* Minos Garofalakis § Joe Hellerstein* Anthony Joseph* Nina Taft § *UC Berkeley §
1 Distributed Online Simultaneous Fault Detection for Multiple Sensors Ram Rajagopal, Xuanlong Nguyen, Sinem Ergen, Pravin Varaiya EECS, University of.
Probabilistic Data Aggregation Ling Huang, Ben Zhao, Anthony Joseph Sahara Retreat January, 2004.
Cumulative Violation For any window size  t  Communication-Efficient Tracking for Distributed Cumulative Triggers Ling Huang* Minos Garofalakis.
Measurement and Monitoring Nick Feamster Georgia Tech.
EL 933 Final Project Presentation Combining Filtering and Statistical Methods for Anomaly Detection Augustin Soule Kav´e SalamatianNina Taft.
1 D-Trigger: A General Framework for Efficient Online Detection Ling Huang University of California, Berkeley.
Network Anomography Yin Zhang, Zihui Ge, Albert Greenberg, Matthew Roughan Internet Measurement Conference 2005 Berkeley, CA, USA Presented by Huizhong.
Privacy Preservation for Data Streams Feifei Li, Boston University Joint work with: Jimeng Sun (CMU), Spiros Papadimitriou, George A. Mihaila and Ioana.
1 Fault Tolerance in Collaborative Sensor Networks for Target Detection IEEE TRANSACTIONS ON COMPUTERS, VOL. 53, NO. 3, MARCH 2004.
Tomo-gravity Yin ZhangMatthew Roughan Nick DuffieldAlbert Greenberg “A Northern NJ Research Lab” ACM.
On Anomalous Hot Spot Discovery in Graph Streams
1 CE 530 Molecular Simulation Lecture 7 David A. Kofke Department of Chemical Engineering SUNY Buffalo
NUS CS5247 A dimensionality reduction approach to modeling protein flexibility By, By Miguel L. Teodoro, George N. Phillips J* and Lydia E. Kavraki Rice.
Masquerade Detection Mark Stamp 1Masquerade Detection.
Gaussian process modelling
Phoenix: A Weight-Based Network Coordinate System Using Matrix Factorization Yang Chen Department of Computer Science Duke University
Feature extraction 1.Introduction 2.T-test 3.Signal Noise Ratio (SNR) 4.Linear Correlation Coefficient (LCC) 5.Principle component analysis (PCA) 6.Linear.
InteMon: Intelligent monitoring system for large clusters Evan Hoke, Jimeng Sun and Christos Faloutsos.
Scalable and Efficient Data Streaming Algorithms for Detecting Common Content in Internet Traffic Minho Sung Networking & Telecommunications Group College.
EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.
A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Graduate Student Department Of CSE 1.
1 Impact of IT Monoculture on Behavioral End Host Intrusion Detection Dhiman Barman, UC Riverside/Juniper Jaideep Chandrashekar, Intel Research Nina Taft,
Network Anomography Yin Zhang – University of Texas at Austin Zihui Ge and Albert Greenberg – AT&T Labs Matthew Roughan – University of Adelaide IMC 2005.
1 Distributed Detection of Network-Wide Traffic Anomalies Ling Huang* XuanLong Nguyen* Minos Garofalakis § Joe Hellerstein* Michael Jordan* Anthony Joseph*
A Passive Approach to Sensor Network Localization Rahul Biswas and Sebastian Thrun International Conference on Intelligent Robots and Systems 2004 Presented.
Limitations of Cotemporary Classification Algorithms Major limitations of classification algorithms like Adaboost, SVMs, or Naïve Bayes include, Requirement.
Intradomain Traffic Engineering By Behzad Akbari These slides are based in part upon slides of J. Rexford (Princeton university)
1 D-Trigger: A General Framework for Efficient Online Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis ◊ Joe Hellerstein* Michael Jordan* Anthony.
Mining Anomalies in Network-Wide Flow Data Anukool Lakhina with Mark Crovella and Christophe Diot NANOG35, Oct 23-25, 2005.
Mining Anomalies Using Traffic Feature Distributions Anukool Lakhina Mark Crovella Christophe Diot in ACM SIGCOMM 2005 Presented by: Sailesh Kumar.
Network Computing Laboratory 1 Vivaldi: A Decentralized Network Coordinate System Authors: Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris MIT Published.
DISTIN: Distributed Inference and Optimization in WSNs A Message-Passing Perspective SCOM Team
EE515/IS523: Security 101: Think Like an Adversary Evading Anomarly Detection through Variance Injection Attacks on PCA Benjamin I.P. Rubinstein, Blaine.
ApproxHadoop Bringing Approximations to MapReduce Frameworks
Sensitivity of PCA for Traffic Anomaly Detection Evaluating the robustness of current best practices Haakon Ringberg 1, Augustin Soule 2, Jennifer Rexford.
Network Anomography Yin Zhang Joint work with Zihui Ge, Albert Greenberg, Matthew Roughan Internet Measurement.
3D Geometry for Computer Graphics Class 3. 2 Last week - eigendecomposition A We want to learn how the transformation A works:
BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data Authored by Sameer Agarwal, et. al. Presented by Atul Sandur.
Continuous Monitoring of Distributed Data Streams over a Time-based Sliding Window MADALGO – Center for Massive Data Algorithmics, a Center of the Danish.
Communication-Efficient Online Detection of Network-Wide Anomalies Ling Huang* XuanLong Nguyen* Minos Garofalakis § Joe Hellerstein* Michael Jordan* Anthony.
SketchVisor: Robust Network Measurement for Software Packet Processing
Experience Report: System Log Analysis for Anomaly Detection
Principal Component Analysis (PCA)
NOX: Towards an Operating System for Networks
CHAPTER 3 Architectures for Distributed Systems
DDoS Attack Detection under SDN Context
Design open relay based DNS blacklist system
By: Ran Ben Basat, Technion, Israel
Feature Selection Methods
Lu Tang , Qun Huang, Patrick P. C. Lee
Presentation transcript:

1 Communication-Efficient Online Detection of Network-Wide Anomalies Ling Huang* XuanLong Nguyen* Minos Garofalakis § Joe Hellerstein* Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel Research

Today: Distributed Monitoring & Centralized Computation  Stream-based data collection  Periodically evaluate detection function over collected data  Doesn’t scale well in network size or timescale Our contribution: Decentralized Detection  Continuously evaluate detection function in a decentr. way  Low-overhead, rapid response, accurate and scalable  Detection accuracy controllable by a “tuning knob” Provable guarantees on detection error (false alarm rate) Flexible tradeoff between overhead and accuracy Towards Decentralized Detection

Detection Problems in Enterprise Network Do machines in my network participate in a Botnet to attack other machines? victim command & control Send at rate.95*allowed to victim X V Operation Center Coordinated Detection! Data: Byte rate on a network link. CAN’T AFFORD TO DO THE DETECTION CONTINUOUSLY ! For efficient and scalable detection, push data processing to the edge of network! Approximate but accurate detection!

Detection of Network-wide Anomalies A volume anomaly is a sudden change in an Origin-Destination flow (i.e., point to point traffic) Given link traffic measurements, detect the volume anomalies H1H1 H2H2 The backbone network Regional network 1 Regional network 2

An Illustration Observed network link data = aggregate of application-level flows Each link is a dimension Anomalies in (unobserved) flow data Finding anomalies in high-dimensional, noisy data is difficult !

The Subspace Method (Lakhina’04) An approach to separate normal from anomalous traffic based on Principal Component Analysis (PCA) Normal Subspace : space spanned by the top k principal components Anomalous Subspace : space spanned by the remaining components Then, decompose traffic on all links by projecting onto and to obtain: Traffic vector of all links at a particular point in time Normal traffic vector Residual traffic vector [3] Lakhina et al. Diagnosing Network-Wide Traffic Anomalies. In ACM SIGCOMM (2004). [4] Zhang et al. Network Anomography, In IMC (2005). Say: An centralied approach based on PCA: Lakhina 2004 develop PCA: High dim projected down to low dim, looked at the residual traffic. For detail, see their paper.

Detection Illustration Value of over time (all traffic) over time Value of No detail, high level intution. Red dots: anomalies Blue curve: traffic data

m (timestep) n (nodeID) Operation center The Centralized Algorithm The Network Eigen values Eigen vectors Data matrix Dat 1) Each link produces a column of m data over time. 2) n links produces a row data y at each time instance. The detection is: Dat 1 (t)Dat 2 (t)Dat 3 (t)Dat n (t) Periodically Doesn’t scale well to large network or to smaller timescales  The number of monitoring devices may grow to thousands  The anomalies may occur on second or sub-second time scales

PCA-Based Detection Our In-Network Detection Framework data 1 (t) data 2 (t) data n (t) Perturbation Analysis Adjust Filter Parameters original monitored time series filtered_data 1 (t) filtered_data 2 (t) filtered_data n (t) Coordinator Alarms user inputs: detection error Distr. Monitors

The Protocol At Monitors Monitor i updates information if are the filtering parameters can be based on any prediction model built on historical data  e.g., the average of last 5 signal values observed locally  Simple but enough to achieve 10x data reduction

The Communication and Error Tradeoff data(t) Dat= filtered_data(t) PCA on Dat Full Info. PCA on Approximate Info. Difference? The bigger the filtering parameter  i, the less the communication overhead, but the more the detection error! The coordinator computes a set of good  1, …,  n to manage this difference.

Parameter Design and Error Control (I) Users specify an upper bound on false alarm rate, then we determine the filtering parameters  ’s Data vs. Model Monte Carlo and fast binary search Stochastic Matrix Perturbation Theory An algorithmic analysis for the tradeoff between detection accuracy and data communication cost  monitor parameters determined from the target accuracy  technical tool relies on stochastic matrix perturbation theory Eigen error: L 2 norm of the difference between the approximate eigenvalues and the actual ones ActualApproximate

Parameter Design and Error Control (II) Detection Error   Eigen-Error   Mont Carol simulation to find the mapping from  to   For the given  using fast binary search to find an  Eigen-Error   Filtering parameters  ’s  

Evaluation Given user-specified false alarm rate, evaluate the actual detection accuracy and communication overhead Experiment setup  Abilene backbone network data  Traffic matrices of size 1008 X 41  Set uniform slack for all monitors

Performance  Missed DetectionsFalse AlarmsData Reduction Week 1Week 2Week 1Week 2Week 1Week %70% %76% %79% Data Used: Abilene traffic matrix, 2 weeks, 41 links. error tolerance = upper bound on error

Summary A communication-efficient framework that  detects anomalies at desired accuracy level  with minimal communication cost A distributed protocol for data processing  local monitors decide when to update data to coordinator  coordinator makes global decision and feedback to monitors An algorithmic framework to guide the tradeoff between communication overhead and detection accuracy

Reference [Lakhina’04] Diagnosing Network-Wide Traffic Anomalies. A. Lakhina, M. Crovella and C. Diot. In SIGCOMM '04. [Huang’06] In-Network PCA and Anomaly Detection. L. Huang, X. Nguyen, M. Garofalakis, M. Jordan, A. Joseph and N. Taft. In NIPS 19, [Huang’07] Communication-Efficient Online Detection of Network-Wide Anomalies. L. Huang, X. Nguyen, M. Garofalakis, J. Hellerstein, M. Jordan, A. Joseph and N. Taft. To appear in INFOCOM'07. Questions

18 Backup Slides

Operation Center Traditional Distributed Monitoring Large-scale network monitoring and detection systems  Distributed and collaborative monitoring boxes  Continuously generating time series data Existing research focuses on data streaming  Centrally collect, store and aggregate network state  Well suited to answering approximate queries and continuously recording system state  Incur high overhead! Monitor 1 Monitor 2 Monitor 3 Local Network 2 Local Network 3 Local Network 1 Bandwidth Bottleneck! Overloaded!

Our Distributed Processing Approach A coordinator  Is aggregation, correlation and detection center A set of distributed monitors  Each produces a time series signals  Processes data locally, only sends needed info. to coordinator  No communication among monitors  Coordinator tells monitors the level of accuracy for signal updates Filters x “push” Filters x adjust Hosts/Monitors Coordinator Detection Accuracy

Traffic on Link 1 Traffic on Link 2 Principal Component Analysis (PCA) y Anomalous traffic usually results in a large value of : principal components : minor components Principal components are top eigenvectors of covariance matrix. They form the subspace projection matrices C no and C ab