1 D-Trigger: A General Framework for Efficient Online Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis ◊ Joe Hellerstein* Michael Jordan* Anthony.

1 D-Trigger: A General Framework for Efficient Online Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis ◊ Joe Hellerstein* Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel Research ◊ Yahoo! Research

2 Overview slide DC spec compiler logical config physical config Policy-aware Switching Layer monitoring data Ruby on Rails interpreter VM monitor local OS functions Trace/data collection web svc APIs Web 2.0 apps per node SW stack AWE drivers policy verification D-Trigger Trace/data collection D-Trigger monitoring data

3 RIOT Framework RADLab Integrated Observation via Tracing Trace Data Instrumented Calls libAsync Applications Distributed Reporting RoRlibLog Distributed Replay Policy Verification Performance XMLdb Queries Instrumented VMs Troubleshooting Augmented Logging Distributed Triggering Real Time Alarms / Actuation libAsynclibLog Instrumented VMs Instrumented Calls RoR Applications Augmented Logging Trace Data Distributed Triggering Distributed Reporting Distributed Replay Performance XMLdb Queries Real Time Alarms / Actuation Troubleshooting Policy Verification

4 Outline Motivation & Introduction Our Decentralized Detection Framework Efficient Detection of Cumulative Violations –Problem definition –Decentralized Detection Summary & Future work

5 Operation Center Traditional Distributed Monitoring Large-scale network monitoring and detection systems –Distributed and collaborative monitoring boxes Netflow modules, Snort sensors, firewalls, … –Continuously generating time series data Existing research focuses on data streaming –Centrally collect, store and aggregate network state –Well suited to answering approximate queries and continuously recording system state –Incur high overhead! Monitor 1 Monitor 2 Monitor 3 Local Network 2 Local Network 3 Local Network 1 Bandwidth Bottleneck! Overloaded!

6 Do machines in my network participate in a Botnet to attack other machines? victim command & control Send at rate.95*allowed to victim X V Operation Center Coordinated Detection! Data: Byte rate on a network link. CAN’T AFFORD TO DO THE DETECTION CONTINUOUSLY ! For efficient and scalable detection, push data processing to the edge of network! Approximate but accurate detection! Detection Problems in Enterprise Network

7 Moving Towards Decentralized Detection Today: Distributed Monitoring & Centralized Computation –Stream-based data collection (push) –Evaluates sophisticated detection function over periodical monitoring data –Doesn’t scale well in numbers of nodes or to smaller timescales – high bandwidth and/or central overhead Tomorrow: Distributed Monitoring & Decentralized Computation –Evaluates sophisticated detection function over continuous monitoring data in a decentralized way –Provides low-overhead, rapid response, high accuracy, and scalability

8 Decentralized detection system –Distributed information processing & central decision making –Detects violations/anomalies on micro/small timescale –Open platform to support a wide-range of applications General detection functions –SUM, MIN, MAX, PCA, SVM, TOP-K,…… Operation on general time series –Detection accuracy controllable by a “tuning knob” Provide user-controllable tradeoff between accuracy and overhead –Communication-efficient Minimize communication at given detection accuracy Research Goals and Achievements

9 Outline Motivation & Introduction Our Decentralized Detection Framework Efficient Detection of Cumulative Violations –Problem Definition –Decentralized Detection Summary & Future work

10 The System Setup A set of distributed monitors –Each produces a time series signals Dat i (t) –Send minimal information to coordinator –No communication among monitors A coordinator X –Is aggregation, correlation and detection center –Performs detection –X tells monitors the level of accuracy for signal updates

11 New Ideas for Efficiency and Scalability Local Processing – push tracking capabilities to edges –Monitors do ‘continuous’ monitoring and local processing –Use filtering scheme to decide when to push information to operation center (i.e. only ‘when needed’). Dealing with approximations – algorithms resident at operation center for doing accurate detection in the face of limited data/information A framework for putting the above together in a single system –Implemented as protocols that govern actions between monitors and operation center, and manage adaptivity.

12 Local Processing by Filtering Filters x “push” Filters x adjust Hosts/Monitors Operation Center Don’t send all the data! Don’t need most of them anyway –Our applications are anomaly detectors –Most of the traffic is normal so don’t need to send. Ideally hosts should only be sending rare events Detection Accuracy

13 Our Decentralized Detection Framework Dat 1 (t) Dat 2 (t) Dat n (t) Detection Function Perturbation Analysis Feedback: Filter Sizes original monitored time series Coordinator Alarms user inputs: detection error Distr. Monitors Mod 1 (t) Mod 2 (t) Mod n (t) Linear Function Queue Overflow Detection Classificati- on Function Outlier Detection

15 Efficient Detection of Distributed Cumulative Violations Detection of volume-based anomalies; trigger an alarm when overflows (ICDCS’07)

16 Problem Setup Constraints on aggregate conditions of subset of nodes –Accrue volume penalty when value exceeds threshold C –Fire trigger whenever volume penalty exceeds error tolerance  Aggregate function –Current work supports simple queries Focus on SUM and AVG here Extending to MIN, MAX, TOP-K as ongoing work –Future work to support general and complex functions Sum

17 Applications Distributed SUM exceeding threshold –Hot-spot detection for server farm Trigger an alarm if any set of 20 servers have workload more than 80% of the workload of the total 100 servers –Botnet detection for enterprise network Trigger an alarm if the packet rate from all hosts in my network to a destination exceeds 200 MB/second Cumulative (persistent) violation –Sever load are spiky, and it makes more sense to look at persistent overload over time –Network traffic is shaped and routed according to queueing model –In environment monitoring, radiation exposure are accumulated over time

18 Problem Statement User Inputs: – Constraint violation threshold: C – Tolerable error zone around constraint:  – Tolerable false alarm rate:  – Tolerable missed detection rate:  GOAL: fire trigger whenever penalty exceeds error tolerance –with required accuracy level –AND with minimum communication overhead (monitor updates)

19 Let V(t,  ) be size of penalty, at time t, over past window  Instantaneous violation [3][4] Fixed-window violation Cumulative violation 4 Three Types of Violations for a any  in [1, t] for a user given fixed  >  <  >  Key point: One needs to find a special  * which maximize V(t,  ) 1 5 t C C+  >  <  > 

20 Detection of Cumulative Violation Key insight: Cumulative trigger is equivalent to a queue overflow problem The centralized queuing model Dat 1 (t) Dat 2 (t) Dat n (t)

22 Aggreg./ Queueing Our In-Network Detection Framework Dat 1 (t) Dat 2 (t) Dat n (t) Constraint Checking Adjust Filter Slacks original monitored time series Mod 1 (t) Mod 2 (t) Mod n (t) Coordinator Alarms user inputs: C,  Distr. Monitors

23 Mod 1 (t) Mod 2 (t) Mod n (t) Dat 1 (t) Dat 2 (t) Dat n (t) The Distributed Queuing Model Distributed queuing model for cumulative triggers (b) Queue-based filtering (a) Distributed queuing model under-estimate over-estimate  1,...,  n : monitor queue size;   coordinator queue size Number of TCP Requests Dat 1 (t) Dat 2 (t) Dat n (t)

24 Queuing Analysis: The Model Each input is decomposed into two parts  Continuous enqueuing with rate  Discrete enqueuing/dequeuing with size How is the detection behavior of solution model different from centralized model? (a) The centralized idea model(b) The Distributed solution model

25 Queuing Analysis: Missed Detection The centralized model overflows … The solution model does not overflow! Queueing model and assumptions! Tell people there is one equation.

26 Queuing Analysis: False Alarm The solution model overflows! The centralized model does not overflow …

27 Experimental Setup Collecting trace data from 200 Planetlab nodes using SNORT sensors Evaluation carried out for using time series of “number of active TCP connections” on 1-minute interval Randomly selecting 40 data streams, each with 4000 data points Threshold C = 90 percentile of the aggregate data

28 Results: Model Validation Desired vs. achieved detection performance  missed detection rate  false alarm rate Achieved   and  are always less than desired  and  indicating that analytical model find upper bounds on the detection performance. 0.10.010.0070.020.010 0.10.020.0000.020.008 0.10.020.0000.040.011 0.20.010.0000.020.016 0.20.020.0000.020.013 0.20.020.0000.040.020   DesiredAchievedDesiredAchieved

29 Results: The Tradeoff Parameters design and tradeoff between false alarm, missed detection and communication overhead Error tolerance  = 0.2C Overhead = # of messages sent / total # of monitoring epochs False Alarm Rate Missed Det. Rate False Alarm Rate Missed Det. Rate

31 Summary A communication-efficient framework that –detects anomalies at desired accuracy level –with minimal communication cost A distributed protocol for data processing –local monitors decide when to update data to coordinator –coordinator makes global decision and feedback to monitors An algorithmic framework to guide the tradeoff between detection accuracy and overhead –Detection accuracy is always provable mathematically

32 Future Work Efficient detection framework for distributed systems –Extension to more sophisticate classification and detection functions –Development better prediction algorithms –Exploration of spatial correlation among data streams between monitors –Detection on hierarchical topology Going beyond detection –Fault identification, diagnosis, etc.

33 http://radlab.cs.berkeley.edu/wiki/Dtrigger Questions

34 Backup Slides

35 The Components of D-Triggers Decentralized Detection Linear Functions, Fixed Thresholds Detection of Cumulative Violations (ICDCS’07) Detection of Inst. Violations (MINENET’06) Sophisticated Detection Detection of Network- Wide Anomalies (NIPS’06, INFOCOM’07) Online Continuous Classification (Ongoing …) Detection of Inst. Violations (MINENET’06) Detection of Cumulative Violations (ICDCS’07) Detection of Network- Wide Anomalies (NIPS’06, INFOCOM’07) Online Continuous Classification (Ongoing …) Principal Component Analysis Support Vector Machine SUM + Top-K Queueing Model

36 Let start the analysis with all monitors having same local queue size  easy for analysis, applicable to the non-uniform case We want  as large as possible to reduce the communication overhead However, large  brings large bursts into the system, which requires a large  to absorb the burst Values of  and  are constrained by the error  Using queuing theory, we can analyze the overflow probability of the queue, thus determining the values of  and  Queuing Analysis: The Intuition

37 Key Contributions (1/2) Designed decentralized detection systems that scale well –You don’t need all the data! –Can do detection with 80+% less than others –Enable the detection on very small time scales Provable mathematical guarantees on errors –Detection accuracy is always provable mathematically

38 Key Contributions (2/2) General Framework for continuous online detection –For wide-range of applications –Framework is broad: Distributed information processing –Filtering, prediction, adaptive learning, … Central decision making –Cumulative Sum, PCA and Top-k, SVM Classification, … Constraint definition –Fixed value, threshold function, advanced statistics, … Adaptive system –Algorithms guide the tradeoff between comm. overhead and detection accuracy

1 D-Trigger: A General Framework for Efficient Online Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis ◊ Joe Hellerstein* Michael Jordan* Anthony.

Similar presentations

Presentation on theme: "1 D-Trigger: A General Framework for Efficient Online Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis ◊ Joe Hellerstein* Michael Jordan* Anthony."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 D-Trigger: A General Framework for Efficient Online Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis ◊ Joe Hellerstein* Michael Jordan* Anthony.

Similar presentations

Presentation on theme: "1 D-Trigger: A General Framework for Efficient Online Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis ◊ Joe Hellerstein* Michael Jordan* Anthony."— Presentation transcript:

Similar presentations

About project

Feedback