Download presentation
Presentation is loading. Please wait.
1
Supporting a Real-time Distributed Intrusion Detection Application on GATES
QianZhu, Liang Chen and Gagan Agrawal Department of Computer Science and Engineering The Ohio State University Euro-Par 2006 Conference Aug 30th, Dresden, Germany
2
Roadmap Introduction Anomaly Detection Algorithm Overview of GATES
Distributed Anomaly Detection Algorithm Experiments Conclusion
3
Introduction Growing rate of interconnections among computer systems
Network Security chanllenge Intrusion prevention techniques user authentication avoiding programming errors information protection Intrusion detection to protect system
4
Introduction Intrusion Detection Techniques Anomaly Detection
Detect intrusions by determining whether a record is deviated from an established normal behavior profile Misuse Detection Detect intrusions by comparing records against patterns of known intrusions
5
Roadmap Introduction Anomaly Detection Algorithm Overview of GATES
Distributed Anomaly Detection Algorithm Experiments Conclusion
6
Anomaly Detection Algorithm
Many anomaly detection algorithms train models over clean data Drawbacks Clean data is NOT always easy to obtain Training over noisy data has serious consequences It is difficult to train the model “online” since clean data must be guaranteed
7
Anomaly Detection Algorithm
An approach from Eskin (ICML 2000) Detecting intrusions without clean data Assumption: the number of normal elements should be significantly larger than the number of intrusion elements
8
Anomaly Detection Algorithm
Explaining anomalies by a mixture model Modeling probability distributions D: The data set Mt: The set of normal data at time t At: The set of anomalous data at time t
9
Anomaly Detection Algorithm
Detecting anomalies IF (LLt-LLt-1)>c ELSE
10
Anomaly Detection Algorithm
Problems Computation intensive Processing data on one single node Real-time constraint Fast detection Need for self-adaptation
11
Roadmap Introduction Anomaly Detection Algorithm Overview of GATES
Distributed Anomaly Detection Algorithm Experiments Conclusion
12
Overview of GATES GATES (Grid-based AdapTive Execution on Stream) is a middleware which can support distributed data stream processing Internet Globus-OGSA GATES Applications Web service
13
Overview of GATES An application built on the GATES
Automatically distributed to proper computing nodes Automatically self-adaptive to varying environment without implementing certain algorithms or multiple versions Self-adaptation algorithm to achieve the highest level of accuracy while meeting the real-time constraint
14
Overview of GATES Breaking down the task into several sub-tasks so that the sub-tasks can consist of a pipeline Implementing each sub-task in Java Writing an XML configuration file for the sub-tasks to be automatically deployed. I.E specify how many stages the pipeline has specify where the codes that are processing the sub-tasks reside Launch the application by running a java program (StreamClient.class) provided by the GATES
15
Roadmap Introduction Anomaly Detection Algorithm Overview of GATES
Distributed Anomaly Detection Algorithm Experiments Conclusion
16
Distributed Anomaly Detection Algorithm
Network data come in streams How to maintain an accurate model for the data Incremental maintenance of a data model over a data stream The maintenance has to be quick for fast streams and robust for noisy data
17
Distributed Anomaly Detection Algorithm
18
Distributed Anomaly Detection Algorithm
Producer Generating data streams Collector Generating local model (GMM) and sent it together with sample data to the next stage. Performing anomaly detection based on global model (GMM) Combiner Combining local models into a global model Sending the global model back to Collector
19
Distributed Anomaly Detection Algorithm
Adjustable parameters The sampling rate on the Collector stage The converge threshold for the EM algorithm Fix one of them while making the other one adjusted by GATES
20
Roadmap Introduction Anomaly Detection Algorithm Overview of GATES
Distributed Anomaly Detection Algorithm Experiments Conclusion
21
Experiments Data set (KDD cup 99) 335,892 91% 41 22 # of records
% of normal data attributes # of intrusion types 335,892 91% 41 22 Note: only 10 attributes (7 continuous and 3 categorical) out of 41 were used for the algorithm
22
Experiments Adjustable EM threshold vs. Fixed sampling rate
Producing rate varies from 100k/sec, 80k/sec 50k/sec, 30k/sec to 10k/sec Sampling rate varies from 40%, 20%, 16%, 13% to 10%
24
Experiments
25
Experiments
26
Experiments Logistic Regression
input variables: continuous, categorical or both response variavles: 0/1 value Use three categorical attributes for logistic regression Combine results for final detection
27
Experiments Detection performance improved by using Logistic Regression
28
Experiments Adjustable sampling rate vs. Fixed EM threshold
Producing rate varies from 100k/sec, 80k/sec 50k/sec, 30k/sec to 10k/sec EM threshold varies from , to
29
Experiements
30
Roadmap Introduction Anomaly Detection Algorithm Overview of GATES
Distributed Anomaly Detection Algorithm Experiments Conclusion
31
Conclusion Convert the Eskin anomaly detection algorithm into a distributed version and deploy the application on GATES GATES can effectively adjust the tradeoff between maintaining the real-time constraint and the highest accuracy (95.36% vs %)
32
Thank you!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.