Download presentation
Presentation is loading. Please wait.
Published byΜνημοσύνη Παπάζογλου Modified over 5 years ago
1
Statistical based IDS background introduction
2
Statistical IDS background
Why do we do this project Attack introduction IDS architecture Data description Feature extraction Statistical method introduction Result analysis
3
Project goals Related work Our goals
Internet has various network attacks, including denial of service attacks and port scans, etc. Overall traffic detection Flow-level detection Our goals Detect both attacks at the same time Differentiate DoS and port scans
4
Attack introduction TCP SYN flooding
- An important form of DoS attacks - Exploit the TCP’s three-way handshake mechanism and its limitation in maintaining half-open connection - Feature: spoofed source IP - Recent reflected SYN/ACK flooding attacks
5
Attack introduction Port scan - horizontal scan - Vertical scan
- Block scan Feature: real source IP address
6
Statistical IDS architecture
Learning part Detection part
7
Data description DARPA98 data
The first standard corpora for evaluation of network intrusion detection systems. From the Information Systems Technology Group ( IST ) of MIT Lincoln Laboratory, Under Defense Advanced Research Projects Agency ( DARPA ITO ) and Air Force Research Laboratory ( AFRL/SNHS ) sponsorship Seven weeks of training data Two weeks of detection data
8
Data description DARPA98 data format
> : S ACK : (0) win 512 <mss 1460> - Time stamp: - Source IP address + port: - Destination IP address + port: - TCP flag: S (maybe other : R, F, P) - ACK flag: ACK - Other part of packet header: : (0) win 512 <mss 1460>
9
Feature extraction Calculate the metrics in every 5 minute traffic
For example: SYN-SYN_ACK pair SYN-FIN + SYN-RSTactive pair traffic volume SYN packet volume …… Good Luck
10
Statistical method Statistical based IDS
Goals: Using statistical metrics and algorithm to differentiate the anomaly traffic from benign traffic, and to differentiate different types of attacks. - Advantage: detect unknown attacks - Disadvantage: false positive and false negative
11
Hidden Markov Model (HMM)
HMM is a very useful statistical learning model. It has been successfully implemented in the speech recognition. - Advantage 1. analyzing sequence data (using observation probability and transition probability to represent) 2. unsurprised data training and surprised data training 3. high accuracy - Disadvantage comparatively long training time
12
Double Gaussian model Introduction - Two Gaussion distribution models are used to represent two classes of behaviors - Get the two probabilities of current behavior using different two-class Gaussian parameters - Compare them. The current behavior belongs to the larger probability class. Training period - Get the two-class Gaussian parameters Detection period - Use two-class Gaussian parameters to get probabilities and compare them
13
Double Gaussian model Advantage Disadvantage
Simple, easy to understand Fast Disadvantage No sequence characteristic
14
Result analysis Evaluation - Important quantitative analysis:
false positive + false negative - Looking at metric value, and finding the reasons - Repeating experiments
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.