Statistical based IDS background introduction

Statistical based IDS background introduction

Statistical IDS background
Why do we do this project Attack introduction IDS architecture Data description Feature extraction Statistical method introduction Result analysis

Project goals Related work Our goals
Internet has various network attacks, including denial of service attacks and port scans, etc. Overall traffic detection Flow-level detection Our goals Detect both attacks at the same time Differentiate DoS and port scans

Attack introduction TCP SYN flooding
- An important form of DoS attacks - Exploit the TCP’s three-way handshake mechanism and its limitation in maintaining half-open connection - Feature: spoofed source IP - Recent reflected SYN/ACK flooding attacks

Attack introduction Port scan - horizontal scan - Vertical scan
- Block scan Feature: real source IP address

Statistical IDS architecture
Learning part Detection part

Data description DARPA98 data
The first standard corpora for evaluation of network intrusion detection systems. From the Information Systems Technology Group ( IST ) of MIT Lincoln Laboratory, Under Defense Advanced Research Projects Agency ( DARPA ITO ) and Air Force Research Laboratory ( AFRL/SNHS ) sponsorship Seven weeks of training data Two weeks of detection data

Data description DARPA98 data format
> : S ACK : (0) win 512 <mss 1460> - Time stamp: - Source IP address + port: - Destination IP address + port: - TCP flag: S (maybe other : R, F, P) - ACK flag: ACK - Other part of packet header: : (0) win 512 <mss 1460>

Feature extraction Calculate the metrics in every 5 minute traffic
For example: SYN-SYN_ACK pair SYN-FIN + SYN-RSTactive pair traffic volume SYN packet volume …… Good Luck 

Statistical method Statistical based IDS
Goals: Using statistical metrics and algorithm to differentiate the anomaly traffic from benign traffic, and to differentiate different types of attacks. - Advantage: detect unknown attacks - Disadvantage: false positive and false negative

Hidden Markov Model (HMM)
HMM is a very useful statistical learning model. It has been successfully implemented in the speech recognition. - Advantage 1. analyzing sequence data (using observation probability and transition probability to represent) 2. unsurprised data training and surprised data training 3. high accuracy - Disadvantage comparatively long training time

Double Gaussian model Introduction - Two Gaussion distribution models are used to represent two classes of behaviors - Get the two probabilities of current behavior using different two-class Gaussian parameters - Compare them. The current behavior belongs to the larger probability class. Training period - Get the two-class Gaussian parameters Detection period - Use two-class Gaussian parameters to get probabilities and compare them

Double Gaussian model Advantage Disadvantage
Simple, easy to understand Fast Disadvantage No sequence characteristic

Result analysis Evaluation - Important quantitative analysis:
false positive + false negative - Looking at metric value, and finding the reasons - Repeating experiments

Statistical based IDS background introduction

Similar presentations

Presentation on theme: "Statistical based IDS background introduction"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Statistical based IDS background introduction

Similar presentations

Presentation on theme: "Statistical based IDS background introduction"— Presentation transcript:

Similar presentations

About project

Feedback