Download presentation
Presentation is loading. Please wait.
1
Statistical based IDS background introduction
2
Statistical IDS background Why do we do this project Attack introduction IDS architecture Data description Feature extraction Statistical method introduction Result analysis
3
Project goals Related work –Internet has various network attacks, including denial of service attacks and port scans, etc. –Overall traffic detection –Flow-level detection Our goals – Detect both attacks at the same time – Differentiate DoS and port scans
4
Attack introduction TCP SYN flooding - An important form of DoS attacks - Exploit the TCP’s three-way handshake mechanism and its limitation in maintaining half-open connection - Feature: spoofed source IP - Recent reflected SYN/ACK flooding attacks
5
Attack introduction Port scan - horizontal scan - Vertical scan - Block scan Feature: real source IP address
6
Statistical IDS architecture Learning part Detection part
7
Data description DARPA98 data –The first standard corpora for evaluation of network intrusion detection systems. –From the Information Systems Technology Group ( IST ) of MIT Lincoln Laboratory,ISTLincoln Laboratory –Under Defense Advanced Research Projects Agency ( DARPA ITO ) and Air Force Research Laboratory ( AFRL/SNHS ) sponsorshipDARPA ITOAFRL/SNHS –Seven weeks of training data –Two weeks of detection data
8
Data description DARPA98 data format 897048008.080700 172.16.114.169.1024 > 195.73.151.50.25: S ACK 1055330111:1055330111(0) win 512 - Time stamp: 897048008.080700 - Source IP address + port: 172.16.114.169.1024 - Destination IP address + port: 195.73.151.50.25 - TCP flag: S (maybe other : R, F, P) - ACK flag: ACK - Other part of packet header: 1055330111:1055330111(0) win 512
9
Feature extraction Calculate the metrics in every 5 minute traffic Metrics -For example: SYN-SYN_ACK pair SYN-FIN + SYN-RST active pair traffic volume SYN packet volume …… Good Luck
10
Statistical method Statistical based IDS Goals: Using statistical metrics and algorithm to differentiate the anomaly traffic from benign traffic, and to differentiate different types of attacks. - Advantage: detect unknown attacks - Disadvantage: false positive and false negative
11
Hidden Markov Model (HMM) HMM is a very useful statistical learning model. It has been successfully implemented in the speech recognition. - Advantage 1. analyzing sequence data (using observation probability and transition probability to represent) 2. unsurprised data training and surprised data training 3. high accuracy - Disadvantage comparatively long training time
12
Double Gaussian model Introduction - Two Gaussion distribution models are used to represent two classes of behaviors - Get the two probabilities of current behavior using different two-class Gaussian parameters - Compare them. The current behavior belongs to the larger probability class. Training period - Get the two-class Gaussian parameters Detection period - Use two-class Gaussian parameters to get probabilities and compare them
13
Double Gaussian model Advantage –Simple, easy to understand –Fast Disadvantage –No sequence characteristic
14
Result analysis Evaluation - Important quantitative analysis: false positive + false negative - Looking at metric value, and finding the reasons - Repeating experiments
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.