Statistical based IDS background introduction
Statistical IDS background Why do we do this project Attack introduction IDS architecture Data description Feature extraction Statistical method introduction Result analysis
Project goals Related work –Internet has various network attacks, including denial of service attacks and port scans, etc. –Overall traffic detection –Flow-level detection Our goals – Detect both attacks at the same time – Differentiate DoS and port scans
Attack introduction TCP SYN flooding - An important form of DoS attacks - Exploit the TCP’s three-way handshake mechanism and its limitation in maintaining half-open connection - Feature: spoofed source IP - Recent reflected SYN/ACK flooding attacks
Attack introduction Port scan - horizontal scan - Vertical scan - Block scan Feature: real source IP address
Statistical IDS architecture Learning part Detection part
Data description DARPA98 data –The first standard corpora for evaluation of network intrusion detection systems. –From the Information Systems Technology Group ( IST ) of MIT Lincoln Laboratory,ISTLincoln Laboratory –Under Defense Advanced Research Projects Agency ( DARPA ITO ) and Air Force Research Laboratory ( AFRL/SNHS ) sponsorshipDARPA ITOAFRL/SNHS –Seven weeks of training data –Two weeks of detection data
Data description DARPA98 data format > : S ACK : (0) win Time stamp: Source IP address + port: Destination IP address + port: TCP flag: S (maybe other : R, F, P) - ACK flag: ACK - Other part of packet header: : (0) win 512
Feature extraction Calculate the metrics in every 5 minute traffic Metrics -For example: SYN-SYN_ACK pair SYN-FIN + SYN-RST active pair traffic volume SYN packet volume …… Good Luck
Statistical method Statistical based IDS Goals: Using statistical metrics and algorithm to differentiate the anomaly traffic from benign traffic, and to differentiate different types of attacks. - Advantage: detect unknown attacks - Disadvantage: false positive and false negative
Hidden Markov Model (HMM) HMM is a very useful statistical learning model. It has been successfully implemented in the speech recognition. - Advantage 1. analyzing sequence data (using observation probability and transition probability to represent) 2. unsurprised data training and surprised data training 3. high accuracy - Disadvantage comparatively long training time
Double Gaussian model Introduction - Two Gaussion distribution models are used to represent two classes of behaviors - Get the two probabilities of current behavior using different two-class Gaussian parameters - Compare them. The current behavior belongs to the larger probability class. Training period - Get the two-class Gaussian parameters Detection period - Use two-class Gaussian parameters to get probabilities and compare them
Double Gaussian model Advantage –Simple, easy to understand –Fast Disadvantage –No sequence characteristic
Result analysis Evaluation - Important quantitative analysis: false positive + false negative - Looking at metric value, and finding the reasons - Repeating experiments