Presentation is loading. Please wait.

Presentation is loading. Please wait.

Benchmarking Anomaly-based Detection Systems Ashish Gupta Network Security May 2004.

Similar presentations


Presentation on theme: "Benchmarking Anomaly-based Detection Systems Ashish Gupta Network Security May 2004."— Presentation transcript:

1 Benchmarking Anomaly-based Detection Systems Ashish Gupta Network Security May 2004

2 Overview The Motivation for this paper –Waldo example The approach Structure in data Generating the data and anomalies Injecting anomalies Results –Training and Testing: the method –Scoring –Presentation –The ROC curves: somewhat obvious

3 Motivation Does anomaly detection depend on regularity/randomness of data ?

4 Where’s Waldo !

5

6

7 The aim Hypothesis: –Differences in data regularity affect anomaly detection –Different environments  different regularity Regularity –Highly redundant or random ? –Example of environment’s affect 010101010101010101010101 Or 0100011000101000100100101

8 Consequences One IDS : Different False Alarm Rates Need custom system/training for each environment ? Temporal affects: Regularity may vary over time ?

9 Structure in data Measuring randomness

10 010101010101010101010101 Or 0100011000101000100100101 Measuring Randomness Relative EntropySequential Dependence + Conditional Relative Entropy

11 The benchmark datasets Three types: –Training data ( the background data) –Anomalies –Testing data ( background + anomalies ) Generating the sequences –5 sets, each set  11 files ( for increasing regularity) –Each set  different alphabet size –Alphabet size  decides complexity

12 Anomaly Generation What’s a surprise ? –Different from the expected probability Types: –Juxta-positional : different arrangements of data 001001001001001001111 –Temporal Unexpected periodicities –Other types ?

13 Types in this paper Foreign symbol –AAABABBBABAB C BBABABBA Foreign n-gram –AAABABAABAABAAABB B BA Rare n-gram –AABBBABBBABBBABBBABBBABB AA

14 Injecting anomalies –Make sure not more than 0.24 %

15 The experiments The Hypothesis is true

16 The hypothesis: –Nature of “normal” background noise affects signal detection The anomaly detector –To detect anomalous subsequences –Learning phase  n-gram probability table –Unexpected event  anomaly ! –Anomaly threshold decides level of surprise

17 Example of anomaly detection AAA0.12 AAB0.13 ABA0.20 BAA0.17 BBB0.15 BBA0.12 AAC  ANOMALY !

18 Scoring Event outcomes –Hits –Misses –False alarms Threshold –Decides level of surprise –0  completely unsurprising, 1  astonishing –Need to calibrate

19 Presentation of results Presents two aspects: –% correct detections –% false detections Detector operates through a range of sensitivities –Higher sensitivity  ? –Need the right sensitivity

20

21 Interpretation Nothing overlaps  regularity affects detection !

22 What does this mean ? Detection metrics are data dependent Cannot say: –My XYZ product will flag down 75% percent anomalies with 10% false hit rate ! –Sir, are you sure ?

23 Real world data Regularity index for system calls for different users

24 Is this surprising ? What about network traffic ?

25 Conclusions Data Structure Anomaly Detection Effectiveness Evaluation is data dependent

26 Conclusions Change in regularity Different system Or Change the parameters

27 Quirks ? Assumes rather naïve detection systems –“Simple retraining will not suffice” An intelligent detection can take this into account. What is really an anomaly ? –If data is highly irregular, won’t randomness produce some anomalies by itself Anomaly is a relative term –Here anomalies are generated independently


Download ppt "Benchmarking Anomaly-based Detection Systems Ashish Gupta Network Security May 2004."

Similar presentations


Ads by Google