1 Impact of IT Monoculture on Behavioral End Host Intrusion Detection Dhiman Barman, UC Riverside/Juniper Jaideep Chandrashekar, Intel Research Nina Taft, Intel Research Michalis Faloutsos, UC Riverside/stopthehacker.com Ling Huang, Intel Research Frederic Giroire, INRIA
2 Problem: How should we configure behavioral HIDS across an enterprise? Enterprise laptops run HIDS Each device can have its own threshold Key question: does “one size fit all”? Users Firewall Enterprise Internet SysAdmin Server HIDS = Host Intrusion Detection Systems
3 Motivation: so far, monoculture! Why? We polled sys admins: "easier to manage” no method on how to set them otherwise harder to interpret results, if not mono Term: monoculture = homogeneous
4 Contributions We challenge the practice of monoculture Measure enterprise behavior: 350 laptops We observe that User behavior is diverse Diversity is better than monoculture in HIDS We propose a new approach: partial diversity A little diversity goes a long way!
5 Roadmap What you would expect…
6 Our data collection User traffic: 350 laptops of enterprise employees 5 weeks in Q1 of 2007 Collected all packet headers Collection tool runs on laptop Malicious traffic: Collected traces from machines with known botnets on them
7 Measured key detection features We study features used in real systems Selection of features is an orthogonal question
8 Threat Models #1: Attacker knows nothing about user behavior #2: Attacker monitors user behavior and builds histograms of behavior for typical HIDS feature Attacker cannot know the instantaneous value of a feature, only its histogram Attacker selects volume of malicious traffic to “hide” inside normal traffic
9 Defining the optimization goal Far from obvious: FN (False Negatives) vs FP (False Positives) failing to detect vs false alarms Our Utility provides a flexible definition Sysadmins need to decide this User i, with threshold Ti, w is relative importance of FN or FP
10 Results, at last…
11 User behavior varies a lot! Focus on the tail behavior of users 99%, 99.9% Spans 4 orders of magnitude
12 What about other features? All features vary a lot!
13 Different users could detect different types of attacks Is the feature activity correlated? Not necessarily Conclusion: All users are important Synthesizing alarms is non-trivial Some users are "light" in terms of the maximum number of UDP connections, but "heavy" in TCP connections
14 An uber-policy for enterprise diversity We propose a tunable policy Monoculture: one threshold for all Full diversity: one threshold per user Partial diversity: one threshold per group We use 8 groups Partial diversity subsumes the other two a key question: grouping users
15 Partial Diversity: grouping Our goal here: there exists a grouping with good results for diversity k-means clustering did not work well: skewed distribution with wide and continues spread Heuristic: follow the nature of the distribution: the top 15%, split into 4 subgroups bottom 85% split into 4 subgroups Experimented with 2,3,5,8 We show only the 8 group case (best results)
16 Evaluation approach Train using real data Test with malicious traces superimposed Evaluation method: Train on previous week -> thresholds Apply thresholds on current week Interesting: Weekly thresholds vary! a 99th perc. threshold for previous week does not guarantee 1% false positive this week
17 Diversity is good Partial diversity is almost as good as full diversity! For w= 0.4, recall:
18 What if w varies? Still good.
19 Limiting the attacker’s opportunity: measuring the stealth traffic Naïve attacker will be detected Clever attacker will be “limited”
20 Conclusions Time to revisit the question of diversity Diversity can offer benefits We propose Partial Diversity: striking the balance in a tunable way Our work as a first step in providing a framework to compare initial techniques to establish thresholds
21 Future Work Finetune the different parts user grouping in partial diversity approach Utility function for users and network Select and use multiple features together Deploy the approach in a real network