High-Performance Network Anomaly/Intrusion Detection & Mitigation System (HPNAIDM) Yan Chen Lab for Internet & Security Technology (LIST) Department of.

Slides:



Advertisements
Similar presentations
Polygraph: Automatically Generating Signatures for Polymorphic Worms James Newsome *, Brad Karp *†, and Dawn Song * † Intel Research Pittsburgh * Carnegie.
Advertisements

Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.
Hamsa: Fast Signature Generation for Zero-day Polymorphic Worms with Provable Attack Resilience Zhichun Li, Manan Sanghi, Yan Chen, Ming-Yang Kao and Brian.
1 Yan Chen Northwestern Lab for Internet and Security Technology (LIST) Dept. of Computer Science Northwestern University
High-Performance Network Anomaly/Intrusion Detection & Mitigation System (HPNAIDM) Yan Chen Department of Electrical Engineering and Computer Science Northwestern.
1 Yan Chen Northwestern University Lab for Internet and Security Technology (LIST) in Northwestern.
High-Performance Network Anomaly/Intrusion Detection & Mitigation System (HPNAIDM) Yan Chen Department of Electrical Engineering and Computer Science Northwestern.
1 Reversible Sketches for Efficient and Accurate Change Detection over Network Data Streams Robert Schweller Ashish Gupta Elliot Parsons Yan Chen Computer.
Intrusion Detection/Prevention Systems. Objectives and Deliverable Understand the concept of IDS/IPS and the two major categorizations: by features/models,
Northwestern Lab for Internet and Security Technology (LIST) Yan Chen Router-based Anomaly/Intrusion Detection and Mitigation (RAIDM) Systems Scalable.
Reverse Hashing for High-speed Network Monitoring: Algorithms, Evaluation, and Applications Robert Schweller 1, Zhichun Li 1, Yan Chen 1, Yan Gao 1, Ashish.
Yan Chen, Hai Zhou Northwestern Lab for Internet and Security Technology (LIST) Dept. of Electrical Engineering and Computer Science Northwestern University.
High-Performance Network Anomaly/Intrusion Detection & Mitigation System (HPNAIDM) Zhichun Li Lab for Internet & Security Technology (LIST) Department.
Reverse Hashing for Sketch Based Change Detection in High Speed Networks Ashish Gupta Elliot Parsons with Robert Schweller, Theory Group Advisor: Yan Chen.
Internet Cache Pollution Attacks and Countermeasures Yan Gao, Leiwen Deng, Aleksandar Kuzmanovic, and Yan Chen Electrical Engineering and Computer Science.
Towards a High-speed Router-based Anomaly/Intrusion Detection System (HRAID) Zhichun Li, Yan Gao, Yan Chen Northwestern.
Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.
Lab for Internet & Security Technology (LIST) Northwestern University
A DoS Resilient Flow-level Intrusion Detection Approach for High-speed Networks Yan Gao, Zhichun Li, Yan Chen Lab for Internet and Security Technology.
1 Towards Anomaly/Intrusion Detection and Mitigation on High-Speed Networks Yan Gao, Zhichun Li, Manan Sanghi, Yan Chen, Ming- Yang Kao Northwestern Lab.
1 Network Intrusion Detection and Mitigation Yan Chen Northwestern Lab for Internet and Security Technology (LIST) Department of Computer Science Northwestern.
What Learned Last Week Homework qn –What machine does the URL go to?
Intrusion Detection/Prevention Systems. Definitions Intrusion –A set of actions aimed to compromise the security goals, namely Integrity, confidentiality,
High-Performance Network Anomaly/Intrusion Detection & Mitigation System (HPNAIDM) Yan Chen Department of Electrical Engineering and Computer Science Northwestern.
1 Towards Anomaly/Intrusion Detection and Mitigation on High-Speed Networks Yan Gao, Zhichun Li, Yan Chen Northwestern Lab for Internet and Security Technology.
1 Yan Chen Northwestern Lab for Internet and Security Technology (LIST) Dept. of Computer Science Northwestern University
Towards a High speed Router based Anomaly/Intrusion detection System Yan Gao & Zhichun Li.
1 Network-based Intrusion Detection, Mitigation and Forensics System Yan Chen Department of Electrical Engineering and Computer Science Northwestern University.
Network-based and Attack-resilient Length Signature Generation for Zero-day Polymorphic Worms Zhichun Li 1, Lanjia Wang 2, Yan Chen 1 and Judy Fu 3 1 Lab.
Network-based and Attack-resilient Length Signature Generation for Zero-day Polymorphic Worms Zhichun Li 1, Lanjia Wang 2, Yan Chen 1 and Judy Fu 3 1 Lab.
Lucent Technologies – Proprietary Use pursuant to company instruction Learning Sequential Models for Detecting Anomalous Protocol Usage (work in progress)
1 HPNAIDM: the High-Performance Network Anomaly/Intrusion Detection and Mitigation System Yan Chen Lab for Internet & Security Technology (LIST) Department.
Intrusion Detection/Prevention Systems. Objectives and Deliverable Understand the concept of IDS/IPS and the two major categorizations: by features/models,
SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University.
Scalable and Efficient Data Streaming Algorithms for Detecting Common Content in Internet Traffic Minho Sung Networking & Telecommunications Group College.
Hamsa: Fast Signature Generation for Zero-day Polymorphic Worms with Provable Attack Resilience Zhichun Li, Manan Sanghi, Yan Chen, Ming-Yang Kao and Brian.
Network-based Intrusion Detection, Prevention and Forensics System 1 Yan Chen Department of Electrical Engineering and Computer Science Northwestern University.
Automatically Generating Models for Botnet Detection Presenter: 葉倚任 Authors: Peter Wurzinger, Leyla Bilge, Thorsten Holz, Jan Goebel, Christopher Kruegel,
IEEE Communications Surveys & Tutorials 1st Quarter 2008.
1 Network-based Intrusion Detection, Prevention and Forensics System Yan Chen Department of Electrical Engineering and Computer Science Northwestern University.
A Dos Resilient Flow-level Intrusion Detection Approach for High-speed Networks Yan Gao, Zhichun Li, Yan Chen Department of EECS, Northwestern University.
CINBAD CERN/HP ProCurve Joint Project on Networking 26 May 2009 Ryszard Erazm Jurga - CERN Milosz Marian Hulboj - CERN.
Anomaly/Intrusion Detection and Prevention in Challenging Network Environments 1 Yan Chen Department of Electrical Engineering and Computer Science Northwestern.
Department of Computer Science and Engineering Applied Research Laboratory Architecture for a Hardware Based, TCP/IP Content Scanning System David V. Schuehler.
Polygraph: Automatically Generating Signatures for Polymorphic Worms James Newsome, Brad Karp, and Dawn Song Carnegie Mellon University Presented by Ryan.
Anomaly/Intrusion Detection and Prevention in Challenging Network Environments 1 Yan Chen Department of Electrical Engineering and Computer Science Northwestern.
Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song.
Intrusion Detection/Prevention Systems. Objectives and Deliverable Understand the concept of IDS/IPS and the two major categorizations: by features/models,
Selective Packet Inspection to Detect DoS Flooding Using Software Defined Networking Author : Tommy Chin Jr., Xenia Mountrouidou, Xiangyang Li and Kaiqi.
1 Modeling, Early Detection, and Mitigation of Internet Worm Attacks Cliff C. Zou Assistant professor School of Computer Science University of Central.
Automated Worm Fingerprinting Authors: Sumeet Singh, Cristian Estan, George Varghese and Stefan Savage Publish: OSDI'04. Presenter: YanYan Wang.
Yan Chen Dept. of Electrical Engineering and Computer Science Northwestern University Spring Review 2008 Award # : FA Intrusion Detection.
Network-based and Attack-resilient Length Signature Generation for Zero-day Polymorphic Worms Zhichun Li 1, Lanjia Wang 2, Yan Chen 1 and Judy Fu 3 1 Lab.
Polygraph: Automatically Generating Signatures for Polymorphic Worms Authors: James Newsome (CMU), Brad Karp (Intel Research), Dawn Song (CMU) Presenter:
Monitoring, Diagnosing, and Securing the Internet 1 Yan Chen Department of Electrical Engineering and Computer Science Northwestern University Lab for.
Network-based Intrusion Detection, Prevention and Forensics System 1 Yan Chen Department of Electrical Engineering and Computer Science Northwestern University.
Northwestern Lab for Internet & Security Technology (LIST)
Yan Chen Northwestern Lab for Internet and Security Technology (LIST) Dept. of Computer Science Northwestern University
SketchVisor: Robust Network Measurement for Software Packet Processing
POLYGRAPH: Automatically Generating Signatures for Polymorphic Worms
Network-based Intrusion Detection, Prevention and Forensics System
Worm Origin Identification Using Random Moonwalks
Northwestern Lab for Internet and Security Technology (LIST) Yan Chen Department of Computer Science Northwestern University.
Xutong Chen and Yan Chen
Polygraph: Automatically Generating Signatures for Polymorphic Worms
Yan Chen Department of Electrical Engineering and Computer Science
Yan Chen Lab for Internet & Security Technology (LIST)
Lu Tang , Qun Huang, Patrick P. C. Lee
Introduction to Internet Worm
Presentation transcript:

High-Performance Network Anomaly/Intrusion Detection & Mitigation System (HPNAIDM) Yan Chen Lab for Internet & Security Technology (LIST) Department of Electrical Engineering and Computer Science Northwestern University

The Spread of Sapphire/Slammer Worms

Current Intrusion Detection Systems (IDS) Mostly host-based and not scalable to high- speed networks –Slammer worm infected 75,000 machines in <10 mins –Host-based schemes inefficient and user dependent »Have to install IDS on all user machines ! Mostly simple signature-based –Cannot recognize unknown anomalies/intrusions –New viruses/worms, polymorphism

Current Intrusion Detection Systems (II) Statistical detection –Unscalable for flow-level detection »IDS vulnerable to DoS attacks –Overall traffic based: inaccurate, high false positives Cannot differentiate malicious events with unintentional anomalies –Anomalies can be caused by network element faults –E.g., router misconfiguration, link failures, etc.

High-Performance Network Anomaly/Intrusion Detection and Mitigation System (HPNAIDM) Attached to a router/switch as a black box Edge network detection particularly powerful Original configuration Monitor each port separately Monitor aggregated traffic from all ports Router LAN Intern et Switch LAN (a) Router LAN Intern et LAN (b) HPNAIDM system scan port Splitter Router LAN Intern et LAN (c) Splitter HRAID system Switch HPNAIDM system HPNAIDM system

Features of HPNAIDM Online traffic recording [ACM SIGCOMM IMC 2004, IEEE INFOCOM 2006] –Reversible sketch for data streaming computation –Record millions of flows (GB traffic) in a few hundred KB –Small # of memory access per packet –Scalable to large key space size (2 32 or 2 64 ) Online sketch-based flow-level anomaly detection [IEEE ICDCS 2006] [IEEE CG&A, Security Visualization 06] –Adaptively learn the traffic pattern changes –As a first step, detect TCP SYN flooding, horizontal and vertical scans even when mixed

Features of HPNAIDM (II) Integrated approach for false positive reduction Polymorphic worm detection (Hamsa) [IEEE Symposium on Security and Privacy 2006] Network element fault Diagnostics with Operational Determinism (ODD) [ACM SIGMETRICS 2006, poster paper] HPNAIDM: First flow-level intrusion detection that can sustain 10s Gbps bandwidth even for worst case traffic of 40-byte packet streams Patents have been filed or are currently being filed for most technologies.

HPNAIDM Architecture Reversible sketch monitoring Filtering Sketch based statistical anomaly detection (SSAD) Local sketch records Sent out for aggregation Remote aggregated sketch records Per-flow monitoring Streaming packet data Normal flows Suspicious flows Intrusion or anomaly alarms Keys of suspicious flows Keys of normal flows Data path Control path Modules on the critical path Signature -based detection Polymorphic worm detection (Hamsa) Part I Sketch- based monitoring & detection Part II Per-flow monitoring & detection Modules on the non-critical path Network fault diagnosis (ODD)

Research methodology Combination of theory, synthetic/real trace driven simulation, and real-world implementation and deployment

Hamsa: Fast Signature Generation for Zero-day Polymorphic Worms with Provable Attack Resilience Zhichun Li, Manan Sanghi, Yan Chen, Ming-Yang Kao and Brian Chavez Northwestern University

Desired Requirements for Polymorphic Worm Signature Generation No existing work satisfies these requirements ! Network based, no host-level info Noise tolerant –Most network flow classifiers suffer false positives. –Even host based IDSes, such as honeynets, can be injected with noise. Attack resilience –Attackers always try to evade the IDS Efficient signature matching for high-speed links

Outline Motivation Hamsa Design Model-based Signature Generation Evaluation Related Work Conclusion

Choice of Signatures Two classes of signatures –Content based »Token: a substring with reasonable coverage to the suspicious traffic »Signatures: conjunction and/or sequence of tokens –Behavior based Our choice: content based –Fast signature matching. ASIC based approach can archive 6 ~ 8Gb/s –Generic, not depend upon any protocol or server

Unique Invariants of Worms Protocol Frame –Makes server branch down the code path to the vulnerability part, usually infrequently used –Code-Red II: ‘.ida?’ or ‘.idq?’ Control Data: leading to control flow hijacking –Hard coded value to overwrite a jump target or a function call –Example: ATPhttpd exploit, wu-ftp exploit Worm Executable Payload –CLET polymorphic engine: ‘0\x8b’, ‘\xff\xff\xff’ and ‘t\x07\xeb’ Possible to have worms with no such invariants, but very hard

Hamsa Architecture

Hamsa Design Key idea: model the uniqueness of worm invariants –Greedy algorithm for finding token conjunction signatures Highly accurate while much faster –Both analytically and experimentally –Compared with the latest work, polygraph –Suffix array based token extraction »Use less than 20% space, but at least 20 times faster Provable attack resilience guarantee –Propose an adversary model Noise tolerant

Hamsa Signature Generator Core part: Model-based Greedy Signature Generation Iterative approach for multiple worms Signature refinement for better specificity –False positive is worse than false negative

Outline Motivation Hamsa Design Model-based Signature Generation Evaluation Related Work Conclusion

Problem Formulation Noisy Token Multiset Signature Generation Problem : INPUT : Suspicious pool  and normal traffic pool N; value  <1. OUTPUT : A multi-set of tokens signature S={(t 1, n 1 ),... (t k, n k )} such that the signature can maximize the coverage in the suspicious pool and the false positive in normal pool should less than  Without noise, exist polynomial time algo With noise, NP-Hard

Model Uniqueness of Invariants Let worm has a set of invariants: Determine their order by: t 1 : the token with minimum false positive in normal traffic. u(1) is the upper bound of the false positive of t 1 t 2 : the token with minimum joint false positive with t 1 FP({t 1,t 2 }) bounded by u(2) t i : the token with minimum joint false positive with {t 1, t 2, t i-1 }. FP({t 1,t 2,…,t i }) bounded by u(i) The total number of tokens bounded by k *

Signature Generation Algorithm (82%, 50%) (COV, FP) (70%, 5%) (67%, 30%) (62%, 15%) (50%, 25%) (41%, 55%) (36%, 41%) (12%, 9%) u(1)=10% Suspicious pool tokens token extraction Order by coverage t1t1

(82%, 50%) (COV, FP) (70%, 5%) (67%, 30%) (62%, 15%) (50%, 25%) (41%, 55%) (36%, 41%) (12%, 9%) t1t1 Order by joint coverage with t 1 (69%, 4.8%) (COV, FP) (68%, 4.5%) (67%, 1%) (40%, 2.5%) (35%, 12%) (31%, 9%) (10%, 0.5%) u(2)=2% t2t2 Signature Signature Generation Algorithm

Algorithm Analysis Runtime analysis O(T*(|M|+|N|)) Provable Attack Resilience Guarantee –Analytically bound the worst attackers can do! –False negative: –Example: K*=5, u(1)=0.2, u(2)=0.08, u(3)=0.04, u(4)=0.02 and u(5)=0.01 –The better the flow classifier, the lower are the false negatives Noise ratioFP upper boundFN upper bound 5%1%1.84% 10%1%3.89% 20%1%8.75%

Attack Resilience Assumptions Common assumptions for any sig generation sys 1.The attacker cannot control which worm samples are encountered by Hamsa 2.The attacker cannot control which worm samples encountered will be classified as worm samples by the flow classifier Unique assumptions for token-based schemes 1.The attacker cannot change the frequency of tokens in normal traffic 2.The attacker cannot control which normal samples encountered are classified as worm samples by the worm flow classifier

Improvements to the Basic Approach Generalizing signature generation –Provide the flexibility and tradeoff between signature coverage and false positives –Define scoring function: score(cov, fp, …) to evaluate the goodness of signature Iteratively use single worm detector to detect multiple worms –At the first iteration, the algorithm find the signature for the most popular worms in the suspicious pool. –All other worms and normal traffic treat as noise.

Outline Motivation Hamsa Design Model-based Signature Generation Evaluation Related Work Conclusion

Experiment Methodology Experiential setup: –Suspicious pool: »Three pseudo polymorphic worm based on real exploits (Code- Red II, Apache-Knacker and ATPhttpd), »Two polymorphic engine from Internet (CLET and TAPiON). –Normal pool: 2 hour departmental http trace (326MB) Signature evaluation: –False negative: 5000 generated worm samples per worm –False positive: »4-day departmental http trace (12.6 GB) »3.7GB web crawling including.mp3,.rm,.ppt,.pdf,.swf etc. »/usr/bin of Linux Fedora Core 4

Results on Signature Quality Single worm with noise –Suspicious pool size: 100 and 200 samples –Noise ratio: 0%, 10%, 30%, 50%, 70% –Noise samples randomly picked from the normal pool –Always get above signature and accuracy, except in the next slide Worms Training FN Training FP Evaluation FN Evaluation FP Binary evaluation FP Signature Code-Red II {'.ida?': 1, '%u780': 1, ' HTTP/1.0\r\n': 1, 'GET /': 1, '%u': 2} CLET00.109% %0.268% {'0\x8b': 1, '\xff\xff\xff': 1,'t\x07\xeb': 1}

Results on Signature Quality (II) Suspicious pool with high noise ratio: –For noise ratio 50% and 70%, sometimes we can produce two signatures, one is the true worm signature, anther solely from noise. –The false positive of these noise signatures have to be very small: »Mean: 0.09% »Maximum: 0.7% Multiple worms with noises give similar results

Speed Results Implementation with hybrid of C++/Python –500 samples with 20% noise, 326MB normal traffic pool, 15 seconds on an XEON 2.8Ghz, 50MB memory consumption Speed comparison with Polygraph –Asymptotic runtime: O(T) vs. O(|M| 2 ), when |M| increase, T won’t increase as fast as |M|! –Experimental: 64 to 361 times faster (polygraph vs. ours, both in python) Noise ratio |M|20%30%40%50% 15074(64) (70) (361) Speed up ratio

Related works HamsaPolygraphCFGPADSNemeanCOVERSMalware Detection Network or host based Network Host Content or behavior based Content based Behavior based Content based Behavior based Noise tolerance YesYes (slow) YesNo Yes Multi worms in one protocol YesYes (slow) YesNoYes On-line sig matching Fast SlowFast Slow GeneralityGeneral purpose Protocol specific Server specific General purpose Provable atk resilience YesNo Information exploited   

Conclusion Network based signature generation and matching are important, but challenging Hamsa: automated signature generation –Fast –Noise tolerant –Provable attack resilience –Capable of detecting multiple worm in a single application protocol Proposed a model to describe the worm invariants

Backup Slides

Motivation: Desired requirements for polymorphic worm signature generation Network-based signature generation –Worms spread in exponential speed, to detect them in their early stage is very crucial… However »At their early stage there are limited worm samples. –The high speed network router may see more worm samples… But »Need to keep up with the network speed ! »Only can use network level information

Token-fit Attack Can Fail Polygraph Polygraph: hierarchical clustering to find signatures w/ smallest false positives Attacker can potentially obtain the token distribution of the noise in the suspicious pool He can make the worm samples more like noise traffic –Different worm samples encode different noise tokens Our approach can still work!

Token-fit attack could make Polygraph fail Noise samples N1 N2 N3 Worm samples W1 W2 W3 Merge Candidate 1 Merge Candidate 2 Merge Candidate 3 CANNOT merge further! NO true signature found!

Generalizing Signature Generation with noise BEST Signature = Balanced Signature –Balance the sensitivity with the specificity –But how? Create notation Scoring function: score(cov, fp, …) to evaluate the goodness of signature –Current used »Intuition: it is better to reduce the coverage 1/a if the false positive becomes 10 times smaller. »Add some weight to the length of signature (LEN) to break ties between the signatures with same coverage and false positive

Extension to multiple worm Iteratively use single worm detector to detect multiple worm –At the first iteration, the algorithm find the signature for the most popular worms in the suspicious pool. All other worms and normal traffic treat as noise. –Though the analysis for the single worm can apply to multiple worms, but the bound are not very promising. Reason: high noise ratio

Experiment: Sample requirement Coincidental-pattern attack [Polygraph] Results –For the three pseudo worms, 10 samples can get good results. –CLET and TAPiON at least need 50 samples Conclusion –For better signatures, to be conservative, at least need 100+ samples Require scalable and fast signature generation!

Implementation details Token Extraction: extract a set of tokens with minimum length l and minimum coverage COV min. –Polygraph use suffix tree based approach: 20n space and time consuming. –Our approach: Enhanced suffix array 4n space and much faster! (at least 20 times) Calculate false positive when check U-bounds –Again suffix array based approach, but for a 300MB normal pool, 1.2GB suffix array still large! –Optimization: using MMAP, memory usage: 150 ~ 250MB

Token Extraction Extract a set of tokens with minimum length l min and coverage COV min. And for each token output the frequency vector. Polygraph use suffix tree based approach: 20n space and time consuming. Our approach: –Enhanced suffix array 4n space –Much faster, at least 50(UPDATE) times! –Can apply to Polygraph also.

Calculate the false positive We need to have the false positive to check the U-bounds Again suffix array based approach, but for a 300MB normal pool, 1.2GB suffix array still large! Improvements –Caching –MMAP suffix array. True memory usage: 150 ~ 250MB. –2 level normal pool –Hardware based fast string matching –Compress normal pool and string matching algorithms directly over compressed strings

Experiment: Attacks We propose a new attack: token-fit. –The attacker may study the noise inside the suspicious pool –Create worm sample W i which may has more same tokens with some normal traffic noise sample N i –This will stuck the hierarchical clustering used in [Polygraph] –BUT We still can generate correct signature!

Experiment: U-bound evaluation To be conservative we chose k * =15. –Even we assume every token has 70% false positive, their conjunction still only have 0.5% false positive. In practice, very few tokens exceed 70% false positive. Define u(1) and u r, generate –We tested: u(1) = [0.02, 0.04, 0.06, 0.08, 0.10, 0.20, 0.30, 0.40, 0.5] and u r = [0.20, 0.40, 0.60, 0.8]. The minimum ( u(1), u r ) works for all our worms was (0.08,0.20) –In practice, we use conservative value (0.15,0.5)