Towards Scalable and Robust Distributed Intrusion Alert Fusion with Good Load Balancing Zhichun Li, Yan Chen and Aaron Beach Lab for Internet & Security.

Slides:



Advertisements
Similar presentations
Peer-to-Peer Infrastructure and Applications Andrew Herbert Microsoft Research, Cambridge
Advertisements

SkipNet: A Scalable Overlay Network with Practical Locality Properties Nick Harvey, Mike Jones, Stefan Saroiu, Marvin Theimer, Alec Wolman Microsoft Research.
Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Scalable Content-Addressable Network Lintao Liu
A Survey of Botnet Size Measurement PRESENTED: KAI-HSIANG YANG ( 楊凱翔 ) DATE: 2013/11/04 1/24.
Kademlia: A Peer-to-peer Information System Based on the XOR Metric Petar Mayamounkov David Mazières A few slides are taken from the authors’ original.
P2P Systems and Distributed Hash Tables Section COS 461: Computer Networks Spring 2011 Mike Freedman
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Reverse Hashing for High-speed Network Monitoring: Algorithms, Evaluation, and Applications Robert Schweller 1, Zhichun Li 1, Yan Chen 1, Yan Gao 1, Ashish.
SkipNet: A Scalable Overlay Network with Practical Locality Properties Nick Harvey, Mike Jones, Stefan Saroiu, Marvin Theimer, Alec Wolman Microsoft Research.
Towards a High-speed Router-based Anomaly/Intrusion Detection System (HRAID) Zhichun Li, Yan Gao, Yan Chen Northwestern.
SCALLOP A Scalable and Load-Balanced Peer- to-Peer Lookup Protocol for High- Performance Distributed System Jerry Chou, Tai-Yi Huang & Kuang-Li Huang Embedded.
1 Load Balance and Efficient Hierarchical Data-Centric Storage in Sensor Networks Yao Zhao, List Lab, Northwestern Univ Yan Chen, List Lab, Northwestern.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
A DoS Resilient Flow-level Intrusion Detection Approach for High-speed Networks Yan Gao, Zhichun Li, Yan Chen Lab for Internet and Security Technology.
1 Load Balance and Efficient Hierarchical Data-Centric Storage in Sensor Networks Yao Zhao, List Lab, Northwestern Univ Yan Chen, List Lab, Northwestern.
Or, Providing Scalable, Decentralized Location and Routing Network Services Tapestry: Fault-tolerant Wide-area Application Infrastructure Motivation and.
EPFL-I&C-LSIR [P-Grid.org] Workshop on Distributed Data and Structures ’04 NCCR-MICS [IP5] presented by Anwitaman Datta Joint work with Karl Aberer and.
1 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
File Sharing : Hash/Lookup Yossi Shasho (HW in last slide) Based on Chord: A Scalable Peer-to-peer Lookup Service for Internet ApplicationsChord: A Scalable.
1 Towards Anomaly/Intrusion Detection and Mitigation on High-Speed Networks Yan Gao, Zhichun Li, Yan Chen Northwestern Lab for Internet and Security Technology.
Structured P2P Network Group14: Qiwei Zhang; Shi Yan; Dawei Ouyang; Boyu Sun.
Hashing it Out in Public Common Failure Modes of DHT-based Anonymity Schemes Andrew Tran, Nicholas Hopper, Yongdae Kim Presenter: Josh Colvin, Fall 2011.
Network-based and Attack-resilient Length Signature Generation for Zero-day Polymorphic Worms Zhichun Li 1, Lanjia Wang 2, Yan Chen 1 and Judy Fu 3 1 Lab.
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
1 Measurements and Mitigation of Peer-to-Peer-based Botnets: A Case Study on Storm Worm T. Holz, M. Steiner, F. Dahl, E. Biersack, and F. Freiling - Proceedings.
Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati.
An Evaluation model of botnet based on peer to peer Gao Jian KangFeng ZHENG,YiXian Yang,XinXin Niu 2012 Fourth International Conference on Computational.
Chapter 9: Cooperation in Intrusion Detection Networks Authors: Carol Fung and Raouf Boutaba Editors: M. S. Obaidat and S. Misra Jon Wiley & Sons publishing.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Using the Small-World Model to Improve Freenet Performance Hui Zhang Ashish Goel Ramesh Govindan USC.
Global Intrusion Detection Using Distribute Hash Table Jason Skicewicz, Laurence Berland, Yan Chen Northwestern University 6/2004.
Security Michael Foukarakis – 13/12/2004 A Survey of Peer-to-Peer Security Issues Dan S. Wallach Rice University,
Peer-to-Peer Name Service (P2PNS) Ingmar Baumgart Institute of Telematics, Universität Karlsruhe IETF 70, Vancouver.
Mapping Internet Sensors with Probe Response Attacks Authors: John Bethencourt, Jason Franklin, Mary Vernon Published At: Usenix Security Symposium, 2005.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
A Dos Resilient Flow-level Intrusion Detection Approach for High-speed Networks Yan Gao, Zhichun Li, Yan Chen Department of EECS, Northwestern University.
Scalable Content- Addressable Networks Prepared by Kuhan Paramsothy March 5, 2007.
Automating Analysis of Large-Scale Botnet Probing Events Zhichun Li, Anup Goyal, Yan Chen and Vern Paxson* Lab for Internet and Security Technology (LIST)
Mangai Vetrivelan Snigdha Joshi Avani Atre. Sensor Network Vulnerabilities o Unshielded Sensor Network Nodes vulnerable to be compromised. o Attacks on.
1 SOS: Secure Overlay Services A. D. Keromytis V. Misra D. Runbenstein Columbia University.
Security in Mobile Ad Hoc Networks: Challenges and Solutions (IEEE Wireless Communications 2004) Hao Yang, et al. October 10 th, 2006 Jinkyu Lee.
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
Peer to Peer Network Design Discovery and Routing algorithms
Sybil attacks as a mitigation strategy against the Storm botnet Authors:Carlton R. Davis, Jos´e M. Fernandez, Stephen Neville†, John McHugh Presenter:
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
PeerNet: Pushing Peer-to-Peer Down the Stack Jakob Eriksson, Michalis Faloutsos, Srikanth Krishnamurthy University of California, Riverside.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
Network-based and Attack-resilient Length Signature Generation for Zero-day Polymorphic Worms Zhichun Li 1, Lanjia Wang 2, Yan Chen 1 and Judy Fu 3 1 Lab.
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
Plethora: A Locality Enhancing Peer-to-Peer Network Ronaldo Alves Ferreira Advisor: Ananth Grama Co-advisor: Suresh Jagannathan Department of Computer.
Incrementally Improving Lookup Latency in Distributed Hash Table Systems Hui Zhang 1, Ashish Goel 2, Ramesh Govindan 1 1 University of Southern California.
Network-based Intrusion Detection, Prevention and Forensics System
Northwestern Lab for Internet and Security Technology (LIST) Yan Chen Department of Computer Science Northwestern University.
Plethora: Infrastructure and System Design
P2P Systems and Distributed Hash Tables
Mapping Internet Sensors With Probe Response Attacks
Consistent Hashing and Distributed Hash Table
Presentation transcript:

Towards Scalable and Robust Distributed Intrusion Alert Fusion with Good Load Balancing Zhichun Li, Yan Chen and Aaron Beach Lab for Internet & Security Technology (LIST) Northwestern University

2 The Spread of CodeRed

3 Distributed IDSes Distributed Intrusion Detection Systems (IDSes) –Crucial to identify large-scale attacks early –Robust to various scan techniques –Locate the attackers/zombies when spoofed –E.g, Symantec has 20,000 sensors in 180 countries General architecture –IDS nodes »Generate the alarms »Heterogeneous: host- or network- based –Sensor fusion centers (SFCs) »Fuse the alarms »A subset of IDSes or dedicated hosts

4 Desired Features of DIDS Infrastructure Scalability –15 million daily intrusion alerts reported to DShield Route only related alarms to the same SFC –Over 18,000 vulnerabilities found [CERT] –17,500 Win32 threats and their variants [Symantec] –Hierarchical fusion cannot scale w/ diverse alerts Distributed queries over multiple SFCs Good load balancing Attack resiliency

5 Outline Motivation CDDHT Design Features of CDDHT Evaluation Related Work Conclusion

6 Cyber Disease Distributed Hash Tables (CDDHT) General intrusion alert fusion framework, can plug-in any alert generation or alert fusion algorithm Part of the Router-based Anomaly/Intrusion Detection and Mitigation (RAIDM) system in LIST –High-speed network measurement with reversible sketches [IMC 2004, INFOCOM 2006] –Online flow-level anomaly/intrusion detection [IEEE ICDCS 2006] [IEEE CG&A, Security Visualization 06] –Router-based polymorphic worm signature generation[IEEE Symposium on Security and Privacy 2006]

7 CDDHT Design Leverage DHT systems –O(log(n)) hops distance where n is the # of nodes –O(log(n)) maintenance overhead for routing –Guaranteed success for deterministic routing –Fault-tolerant, robust, and DoS attack resilient –Becoming increasingly popular for serious use »Eg, eMule P2P system uses Kademila Primitives of CDDHT –Put (disease key, symptom report) –Summary report = Get (disease key)

8 Architecture of CDDHT Internet IDS Node ID : “0” + sha2(IP of the IDS) IDS + SFC Node ID: “1” + sha2(IP of the IDS) DIDS Coverage Attack Injected Attack Injected

9 Disease Key Design Challenge: fuse the vast, diverse symptoms from heterogeneous IDSes with different views –Key generation in a decentralized and deterministic manner Key idea: generate the disease keys which capture the uniqueness of certain attacks Focus on popular types of attacks Improve with features –Load balancing –Attack resilience

10 The Disease Key Currently, model four types of attacks Extensible design IntrusionIDCharacterization Field(s)Length DoS Attack000Victim IP (subnet)35 bits Scans0010 (for vertical & block scan) Source IP36 bits 1 (for horizontal & coordinated scan) Dest port # Src IP (horizontal scan)52 bits 0 (coordinated scan) Viruses/ Worms 0100 (for known)Worm ID (32bit)36 bits 1 (for unknown)Dst port #20 bits Botnets01100 (for DDNS entry)Botnet ID (32bit)37 bits 01 (for URL entry)Botnet ID (32bit)37 bits

11 Port Scan Disease Key Design Vertical scan and block scan –Source IP Horizontal scan and Coordinated scan –Scan port –Horizontal: + Source IP

12 Viruses/Worms and Botnets Disease Key Design Viruses/Worms –Known worms: hash of the worm name –Unknown worms: worm scan port # Botnets –Assume botnets use centralized C&C –IRC based bots: dynamic DNS –Web based bots: URL –Botnet ID = hash of the DDNS or URL

13 Outline Motivation CDDHT Design Features of CDDHT Evaluation Related Work Conclusion

14 Load Balancing Challenges to load balancing –Large key space in DHT –Highly skewed alert distribution Number of ports picked Number of subnets picked

15 Load Balancing II Proactive balancing with stable hot spots –Reduce key space of port # to 7 bits –64 buckets for 64 most popular port # –Remaining 64 buckets randomly assigned to other port # Balancing load of the key space –Node migration –Virtual node –Load-aware bootstrap Balancing load of single hot key –IDS alarm rate limiting Aggregation tree for large-scale attacks –Received alarms by the final SFC bounded by O(log(n))

16 Attack Resilience DoS resilience comparison with hierarchical model –Proved the average number of alerts unreachable to their corresponding SFCs given one node loss »Hierarchical DIDS: O(log (n)) »CDDHT: O(1) More in the paper –Authenticity of alarms –Dealing with compromised nodes

17 Outline Motivation CDDHT Design Features of CDDHT Evaluation Related Work Conclusion

18 Methodology Implementation –Preliminary CDDHT system based on Chord simulator –Event-driven simulation »Each alarm is an event with a timestamp from certain IDSes Datasets –DShield firewall logs (Jan. 2004) –Results from each day’s data are similar –Use January 2 nd 2004 as illustration »25 million scan logs from 1,417 providers »Randomly choose 10% to be SFCs Scan typeVerticalHorizontalBlockCoordinated # of scans

19 Evaluation Metrics Fusion effectiveness –100% due to deterministic routing of CDDHT Load balancing –Consider number of alerts received at each SFC –Maximum vs. mean ratio (MMR) –Coefficient of variation (CV)

20 Proactive Balancing with Stable Hot Ports Proactive load balancing can reduce CV by 60% and reduce MMR by 40%

21 The Load Variation Comparison Between Hierarchical Scheme and CDDHT Median, 10- and 90- percentile of 10 runs CDDHT with proactive balancing (PB) and virtual nodes (VN) Compared with Hierarchical schemes, CDDHT reduces the MMR by a factor of 5.5 and CV by a factor of 5.2 Hierarchical CDDHT CDDHT w/ PB CDDHT w/ PB+VN Hierarchical CDDHT CDDHT w/ PB CDDHT w/ PB+VN

22 Outline Motivation CDDHT Design Features of CDDHT Evaluation Related Work Conclusion

23 Related Works CDDHTCentralized /Hierarchic al Model Publish/ Subscribe Model P2P Querying Failure/ attack resilience HighLowHigh Fusion overhead Low HighLow Query overhead Low High WormShield uses DHT specifically to find popular content fingerprints as worm signatures, but does not work for polymorphic worms

24 Conclusion Large number and diverse alerts from many distributed IDSes calls for efficient fusion of these alerts CDDHT: Cyber Disease DHT –Efficient route alarms of different intrusions to different SFCs –Highly scalable and robust –Good load balancing –High attack resilience Future work –Disease keys for more types of attacks and querying of CDDHT

Backup Slides

26 Introduction to DHT DHT (Distributed Hash Table): An infrastructure that enables the distribution of an ordinary hash table onto a set of cooperating nodes KeyObject 0x2535“Apple” 0x2353”Banana” 0x3978”Peach” 0x9123”Strawberry” 0x7234”Grape” 0x5942”Watermelon” node A node D node B node C Each node only stores part of the hash table Basic operations –Put(Key, Object) : From Key to find the corresponding node via DHT routing and store the Object on the node –Object=Get(Key) : From Key to find the corresponding node via DHT routing and retrieve the Object from the node

27 Introduction to DHT II Different DHT systems –Chord –CAN –Pastry –Tapestry -Kademlia -Kademia has been used in eMule P2P software Chord DHT routing DHT routing –Distributed and deterministic routing –The max hops to find the node corresponding to a key is bounded by O( log (n) )

28 DoS Attack Disease Key Design Most DoS attack target specific IP addresses (the server) or the subnet (Bandwidth consuming attack) But the victim IP (subnet) can be destination or source (in backscatter) Other parts all can be variants

29 Related Works Centralized/Hierarchical Model Publish/subscribe Model –O(n 2 ) communicate vs. O(n) P2P Query –Scalability with frequent fusion

30 Attack Resilience DoS resilience comparison with hierarchical model –Proved the average number of disconnected nodes given one node loss »in a k-way hierarchical DIDS is O(log (n)) »but the DHT based is O(1). Authenticity of alarms –Valid the source subnets of IDS by Whois and BGP tables –Use PKI to verify the messages send by IDSes/SFCs

31 Attack Resilience II Dealing with compromised nodes –IDS nodes »Voting the importance of the results by # of IDSes, IP coverages »Probability based verification for alarm aggregation –SFC nodes »The “trust but verify” principle »Envision that there is a centralized authority randomly check the fusion results for the SFCs

32 Proactive Balancing with Stable Hot Ports Use 7 bits encoding, can reduce MMR by 60% and reduce CV by 40%

33 Dynamic of Load Variation over Time MMR for CDDHT is much smaller and smoother CV also get better