Streaming Algorithms for Robust, Real-Time Detection of DDoS Attacks S. Ganguly M. Garofalakis R. Rastogi K.Sabnani Indian Inst. Of Tech. India Yahoo!

Slides:



Advertisements
Similar presentations
Optimal Approximations of the Frequency Moments of Data Streams Piotr Indyk David Woodruff.
Advertisements

The Data Stream Space Complexity of Cascaded Norms T.S. Jayram David Woodruff IBM Almaden.
Size-estimation framework with applications to transitive closure and reachability Presented by Maxim Kalaev Edith Cohen AT&T Bell Labs 1996.
Counting Distinct Objects over Sliding Windows Presented by: Muhammad Aamir Cheema Joint work with Wenjie Zhang, Ying Zhang and Xuemin Lin University of.
New Directions in Traffic Measurement and Accounting Cristian Estan – UCSD George Varghese - UCSD Reviewed by Michela Becchi Discussion Leaders Andrew.
Data Streaming Algorithms for Accurate and Efficient Measurement of Traffic and Flow Matrices Qi Zhao*, Abhishek Kumar*, Jia Wang + and Jun (Jim) Xu* *College.
Finding Frequent Items in Data Streams Moses CharikarPrinceton Un., Google Inc. Kevin ChenUC Berkeley, Google Inc. Martin Franch-ColtonRutgers Un., Google.
Mining Data Streams.
Estimating Join-Distinct Aggregates over Update Streams Minos Garofalakis Bell Labs, Lucent Technologies (Joint work with Sumit Ganguly, Amit Kumar, Rajeev.
Kiyoshi Irie, Tomoaki Yoshida, and Masahiro Tomono 2011 IEEE International Conference on Robotics and Automation Shanghai International Conference Center.
11 Packet Sampling for Worm and Botnet Detection in TCP Connections Reporter: 林佳宜 /10/25.
New Sampling-Based Summary Statistics for Improving Approximate Query Answers P. B. Gibbons and Y. Matias (ACM SIGMOD 1998) Rongfang Li Feb 2007.
Distributed Set-Expression Cardinality Estimation Abhinandan Das (Cornell U.) Sumit Ganguly (I.I.T. Kanpur) Minos Garofalakis (Bell Labs.) Rajeev Rastogi.
Algorithms for Distributed Functional Monitoring Ke Yi HKUST Joint work with Graham Cormode (AT&T Labs) S. Muthukrishnan (Google Inc.)
Streaming Algorithms for Robust, Real- Time Detection of DDoS Attacks S. Ganguly, M. Garofalakis, R. Rastogi, K. Sabnani Krishan Sabnani Bell Labs Research.
1 Design of Bloom Filter Array for Network Anomaly Detection Author: Jieyan Fan, Dapeng Wu, Kejie Lu, Antonio Nucci Publisher: IEEE GLOBECOM 2006 Presenter:
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 12 June 18, 2006
1 Reversible Sketches for Efficient and Accurate Change Detection over Network Data Streams Robert Schweller Ashish Gupta Elliot Parsons Yan Chen Computer.
Polytechnic University,ECE Department1 Detection of “Hot Spots” Paper Title : Joint Data Streaming and Sampling Techniques for Detection of Super Sources.
Ph.D. DefenceUniversity of Alberta1 Approximation Algorithms for Frequency Related Query Processing on Streaming Data Presented by Fan Deng Supervisor:
Reverse Hashing for High-speed Network Monitoring: Algorithms, Evaluation, and Applications Robert Schweller 1, Zhichun Li 1, Yan Chen 1, Yan Gao 1, Ashish.
Ph.D. SeminarUniversity of Alberta1 Approximation Algorithms for Frequency Related Query Processing on Streaming Data Presented by Fan Deng Supervisor:
Processing Data-Stream Joins Using Skimmed Sketches Minos Garofalakis Internet Management Research Department Bell Labs, Lucent Technologies Joint work.
Data Stream Mining and Querying
Introduction. Overview of Pushback. Architecture of router. Pushback mechanism. Conclusion. Pushback: Remedy for DDoS attack.
What ’ s Hot and What ’ s Not: Tracking Most Frequent Items Dynamically G. Cormode and S. Muthukrishman Rutgers University ACM Principles of Database Systems.
Reverse Hashing for Sketch Based Change Detection in High Speed Networks Ashish Gupta Elliot Parsons with Robert Schweller, Theory Group Advisor: Yan Chen.
Proof Sketches: Verifiable In-Network Aggregation Minos Garofalakis Yahoo! Research, UC Berkeley, Intel Research Berkeley
A survey on stream data mining
Estimating Set Expression Cardinalities over Data Streams Sumit Ganguly Minos Garofalakis Rajeev Rastogi Internet Management Research Department Bell Labs,
Statistic estimation over data stream Slides modified from Minos Garofalakis ( yahoo! research) and S. Muthukrishnan (Rutgers University)
Detecting SYN-Flooding Attacks Aaron Beach CS 395 Network Secu rity Spring 2004.
On the Difficulty of Scalably Detecting Network Attacks Kirill Levchenko with Ramamohan Paturi and George Varghese.
Data Stream Processing (Part III) Gibbons. “Distinct sampling for highly accurate answers to distinct values queries and event reports”, VLDB’2001. Ganguly,
Cloud and Big Data Summer School, Stockholm, Aug Jeffrey D. Ullman.
Network Flow-Based Anomaly Detection of DDoS Attacks Vassilis Chatzigiannakis National Technical University of Athens, Greece TNC.
Finding Frequent Items in Data Streams [Charikar-Chen-Farach-Colton] Paper report By MH, 2004/12/17.
Approximate Frequency Counts over Data Streams Gurmeet Singh Manku, Rajeev Motwani Standford University VLDB2002.
DoWitcher: Effective Worm Detection and Containment in the Internet Core S. Ranjan et. al in INFOCOM 2007 Presented by: Sailesh Kumar.
Detection Unknown Worms Using Randomness Check Computer and Communication Security Lab. Dept. of Computer Science and Engineering KOREA University Hyundo.
End-biased Samples for Join Cardinality Estimation Cristian Estan, Jeffrey F. Naughton Computer Sciences Department University of Wisconsin-Madison.
A Formal Analysis of Conservative Update Based Approximate Counting Gil Einziger and Roy Freidman Technion, Haifa.
Data Stream Algorithms Ke Yi Hong Kong University of Science and Technology.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Measurement COS 597E: Software Defined Networking.
New Sampling-Based Summary Statistics for Improving Approximate Query Answers Yinghui Wang
Open-Eye Georgios Androulidakis National Technical University of Athens.
Packet-Marking Scheme for DDoS Attack Prevention
1 Fast packet classification for two-dimensional conflict-free filters Department of Computer Science and Information Engineering National Cheng Kung University,
________________ CS3235, Nov 2002 (Distributed) Denial of Service Relatively new development. –Feb 2000 saw attacks on Yahoo, buy.com, ebay, Amazon, CNN.
Distributed Denial-of-Service Attack Detection (and Mitigation?) Mukesh Agarwal, Aditya Akella, Ashwin Bharambe.
D 陳怡安 R 解巽評 R 高榮泰 IEEE/ACM TRANSACTIONS ON NETWORKING OCTOBER 2006 Cristian Estan, George Varghese, Member, IEEE, and Michael Fisk.
Midterm Midterm is Wednesday next week ! The quiz contains 5 problems = 50 min + 0 min more –Master Theorem/ Examples –Quicksort/ Mergesort –Binary Heaps.
1 Figure 4-11: Denial-of-Service (DoS) Attacks Introduction  Attack on availability  Act of vandalism Single-Message DoS Attacks  Crash a host with.
Linear Sorting. Comparison based sorting Any sorting algorithm which is based on comparing the input elements has a lower bound of Proof, since there.
International Conference Security in Pervasive Computing(SPC’06) MMC Lab. 임동혁.
Hierarchical packet classification using a Bloom filter and rule-priority tries Source : Computer Communications Authors : A. G. Alagu Priya 、 Hyesook.
Inferring Internet Denial-of-Service Activity Authors: David Moore, Geoffrey M. Voelker and Stefan Savage; University of California, San Diego Publish:
REU 2009-Traffic Analysis of IP Networks Daniel S. Allen, Mentor: Dr. Rahul Tripathi Department of Computer Science & Engineering Data Streams Data streams.
Discrete Methods in Mathematical Informatics Kunihiko Sadakane The University of Tokyo
Continuous Monitoring of Distributed Data Streams over a Time-based Sliding Window MADALGO – Center for Massive Data Algorithmics, a Center of the Danish.
Discrete Methods in Mathematical Informatics Kunihiko Sadakane The University of Tokyo
SketchVisor: Robust Network Measurement for Software Packet Processing
Mining Data Streams (Part 1)
Finding Frequent Items in Data Streams
Streaming & sampling.
Lecture 7: Dynamic sampling Dimension Reduction
Qun Huang, Patrick P. C. Lee, Yungang Bao
Memento: Making Sliding Windows Efficient for Heavy Hitters
Extendable hashing M.B.Chandak.
Lu Tang , Qun Huang, Patrick P. C. Lee
Presentation transcript:

Streaming Algorithms for Robust, Real-Time Detection of DDoS Attacks S. Ganguly M. Garofalakis R. Rastogi K.Sabnani Indian Inst. Of Tech. India Yahoo! Research USA Bell Labs India Bell Labs USA ICDCS’07 27th international Conference on Distributed Computing Systems

Introduction Distributed Denial-of-Service (DDoS): A DDoS attack directs hundreds or even thousands of “zombie” hosts against a single victim

Introduction (cont.) TCP-SYN flooding attack 1. SYN 2. SYN-Ack 3. Ack IPtimeTTL Fake IP Out of Memory Crash! ×

Problem Formulation A stream of flow updates: (source, dest, ±1) Bad guy: Occur(u, v, +1) > Occur(u, v, -1) 1. SYN 2. SYN-Ack 3. Ack +1 Distinct source frequency f v = # of bad guys to v Continuously track the top-k distinct source frequency destinations over the stream of flow updates

Main idea of the solution: Sampling Directly sample from the stream? – For estimating the counts of an item: OK – For counting the number of distinct items: NO Construct the synopsis for the stream and then sample from the synopsis a, a, a, a, a, a, a, a, a, a, b (a, 10), (b, 1)

Distinct-Count Sketch: structure Domain of IP: [m] = {0, m-1} (source, dest) pairs: [m 2 ] First level hash function h: [m 2 ] → {0, …, Θ( logm)} with Pr[h(x) = l] = 1/2 l+1 – ½ of the distinct values in [m 2 ] mapping to bucket 0 – ¼ of the distinct values in [m 2 ] mapping to bucket 1 – 1/8 of the distinct values in [m 2 ] mapping to bucket 2 Second level hash function g i : [m 2 ] → [s] uniformly

Distinct-Count Sketch: structure (cont.) 0 Θ(logm) h(u, v) = b r hash tables 1 s g 1 (u, v) g 2 (u, v) g r (u, v) … … … … … … … 01 2logm Total element count Bit location counts Total element count: the total number of the tuples hashed into the bucket Bit location counts: the total number of the tuples hashed into the bucket with BIT j (u, v) = … Binary representation of (u, v): ☆☆ ☆☆ ☆ χ[i, j, k, l]: the i th first level bucket, the j th hash table, the k th second level bucket, the l th count- signature location

Distinct-Count Sketch: maintenance For each incoming update/tuple (u, v, ±1), update its corresponding count-signatures For all j = 1 to r – χ[h(u, v), j, g j (u, v), 0] = χ[h(u, v), j, g j (u, v), 0] ±1 – For each l = 1 to 2logm If BIT l (u, v) = 1 – χ[h(u, v), j, g j (u, v), l] = χ[h(u, v), j, g j (u, v), l] ±1

Top-k Frequency Estimation Generate distinct sample from the distinct-count sketch Scan the first level hash table until |dSample| < (1+ ε)s/16 or b ≥ 0 Check the count-signatures – For all l = 1 to 2logm Either Χ[b, j, k, l] = Χ[b, j, k, 0] or Χ[b, j, k, l] = 0 Add the (u, v) to dSample 0 Θ(logm) r hash tables 1 s g 1 (u, v) g 2 (u, v) g r (u, v) … … … … … … … … → (u, v) bit 1 bit Collision (u,v) = 1010

Top-k Frequency Estimation (cont.) After obtaining the dSample – (a, v), (u, v), (m, v), (a, w), (b, w), (c, w), (d, w), …. – f w in dS = 4, f v in dS= 3, …

Error guaranteed Input: Flow-update stream, k, error ε, and confidence δ Output: continuously track a list L of k destination IP addresses and guaranteed that with probability of at least 1-δ – 1. Any destination address v in L has frequency f v ≥ (1-ε)f vk – 2. For any destination address v in L, n = the upper bound on the number of update tuples in the streams

Conclusion Seem to combine the FM sketch and the Count-Min sketch to reduce the collisions and then using BIT operations to identify the destination addresses