A Resource-minimalist Flow Size Histogram Estimator

Slides:



Advertisements
Similar presentations
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003 Bitmap Algorithms for Counting Active Flows on High Speed Links Cristian.
Advertisements

Florin Dinu T. S. Eugene Ng Rice University Inferring a Network Congestion Map with Traffic Overhead 0 zero.
Counting Distinct Objects over Sliding Windows Presented by: Muhammad Aamir Cheema Joint work with Wenjie Zhang, Ying Zhang and Xuemin Lin University of.
New Directions in Traffic Measurement and Accounting Cristian Estan – UCSD George Varghese - UCSD Reviewed by Michela Becchi Discussion Leaders Andrew.
Data Streaming Algorithms for Accurate and Efficient Measurement of Traffic and Flow Matrices Qi Zhao*, Abhishek Kumar*, Jia Wang + and Jun (Jim) Xu* *College.
A Fast and Compact Method for Unveiling Significant Patterns in High-Speed Networks Tian Bu 1, Jin Cao 1, Aiyou Chen 1, Patrick P. C. Lee 2 Bell Labs,
Estimating TCP Latency Approximately with Passive Measurements Sriharsha Gangam, Jaideep Chandrashekar, Ítalo Cunha, Jim Kurose.
CS 268: Lecture 8 Router Support for Congestion Control Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.
CS 4700 / CS 5700 Network Fundamentals Lecture 12: Router-Aided Congestion Control (Drop it like it’s hot) Revised 3/18/13.
Yoshiharu Ishikawa (Nagoya University) Yoji Machida (University of Tsukuba) Hiroyuki Kitagawa (University of Tsukuba) A Dynamic Mobility Histogram Construction.
PERSISTENT DROPPING: An Efficient Control of Traffic Aggregates Hani JamjoomKang G. Shin Electrical Engineering & Computer Science UNIVERSITY OF MICHIGAN,
Query Assurance on Data Streams  Ke Yi (AT&T Labs, now at HKUST)  Feifei Li (Boston U, now at Florida State)  Marios Hadjieleftheriou (AT&T Labs) 
Streaming Algorithms for Robust, Real- Time Detection of DDoS Attacks S. Ganguly, M. Garofalakis, R. Rastogi, K. Sabnani Krishan Sabnani Bell Labs Research.
1 Reversible Sketches for Efficient and Accurate Change Detection over Network Data Streams Robert Schweller Ashish Gupta Elliot Parsons Yan Chen Computer.
Polytechnic University,ECE Department1 Detection of “Hot Spots” Paper Title : Joint Data Streaming and Sampling Techniques for Detection of Super Sources.
Reverse Hashing for High-speed Network Monitoring: Algorithms, Evaluation, and Applications Robert Schweller 1, Zhichun Li 1, Yan Chen 1, Yan Gao 1, Ashish.
Beyond Bloom Filters: From Approximate Membership Checks to Approximate State Machines By F. Bonomi et al. Presented by Kenny Cheng, Tonny Mak Yui Kuen.
Performance Evaluation of IPv6 Packet Classification with Caching Author: Kai-Yuan Ho, Yaw-Chung Chen Publisher: ChinaCom 2008 Presenter: Chen-Yu Chaug.
CS 268: Lecture 8 (Router Support for Congestion Control) Ion Stoica February 19, 2002.
Reverse Hashing for Sketch Based Change Detection in High Speed Networks Ashish Gupta Elliot Parsons with Robert Schweller, Theory Group Advisor: Yan Chen.
Towards a High-speed Router-based Anomaly/Intrusion Detection System (HRAID) Zhichun Li, Yan Gao, Yan Chen Northwestern.
CS591A1 Fall Sketch based Summarization of Data Streams Manish R. Sharma and Weichao Ma.
The War Between Mice and Elephants By Liang Guo (Graduate Student) Ibrahim Matta (Professor) Boston University ICNP’2001 Presented By Preeti Phadnis.
A DoS Resilient Flow-level Intrusion Detection Approach for High-speed Networks Yan Gao, Zhichun Li, Yan Chen Lab for Internet and Security Technology.
1 BRICK: A Novel Exact Active Statistics Counter Architecture Nan Hua 1, Bill Lin 2, Jun (Jim) Xu 1, Haiquan (Chuck) Zhao 1 1 Georgia Institute of Technology.
1 Network-based Intrusion Detection, Mitigation and Forensics System Yan Chen Department of Electrical Engineering and Computer Science Northwestern University.
George Varghese (based on Cristi Estan’s work) University of California, San Diego May 2011 Internet traffic measurement: from packets to insight.
Tracking Port Scanners on the IP Backbone Tao Ye Sprint Burlingame, CA Avinash Sridharan University of Southern California.
SIGCOMM 2002 New Directions in Traffic Measurement and Accounting Focusing on the Elephants, Ignoring the Mice Cristian Estan and George Varghese University.
Scalable and Efficient Data Streaming Algorithms for Detecting Common Content in Internet Traffic Minho Sung Networking & Telecommunications Group College.
CEDAR Counter-Estimation Decoupling for Approximate Rates Erez Tsidon (Technion, Israel) Joint work with Iddo Hanniel and Isaac Keslassy ( Technion ) 1.
CEDAR Counter-Estimation Decoupling for Approximate Rates Erez Tsidon Joint work with Iddo Hanniel and Isaac Keslassy Technion, Israel 1.
DoWitcher: Effective Worm Detection and Containment in the Internet Core S. Ranjan et. al in INFOCOM 2007 Presented by: Sailesh Kumar.
Bruno Ribeiro CS69000-DM1 Topics in Data Mining. Bruno Ribeiro  Reviews of next week’s papers due Friday 5pm (Sunday 11:59pm submission closes) ◦ Assignment.
TinyLFU: A Highly Efficient Cache Admission Policy
27th, Nov 2001 GLOBECOM /16 Analysis of Dynamic Behaviors of Many TCP Connections Sharing Tail-Drop / RED Routers Go Hasegawa Osaka University, Japan.
Optimal XOR Hashing for a Linearly Distributed Address Lookup in Computer Networks Christopher Martinez, Wei-Ming Lin, Parimal Patel The University of.
Challenges and Opportunities Posed by Power Laws in Network Analysis Bruno Ribeiro UMass Amherst MURI REVIEW MEETING Berkeley, 26 th Oct 2011.
A Formal Analysis of Conservative Update Based Approximate Counting Gil Einziger and Roy Freidman Technion, Haifa.
1 LD-Sketch: A Distributed Sketching Design for Accurate and Scalable Anomaly Detection in Network Data Streams Qun Huang and Patrick P. C. Lee The Chinese.
CINBAD CERN/HP ProCurve Joint Project on Networking 26 May 2009 Ryszard Erazm Jurga - CERN Milosz Marian Hulboj - CERN.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Measurement COS 597E: Software Defined Networking.
Efficient Cache Structures of IP Routers to Provide Policy-Based Services Graduate School of Engineering Osaka City University
D 陳怡安 R 解巽評 R 高榮泰 IEEE/ACM TRANSACTIONS ON NETWORKING OCTOBER 2006 Cristian Estan, George Varghese, Member, IEEE, and Michael Fisk.
Data Streams Topics in Data Mining Fall 2015 Bruno Ribeiro © 2015 Bruno Ribeiro.
Calculating frequency moments of Data Stream
Noise Can Help: Accurate and Efficient Per-flow Latency Measurement without Packet Probing and Time Stamping Michigan State University SIGMETRICS 14.
SCREAM: Sketch Resource Allocation for Software-defined Measurement Masoud Moshref, Minlan Yu, Ramesh Govindan, Amin Vahdat (CoNEXT’15)
REU 2009-Traffic Analysis of IP Networks Daniel S. Allen, Mentor: Dr. Rahul Tripathi Department of Computer Science & Engineering Data Streams Data streams.
Re-evaluating Measurement Algorithms in Software Omid Alipourfard, Masoud Moshref, Minlan Yu {alipourf, moshrefj,
SketchVisor: Robust Network Measurement for Software Packet Processing
FlowRadar: A Better NetFlow For Data Centers
Empirically Characterizing the Buffer Behaviour of Real Devices
Columbia University in the city of New York
ECF: an MPTCP Scheduler to Manage Heterogeneous Paths
Pyramid Sketch: a Sketch Framework
Optimal Elephant Flow Detection Presented by: Gil Einziger,
SCREAM: Sketch Resource Allocation for Software-defined Measurement
BRICK: A Novel Exact Active Statistics Counter Architecture
Advanced Computer Networks
RAP: Rate Adaptation Protocol
Coping with (exploiting) heavy tails
Memento: Making Sliding Windows Efficient for Heavy Hitters
The War Between Mice & Elephants by, Matt Hartling & Sumit Kumbhar
April 10, 2006, Northwestern University
Heavy Hitters in Streams and Sliding Windows
By: Ran Ben Basat, Technion, Israel
Lu Tang , Qun Huang, Patrick P. C. Lee
Author: Ramana Rao Kompella, Kirill Levchenko, Alex C
Presentation transcript:

A Resource-minimalist Flow Size Histogram Estimator Bruno Ribeiro, Don Towsley UMass Amherst Tao Ye Sprint

Internet core router: TCP flows Flow size histogram Internet core router: TCP flows Flow size e.g. # of packets TCP flow Flow size histogram used: Traffic profiling Anomaly detection Histogram hard to obtain TCP flows: Hundreds of millions flows/hour (OC-48 router) Estimating flow size histograms Random packet sampling is inaccurate [Ribeiro et al. 2006] Flow sampling: more memory & accurate tail needs packet sampling Current data streaming methods have slow estimators Bruno Ribeiro, Tao Ye, Don Towsley, "A Resource-minimalist flow size histogram estimator"

Outline Related work Our resource-minimalist approach Experiment Conclusions Bruno Ribeiro, Tao Ye, Don Towsley, "A Resource-minimalist flow size histogram estimator"

Related work [Kumar et al. 2004] Router Packet hash collision!! Universal hash function Flow size histogram 1 2 1 1 2 Estimation phase (powerful backend server) counters hash collisions Complexity: O( (maximum flow size)3 ) Sketch phase Bruno Ribeiro, Tao Ye, Don Towsley, "A Resource-minimalist flow size histogram estimator"

Resource-minimalist Approach Insight: Don’t need to count every flow size Idea: Group large flow sizes into bins Fine grained flow histogram < k packets Coarse grained flow histogram > k packets Approach: Probablistic counting Reduces counters to 6 bits Requires: Low collision probability (e.g. counter/flow = 2/1) Result: O(k3 + log(W)) estimator, e.g., k=16 and W=107 Problem: Low collision → more memory (2 counters / flow) Approach: Counter folding Negligible increase in estimator error Requires one extra bit / counter Result: Reduces number of counters by half Bruno Ribeiro, Tao Ye, Don Towsley, "A Resource-minimalist flow size histogram estimator"

Group large flow sizes & Probabilistic counting [Morris 78] Counter increments (probabilisitc): With ma = 2ª , 6 bit counter bins up to W=1014 Hash counter p=1/m1 p=1/m2 Arrived packets: k k+2 k-1 k+1 2 1 … … … k-1 k m1 m2 average Counter value k → flow sizes = [k, k+m1-1] Counter value k+1 → flow sizes = [k+m1, k+m1+m2-1] Bruno Ribeiro, Tao Ye, Don Towsley, "A Resource-minimalist flow size histogram estimator"

Counter folding: Detecting some collisions Maximum hash value = M M/2 counters If hash(packet) < M/2 → red Otherwise (hash(packet) mod M/2) → blue Detectable blue – red collision: 1 bit required Undetectable collision flow 7 flow 9 flow 8 Flows: Counters: 6 1 2 2 1 6 M/2 counters Bruno Ribeiro, Tao Ye, Don Towsley, "A Resource-minimalist flow size histogram estimator"

Counter folding À 1 Collision policy: “red flow cannot increment blue counter” “blue flow overwrites red counter” counter = 0 are red Flows: Counters: 6 1 2 2 1 3 Counter colors: (extra bit) 1 1 1 1 Result: e.g. if 1 counter / flow All red counters are also blue counters = 0 Virtually expands hash table in ≈ 50% (virtual 2 counters/ flow) Blue counters evict red counters Flow sampling effect: Discards 15% flows at random Folding: interesting fact Number of foldings Policy: Evict newest flow (color = flow ID) Flow sampling À 1 Bruno Ribeiro, Tao Ye, Don Towsley, "A Resource-minimalist flow size histogram estimator"

Experiment Evaluated with simulations Same accuracy without counter folding requires 13MB of memory Evaluated with simulations Our worst result with Internet core traces 9.5 million flows 8MB of memory k=16 W=1014 k Bruno Ribeiro, Tao Ye, Don Towsley, "A Resource-minimalist flow size histogram estimator"

Conclusions Insights Our Estimator Group large flow sizes using probabilistic counters Counter folding Fast quasi-random sampling Our Estimator Time complexity Sketch phase Universal hash cost Two additions One subtraction Estimation phase O(k3 + log(W)) Space complexity ≈ 1/4 memory usage of [Kumar et al. 2004] Bruno Ribeiro, Tao Ye, Don Towsley, "A Resource-minimalist flow size histogram estimator"