Qun Huang, Patrick P. C. Lee, Yungang Bao

Slides:



Advertisements
Similar presentations
Rectangle-Efficient Aggregation in Spatial Data Streams Srikanta Tirthapura David Woodruff Iowa State IBM Almaden.
Advertisements

Author: Chengchen, Bin Liu Publisher: International Conference on Computational Science and Engineering Presenter: Yun-Yan Chang Date: 2012/04/18 1.
New Directions in Traffic Measurement and Accounting Cristian Estan – UCSD George Varghese - UCSD Reviewed by Michela Becchi Discussion Leaders Andrew.
Active Learning for Streaming Networked Data Zhilin Yang, Jie Tang, Yutao Zhang Computer Science Department, Tsinghua University.
Fast Algorithms For Hierarchical Range Histogram Constructions
Summarizing Distributed Data Ke Yi HKUST += ?. Small summaries for BIG data  Allow approximate computation with guarantees and small space – save space,
OpenSketch Slides courtesy of Minlan Yu 1. Management = Measurement + Control Traffic engineering – Identify large traffic aggregates, traffic changes.
Significance: gives for the first time exact inference results in closed-form Efficiency is cubic in the number of variables Derived Stable-Jacobi approximate.
Streaming Algorithms for Robust, Real- Time Detection of DDoS Attacks S. Ganguly, M. Garofalakis, R. Rastogi, K. Sabnani Krishan Sabnani Bell Labs Research.
1 Reversible Sketches for Efficient and Accurate Change Detection over Network Data Streams Robert Schweller Ashish Gupta Elliot Parsons Yan Chen Computer.
Reverse Hashing for High-speed Network Monitoring: Algorithms, Evaluation, and Applications Robert Schweller 1, Zhichun Li 1, Yan Chen 1, Yan Gao 1, Ashish.
Reverse Hashing for Sketch Based Change Detection in High Speed Networks Ashish Gupta Elliot Parsons with Robert Schweller, Theory Group Advisor: Yan Chen.
Machine Learning CMPT 726 Simon Fraser University
Dream Slides Courtesy of Minlan Yu (USC) 1. Challenges in Flow-based Measurement 2 Controller Configure resources1Fetch statistics2(Re)Configure resources1.
CS 580S Sensor Networks and Systems Professor Kyoung Don Kang Lecture 7 February 13, 2006.
Lecture II-2: Probability Review
BUFFALO: Bloom Filter Forwarding Architecture for Large Organizations Minlan Yu Princeton University Joint work with Alex Fabrikant,
Hash, Don’t Cache: Fast Packet Forwarding for Enterprise Edge Routers Minlan Yu Princeton University Joint work with Jennifer.
SIGCOMM 2002 New Directions in Traffic Measurement and Accounting Focusing on the Elephants, Ignoring the Mice Cristian Estan and George Varghese University.
Approximate Frequency Counts over Data Streams Loo Kin Kong 4 th Oct., 2002.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Sample Size Determination CHAPTER thirteen.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 8-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
© 2010 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein are trademarks of AT&T Intellectual.
Introduction to Software Development. Systems Life Cycle Analysis  Collect and examine data  Analyze current system and data flow Design  Plan your.
Resource/Accuracy Tradeoffs in Software-Defined Measurement Masoud Moshref, Minlan Yu, Ramesh Govindan HotSDN’13.
1 LD-Sketch: A Distributed Sketching Design for Accurate and Scalable Anomaly Detection in Network Data Streams Qun Huang and Patrick P. C. Lee The Chinese.
Data Stream Algorithms Ke Yi Hong Kong University of Science and Technology.
SCREAM: Sketch Resource Allocation for Software-defined Measurement Masoud Moshref, Minlan Yu, Ramesh Govindan, Amin Vahdat (CoNEXT’15)
Image Processing Architecture, © Oleh TretiakPage 1Lecture 5 ECEC 453 Image Processing Architecture Lecture 5, 1/22/2004 Rate-Distortion Theory,
Discrete Methods in Mathematical Informatics Kunihiko Sadakane The University of Tokyo
New Algorithms for Heavy Hitters in Data Streams David Woodruff IBM Almaden Joint works with Arnab Bhattacharyya, Vladimir Braverman, Stephen R. Chestnut,
Re-evaluating Measurement Algorithms in Software Omid Alipourfard, Masoud Moshref, Minlan Yu {alipourf, moshrefj,
SketchVisor: Robust Network Measurement for Software Packet Processing
Estimating standard error using bootstrap
SDN challenges Deployment challenges
Jennifer Rexford Princeton University
Frequency Counts over Data Streams
Making inferences from collected data involve two possible tasks:
Deep Feedforward Networks
J. Randall Brown Milton E Harvey Ceyhun Ozgur
A DFA with Extended Character-Set for Fast Deep Packet Inspection
The Variable-Increment Counting Bloom Filter
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Streaming & sampling.
Performance Measures II
Chapter 8: Inference for Proportions
Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &
SoC and FPGA Oriented High-quality Stereo Vision System
A Framework for Automatic Resource and Accuracy Management in A Cloud Environment Smita Vijayakumar.
Query-Friendly Compression of Graph Streams
Optimal Elephant Flow Detection Presented by: Gil Einziger,
Bloom Filters Very fast set membership. Is x in S? False Positive
SCREAM: Sketch Resource Allocation for Software-defined Measurement
Elastic Sketch: Adaptive and Fast Network-wide Measurements
Operating Systems.
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Approximate Frequency Counts over Data Streams
Elastic Sketch: Adaptive and Fast Network-wide Measurements
Memento: Making Sliding Windows Efficient for Heavy Hitters
This Lecture Substitution model
Lecture Slides Elementary Statistics Twelfth Edition
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Lu Tang , Qun Huang, Patrick P. C. Lee
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Toward Self-Driving Networks
Toward Self-Driving Networks
NitroSketch: Robust and General Sketch-based Monitoring in Software Switches Alan (Zaoxing) Liu Joint work with Ran Ben-Basat, Gil Einziger, Yaron Kassner,
Author: Ramana Rao Kompella, Kirill Levchenko, Alex C
2019/11/12 Efficient Measurement on Programmable Switches Using Probabilistic Recirculation Presenter:Hung-Yen Wang Authors:Ran Ben Basat, Xiaoqi Chen,
(Learned) Frequency Estimation Algorithms
Presentation transcript:

Qun Huang, Patrick P. C. Lee, Yungang Bao SketchLearn: Relieving User Burdens in Approximate Measurement with Automated Statistical Inference Qun Huang, Patrick P. C. Lee, Yungang Bao

Requirements for Measurement R1: Small memory usage Hardware switch: at most 50-100 MB R2: Fast packet processing 10 Gbps link -> 14.88Mpps (64-byte packet) -> 200 CPU cycles (3 GHz) R3: Fast response R4: Generality One architecture for multiple tasks One architecture for software and hardware

Approximate Measurement Underlying techniques Sampling Top-k counting Sketch Trade-offs: accuracy VS computation/storage/communication No one considers user burdens

User Burden 1: Errors as User Input Administrator Sketch developer Can I use sketch to monitor frequent items? Of course! Give me your expected bias 𝜖 and failure probability 𝛿

User Burden 1: Errors as User Input Administrator Sketch developer Can I use sketch to monitor frequent items? Of course! Give me your expected bias 𝜖 and error probability 𝛿 User burden: hard to parameterize “best” errors

User Burden 2: Prior Knowledge of Query Administrator Sketch developer I want all frequent items larger than 1% of total Here you are! Hmm … 1% is too large, how about 0.1% Sorry, current configuration is for threshold 1% or larger. Errors for 0.1% may be very large!

User Burden 2: Prior Knowledge of Query Administrator Sketch developer I want all frequent items larger than 1% of total Here you are! Hmm … 1% is too large, how about 0.1% Sorry, current configuration is for threshold 1% or larger. Errors for 0.1% may be unbounded! User burden: one configuration binds to query parameters Absolute threshold: 10^6, 10^7… Relative threshold: 1%, 5%, 10% … Top-k: top 50, top 100, top 200 …

Benchmark Results In heavy hitter detection, accuracy drops for smaller thresholds

User Burden 3: Complicated Formulas Administrator Sketch developer Can I use sketch to monitor frequent items? Of course! Allocate ⌈ log 2 1 𝛿 ⌉ rows and ⌈ 𝑈 2𝜖 ⌉ counters in each row! Relative error is smaller than 𝜖 with a probability at least 1−𝛿

User Burden 3: Complicated Formulas Administrator Sketch developer Can I use sketch to monitor frequent items? Of course! Allocate ⌈ log 2 1 𝛿 ⌉ rows and ⌈ 𝑈 2𝜖 ⌉ counters in each row! Relative error is smaller than 𝜖 with a probability at least 1−𝛿 User burden: parameter is complicated

Benchmark Results Significant differences between theoretical and empirical results

User Burden 4: Fixed Flowkey Definition Administrator Sketch developer I want top-100 (src IP, dst IP) pairs Here you are! Hmm … How about top-100 (src IP, src port) pairs Sorry, current configuration is for (src IP, dst IP) pairs only!

User Burden 4: Fixed Flowkey Definition Administrator Sketch developer I want top-100 (src IP, dst IP) pairs Here you are! Hmm … How about top-100 (src IP, src port) pairs Sorry, current configuration is for (src IP, dst IP) pairs only! User burden: one configuration binds to one flow definition

User Burden 5: No Error Measures Administrator Sketch developer What’s the error of 192.168.4.12 ? Hmm … <5% with prob >99%, you specified it… How about 192.168.4.13 ? It seems to have smaller error … Sorry, the error estimate is specified by you! I cannot estimate for a particular run!

User Burden 5: No Error Measures Administrator Sketch developer What’s the error of 192.168.4.12 ? Hmm … <5% with prob >99%, you specified it… How about 192.168.4.13 ? It seems to have smaller error … Sorry, the error estimate is specified by you! I cannot estimate for a particular run! User burden: cannot measure the actual extent of errors

Root Cause Analysis Errors are caused by resource conflicts Need to careful configuration to mitigate resource conflicts Less hash collisions More collisions The more buckets, the less conflicts (errors) in each bucket

Resource conflicts (Errors) Root Cause Analysis Resource conflicts depend on many issues A good configuration should take all of them into account Binds to many issues Complicated Calculations Flow definitions Query parameters Prior errors Configured Algorithm Query parameters Good Configuration Flow definitions Resource conflicts (Errors)

Naive Approach Automatically adjust configuration Hardness Predict prior errors/query parameters/flow definition is hard History cannot reflect future Adjusting configuration introduces latency

Our Work SketchLearn: Sketch-based Measurement System that Mitigates User Burdens Performance Catch up with underlying packet forwarding speed Resource efficiency Consume only limited resources Accuracy Preserve high accuracy of sketches Generality Support multiple measurement tasks Simplicity Address all user burdens

High-Level Idea Allow “imperfect” configuration Not pursue a perfect configuration to mitigate resource conflicts Characterize inherent statistical properties of resource conflicts Sketch Statistical Modeling Resource Conflict Model Network Queries

Design Features Feature 1: Multi-level sketch for per-bit tracking

Design Features Feature 2: Parameter-free model inference

Design Features Feature 3: Separation of large and small flows

Design Features Feature 4: Attachment of error measures with individual flows

Design Features Feature 5: Network-wide query

Multi-level Sketch L: # of bits considered Data structure: L+1 levels (from 0 to L), each is a counter matrix Level 0 is always updated Level k is updated iff k-th bit in a flowkey is 1 All levels share the same hash functions

Model Theory Theory needs to address two factors Hash collisions: number of flows, byte counts in each matrix bucket Bit-level flowkeys: probability that k-th bit is 1, denoted by p[k] Let V i,j 𝑘 be counter value of bucket (i,j) in level k Consider random variables R i,j k = V i,j 𝑘 V i,j 0 Theorem: if no large flows in bucket, R i,j k follows Gaussian distribution with mean p[k]

Statistical Model Inference Goal Extract all large flows Guarantee remaining flows in sketches are small Estimate Gaussian distributions for each level Challenge No guidelines to distinguish large and small flows Hard to directly decompose Solution: self-adaptive algorithm

Algorithm Start with imperfect input distributions Start with an initial threshold that distinguishes large and small flows Threshold! Try to extract some large flows based on current imperfect distributions Remove large flows and refine distributions Adjust threshold

Large Flow Extraction Intuition: a large flow (i) results in extreme (large or small) R i,j k , or (ii) at least deviates R i,j k from its expectation p[k] significantly Update Counter (i,j) at all levels Recovery Keys larger than 15 30 Level 0: total counts Key 0101: size 20 Counter<15 && Total-counter>=15 Level 1 Counter>=15 && Total-counter<15 20 Level 2 1 Key 0001: size 10 Counter<15 && Total-counter>=15 Level 3 Counter>=15 && Total-counter<15 30 Level 4 1

Large Flow Extraction Key idea: examine R i,j k and its difference from p[k], then Step 1: Compute probability that k-th bit is 1 Step 2: Determine k-th bit of a large flow (assuming it exists) Step 3: Estimate frequency Step 4: Associate its error probability Step 5: Verify constructed large flows

Large Flow Extraction Example

Guarantee Correctness Lemma: For each bucket, flows exceeding half of total must be extracted Update Counter (i,j) at all levels Recovery Keys larger than 15 30 Level 0: total counts Key 0101: size 20 Counter<15 && Total-counter>=15 Level 1 Counter>=15 && Total-counter<15 20 Level 2 1 Key 0001: size 10 Counter<15 && Total-counter>=15 Level 3 Counter>=15 && Total-counter<15 30 Level 4 1 Theorem: w is sketch width, flows exceeding 1/w of total must be extracted Empirical results: flows smaller than 1/w are also extracted with high probability E.g., >99% flows exceeding 1/2w are extracted

Supported Network Queries Per-flow byte count Heavy hitter detection Heavy changer detection Cardinality estimation Flow size distribution estimation Entropy estimation

Extended Query Model Allow query for arbitrary flowkeys Only use corresponding levels Estimate actual query errors Use attached errors for each large flow Use Gaussian distributions

How to Address User Burdens Need user-specified errors Guarantee the correctness of model inference Need query parameters Robust to very small flows, no thresholds needed Hard to compute configurations Simple parameters computed based on memory budget or minimum requirement Self-adaptive algorithm, no further configurations One configuration for one flowkey definition Extended query model supports arbitrafy flowkeys Fail to estimate actual errors Extended query model attaches errors to query results

Implementation Challenge: updating L levels is time consuming Solution: parallel updating L levels are independent Software Based on OpenVSwitch + DPDK Parallelism with SIMD Hardware Based on P4 programmable switches Parallelism with P4 pipeline stages

Resource Usage Memory CPU cycles

Performance

Generality 6 measurement tasks In each task, compare with state-of-the-arts

Fitting Theorem and Robustness

Other Experiments Support multiple flowkey definitions Attach effective error measures Support network-wide measurement tasks

Conclusion Analyze 5 user burdens in existing approximate measurement SketchLearn architecture Multi-level data structure design Theory: counters follow Gaussian distributions when no large flows Self-adaptive model inference algorithm Proof of convergence and accuracy Extended query models OVS-DPDK and P4 implementation Evaluations

Future Consumptions for implicit resources SIMD registers P4 pipelines Sketch + True network controls