Resource/Accuracy Tradeoffs in Software-Defined Measurement Masoud Moshref, Minlan Yu, Ramesh Govindan HotSDN’13
Motivation Tradeoff Case study Discussion Motivation Management policies need measurement 2 Traffic Engineering: Find elephant flows to route [Hedera] Estimate rack-to-rack traffic matrix [MicroTE] Traffic Engineering: Find elephant flows to route [Hedera] Estimate rack-to-rack traffic matrix [MicroTE] Accounting: Tiered pricing based on network usage Accounting: Tiered pricing based on network usage Troubleshooting: Find network bottlenecks of an application Incast problem Troubleshooting: Find network bottlenecks of an application Incast problem
Motivation Tradeoff Case study Discussion Motivation Software Defined Measurement 3 3 Controller Traffic Engineering Configure resources1Fetch statistics2 (Re)Configure resources 1
Motivation Tradeoff Case study Discussion Motivation Challenges 4 Different measurements Resource usage depends on measurement Different measurements Resource usage depends on measurement Limited resources Limited CPU/memory in switches Limited control network bandwidth Limited resources Limited CPU/memory in switches Limited control network bandwidth Complexity of different primitives in switches Counting by prefix matching (TCAM) Counting based on hashes Programming Complexity of different primitives in switches Counting by prefix matching (TCAM) Counting based on hashes Programming Complexity of different primitives in switches Counting by prefix matching (TCAM) Counting based on hashes Programming Complexity of different primitives in switches Counting by prefix matching (TCAM) Counting based on hashes Programming
Motivation Architecture Tradeoff Case study Discussion Using Different Primitives for Measurement 5 Example: Finding heavy hitter IP when traffic from /24 goes beyond a threshold Comparing primitives in performing a measurement task
Motivation Architecture Tradeoff Case study Discussion Counting TCAM matches ****18 Filters Counter s 1.Monitor /24 2.When goes beyond threshold, iteratively search (e.g. binary search) Controlle r / / / / Fetch Install Large time-scale measurement Measure-install loop latency Controller link bandwidth Slowly varying traffic Large time-scale measurement Measure-install loop latency Controller link bandwidth Slowly varying traffic
Motivation Architecture Tradeoff Case study Discussion H Src: , Size:2 3 Controlle r 1.Fetch counters 2.Approximate traffic for each IP Uses cheaper resources (SRAM) Not dependent on traffic history Use for varying traffic Still history may help in large time scale to optimize parameters Uses cheaper resources (SRAM) Not dependent on traffic history Use for varying traffic Still history may help in large time scale to optimize parameters Sketches using hash based counters
Motivation Architecture Tradeoff Case study Discussion Programming switches 3 Use more complex data structures/commands Heap for finding large flows Uses CPU resources Can it keep up with line rate? Uses CPU resources Can it keep up with line rate?
Motivation Architecture Tradeoff Case study Discussion Summary of tradeoffs of primitives CriteriaPrefix matching Hash-based counting Programming Memory usageTCAMSRAM Bandwidth usageMediumLargeSmall Time-scaleLargeSmall Traffic history sensitivity YesNo Concerns Accuracy vs Resource 9
Case Study: Hierarchical Heavy Hitter Detection 10
Motivation Case study Tradeoff Discussion Hierarchical Heavy Hitters *11 * 00*01* 0** 1** ***
Motivation Case study Tradeoff Discussion Hierarchical Heavy Hitters *11 * 00*01* 0** 1** *** Threshold=10 Longest IP prefixes Contribute a large amount of traffic After excluding any HHH descendants in the prefix tree
Motivation Case study Tradeoff Discussion Algorithms for single switch HHH detection ApproximateTraverse tree Prefix-based counting Proposed an iterative algorithm at controller Given TCAM capacity, monitor prefixes that maximize accuracy Hash-based counting Use Hierarchical Count-Min sketch Approximate traffic in all levels using sketches Efficiently traverse IP tree at controller to find HHHs
Motivation Case study Tradeoff Discussion Evaluation 14 Sketches have better accuracy for small threshold (longer prefixes, higher variability) Sketches have better accuracy for small threshold (longer prefixes, higher variability) Max-Cover saves bandwidth for large number of counters and large thresholds SRAM counters TCAM counters Normalize switch cost 80 SRAM 1 TCAM
Motivation Tradeoff Case study Discussion Motivation Future work 15 Controller Traffic Engineering Accountin g Software Defined Measurement North-bound South-bound Use joint information Guarantee accuracy Use joint information Guarantee accuracy Select primitive Assign switch resources Run global measurement Select primitive Assign switch resources Run global measurement Anomaly Detection
Motivation Case study Tradeoff Discussion Evaluation: Recall 0.5s The accuracy of both methods decreases for smaller time-scale Fewer counters per second Still the same trend as 5s time-scale 16