Enabling a “RISC” Approach for Software-Defined Monitoring using Universal Streaming Vyas Sekar Zaoxing Liu, Greg Vorsanger, Vladimir Braverman
Network Management: Many Monitoring Requirements SDN Controller (OpenDayLight etc.) Traffic Engineering Analyze new user apps Anomaly Detection NetworkForensics Worm Detection Accounting Botnet analysis ……. “Heavy-hitters” “Flow size distribution” “SuperSpreaders” “Entropy”, “Traffic Changes” 1
Traditional: Packet Sampling Flow reports 1 Not good for fine-grained analysis Extensive literature on limitations for many tasks! Sample packets at random, aggregate into flows FlowId Counter Flow = Packets with same pattern Source and Destination Address and Ports Estimate: FSD, Entropy, Heavyhitters, Changes, SuperSpreaders ….
Application-Specific Sketches Packet Processing Counter Data Structures Application-Level Metric Heavy Hitter EntropySuperspreader Complexity: Need per-metric implementation Recent Example: OpenSketch [NSDI’13] Trend: Many more applications appear! …. Monitoring (on router) Bloom-filter, Count-min Sketch, reversible sketch, etc. 3 Packet Processing Counter Data Structures Application-Level Metric Packet Processing Counter Data Structures Application-Level Metric …. Traffic Computation (off router)
Packet Processing Counter Data Structures Application-Level Metric Support many applications Holy Grail of Flow Monitoring? Results with high accuracy 4 Traffic
Our Solution: Universal Monitoring 5 Recent theory advances: Universal Streaming Packet Processing Universal Sketch Traffic App 1 Application-specific Computation App n …... UnivMon Control Plane UnivMon Data Plane One sketch does it ALL
Theory of Universal Streaming 1. Vladimir Braverman, Rafail Ostrovsky: Zero-one frequency laws. STOC Generalizing the Layering Method of Indyk and Woodruff: Recursive Sketches for Frequency-Based Vectors on Streams. APPROX-RANDOM …... (A stream of length m with n unique items) ‘Universal’ Sketch Estimated G-sum frequency vector is 6
Universal Sketch Data Structure L2 Heavy Hitter Algorithms (1,4), (3,2),(5,2) Heavy Hitters (1,4), (5,2),(2,1) …... (2,1) 7 (5,2), (2,1) 0 1 log(n) …... Generate k=log(n) pairwise ind. zero-one hash functions: H 1 …. H k 25 5 Similar to counting bloom filter H 1 (1)=1, H 1 (5)=1, H 1 (2)=1 H 2 (5)=1, H 2 (2)=1 H 3 (2)=1 Levels Heavy Hitter Alg Count Sketch Alg …... Count-Sketch, Pick-and- drop etc. In Parallel
Estimating G-sum (1,4), (3,2),(5,2) Counters from Universal Sketch (1,4), (5,2), (2,1) …... (2,1) 8 (5,2),(2,1) Levels 0 1 log(n) …... Apply arbitrary g() (1,g(4)), (3,g(2)),(5,g(2)) (1,g(4)), (5,g(2)), (2,g(1)) (5,g(2)),(2,g(1)) (2,g(1)) Y 3 =g(1) Sum of the g()s Y 2 =g(1)+g(2) Y 1 =g(1)+g(2)+g(4) Y 0 =2g(1)+2g(2)+g(4) Estimated G-sum Recursive Steps: Y i-1 = 2Y i + new counters – repeated counters
Putting it together: UnivMon Universal Sketch Offline Recursive Computation 9
Comparison with custom sketches via OpenSketch Preliminary Evaluation 10 N/A
Distributed universal streaming Multidimensional data Dynamically change monitoring scope Feasibility of hardware implementations? Future Directions 11
12 Conclusions Network management needs many traffic metrics Today’s solutions offer undesirable extremes Generic but low fidelity (e.g., sampling) High fidelity but high complexity (e.g., specific-sketches) Holy grail: Universal Monitoring Decouple monitoring control and data plane like SDN! This work: Can be viable via Universal Sketches Several open questions e.g. dynamic, multidimensional, distributed, hardware viability