FlowRadar: A Better NetFlow For Data Centers

Slides:



Advertisements
Similar presentations
IP Router Architectures. Outline Basic IP Router Functionalities IP Router Architectures.
Advertisements

Florin Dinu T. S. Eugene Ng Rice University Inferring a Network Congestion Map with Traffic Overhead 0 zero.
A Scalable and Reconfigurable Search Memory Substrate for High Throughput Packet Processing Sangyeun Cho and Rami Melhem Dept. of Computer Science University.
Programmable Measurement Architecture for Data Centers Minlan Yu University of Southern California 1.
Bio Michel Hanna M.S. in E.E., Cairo University, Egypt B.S. in E.E., Cairo University at Fayoum, Egypt Currently is a Ph.D. Student in Computer Engineering.
OpenSketch Slides courtesy of Minlan Yu 1. Management = Measurement + Control Traffic engineering – Identify large traffic aggregates, traffic changes.
Fine-Grained Latency and Loss Measurements in the Presence of Reordering Myungjin Lee, Sharon Goldberg, Ramana Rao Kompella, George Varghese.
Improving TCP Performance over Mobile Ad Hoc Networks by Exploiting Cross- Layer Information Awareness Xin Yu Department Of Computer Science New York University,
Scalable Flow-Based Networking with DIFANE 1 Minlan Yu Princeton University Joint work with Mike Freedman, Jennifer Rexford and Jia Wang.
Profiling Network Performance in Multi-tier Datacenter Applications
Internet Indirection Infrastructure Ion Stoica UC Berkeley.
SAVE: Source Address Validity Enforcement Protocol Jun Li, Jelena Mirković, Mengqiu Wang, Peter Reiher and Lixia Zhang UCLA Computer Science Dept 10/04/2001.
1 TVA: A DoS-limiting Network Architecture Xiaowei Yang (UC Irvine) David Wetherall (Univ. of Washington) Thomas Anderson (Univ. of Washington)
DREAM: Dynamic Resource Allocation for Software-defined Measurement
BUFFALO: Bloom Filter Forwarding Architecture for Large Organizations Minlan Yu Princeton University Joint work with Alex Fabrikant,
Hash, Don’t Cache: Fast Packet Forwarding for Enterprise Edge Routers Minlan Yu Princeton University Joint work with Jennifer.
OpenFlow Switch Limitations. Background: Current Applications Traffic Engineering application (performance) – Fine grained rules and short time scales.
NET-REPLAY: A NEW NETWORK PRIMITIVE Ashok Anand Aditya Akella University of Wisconsin, Madison.
Detail: Reducing the Flow Completion Time Tail in Datacenter Networks SIGCOMM PIGGY.
SIGCOMM 2002 New Directions in Traffic Measurement and Accounting Focusing on the Elephants, Ignoring the Mice Cristian Estan and George Varghese University.
BUFFALO: Bloom Filter Forwarding Architecture for Large Organizations Minlan Yu Princeton University Joint work with Alex Fabrikant,
Compact Data Structures and Applications Gil Einziger and Roy Friedman Technion, Haifa.
Multiple Aggregations Over Data Streams Rui ZhangNational Univ. of Singapore Nick KoudasUniv. of Toronto Beng Chin OoiNational Univ. of Singapore Divesh.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Measurement COS 597E: Software Defined Networking.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
HARD: Hardware-Assisted lockset- based Race Detection P.Zhou, R.Teodorescu, Y.Zhou. HPCA’07 Shimin Chen LBA Reading Group Presentation.
An Efficient Gigabit Ethernet Switch Model for Large-Scale Simulation Dong (Kevin) Jin.
Querying the Internet with PIER CS294-4 Paul Burstein 11/10/2003.
SCREAM: Sketch Resource Allocation for Software-defined Measurement Masoud Moshref, Minlan Yu, Ramesh Govindan, Amin Vahdat (CoNEXT’15)
MMPTCP: A Multipath Transport Protocol for Data Centres 1 Morteza Kheirkhah University of Edinburgh, UK Ian Wakeman and George Parisis University of Sussex,
BUFFALO: Bloom Filter Forwarding Architecture for Large Organizations Minlan Yu Princeton University Joint work with Alex Fabrikant,
1 Out of Order Processing for Stream Query Evaluation Jin Li (Portland State Universtiy) Joint work with Theodore Johnson, Vladislav Shkapenyuk, David.
MOZART: Temporal Coordination of Measurement (SOSR’ 16)
SketchVisor: Robust Network Measurement for Software Packet Processing
Xin Li, Chen Qian University of Kentucky
SDN challenges Deployment challenges
HULA: Scalable Load Balancing Using Programmable Data Planes
Problem: Internet diagnostics and forensics
James Hongyi Zeng, Jeongkeun Lee, Changhoon Kim, Minlan Yu
Jennifer Rexford Princeton University
Large-scale file systems and Map-Reduce
Data Streaming in Computer Networking
NOX: Towards an Operating System for Networks
What's the buzz about HORNET?
LossRadar: Fast Detection of Lost Packets in Data Centers
Fast and Efficient Flow and Loss Measurement for Data Center Networks
Lecture 9 Logistics Last lecture Today HW3 due Wednesday
IS3120 Network Communications Infrastructure
Northwestern Lab for Internet and Security Technology (LIST) Yan Chen Department of Computer Science Northwestern University.
Srinivas Narayana MIT CSAIL October 7, 2016
Congestion-Aware Load Balancing at the Virtual Edge
Abstractions for Model Checking SDN Controllers
Optimal Elephant Flow Detection Presented by: Gil Einziger,
Hash Tables.
SCREAM: Sketch Resource Allocation for Software-defined Measurement
Approximate Frequency Counts over Data Streams
Lecture 11 Logistics Last lecture Today HW3 due now
Fine-grained vs Coarse-grained multithreading
EE 122: Lecture 7 Ion Stoica September 18, 2001.
CSE 370 – Winter 2002 – Comb. Logic building blocks - 1
Combinational Circuits
Congestion-Aware Load Balancing at the Virtual Edge
Catching the Microburst Culprits with Snappy
Combinational Circuits
2019/5/13 A Weighted ECMP Load Balancing Scheme for Data Centers Using P4 Switches Presenter:Hung-Yen Wang Authors:Peng Wang, George Trimponias, Hong Xu,
Performance of VoIP in a b wireless mesh network
Lu Tang , Qun Huang, Patrick P. C. Lee
Toward Self-Driving Networks
Toward Self-Driving Networks
Catching the Microburst Culprits with Snappy
Presentation transcript:

FlowRadar: A Better NetFlow For Data Centers Yuliang Li Rui Miao Changhoon Kim Minlan Yu

Flow coverage in data centers Traffic monitoring needs to cover all the flows Transient loop/blackhole Fine-grained traffic analysis

Temporal coverage in data centers Traffic monitoring needs millisecond-level flow information DoS Short-time scale Loss rate Timely attack detection

Key insight: division of labor Goal: report counters for all flows in fine-grained time granularity Overhead at the collector Needs sampling Mirroring Collector has limited bandwidth and storage NetFlow Overhead at the switches Limited per-packet processing time Limited memory (10s of MB)

Key insight: division of labor Goal: report counters for all flows in fine-grained time granularity Overhead at the collector Mirroring Keep the memory usage small FlowRadar NetFlow Overhead at the switches Use fixed operations per-pkt in switch

FlowRadar architecture Analyze Flow&counter Collector Correlate network-wide info to extract per-flow counter Periodic report Network Each switch maintains a fast and efficient data structure for half-baked per-flow counter

Challenge: handling collision? Handling hash collision is hard Large hash table  high memory usage Linked list/Cuckoo hashing multiple, non-constant memory accesses Flow d collision Flow a b c PacketCount … d 1

Switch embraces collisions! Handling hash collision is hard Large hash table  high memory usage Linked list/Cuckoo hashing multiple, non-constant memory accesses Embrace the collision Less memory and constant #accesses

Switch embraces collisions! Embrace the collision: xor up all the flows Less memory and constant #accesses flow a: S(a) a b Flow b: S(b) c Flow c: S(c) FlowXor FlowCount PacketCount FlowXor a a⊕b b⊕c c FlowCount 1 2 PacketCount S(a) S(a)+S(b) S(b)+S(c) S(c) Counting table [Invertible Bloom Lookup Table (arXiv 2011)] S(x): #packets in x

Switch embraces collisions! 1. Check and update the flow filter 2. Update counting table Packet from a new flow, update all fields Subsequent packets update only PacketCount d d Flow filter Encoded flowset bloom filter: identify new flow ⊕d ⊕d ⊕d FlowXor a a⊕b b⊕c c FlowCount 1 2 PacketCount S(a) S(a)+S(b) S(b)+S(c) S(c) +1 +1 +1 Counting table +1 +1 +1 +1 +1 +1 [Invertible Bloom Lookup Table (arXiv 2011)]

Easy to implement in merchant silicon Switch data plane Fixed operations in hardware Small memory, 2.36MB for 100K flows Switch control plane Control plane gets the small flowset every 10ms We implemented it using P4 Language.

FlowRadar architecture Analyze Flow&counter Collector Correlate network wide info to extract per-flow counter Stage1.SingleDecode Stage2. Network-wide Decode Periodic report Network FlowXor … FlowCount PacketCount Bloom filter Counting table Flow filter Each switch maintains a fast and efficient data structure for half-baked per-flow counter Encoded flowset Encoded flowset Encoded flowset

Stage1. SingleDecode Input: a single encoded flowset Output: per-flow counters FlowXor … FlowCount PacketCount Bloom filter Counting table Flow filter Flow #packet a S(a) …

Stage1. SingleDecode Find a pure cell: a cell with one flow Remove the flow from all cells Flow #packet a 5 Decoded: Flow filter FlowXor a a⊕b b⊕c⊕d c⊕d FlowCount 1 2 3 PacketCount 5 12 13 6 -1 -1 -1 -5 -5 -5 Pure cell

Stage1. SingleDecode Find a cell with one flow (pure cell) Remove the flow from all cells Create more pure cells Iterate until no pure cells Flow #packet a 5 Decoded: Flow filter FlowXor b b⊕c⊕d c⊕d FlowCount 1 3 2 PacketCount 7 13 6

We want to leverage the network-wide info Stage1. SingleDecode Flow #packet a 5 b 7 Decoded: We want to leverage the network-wide info to decode more flows Flow filter FlowXor c⊕d FlowCount 2 PacketCount 6

FlowRadar architecture Analyze Flow&counter Collector Stage1.SingleDecode Stage2. Network-wide Decode Periodic report Network Encoded flowset Encoded flowset Encoded flowset

Key insight: overlapping sets of flows The sets of flows overlap across hops We can use the redundancy to decode more flows Use different hash functions across hops Provision memory based on avg(#flows), not max(#flows) SingleDecode for normal case Network-wide decoding for bursts of flows FlowXor FlowCount PktCount a a⊕c ⊕d b⊕c a⊕b ⊕c b⊕d 1 3 2 FlowXor FlowCount PktCount a⊕d a⊕c b⊕c ⊕d a⊕b ⊕c b⊕d 2 3 a a b b 10 c Use 5 cells to decode 4 flows c d d Collector can leverage flowsets from all switches to decode more

Challenge 1: sets of flows not fully overlapped Flows from one switch may go to different next hops One switch receive flow from multiple hops a b c d e

Challenge 1 solution: use flow filter to check Generalize to network No need for routing info Incremental deployment SingleDecode Remove from the other SingleDecode Decoded: a b c d a c d e Flow filter Flow filter a b c d a c d e

Challenge 2: counters are different across hops The counter of a flow may be different across hops Some packets may get lost On-the-fly packets drop

Challenge 2 solution: solve linear equations We got full list of flows Combine with counting table Construct and solve a linear equation system for each switch Speed up by using counter’s properties to stop solver earlier a b c d Flow #pkt a 5 b 7 c 4 d 2 FlowXor FlowCount PktCount …

FlowRadar architecture Analyze Collector Flow&counter Stage1.SingleDecode Stage2. Network-wide Decode Stage2.1 FlowDecode Stage2.2 CounterDecode Periodic report Network Encoded flowset Encoded flowset Encoded flowset

Evaluations Encoded flowset Encoded flowset Encoded flowset SingleDecode vs. Network-wide Decode Analyze Collector Flow&counter Stage1.SingleDecode Stage2.1 FlowDecode Stage2.2 CounterDecode Bandwidth usage Periodic report Network Encoded flowset Encoded flowset Encoded flowset Memory efficiency

Evaluation Simulation of k=8 FatTree (80 switches, 128 hosts) in ns3 Config the memory base on avg(#flow), when burst of flows happens, use network-wide decode The worst case is all switches are pushed to max(#flow) Traffic: each switch has same number of flows, and thus same memory Each switch reports the flowset every 10 ms.

Memory efficiency FlowRadar: 24.8MB #cell=#flow (Impractical) Log scale

Other results Bandwidth usage NetDecode improvement over SingleDecode Only 0.52% based on topology and traffic of Facebook data centers (sigcomm’15) NetDecode improvement over SingleDecode SingleDecode 100K flow, which takes 10ms NetDecode 26.8% more flows with the same memory, which takes around 3 sec

Stage2. Network-wide Decode FlowRadar analyzer Analyze Flow&counter Collector Stage1.SingleDecode Stage2. Network-wide Decode Periodic report Network Encoded flowset Encoded flowset Encoded flowset

Analysis applications Flow coverage Transient loop/blackhole Error in match-action table Fine-grained traffic analysis Temporal coverage Short time-scale per-flow loss rate ECMP load imbalance Timely attack detection

Per-flow loss map: better temporal coverage Detect loss faster than NetFlow 35 packets 15 packets NetFlow detects loss Switch 1 Switch 2 FlowRadar detects loss 14 packets 34 packets

Conclusion Report counters for all flows in fine-grained time granularity Fully leverage the capability of both the switches and the collector Switch: fixed per-packet processing time, memory-efficient Collector: Network-wide decoding Mirroring Overhead at the collector FlowRadar NetFlow Overhead at the switches