Presentation is loading. Please wait.

Presentation is loading. Please wait.

Revisiting the Case for a Minimalist Approach for Network Flow Monitoring Vyas Sekar, Michael K Reiter, Hui Zhang 1.

Similar presentations


Presentation on theme: "Revisiting the Case for a Minimalist Approach for Network Flow Monitoring Vyas Sekar, Michael K Reiter, Hui Zhang 1."— Presentation transcript:

1 Revisiting the Case for a Minimalist Approach for Network Flow Monitoring Vyas Sekar, Michael K Reiter, Hui Zhang 1

2 Many Monitoring Applications 2 Traffic Engineering Analyze new user apps Anomaly Detection Network Forensics Worm Detection Accounting Botnet analysis …….

3 Need to estimate different metrics 3 Traffic Engineering Analyze new user apps Anomaly Detection Network Forensics Worm Detection Accounting Botnet analysis ……. “Heavy-hitters” “Degree histogram” “Entropy”, “Changes” “SuperSpreaders” “Flow size distribution”

4 How are these metrics estimated? 4 Traffic Packet Processing Counter Data Structures Application-Level Metrics Monitoring (on router) Computation (off router)

5 Today’s solution: Packet Sampling 5 Traffic Packet Processing Packet Processing Counter Data Structures Monitoring (on router) Computation (off router) Sample packets uniformly FlowId Pkt/Byte Counts Compute metrics on sampled flows Estimation is inaccurate for fine-grained analysis Extensive literature on limitations for many tasks! Application-Level Metrics Application-Level Metrics Flow = Packets with same Src/Dst Addr and Ports

6 Trend: Shift to Application-Specific 6 Traffic Packet Processing Packet Processing Counter Data Structures Application-Level Metric Application-Level Metric Flow Size Distribution Entropy Superspreader Complexity: Need per-metric implementation Early commitment: Applications are a moving target Counter Data Structures Application-Level Metric Application-Level Metric Packet Processing Packet Processing Counter Data Structures Application-Level Metric Application-Level Metric Packet Processing Packet Processing ….

7 Full matrix of algorithms Packet Samplin g Flow Sampling Sample and Hold FSDSketchEntropyHeavy Hitter supersprea der Deg Histogram Packet Processing Sample pkt with orob p Update flowctr If hash(flow id) < p Update flowtable If flow in table, update Else with prob p create new Hash(flowid ): [1,n] Update that ctr H hash functions each from [1—k] Update h_i (pktfields) If key is in multimap, update, create new with prob Use many SH If hash(src,dst ) < p, add to list complex Counter Data Structure Flow, CtrFlow, ctr Countearra y [1—N] Counter matrix [h,k] Multimap of key,value Key,cty2 hashtables and complex Instances1111Src, dst,Src,dst,sp,d p,srcdst Src,dst,5tup le, srcdst,sp,dp 11 Estimation Task ***#flows with size i Find src/dst with large shift in vol - \sum_i p_i log p_i Find top-k items Find srces contacting more than k dest Find #srces with deg between 2^I and 2^i+1 7

8 Today’ solution: Packet Sampling 8 161311 1 Flow reports 1 Not good for fine-grained analysis Extensive literature on limitations for many tasks! 11316111131611 12 Sample packets at random, aggregate into flows FlowId Counter Flow = Packets with same pattern Source and Destination Address and Ports Estimate: FSD, Entropy, Heavyhitters, Changes, SuperSpreaders ….

9 Trend: Shift to Application-Specific 9 Flow Size Distribution Entropy Heavy Hitters Change Detection Super Spreaders Outdegree Histogram Collect Estimate Collect Metric-Specific Collection [Sigmetrics 04] [IMC 06] [Sigmetrics 06] [IMC 08] [NDSS 05] [IMC 05] [IwQoS 07][Sigcomm 02] [IMC 04] [IMC 03] [LATIN 05] Traffic

10 Good estimation accuracy, but.. 10 Is there a simpler and general alternative? Complexity: Need per-metric implementation Early commitment: Applications are a moving target Vendors and operators must commit to fixed capabilities

11 What do we ideally want? 11 Traffic Packet Processing Packet Processing Counter Data Structures Application- Specific Metrics Application- Specific Metrics Monitoring (on router) Computation (off router) Simple High accuracy Support many applications

12 Outline Motivation A Minimalist Alternative Evaluation Summary and discussion 12

13 Requirements 13 Anomaly Worm Accounting Botnet 2. General across applications 1. Simple router implementation 3. Enable drill-down capabilities 4. Network-wide views

14 How do we meet these requirements? 14 1. Simple router implementation 2. General across applications 3. Enable drill-down capabilities 4. Network-wide views Delay binding to specific applications

15 What does it mean to delay binding? 15 Traffic Packet Processing Counter Data Structures Application-Level Metrics Monitoring (on router) Computation (off router) Instead of splitting resources, Aggregate into generic primitives Instead of splitting resources, Aggregate into generic primitives Keep this stage as “generic” as possible Keep this stage as “generic” as possible

16 Decouple Collection and Computation 16 Flow Size Distribution Entropy Heavy Hitters Change Detection Super Spreaders Outdegree Histogram Estimate Collect Late-binding to applications, Low complexity Generic collection primitives Intuition: Instead of splitting across applications, Aggregate and give to generic primitives Intuition: Instead of splitting across applications, Aggregate and give to generic primitives

17 What Generic Primitives? Two broad classes of monitoring tasks: 1. Communication structure e.g., Who talked to whom? 2. Volume structure e.g., How much traffic? 17  Flow sampling [Hohn, Veitch IMC ‘03]  Sample and Hold [Estan,Varghese SIGCOMM ’02]

18 Flow Sampling 18 Traffic Packet Processing Packet Processing Counter Data Structures Hash(5-tuple) If hash < r, update FlowId Pkt/Byte Counts Flow = Packets with same Src/Dst Addr and Ports Pick flows at random; not biased by flow size Good for “communication” patterns

19 Sample and Hold 19 Traffic Packet Processing Packet Processing Counter Data Structures FlowId Pkt/Byte Counts Flow = Packets with same Src/Dst Addr and Ports Accurate counts of large flows Good for “volume” queries If flow in table, update Sample with prob p If new, create entry

20 Flow sampling 20 161311 1 Flow memory (flow, counter #pkts) 3 [3,10] Hash range 6 Pick flows at random; not biased by flow size Good for “communication” patterns 1131611 Compute hash, log if in range Version IHL TOS Length Identification Flags Offset TTL Protocol Checksum Source IP address Destination IP address …… SourcePort DestinationPort Hash Flowid  [0,Max] 1131611 Packet header 1 1

21 Sample and Hold 21 1613111 Flow memory (flow, #pkts) 1 6 Accurate counts of large flows Good for “volume” queries 1131611 Algorithm If flow is already logged  update Sample packet with probability p If new flow  create counter 1131611 1 1 234

22 How do we meet these requirements? 22 1. Simple router implementation 2. General across applications 3. Enable drill-down capabilities 4. Network-wide views Delay binding to specific applications Generic primitives = FS,SH Retain NetFlow’s operational model

23 Retain NetFlow-like operation 23 Application-Specific Flow Size Distribution Outdegree Histogram Estimate … Summary Statistics Minimalist Flow Size Distribution Outdegree Histogram Estimate FS + SH … Estimate Flow reports Difficult to do further analysis e.g., why is X high? Difficult to do further analysis e.g., why is X high? Can go back and analyze! e.g. estimate new metrics Can go back and analyze! e.g. estimate new metrics

24 Retain NetFlow operational model 24 Application-Specific FSDDegree Histogram Entropy Summary Statistics Difficult to do further analysis e.g., why is X high? Difficult to do further analysis e.g., why is X high? Can estimate new metrics! FSD Entropy Deg … … Minimalist Flow reports FS+SH FSDDegree Histogram Entropy

25 How do we meet these requirements? 25 1. Simple router implementation 2. General across applications 3. Enable drill-down capabilities 4. Network-wide views Retain NetFlow’s Operational model  Keep flow reports Network-wide resource management Delay binding to specific applications Generic primitives = FS,SH

26 Network-Wide Sample-and-Hold 26 1 1 1 1 123 4755 Sample-and-Hold Flow Sampling Repeating Sample-and-Hold wastes resources  Do it once per-path 5 5 5 FS+SH

27 Network-Wide Flow Sampling 27 1123 4755 Flow Sampling Use cSamp [NSDI’08] to configure flow sampling capabilities Hash-based coordination  Non-overlapping sets of flows Network-wide Optimization  Operator goals e.g., per-path guarantee 1 5 9 8 2 3 9 4 7 8

28 Putting the pieces together: “Minimalist” Proposal 28 Traffic Flow Sampling FlowId Pkt/Byte Counts Sample & Hold h  Hash(flowid) If h in FS_Range(path) Create/Update If Ingress(path) If flow in table Update With prob SH_p(path) If new Create FS_Range(path), SH_p(path) are configuration parameters e.g., via network-wide optimization using cSamp+

29 Putting the pieces together 29 1. Simple router implementation 2. General across applications 3. Enable drill-down capabilities 4. Network-wide views Generate flow reports with Flow Sampling and Sample-and-Hold Use cSamp+ to configure these primitives

30 What do we ideally want? 30 Traffic Packet Processing Packet Processing Counter Data Structures Application- Specific Metrics Application- Specific Metrics Monitoring (on router) Computation (off router) Simple High accuracy Support many applications ✔ ✔ ?

31 Outline Motivation A Minimalist Alternative Evaluation – Compare FS+SH vs. application-specific Summary and discussion 31

32 Assumptions in resource normalization Hardware requirements are similar – Both need per-packet array/key-value updates – More than pkt sampling, but within router capabilities Processing costs – Online cost lower for minimalist (don’t need per-app-instance) – Offline cost is higher for minimalist (but can be reduced, if necessary) Reporting bandwidth – Higher for minimalist, but < 1% of network capacity Memory for counters – Bottleneck is SRAM (Flow headers can be offloaded to DRAM) – We conservatively assume 4X more per-counter cost 32

33 Head-to-Head Comparison 33 Flow Size Distribution Outdegree Histogram Estimate FS +SH … Flow Size Distribution Outdegree Histogram Collect Estimate Collect … Application-SpecificMinimalist Estimate + + = Same resources Relative Accuracy (Minimalist) – Accuracy (AppSpecific) accuracy = --------------------------------------------------------------- difference Accuracy (AppSpecific) Application Portfolio Estimate

34 Head-to-Head Comparison 34 Flow Size Distribution Outdegree Histogram … Application-SpecificMinimalist + + = Normalize SRAM Relative Accuracy (Minimalist) – Accuracy (AppSpecific) accuracy = --------------------------------------------------------------- difference Accuracy (AppSpecific) Application Portfolio FS+SH FSD Entropy Degree Flow Size Distribution Outdegree Histogram …

35 Resource split between FS and SH 35 We pick 80-20 split as a good operation point Relative difference is positive for most applications! +  good -  bad +  good -  bad Run application-specific algorithms with recommended parameters (details in paper) Measure memory use; Run FS+SH with aggregate, but normalized (1/4X) memory Packet trace from CAIDA; consistent over other traces

36 Varying the application portfolio 36 Minimalist vs. Application-specific under same resources +  good -  bad +  good -  bad More tasks or some resource-intensive  Better across entire portfolio! “Sharing” effect across estimation tasks Application portfolio Packet trace from CAIDA; consistent over other traces Relative accuracy difference

37 Network-Wide View 37 Estimation (error metric) Application Specific Uncoordinated FS + SH Coordinated FS +SH FSD (WMRD) 0.160.190.02 Heavy Hitter (miss rate) 0.020.30.04 Entropy (relative error) not available0.030.02 SuperSpreader( miss rate) 0.020.040.01 Deg. Histogram (JL-divergence) 0.150.030.02 Configured per-ingress  can’t get network-wide! Introduces some biases due to duplicates 1. App-Specific: Difficult to generate different views e.g., per-OD-pair 2. Coordination: better performance & operational simplicity Lower  Better Lower  Better Flow-level traces from Internet2. Configure Application-Specific per PoP Measure resource consumption, normalize and give to network-wide FS+SH

38 Conclusions and discussion Even a simple “minimalist” approach might work Key: Focus on portfolio rather than individual tasks Proposal: FS + SH (complementary) ; cSamp-like mgmt Implications for device vendors and operators – Late binding, lower complexity Quest for feasibility not optimality Better primitives, combination, estimation? Is this sufficient? 38

39 Why is heavy hitter bad for FS/SH? Different flow keys: 5-tuple, src, dst, sport, dport, src-dst – SH in minimalist runs one instance  5-tuple – SH in app-specific runs per-key instance – Some “information loss” in projecting from 5-tuple – Tradeoff: processing cost vs.\ accuracy 39

40 Why is 4X factor conservative? 4X  Key-value more expensive than array Some App-Specific also need key-value – Assume these don’t have this overhead Software-only implementation  4X – Using google-sparsehash Better hardware counters – Space-efficient: e.g., [Sigmetrics’ 06] – New algorithms: e.g., CounterBraids [Sigmetrics’ 08] 40

41 What about programmable routers? Complementary – we do not preclude these! – Can think of these as “primitives” – Still doesn’t answer “configuration” In some cases, performance? 41


Download ppt "Revisiting the Case for a Minimalist Approach for Network Flow Monitoring Vyas Sekar, Michael K Reiter, Hui Zhang 1."

Similar presentations


Ads by Google