Download presentation
Presentation is loading. Please wait.
1
1 Counter Braids A novel counter architecture for network measurement
2
2 Data Measurement: Background Accurate data measurement is needed in network and database systems. Internet Backbones – Accounting/Billing by ISPs: Usage-based pricing – Traffic engineering – Network diagnostics and forensics: Intrusion detection, denial-of-service attacks – Products: NetFlow (Cisco), cflowd (Juniper), NetStream (Huawei) – Ref. Lin, Xu, Kumar, Sung, Ramabhadran, Varghese, Estan, Varghese, Shah, Iyer, Prabhakar, McKeown, etc. Database Systems – Sketches, synopses of data streams – Usage and access logs – Ref. Cormode, Muthukrishnan, Babcock, Babu, Motwani, Widom, Charikar, Chen, Farach-Colton, Alon, etc Data Centers, Cloud Computing – Monitoring network usage, diagnostics, real-time load balancing, network planning – E.g. Amazon, Facebook, Google, Microsoft, Yahoo!
3
3 The Problem A flow is a sequence of packets with common properties; e.g. the same source- destination address, common IP header 5-tuple, common prefix in routing table – An email, a web download, a video or voice stream,... Problem: Measure the number of bytes and packets sent by each flow in a measurement interval : 0 : 1: 2 : 1: 2 : 3 : 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24 : 1: 2: 3 : 1 : 2: 3 Packet Counter
4
4 The Constraints Large number of active flows in a measurement interval – Several millions over 5-min interval on backbone links (CAIDA) – Problem: need lots of counters, large memory Very high link speeds, need fast memory updates – E.g. per-packet update time on 40 Gbps links are roughly 15 nanoseconds – Problem: need memory with fast access times Memories are either large (DRAM) or fast (SRAM); can’t get both – DRAMs have access times of 50 - 60 ns – SRAMs have access times of 4.5 -7 ns, but around 50 - 60 Mb (Micron Tech.)
5
5 Building Counter Arrays: A naïve approach Allocate one counter per flow – Packet arrives, looks up flow-id in memory, update corresponding counter Problems – Too much space taken up by counters, especially because not all flows are large – Need to perform flow-to-counter association for every packet I.e. need perfect hash function : 1 : 23 : 3 : 1 : 3 : 2 : 24 Flow 1: Flow 2: Flow 3: Flow 4: Flow 5: -- 0 -- 1
6
6 Previous Approaches Pros: Obtains exact counts, reduces cost of total memory for counters. Cons: –Need to know the flow-to-counter association –High-complexity counter management 1. Hybrid SRAM-DRAM architecture (Shah et. al ’01, Ramabhadran and Varghese ’03, Zhao et. al. ’06) fast memory (SRAM) slow memory (DRAM) Counter management : 2 : 24 : 3 : 1 : 3
7
Approximate large flow detection 7 Previous Approaches Measures data from large flows (elephants) and ignore small flows (mice) – Exploits Pareto distribution in network traffic (Fang, Peterson ’99, Feldmann et.al ’00) fast memory (SRAM) 2. Approximate Counting (Estan and Varghese, ’02) : 1 : 3 : 2 : 24 : 3 Pros: –Strikes a neat trade-off: accuracy for simplicity. –Flow-to-counter association becomes manageable. Cons: Flow counts are approximate.
8
8 Summary Exact counting methods – Hybrid architecture, complex Approximate methods – Focus on large flows, inaccurate Problems to address – Get rid of flow-to-counter association problem – Save counter space (reduce memory size)
9
9 Getting Rid of Flow-to-Counter Association Use multiple hash functions – Single hash function leads to collisions. – However, one can use two or more hash functions and use the redundancy to recover the flow size. 2 2 24 1 1 3 3 0 0 26 1 1 3 3 3 3 3 3 2 2 24 1 1 3 3 25 26 6 6 6 6 3 3 3 3 Need efficient decoding algorithm for solving c = Mf – c: counts, M: adjacency matrix, f: flow sizes M = 0 1 1 0 0 1 1 0 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0 1 1
10
10 Saving Counter Space “Braid” counters –Share the more significant bits : 2 : 24 : 3 : 1 : 3 : 2 : 24 : 3 : 1 : 3
11
11 Updating Counter Braids : 2 : 24 : 3 : 1 : 3 : 2
12
12 The Overall Architecture Elephant Traps Few, deep counters Mouse Traps Many, shallow counters Status bit Indicates overflow flows
13
13 System Diagram Counter values in CB structure Offline list of Decoder list of 2 24 3 1 3 Counter values in CB structure Online Count using CBs
14
14 Will Describe Decoding algorithms Sizing of Counter Braids – Comparison with Linear Programming Decoder
15
15 Decoding Algorithms 1.Typical set decoder 2.Message-passing decoder
16
16 Typical set decoder Assumptions on flow size distribution, P: 1.At most power-law tails: 2.Decreasing digit entropy: write, entropy of f i ( l ) decreasing in l Satisfied by real traffic distributions Typical set decoder Let f’ = (f 1 ’,... f n ’ ) and. If f’ is the unique vector that solves c = Mf and D(P’ | P) < , then output f’; otherwise, output error. Theorem (Lu, Montanari, Prabhakar ’07) Asymptotic Optimality For any rate r > H(P) there exists a sequence of reliable sparse Counter Braids with asymptotic rate r. Hence, with the typical set decoder, Counter Braids is optimal. However, too complex to implement
17
17 Message Passing Decoder (Lu et.al. ’08) Linear-complexity decoding algorithm In each layer, the decoder solves the set of linear equations, c = Mf shallow counters 2 nd layer flows (exist only conceptually) carry-over
18
18 A Practical Algorithm: Count-Min Count-Min (Cormode and Muthukrishnan ‘05) – Algorithm: Estimate each flow’s size as the minimum counter it hits. – The flow sizes for the example below would be estimated as: 34, 34, 32 Pros — Fast and easy to implement Cons – Decision based on local information. Need lots of counters for accurate estimation – Don’t know the error magnitude; in fact, don’t know if there is an error 1 1 32 34 32
19
Count-Min 34 32 Iter 1 0 0 0 Iter 0 1 1 32 34 32 Count-Min as Message Passing c c 11 11 22 22 3 3 Upper bounds on true flow sizes
20
Count-Min 34 32 Iter 1 0 0 0 Iter 0 1 1 32 34 32 Message Passing continues 33 33 2 2 1 1
21
Count-Min 34 32 Iter 1 0 0 0 Iter 0 1 1 1 Iter 2 Message Passing continues 1 1 32 34 32 c c 11 11 22 22 3 3 1 1 Iter 3 1 1 32 Iter 4 Lower bounds on true flow sizes
22
22 Message Passing Decoder Initialize ia = 0 8 i and 8 a Iterations for t = 1 to T ia (t) = max { c a - j i ja (t-1), 1} ia (t) = min b a bi (t) if t is odd = max b a bi (t) if t is even Final estimate f i (T) = min a ai (T) if T is odd = max a ai (T) if T is even
23
23 Anti-monotonicity Property Lemma. Anti-monotonicity Property (Lu et. al. ’08) If and ’ are such that for every i and a, ia (t-1) · ia ’(t-1) · f i, then ia (t) ¸ ia ’(t) ¸ f i. Consequently, since f e (0) = 0, f e (2t) · f component-wise and f e (2t) is component-wise non-decreasing. Similarly f e (2t+1) ¸ f and is component-wise non-increasing. Corrollary: Incorrect flow size estimates can be identified when the upper bounds differ from the lower bounds. When does the gap close? Flow index Flow size
24
24 Convergence: Sizing of Counter Braids
25
25 Convergence Definitions: n flows and m counters k hash functions Error measure Number of counters per flow where m is the number of counters and n is the number of flows
26
26 Threshold, *= 0.72 (No. of flows=10,000) P err Iteration number Count-Min’s error = 0.71, 21% flows not decoded, but we know their errors exactly. = 0.725, converges in 25 iterations = 0.74, converges faster
27
27 Threshold Let where = nk/m is the average degree of a counter node. Let = P (f i > 1) and Let *= sup { 2 R: x = g (x) has no solution 8 x 2 (0,1] }. Theorem. The Threshold (Lu et. al. ’08) * = k / * such that in the large n limit (i) If > *, P err → 0; the lower bounds and upper bounds converge to the true flow sizes. (ii) If ≤ *, P err > 0; there exists a positive proportion of flows such that the lower bounds differ from the upper bounds, i.e, some flows are incorrectly decoded.
28
28 Density Evolution Lemma (Modern Coding Theory, Richardson and Urbanke) Locally Tree-like Property Consider running the algorithm for a finite number of iterations. The error probability averaged over the class of random graphs, as the system size grows, converges to the error probability on a finite tree, where messages are independent of each other. Flow nodes have degree k Counter nodes have random degrees of Poisson distribution with mean nk/m
29
29 Density Evolution Error probability of a counter-to-flow message where j is the degree of the counter node, which has a Poisson distribution with mean nk/m
30
30 Density Evolution Error probability of a counter-to-flow message, averaged over j ~ Poisson( ) where is the moment generating function of the Poisson distribution.
31
31 Density Evolution Error probability of a counter-to-flow message Error probability of a flow-to-counter message if t is odd if t is even where k is the number of hash functions and = P (f>1)
32
32 Density Evolution Error probability of a counter-to-flow message Error probability of a flow-to-counter message if t is odd if t is even where k is the number of hash functions and = P (f>1) Substitute y t and repeat for odd and even iterations, we obtain Now look for the fixed point of g (x) = x
33
33 Computation of Threshold Above the threshold, decoding error goes to 0 ß > ß*
34
34 Computation of Threshold Above the threshold, decoding error goes to 0 Below the threshold, decoding error stays positive ß < ß* ß > ß*
35
35 Computation of Threshold Above the threshold, decoding error goes to 0 Below the threshold, decoding error stays positive Threshold is computed when g(x) is tangential to the diagonal ß = ß* ß < ß*ß > ß*
36
36 Computation of Threshold Above the threshold, decoding error goes to 0 Below the threshold, decoding error stays positive Threshold is computed when g(x) is tangential to the diagonal ß = ß*ß < ß*ß > ß* Fraction of flows incorrectly decoded Iteration number
37
37 Comparison with Linear Programming Decoder
38
38 Relation to Compressed Sensing Compressed sensing – Storing sparse vectors using random linear transformations Baraniuk, Candes, Donoho, Gilbert, Hassibi, Indyk, Milenkovic, Muthukrishnan, Romberg, Strauss, Tanner, Tao, Tropp, Vershynin, Wainwright, et al Problem statement minimize ||f|| 1 subject to c= Mf – LP decoding: worst-case cubic complexity We consider sparse non-negative solutions and compare the Linear Programming (LP) decoder and the Message Passing (MP) decoder – Given an -sparse vector f (with proportion of non-zero entries) – let m = n *( ) be the number of equations needed for each algorithm to recover f as n → 1, with high probability over the space of random matrices and the space of vector f
39
39 Comparison (Lu, Montonari, Prabhakar ‘08) Theorem. LP Threshold. (Donoho, Tanner ‘05) Consider a mxn matrix M with columns drawn independently from a multivariate normal distribution on R d with nonsingular covariance matrix. For 2 [0,1], numerically obtainable and plotted in the graph. As → 0, -- LP our algorithm Theorem. MP Threshold. (Lu, Montanari, Prabhakar ‘08) Consider sparse matrices with given degree distributions. For any 2 [0,1], * · 1/2. For · 1/8,
40
40 Implementation Flow label collection (Lu, Prabhakar, ’09) Flow labels are collected and stored in DRAM, with a small fraction (e.g. 0.5%) missing The resulting problem is equivalent to decoding with noise FPGA Implementation (Lu, Luo, Prabhakar, submitted) Simple to implement – 2512 Slice Flip Flops and 5342 4-input LUTs on NetFPGA, with 2.8MB memory – 65-nm ASIC synthesis: 0.0487 mm 2 for logic and 15.66 mm 2 for memory Parallel architecture to achieve high throughput: 84 Gbps at 125 MHz clock
41
41 Trace Simulation CAIDA trace (2 hours data) simulation Measurement interval: 5 min Number of flows: ~0.9 million † Includes Bloom filters for flow label collection and status bits for first-layer counters *Hash table and hybrid assume 1.5 times over-provisioning using d-left hash tables # Multi-stage filters assume 4000x4 x 3 bytes and 3000 x 32 bytes in flow memory. 100,000 flows, largest ~3000 Decoding complexity: 1 million flows take 15 seconds to decode on a 2.6 GHz machine. Counter BraidsHash tableHybridApproximate SRAM space (MB) 2.9†28.4*18.9*0.144 # Results 99.535% exact, 0.465% with known error exact 0.9% error for the largest 3000 flows
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.