Presentation is loading. Please wait.

Presentation is loading. Please wait.

Author: Sriram Ramabhadran, George Varghese Publisher: SIGMETRICS’03 Presenter: Yun-Yan Chang Date: 2010/12/29 1.

Similar presentations


Presentation on theme: "Author: Sriram Ramabhadran, George Varghese Publisher: SIGMETRICS’03 Presenter: Yun-Yan Chang Date: 2010/12/29 1."— Presentation transcript:

1 Author: Sriram Ramabhadran, George Varghese Publisher: SIGMETRICS’03 Presenter: Yun-Yan Chang Date: 2010/12/29 1

2  Introduction  Previous works  Scheme ◦ LR(T) ◦ Aggregated bitmap  Implementation  Conclusion 2

3  Remove bottleneck of [1] by proposing a counter management algorithm (CMA) called LR(T) (Largest Recent with threshold T) that avoids sorting by only keeping a bitmap that tracks counters that are larger than threshold T. 3

4  D. Shah, S. Iyer, B. Prabhakar, and N. McKeown ◦ Maintaining statistics counters in router line cards  Propose a hybrid architecture in which DRAM is used to store the statistics counters but a small amount of SRAM is used to enable counter updates at line rate.  Propose a CMA called LCF (Largest Counter First) which picks the counter with the largest value to be updated to DRAM. 4

5  Architecture ◦ SRAM stores N counters of size m<M bits. ◦ DRAM stores N counters of size M bits.  The SRAM counters hold recent updates and are periodically transferred to the corresponding DRAM counters. Figure 1. Statistics counter architecture 5

6  Largest Counter First (LCF) ◦ An algorithm which can minimize the size of SRAM.  Selects the largest counter.  If multiple counters have the same value, picks one arbitrarily.  Updates the value of the corresponding counter in the DRAM and sets in the SRAM. ◦ Bottleneck:  Sort: find the highest counter  Difficult to implement at high speed 6

7  Algorithm description ◦ Let j * be the counter with the largest value among the counters incremented in the last cycle of b updates to SRAM. ◦ If the value of counter c j* ≥T, then updates counter j * to DRAM. ◦ If c j* <T, LR(T) updates any counter with value at least T to DRAM. ◦ If no counter exists, LR(T) updates counter j * to DRAM. 7

8  Proof: ◦ Threshold T=0 allows a simple implementation, while T=b is optimal and minimizes the size of SRAM requirement. ◦ LR(0)  Only remembers the last b updates to SRAM in determining which counter update to DRAM.  Let be maximum value of a counter can reach under LR(0)  Theorem 1:  Implies SRAM counter of size at least 8

9 ◦ LR(b)  Threshold increases from 0 to b.  b: time between accesses DRAM  Let be maximum value of a counter can reach under LR(0).  Theorem2:  Implies any counter is at most (b − 1)(N − 1)  Value of counter cannot be larger than (b-1)+log d (N-1) 9, where

10  To minimize the required storage ◦ Consider a fixed universe U of N elements labelled 1, 2,…,N. ◦ Use a bitmap b 1 b 2... b N to record which elements are contained in set S or not.  b i is set to 1 if element i ∈ S, otherwise set to 0.  Implement functions: ◦ add(i) Adds element i to set S ◦ delete(i)Deletes element i from set S ◦ test(i) Tests whether element i belongs to set S ◦ find() Returns any element i that belongs to set S 10

11 Figure 2: Aggregated bitmap for N = 128 elements and W = 16 word size. 11

12  Each group of W bits in the bitmap is aggregated to form a single node. ◦ N : bits of aggregated bitmap ◦ W : the word size (N and W must be power of 2) 12 Figure 2: Aggregated bitmap for N=128 elements and W=16 word size. Total: nodes Total memory: W

13  Each internal node in the tree contains two fields called lcount and rcount. ◦ lcount is the number of 1s present in its left child ◦ rcount is the number of 1s present in its right child 13 Figure 2: Aggregated bitmap for N=128 elements and W=16 word size. lcountrcount

14  Pipelined implementation ◦ Each operation proceeds top-down, start at root, from one level to another. ◦ At each level of the tree, there is potentially a memory read followed by a memory write. ◦ Storing each of the levels of the tree in a different memory bank permits simultaneous access to all levels of the tree. 14

15  To implement LR(T), it’s necessary to keep track of two things: ◦ The largest value among all counters updated in the last cycle of b updates along with the corresponding counter j ∗. ◦ All counters above the threshold T.  Memory accesses for counter operations and bitmap operations proceed in parallel. 15

16  Every cycle of b updates involves b SRAM and a DRAM update operation 16 Figure 3: Timing diagram for SRAM and DRAM updates for two successive cycles of b counter updates. ◦ SRAM update operation  Two accesses to update SRAM counter  Two accesses for add ◦ DRAM update operation  Two accesses to read and reset SRAM counter  Four accesses for delete and find.  Two DRAM accesses to update DRAM counter

17  For a reference system of a million 64-bit counters and a line rate of 10 Gbps with 10 counter updates per packet 17 Table 1: Cost - benefit comparison for different schemes.


Download ppt "Author: Sriram Ramabhadran, George Varghese Publisher: SIGMETRICS’03 Presenter: Yun-Yan Chang Date: 2010/12/29 1."

Similar presentations


Ads by Google