Author: Sriram Ramabhadran, George Varghese Publisher: SIGMETRICS’03 Presenter: Yun-Yan Chang Date: 2010/12/29 1.

Slides:



Advertisements
Similar presentations
Chapter 12 File Management Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design Principles,
Advertisements

An Array-Based Algorithm for Simultaneous Multidimensional Aggregates By Yihong Zhao, Prasad M. Desphande and Jeffrey F. Naughton Presented by Kia Hall.
An On-Chip IP Address Lookup Algorithm Author: Xuehong Sun and Yiqiang Q. Zhao Publisher: IEEE TRANSACTIONS ON COMPUTERS, 2005 Presenter: Yu Hao, Tseng.
Fast Firewall Implementation for Software and Hardware-based Routers Lili Qiu, Microsoft Research George Varghese, UCSD Subhash Suri, UCSB 9 th International.
Augmenting Data Structures Advanced Algorithms & Data Structures Lecture Theme 07 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006.
Author: Nan Hua, Bill Lin, Jun (Jim) Xu, Haiquan (Chuck) Zhao Publisher: ANCS’08 Presenter: Yun-Yan Chang Date:2011/02/23 1.
B+-Trees (PART 1) What is a B+ tree? Why B+ trees? Searching a B+ tree
File Management Chapter 12. File Management A file is a named entity used to save results from a program or provide data to a program. Access control.
Author: Francis Chang, Wu-chang Feng, Kang Li Publisher: INFOCOM 2004 Presenter: Yun-Yan Chang Date: 2010/12/01 1.
1 Author: Ioannis Sourdis, Sri Harsha Katamaneni Publisher: IEEE ASAP,2011 Presenter: Jia-Wei Yo Date: 2011/11/16 Longest prefix Match and Updates in Range.
Xyleme A Dynamic Warehouse for XML Data of the Web.
The CPBT: A Method for Searching the Prefixes Using Coded Prefixes in B-Tree Author: Mohammad Behdadfar and Hossein Saidi Publisher: LNCS 2008 Presenter:
1 A Tree Based Router Search Engine Architecture With Single Port Memories Author: Baboescu, F.Baboescu, F. Tullsen, D.M. Rosu, G. Singh, S. Tullsen, D.M.Rosu,
Analysis of a Statistics Counter Architecture Devavrat Shah, Sundar Iyer, Balaji Prabhakar & Nick McKeown (devavrat, sundaes, balaji,
Power Efficient IP Lookup with Supernode Caching Lu Peng, Wencheng Lu*, and Lide Duan Dept. of Electrical & Computer Engineering Louisiana State University.
Pipelined van Emde Boas Tree: Algorithms, Analysis, and Applications Hao Wang and Bill Lin University of California, San Diego.
Beyond Bloom Filters: From Approximate Membership Checks to Approximate State Machines By F. Bonomi et al. Presented by Kenny Cheng, Tonny Mak Yui Kuen.
1 Succinct Priority Indexing Structures for the Management of Large Priority Queues Hao Wang and Bill Lin University of California, San Diego IEEE IWQoS.
Searching with Structured Keys Objectives
1 DRES:Dynamic Range Encoding Scheme for TCAM Coprocessors Authors: Hao Che, Zhijun Wang, Kai Zheng and Bin Liu Publisher: IEEE Transactions on Computers,
1 File Management Chapter File Management File management system consists of system utility programs that run as privileged applications Input to.
1 A Fast IP Lookup Scheme for Longest-Matching Prefix Authors: Lih-Chyau Wuu, Shou-Yu Pin Reporter: Chen-Nien Tsai.
The Euler-tour technique
Fast binary and multiway prefix searches for pachet forwarding Author: Yeim-Kuan Chang Publisher: COMPUTER NETWORKS, Volume 51, Issue 3, pp , February.
File Management Chapter 12.
1 Plaxton Routing. 2 Introduction Plaxton routing is a scalable mechanism for accessing nearby copies of objects. Plaxton mesh is a data structure that.
PARALLEL TABLE LOOKUP FOR NEXT GENERATION INTERNET
Network Processor Algorithms: Design and Analysis Stochastic Networks Conference Montreal July 22, 2004 Balaji Prabhakar Stanford University.
CEDAR Counter-Estimation Decoupling for Approximate Rates Erez Tsidon (Technion, Israel) Joint work with Iddo Hanniel and Isaac Keslassy ( Technion ) 1.
An Improved Algorithm to Accelerate Regular Expression Evaluation Author: Michela Becchi, Patrick Crowley Publisher: 3rd ACM/IEEE Symposium on Architecture.
CEDAR Counter-Estimation Decoupling for Approximate Rates Erez Tsidon Joint work with Iddo Hanniel and Isaac Keslassy Technion, Israel 1.
Author: Haoyu Song, Fang Hao, Murali Kodialam, T.V. Lakshman Publisher: IEEE INFOCOM 2009 Presenter: Chin-Chung Pan Date: 2009/12/09.
Authors: Haiquan (Chuck) Zhao, Hao Wang, Bill Lin, Jun (Jim) Xu Conf. : The 5th ACM/IEEE Symposium on Architectures for Networking and Communications Systems.
1 Dynamic Pipelining: Making IP- Lookup Truly Scalable Jahangir Hasan T. N. Vijaykumar School of Electrical and Computer Engineering, Purdue University.
Compact Trie Forest: Scalable architecture for IP Lookup on FPGAs Author: O˘guzhan Erdem, Aydin Carus and Hoang Le Publisher: ReConFig 2012 Presenter:
StrideBV: Single chip 400G+ packet classification Author: Thilan Ganegedara, Viktor K. Prasanna Publisher: HPSR 2012 Presenter: Chun-Sheng Hsueh Date:
1 Power-Efficient TCAM Partitioning for IP Lookups with Incremental Updates Author: Yeim-Kuan Chang Publisher: ICOIN 2005 Presenter: Po Ting Huang Date:
Author: Haoyu Song, Murali Kodialam, Fang Hao and T.V. Lakshman Publisher/Conf. : IEEE International Conference on Network Protocols (ICNP), 2009 Speaker:
Memory-Efficient and Scalable Virtual Routers Using FPGA Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan,
Internal and External Sorting External Searching
Packet classification on Multiple Fields Authors: Pankaj Gupta and Nick McKcown Publisher: ACM 1999 Presenter: 楊皓中 Date: 2013/12/11.
Updating Designed for Fast IP Lookup Author : Natasa Maksic, Zoran Chicha and Aleksandra Smiljani´c Conference: IEEE High Performance Switching and Routing.
Parallel tree search: An algorithmic approach for multi- field packet classification Authors: Derek Pao and Cutson Liu. Publisher: Computer communications.
Introduction to Databases Angela Clark University of South Alabama.
Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas.
Lecture 12 Huffman Algorithm. In computer science and information theory, a Huffman code is a particular type of optimal prefix code that is commonly.
Packet Classification Using Multidimensional Cutting Sumeet Singh (UCSD) Florin Baboescu (UCSD) George Varghese (UCSD) Jia Wang (AT&T Labs-Research) Reviewed.
1 Plaxton Routing. 2 History Greg Plaxton, Rajmohan Rajaraman, Andrea Richa. Accessing nearby copies of replicated objects, SPAA 1997 Used in several.
1 IP Routing table compaction and sampling schemes to enhance TCAM cache performance Author: Ruirui Guo, Jose G. Delgado-Frias Publisher: Journal of Systems.
IP Address Lookup Masoud Sabaei Assistant professor Computer Engineering and Information Technology Department, Amirkabir University of Technology.
Scalable Multi-match Packet Classification Using TCAM and SRAM Author: Yu-Chieh Cheng, Pi-Chung Wang Publisher: IEEE Transactions on Computers (2015) Presenter:
Andreas Klappenecker [partially based on the slides of Prof. Welch]
Storage Access Paging Buffer Replacement Page Replacement
Chapter 12 File Management
Cache Memory Presentation I
Spatial Online Sampling and Aggregation
Morgan Kaufmann Publishers Memory Hierarchy: Cache Basics
(2,4) Trees 2/15/2019 (2,4) Trees (2,4) Trees.
Scalable Multi-Match Packet Classification Using TCAM and SRAM
A SRAM-based Architecture for Trie-based IP Lookup Using FPGA
Compact DFA Structure for Multiple Regular Expressions Matching
Heaps By JJ Shepherd.
A Trie Merging Approach with Incremental Updates for Virtual Routers
Author: Xianghui Hu, Xinan Tang, Bei Hua Lecturer: Bo Xu
Authors: A. Rasmussen, A. Kragelund, M. Berger, H. Wessing, S. Ruepp
A SRAM-based Architecture for Trie-based IP Lookup Using FPGA
Authors: Ding-Yuan Lee, Ching-Che Wang, An-Yeu Wu Publisher: 2019 VLSI
Author: Ramana Rao Kompella, Kirill Levchenko, Alex C
Packet Classification Using Binary Content Addressable Memory
Presentation transcript:

Author: Sriram Ramabhadran, George Varghese Publisher: SIGMETRICS’03 Presenter: Yun-Yan Chang Date: 2010/12/29 1

 Introduction  Previous works  Scheme ◦ LR(T) ◦ Aggregated bitmap  Implementation  Conclusion 2

 Remove bottleneck of [1] by proposing a counter management algorithm (CMA) called LR(T) (Largest Recent with threshold T) that avoids sorting by only keeping a bitmap that tracks counters that are larger than threshold T. 3

 D. Shah, S. Iyer, B. Prabhakar, and N. McKeown ◦ Maintaining statistics counters in router line cards  Propose a hybrid architecture in which DRAM is used to store the statistics counters but a small amount of SRAM is used to enable counter updates at line rate.  Propose a CMA called LCF (Largest Counter First) which picks the counter with the largest value to be updated to DRAM. 4

 Architecture ◦ SRAM stores N counters of size m<M bits. ◦ DRAM stores N counters of size M bits.  The SRAM counters hold recent updates and are periodically transferred to the corresponding DRAM counters. Figure 1. Statistics counter architecture 5

 Largest Counter First (LCF) ◦ An algorithm which can minimize the size of SRAM.  Selects the largest counter.  If multiple counters have the same value, picks one arbitrarily.  Updates the value of the corresponding counter in the DRAM and sets in the SRAM. ◦ Bottleneck:  Sort: find the highest counter  Difficult to implement at high speed 6

 Algorithm description ◦ Let j * be the counter with the largest value among the counters incremented in the last cycle of b updates to SRAM. ◦ If the value of counter c j* ≥T, then updates counter j * to DRAM. ◦ If c j* <T, LR(T) updates any counter with value at least T to DRAM. ◦ If no counter exists, LR(T) updates counter j * to DRAM. 7

 Proof: ◦ Threshold T=0 allows a simple implementation, while T=b is optimal and minimizes the size of SRAM requirement. ◦ LR(0)  Only remembers the last b updates to SRAM in determining which counter update to DRAM.  Let be maximum value of a counter can reach under LR(0)  Theorem 1:  Implies SRAM counter of size at least 8

◦ LR(b)  Threshold increases from 0 to b.  b: time between accesses DRAM  Let be maximum value of a counter can reach under LR(0).  Theorem2:  Implies any counter is at most (b − 1)(N − 1)  Value of counter cannot be larger than (b-1)+log d (N-1) 9, where

 To minimize the required storage ◦ Consider a fixed universe U of N elements labelled 1, 2,…,N. ◦ Use a bitmap b 1 b 2... b N to record which elements are contained in set S or not.  b i is set to 1 if element i ∈ S, otherwise set to 0.  Implement functions: ◦ add(i) Adds element i to set S ◦ delete(i)Deletes element i from set S ◦ test(i) Tests whether element i belongs to set S ◦ find() Returns any element i that belongs to set S 10

Figure 2: Aggregated bitmap for N = 128 elements and W = 16 word size. 11

 Each group of W bits in the bitmap is aggregated to form a single node. ◦ N : bits of aggregated bitmap ◦ W : the word size (N and W must be power of 2) 12 Figure 2: Aggregated bitmap for N=128 elements and W=16 word size. Total: nodes Total memory: W

 Each internal node in the tree contains two fields called lcount and rcount. ◦ lcount is the number of 1s present in its left child ◦ rcount is the number of 1s present in its right child 13 Figure 2: Aggregated bitmap for N=128 elements and W=16 word size. lcountrcount

 Pipelined implementation ◦ Each operation proceeds top-down, start at root, from one level to another. ◦ At each level of the tree, there is potentially a memory read followed by a memory write. ◦ Storing each of the levels of the tree in a different memory bank permits simultaneous access to all levels of the tree. 14

 To implement LR(T), it’s necessary to keep track of two things: ◦ The largest value among all counters updated in the last cycle of b updates along with the corresponding counter j ∗. ◦ All counters above the threshold T.  Memory accesses for counter operations and bitmap operations proceed in parallel. 15

 Every cycle of b updates involves b SRAM and a DRAM update operation 16 Figure 3: Timing diagram for SRAM and DRAM updates for two successive cycles of b counter updates. ◦ SRAM update operation  Two accesses to update SRAM counter  Two accesses for add ◦ DRAM update operation  Two accesses to read and reset SRAM counter  Four accesses for delete and find.  Two DRAM accesses to update DRAM counter

 For a reference system of a million 64-bit counters and a line rate of 10 Gbps with 10 counter updates per packet 17 Table 1: Cost - benefit comparison for different schemes.