Low Power TCAMs For Very Large Forwarding Tables Authors: Wencheng Lu and Sartaj Sahni Presenter: Yi-Sheng, Lin ( 林意勝 ) Date: May. 13, 2008 Publisher/Conf. : INFOCOM 2008 Dept. of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C.
Outline 1. Introduction 2. Background and Related Work 3. Subtree Split 4. Postorder Split 5. Simple TCAM with Wide SRAM 6. Experimental Results 7. Conclusion
Introduction Several strategies have been proposed to reduce TCAM power significantly by selecting a portion of TCAM entire for search. Improving upon the router-table partitioning algorithms of [1] and [6]. Showing how to couple TCAMs and wide SRAMs so as to search forwarding tables whose size is much larger than the TCAM size with no loss in time and with power reduction. [1] G. N. F. Zane and A. Basu, “CoolCAMs: Power-Efficient TCAMs for Forwarding Engines,” in INFOCOM, [6] H. Lu, “Improved Trie Partitioning for Cooler TCAMs,” in IASTED International Conference on Advances in Computer Science and Technology, 2004.
Background and Related Work(1/3)
Background and Related Work(2/3) Subtree split (1-1)
Background and Related Work(3/3) Postorder split (many-1)
Subtree Split(1/2) We need to partition a 1-bit trie into the smallest number of subtrees
Subtree Split(2/2) count(x) : the number of prefixes stored in the subtree
Postorder Split Let the number of forwarding-table prefixes in B be s. Carving from the remaining T a feasible subtree of as large a size as possible but not exceeding b − s and pack this feasible subtree into B.
Simple TCAM with Wide SRAM(1/4) In the simple TCAM organization, each word of the SRAM is used to store only a next hop. Since a next hop requires only a small number of bits (e.g., 10 bits)
Simple TCAM with Wide SRAM(2/4) suffix countlengthnext hop
Simple TCAM with Wide SRAM(3/4) ST(x).numP : the number of prefixes in ST(x). ST(x).numB : the number of bits needed to store the suffix lengths, suffixes and next hops for these prefixes of ST(x). l and r : x’s two children.
Simple TCAM with Wide SRAM(4/4) u be the number of bits allocated to the suffix count field. v be the sum of the number of bits allocated to a length field and a next-hop field.
2-Level TCAM with Wide SRAM(1/3) fixed variant Recording the number of prefixes in bucket. Inefficient DTCAM space utilization
2-Level TCAM with Wide SRAM(2/3)
2-Level TCAM with Wide SRAM(3/3)
Experimental Results(1/2)
Experimental Results(2/2)
Conclusion Improving upon existing trie partitioning algorithms for TCAMs. A novel way to combine TCAMs and SRAMs so as to achieve a significant reduction in power and TCAM size. This is done without any increase in the number of TCAM searches and SRAM accesses required by a table lookup!