Download presentation
Presentation is loading. Please wait.
1
Jason Klaus, Duncan Elliott Confidential
Prefix CAM Jason Klaus, Duncan Elliott Confidential
2
Outline Routing Lookup Tables Ternary CAM Objective
Prefix Representation Previous Work Aside: Binary CAM Innovation Generating Enable Signal Drawbacks Simulated Results Summary References 2/24/2019 Confidential
3
Routing Lookup Tables Routing lookup tables map incoming IP addresses to outgoing ports using address prefix rules Ex: *.* to port 1 Ex: * to port 2 Ex: *.* to port 3 More specific (longer) prefixes are given priority over less specific (shorter) prefixes Ex: routed to port 2, not port 3 2/24/2019 Confidential
4
Routing Lookup Tables (cont.)
As Internet link speeds continue to increase, so too does the demand for fast, low power routing lookup tables Most current solutions involve Ternary Content Addressable Memory (TCAM) 2/24/2019 Confidential
5
Ternary CAM The TCAM stores and searches all prefixes in parallel for the longest match to a given IP address query Each stored prefix has an associated match line Match line charged before/during each query Query IP address is bitwise compared with the prefix Any mismatching bits discharge the match line Longest prefix with match line still high is best match 2/24/2019 Confidential
6
Objective Design a specialized content addressable memory (CAM) used only for IP address prefixes Called a prefix CAM (PCAM) Reduce the number of transistors required to store and search each IP address prefix without degrading performance or increasing dynamic power consumption Reduces manufacturing costs and static power consumption 2/24/2019 Confidential
7
Prefix Representation
TCAM solves a more general problem, requiring two bits of SRAM for every bit of IP address prefix Each bit is stored as either a 0, 1, or * (either 0 or 1) A 32bit IP address (IPv4) prefix can be uniquely and optimally represented using only 33 bits (n+1 bits) Store prefix bits followed by a 1, and 0 pad to 33 bits Ex: * becomes (in binary) 2/24/2019 Confidential
8
Previous Work Akhbarizadeh, Nourani, Vijayasarathi and Balsara use a similar 33bit representation for their PCAM design Used logic equation optimization techniques to achieve a low transistor count 396 transistors per prefix, or per bit Standard TCAM designs require 16 transistors per bit 2/24/2019 Confidential
9
Previous Work (cont.) Unfortunately there are some drawbacks to this PCAM design compared to TCAM Match line is loaded with far more transistors Bit comparisons sometimes drive more than one transistor, up to three in some cases As a result this PCAM suffers degraded performance and increased dynamic power consumption versus a comparable TCAM 2/24/2019 Confidential
10
Aside: Binary CAM A binary CAM (BCAM) is similar to a TCAM except it does not support wildcards in the mask The query either entirely matches or it doesn’t Each cell compares a single bit, discharging the match line if they differ This simplified cell requires only 9 transistors per bit as opposed to 16 transistors per TCAM cell 2/24/2019 Confidential
11
Innovation Add a single transistor to a BCAM cell which acts as an enable If high, mismatches discharge match line If low, mismatches do not discharge match line This transistor creates minimal additional match line loading to preserve performance Enable does not change based on search data, reducing dynamic power consumption 2/24/2019 Confidential
12
Innovation (cont.) Store prefixes as previously mentioned, with significant bits followed by a 1 and 0 padded Scanning from end to start, first 1 encountered indicates all remaining bits must be matched If a cell stores a 1 then it should enable all cells before it for matching If a cell stores a 0 then it should pass on the enable state it received 2/24/2019 Confidential
13
Basic Binary CAM Cell Search Line Bit Line Bit Line Search Line
Word Line Data Data Match Line 2/24/2019 Confidential
14
CAM With Enable Search Line Bit Line Bit Line Search Line Word Line
Data Data Match Line Enable 2/24/2019 Confidential
15
CAM With Cascaded Enable
Search Line Bit Line Bit Line Search Line Word Line Data Data Match Line Enable Logical OR Better implementation to come 2/24/2019 Confidential
16
CAM Cell Cascaded 2/24/2019 Confidential
17
Generating Enable Signal
Use a pull-up transistor combined with a transmission gate This requires 3 additional transistors per cell Total of ~13 transistors per prefix bit, or more precisely 419 transistors per prefix enable_in enable_out data Not exactly 13 transistors per cell, since the enable_out of the first cell is not needed, and the enable_in for the last cell is stored in an SRAM cell. 2/24/2019 Confidential
18
Drawbacks Enable signals take time to ripple through cells from back to front, especially since transmission gates become exponentially slower when chained PCAM requires a delay before searches after one or more consecutive writes Typical routing applications require 1 update per 100k searches Could replace some transmission gates with full 6 transistor implementation to break chain 2/24/2019 Confidential
19
Simulated Results Schematic simulation of a typical TCAM, the previous PCAM, and this proposed PCAM was performed in 90nm ST digital process Typical power consumption was measured as the average power consumed by a single row of cells storing and searching a selection of prefixes and addresses Design Worst Case Mismatch Time Typical Power Consumption TCAM 264 ps 114 μW Akhbarizadeh 507 ps 67 μW Proposed PCAM 345 ps 56 μW 2/24/2019 Confidential
20
Summary Compared to TCAM designs, the proposed PCAM design requires fewer transistors and reduces total power consumption while maintaining comparable performance for IP address prefix matching Only previous PCAM design requires slightly fewer transistors but degrades performance significantly and consumes more power 2/24/2019 Confidential
21
Extensions Faster than TCAM version, +3 transistors Beyond 32 bits
Match line provides Wired-AND for any heterogeneous mix of CAM cells connected Source and Destination addresses in separate clusters Flags for QOS, etc., as pure TCAM 2/24/2019 Confidential
22
References PCAM: A Ternary CAM Optimized for Longest Prefix Matching Tasks, Akhbarizadeh, M.J.; Nourani, M.; Vijayasarathi, D.S.; Balsara, P.T. , Computer Design: VLSI in Computers and Processors, ICCD Proceedings. IEEE International Conference on,11-13 Oct. 2004, Pages: 6- 11 2/24/2019 Confidential
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.