Download presentation
Presentation is loading. Please wait.
Published byCody Nelson Modified over 9 years ago
1
CA-RAM: A High-Performance Memory Substrate for Search-Intensive Applications Sangyeun Cho, J. R. Martin, R. Xu, M. H. Hammoud and R. Melhem Dept. of Computer Science University of Pittsburgh
2
ISPASS 2007 Search ops in applications Search (or lookup) operations represent an important common function Network packet processing For each arriving packet, determine the output port Given packet information, find a matching classification rule Each look up can incur many memory accesses Speech recognition Searching (e.g., dictionary lookup) takes up ~24% of CPU cycles Forthcoming RMS (Recognition, Mining, and Synthesis) apps
3
ISPASS 2007 Search performance and power Search performance must match increasing line speeds For OC-768, up to 104M packets must be processed per second Network traffic has doubled every year [McKeown03] Routing tables (~200K prefixes in a core router) are growing [RIS] IPv6 Power and thermal issue already a critical limiting factor in network processing device design [McKeown03] Search in battery-operated devices should be energy-efficient Conventional search solutions Software methods (tries, hash table, …) Hardware methods (CAM, TCAM, …)
4
ISPASS 2007 IP lookup using a trie Consider an IP address: 0 1 0 0 0 1 1 0 Software approach is “flexible” high memory capacity requirement high memory bandwidth requirement not SCALABLE
5
ISPASS 2007 IP lookup using TCAM Consider an IP address: 0 1 0 0 0 1 1 0 110100* 110101* 110111* 01000* 01100* 01101* 11011* 0100* 0110* 1101* 10* 0* sort before storing choose the first among the matched high bandwidth, constant time lookup TCAMs are relatively small, expensive power consumption very high not SCALABLE
6
ISPASS 2007 CA-RAM – a hybrid approach Can we do better than the existing conventional schemes? CAM-like search performance RAM-like cost and power CA-RAM combines hashing w/ hardware parallel matching CA-RAM design goals High lookup performance Low power consumption Smaller chip area per stored datum Straightforward system-level integration
7
ISPASS 2007 Talk roadmap What is CA-RAM? Prototype design Case study 1: IP lookup Case study 2: Trigram lookup for speech recognition
8
ISPASS 2007 CA-RAM – Content Addressable RA M Separate match logic and memory Match logic for a single row, not every row Allows the use of dense RAM technology Enables highly reconfigurable match logic Keep keys sorted in each row, not in entire array Match logic Memory cells Conventional CAM/TCAMCA-RAM
9
ISPASS 2007 Very simple, yet efficient Use hashing to store keys in a particular row To look up, hash the search key and retrieve one row Perform matching on entire row in parallel Achieve full content addressability w/o paying overhead! Index generator Key i1 Match processor 1 … … Key i2 Key j2 Key j1 Match processor 2 … search key
10
ISPASS 2007 Pipelined CA-RAM operation Index generatorSearch key Key i1 Match processor 1 Key i2 Key j2 Key j1 Match processor 2 ResultMatch processor 3 Key i3 Key j3 Step 1Step 2Step 3Step 4 Index Key j2 Key j1 Key j3 Search keyMatch processor 2 Index generationMemory access Key matching Result forwarding
11
ISPASS 2007 Dealing w/ bucket overflows Careful design of hash function Increase bucket size Reduce load factor ( ); = # of occupied entries / # of total entries Use “chaining”; store overflows in subsequent rows Multiple accesses per lookup Use a small overflow CAM, accessed in parallel Similar to popular “victim caching” Use two-level hashing and employ multiple CA-RAM banks … …
12
ISPASS 2007 CA-RAM reconfig. opportunities Reconfigurable match logic allows: Adapting key size to apps Same hardware to support multiple apps or standards … …
13
ISPASS 2007 Adapting key size Key i1 Reconfigurable match logic Key i2 Key j2 Key j1 Key i3 Key j3 Match information Key i1 Key i2 Key j2 Key j1 Adapting key size is straightforward Will benefit supporting multiple apps/ standards Select key bits for matching
14
ISPASS 2007 CA-RAM reconfig. opportunities Reconfigurable match logic allows: Adapting key size to apps Same hardware to support multiple apps or standards Binary and ternary matching Some apps require ternary matching, some don’t … …
15
ISPASS 2007 Supporting binary/ternary matching Reconfigurable match logic Match information Key i1 Key i2 Key j2 Key j1 Search key Mask j1 Mask i1 Developed configurable comparator T-matching requires 2 bits / 1 symbol Supporting different types of matching in different bit positions feasible Consider mask bits or not
16
ISPASS 2007 CA-RAM reconfig. opportunities Reconfigurable match logic allows: Adapting key size to apps Same hardware to support multiple apps or standards Binary and ternary matching Some apps require ternary matching, some don’t Storing data and keys in a CA-RAM module Cuts # of memory accesses for a lookup by half … …
17
ISPASS 2007 Simult. key matching & data access Reconfigurable match logic Match information Key i1 Key i2 Key j2 Key j1 Search key Data j1 Data i1 Data access follows TCAM lookup CA-RAM supports data embedding Cuts memory traffic & latency by half Match result & Data Match key & bypass data
18
ISPASS 2007 CA-RAM reconfig. opportunities Reconfigurable match logic allows: Adapting key size to apps Same hardware to support multiple apps or standards Binary and ternary matching Some apps require ternary matching, some don’t Storing data and keys in a CA-RAM module Cuts # of memory accesses for IP lookup by half Providing range checking capabilities Beneficial for rule-based packet filtering … …
19
ISPASS 2007 Supporting range checking Reconfigurable match logic Match information Key i1 Range i1 Range j1 Key j1 Search key (Range checking causes troubles) (Entries must be expanded) CA-RAM can upport range checking efficiently Match key & check range
20
ISPASS 2007 CA-RAM-based memory subsystem
21
ISPASS 2007 Prototype implementation We implemented a prototype CA-RAM slice design (w/ a degree of reconfigurability) and evaluated its power and area advantages over state-of-the-art TCAMs We used a standard cell (0.16 m) based ASIC design flow Step# cells Area, m 2 Delay, ns Expand search key3,80466,228(0.89) Calculate match vector5,25210,5910.95 Decode match vector8991,9701.91 Extract result6,03721,7751.99 Total15,992100,5644.85
22
ISPASS 2007 Area and power: CA-RAM vs. TCAM Per Cell Area (um 2 ) @130nm 4.5x 11x 4.5Mb Power (W) @143MHz 14x 4x Cell area ( m 2 ) @130nm CMOS Power (W) 4.5Mb @143MHz CA-RAM area advantage 4.5x~11x CA-RAM power advantage 4x~14x
23
ISPASS 2007 Performance: CA-RAM vs. (T)CAM
24
Case study 1: IP lookup
25
ISPASS 2007 Problem description Given A set of prefixes (each prefix is associated with output port number) IP address Find a prefix that matches with input IP address and return output port number associated with it In the presence of multiple matching prefixes, choose the longest Procedure Find a good hash function to distribute prefixes Determine CA-RAM organization
26
ISPASS 2007 Data set and hashing method IP core router’s table having 186,760 entries Bit selection scheme [Zane et al. ‘03] 98% of prefixes are at least 16 bits long Select hash bits from the first 16 bits (low-order bits)
27
ISPASS 2007 Shaping CA-RAM Consider multiple design points: Design B Design A Design D Design C Design E Design F 2,048 rows (32 entries) 4,096 rows (64 entries) ( = 0.47) ( = 0.40) ( = 0.36) ( = 0.24) ( = 0.36)
28
ISPASS 2007 Performance Spilled entries Average memory access latency ( = 0.47)( = 0.40)( = 0.36) ( = 0.24)( = 0.36) “Uniform” traffic “Skewed” traffic With a properly chosen , CA-RAM achieves near-constant AMAL
29
ISPASS 2007 Area and power CA-RAM advantageous over TCAM Design B Relative area or power
30
Case study 2: Trigram lookup in speech recognition
31
ISPASS 2007 Problem, data set, and hashing Problem Look up a trigram in the trigram database Data set A subset of the Sphinx trigram database We picked up entries having 13~16 characters Still 5,385,231 entries or 86MB Hashing DJB, an efficient string hash function (Used in Sphinx)
32
ISPASS 2007 Result
33
ISPASS 2007 Data distribution
34
ISPASS 2007 Area comparison Relative area CAMCA-RAM
35
ISPASS 2007 CA-RAM conclusions Compared w/ software methods Less # of memory accesses; higher lookup performance Compared w/ CAM or TCAM Higher density matching that of DRAM large lookup table Competitive performance Low power – a critical advantage for cost-effective system design Reconfigurable Can accommodate apps having different key/record sizes, binary vs. ternary searching requirements, range checking, … Can adopt new standards much more easily, e.g., IPv6 Two case studies show the efficacy of the CA-RAM approach 3~5× improvement in area and power, compared with CAM/TCAM
36
CA-RAM: A High-Performance Memory Substrate for Search-Intensive Applications Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.