An Efficient IP Lookup Architecture with Fast Update Using Single-Match TCAMs Author: Jinsoo Kim, Junghwan Kim Publisher: WWIC 2008 Presenter: Chen-Yu Chaug Date: 2008/11/12
Outline Introduction Related work Propose IP Lookup Architecture IP Lookup and Update Algorithms Performance Evaluation Conclusion
Introduction(1/4) Most of the IP lookup schemes can be classified into software approaches based on trie and hardware approaches based on TCAM (Ternary Content Addressable Memory). Trie-based IP lookup schemes usually require several memory accesses. In contrast, TCAM can perform a lookup operation in a single cycle owing to its parallel access characteristics. Therefore, TCAM have been paid much attention to in recent years.
Introduction(2/4) There may be several matches in an IP lookup operation, it is required to determine the best match, i.e., LMP. For the determination of the LMP, all prefixes of a TCAM needs to be ordered by some criteria such as length. Under the ordered circumstance a priority encoder can select the LMP on the uppermost location among all matched prefixes.
Introduction(3/4) Most of TCAM-based lookup schemes need several movements of prefix entries for a single update because the ordering must be maintained in the TCAMs. Therefore, frequent updates may consume many computation cycles in the IP lookup engine and result in the degradation of the lookup performance.
Introduction(4/4) In this paper, we present a new architecture to provide fast update by using single-match TCAMs. Our algorithms guarantee that each single-match TCAM generates at most one match for a given destination address. So, it can eliminate both the ordering constraint and the priority encoder in a single-match TCAM, which makes the update fast.
Outline Introduction Related work Propose IP Lookup Architecture IP Lookup and Update Algorithms Performance Evaluation Conclusion
Ordering constraint Prefix-length ordering constraint: Two prefixes of the same length don’t need to be in any specific order. L-algorithm PLO_OPT Chain-ancestor ordering constraint There’s an ordering constraint between two prefixes if and only if one is a prefix of the other. CAO_OPT
L-algorithm Two prefix in the same length can be in any order. L-algorithm, that can create an empty space in a TCAM in no more than L memory shifts (recall that L = 32 ).
PLO_OPT The basic idea of the PLO_OPT algorithm is to keep all the unused entries in the center of the TCAM. PLO_OPT brings down the worst-case number of memory operations per update to L/2.
Better Algorithm ? P1 P2 P3 P4 P2 has no ordering constraint with P3 or P4. Maximal chain PLO constraint is too Restrictive than needed.
CAO_OPT(1/2) The CAO_OPT algorithm also keeps the free space pool in the center of the TCAM. Basic idea: for every prefix, the longest chain that this prefix belongs to should be split around the free space pool as equally as possible.
CAO_OPT(2/2) P1 P2 P3 P4 P4 < P3 < P1, P2 < P1
Summary of Simulation Results AlgorithmL-algoPLO_OPTCAO_OPT Average Standard deviation Worst Case21123 MAE-Est Entries43344 Isertion34204 Deletion9140
Outline Introduction Related work Propose IP Lookup Architecture IP Lookup and Update Algorithms Performance Evaluation Conclusion
Conventional TCAM-Based Architecture Conventional TCAM-based IP Lookup architecture in PLO (prefix length order). Priority Encoder Location P / P2 P3 P4 P / / /13 100/ PrefixNext-hop
Design of the Proposed Architecture(1/4) The maximum number of matched entries in a TCAM depends on the maximum depth of levels of the prefix search trie. The maximum length of any chain currently does not exceed 7, so there can be at most 7 matches. If the forwarding table is partitioned into several TCAMs so that there is no ancestor-descendant relation in each partitioned TCAM, then it is guaranteed that there exists at most one match in each TCAM.
Design of the Proposed Architecture(2/4) Proposed IP Lookup Architecture
Design of the Proposed Architecture(3/4) Each of TCAM 0 to TCAM 6 should contain a disjoint set of prefixes. So the result of lookup for a given IP address will be no more than one match in each TCAM. Obviously, the single-match TCAMs don’t have priority encoder logic. The selection logic selects longest one among those matches by using length data and sends out the corresponding output port number.
Design of the Proposed Architecture(4/4) In case that there is no suitable singlematch TCAM for a new inserting prefix, the conventional TCAM will be assigned. Ex: There are only two single-match TCAMs and two disjoint prefixes p1=10100* and p2 = 1011* are stored in different single-match TCAMs. Then there is no way to insert a new prefix, p8=10* into any of the single-match TCAMs without moving an existing prefix.
Outline Introduction Related work Propose IP Lookup Architecture IP Lookup and Update Algorithms Performance Evaluation Conclusion
Search algorithm Ex: P1(10100*),10 P7(00*),15 TCAM 0 TCAM 1 TCAM 2 TCAM 5 TCAM 3 TCAM 4 P2(1011*),11 P4(010*),13 P3(1110*),11 P5(100*),14 P6(110*),14 P10(0*),18 P8(10*),16 TCAM 6 P9(11*),17 Selection Logic P1,5 P8,2 10
Insertion algorithm(1/3) Ex: 101* P1(10100*),10 P7(00*),15 TCAM 0 TCAM 1 TCAM 2 TCAM 5 TCAM 3 TCAM 4 P2(1011*),11 P4(010*),13 P3(1110*),11 P5(100*),14 P6(110*),14 P10(0*),18 P8(10*),16 TCAM 6 P9(11*),17 TCAM 7 Conventional TCAM 101* Available[0] ← false Available[0] ← true Randomly Insertion
Insertion algorithm(2/3) Ex: 1* P1(10100*),10 P7(00*),15 TCAM 0 TCAM 1 TCAM 2 TCAM 5 TCAM 3 TCAM 4 P2(1011*),11 P4(010*),13 P3(1110*),11 P5(100*),14 P6(110*),14 P10(0*),18 P8(10*),16 TCAM 6 P9(11*),17 TCAM 7 Conventional TCAM 1* Available[0] ← false Insert to TCAM 7
Insertion algorithm(3/3)
Deletion algorithm For the prefix deletion it is needed to determine which TCAM contains the prefix p (As shown in line 1). Then the prefix can be deleted from the TCAM (As show in line 2). The function delete_from() is performed differently whether it operates on conventional TCAM or single- match TCAM.
Outline Introduction Related work Propose IP Lookup Architecture IP Lookup and Update Algorithms Performance Evaluation Conclusion
Simulation Environment(1/2) In our simulation we used routing tables from Route Views[9].
Simulation Environment(2/2) Comparison of Memory Movements
Simulation Results(1/3) Memory Movements per Update
Simulation Results(2/3) The Number of Prefixes in Each TCAM 0.33%
Simulation Results(3/3) Insertions and Deletions 0.19% and 0.18%
Discussion The updating performance is related to two factors The number of updates in the conventional TCAM. The number of deletions in the single -match TCAMs. The simulation results show that the number of updates in the conventional TCAM is quite small.
Outline Introduction Related work Propose IP Lookup Architecture IP Lookup and Update Algorithms Performance Evaluation Conclusion
Novel assignment strategies for prefix insertion should be developed and evaluated in further research. The design of the hardware to eliminate memory movements also remains for future work.