Download presentation
Presentation is loading. Please wait.
Published byAshley Hudson Modified over 8 years ago
1
1 Dynamic Pipelining: Making IP- Lookup Truly Scalable Jahangir Hasan T. N. Vijaykumar School of Electrical and Computer Engineering, Purdue University SIGCOMM ’05 Rung-Bo-Su 10/26/05
2
2 0.Abstract IP-lookup scheme must address five challenges of scalability, namely: –routing-table size, –lookup throughput, –implementation cost, –power dissipation, –and routing-table update cost.
3
3 Outline 1. Introduction 2. Background 3. Pipelined and Scalable IP-Lookup 4. Brief Review of TCAM-based Schemes 5. Methodology 6. Experimental Results 7. Conclusions
4
4 1.Introduction Fiber optics enabling high line-rates. Two major problems for IP-lookup –First, 2 ns per packet (for a 160 Gbps line-rate and minimum packet size of 40 bytes). –Second, a large number of prefixes.
5
5 1.Introduction Key component : –routing-table memory is used to search through the prefixes to locate the one that matches the incoming packet.
6
6 1.Introduction Five key scaling requirements : –Memory required. –Keep up with the ever-increasing line- rates. –Keep the complexity of heat removal and the cost of cooling reasonable. –Update –implementation cost and complexity
7
7 1.Introduction Two categories : –Trie-based –TCAMs.
8
8 1.Introduction Tries scale well in power but they do not scale well in throughput if they are not pipelined. Two approaches for pipelining tries are : –Hardware-level pipelining (HLP) –Datastructure-level pipelining (DLP) To solve DLP’s problems, we propose scalable dynamic pipelining (SDP).
9
9 2.Background Requirements : –(1) To avoid denial-of-service attacks and instabilities in the network , minimum sized packets streaming in at full line-rate. –(2) Provide enough memory –(3) Choose the prefix with the longest match.
10
10 2.Background Trie-Based IP-lookup Schemes
11
11 2.Background Multiple-bit Stride Tries(striding)
12
12 2.Background The Need for Pipelined Tries –One memory access may take longer than the packet inter-arrival time. –The problem is aggravated that perform multiple memory accesses for one lookup.
13
13 3.Pipelined and Scalable IP- Lookup The observation that pipelining can be used to solve the scalability problem of IP-lookup is not new. –Hardware-level pipelined (HLP) scheme. –Data-structure-level pipelined (DLP) scheme.
14
14 3.Pipelined and Scalable IP- Lookup Hardware-Level Pipelining – k is the number of levels in the multi-bit trie. – d, the total delay of one memory access. –one lookup every t seconds. – HLP hardware- level pipelines the entire memory holding the trie into k*d/t stages.
15
15 3.Pipelined and Scalable IP- Lookup Hardware-Level Pipelining
16
16 3.Pipelined and Scalable IP- Lookup Hardware-Level Pipelining XY Output 2X2X 2Y2Y Decoder Memory Array Access Multiplex
17
17 3.Pipelined and Scalable IP- Lookup Hardware-Level Pipelining XY 2X2X 2Y2Y Decoder Memory Array Access Multiplex
18
18 3.Pipelined and Scalable IP- Lookup Data-Structure-Level Pipelining –places each level of the trie in a different memory, so that each memory is accessed only once per packet lookup. Does not rely on expensive memory technologies or deep hardware pipelining, it scales well in power and implementation cost.
19
19 3.Pipelined and Scalable IP- Lookup
20
20 3.Pipelined and Scalable IP- Lookup Three remaining challenges : –Scalability in memory size –in route-update cost –and in lookup throughput.
21
21 3.Pipelined and Scalable IP- Lookup DLP’s Scalability Problems in Memory Size –each memory stage should be sufficient for any prefix distribution. –for the prefix distribution shown in Figure 4 its worst-case memory size would be no better.
22
22 3.Pipelined and Scalable IP- Lookup DLP’s Scalability Problems in Route-update Cost –Multibit trie : Arbitrarily many nodes. –Tree Bitmap : Almost doubles the size of each trie node.
23
23 3.Pipelined and Scalable IP- Lookup DLP’s Non-Scalability in Throughput Scalable Dynamic Pipelining(SDP) * 0* 00* 000* 1* 10* 100* 1010*
24
24 3.Pipelined and Scalable IP- Lookup DELETE :
25
25 3.Pipelined and Scalable IP- Lookup Jump Nodes : –k bits must have an array of 2 k pointers –Often there may be only one child and the remaining pointers are null.
26
26 3.Pipelined and Scalable IP- Lookup Jump Nodes :
27
27 3.Pipelined and Scalable IP- Lookup Per-Stage Memory Bound (a)binary search tree with N leaves (b)memory size of a trie with jump-nodes for the worst-case prefix distribution of Figure 4, compared to size of 1-bit trie
28
28 3.Pipelined and Scalable IP- Lookup Per-Stage Memory Bound (c) The space taken at various levels by a trie with jump-nodes, for various prefix distributions
29
29 3.Pipelined and Scalable IP- Lookup System Architecture –shadow trie : a copy of the trie containing all the required auxiliary information. accessed only during the construction or update of the trie. it using slow and cheap memory (DRAM). the modifications access only the shadow trie and the IP-lookups access only the SDP trie.
30
30 3.Pipelined and Scalable IP- Lookup Ensures that no read operation may encounter the data-structure in an inconsistent or erroneous state.
31
31 3.Pipelined and Scalable IP- Lookup Optimum Cost Incremental Route-updates
32
32 3.Pipelined and Scalable IP- Lookup Memory Management Overhead Scalability in Lookup Rate
33
33 4. Brief Review of TCAM-based Schemes Content Addressable Memory (CAM) : –Compares all memory locations against the input key to find matching entries. Ternary Content Addressable Memory (TCAM) : –Supports wild card bits in the entries. –Finds the longest matching prefixes in one operation.
34
34 4. Brief Review of TCAM-based Schemes a single access activates all memory locations, as opposed to just one, a TCAM dissipates a lot more power compared to RAM. TCAMs are pipelined at the hardware level. TCAMs do not scale well in power and implementation cost at high line- rates.
35
35 5. 5. Methodology Utilize CACTI 3.2. CACTI is a tool that models accurately. SRAM and CAM structures. Only for 100nm CMOS technology.
36
36 6. 6. Experimental Results (a) Worst-case per-stage memory versus trie-levels for DLP
37
37 6. 6. Experimental Results (b) Worst-case total memory versus trie-levels for HLP
38
38 6. 6. Experimental Results (c) A comparison of total worst-case memory versus routing table size for various IP-lookup schemes.
39
39 6. 6. Experimental Results Comparison of power dissipation versus line-rate for various schemes with tables sizes of (a) 250,000 (b) 500,000 (c) 1 million
40
40 6. 6. Experimental Results Comparison of chip area versus line-rate for various schemes with table sizes of (a) 250,000 (b) 500,000 (c) 1 million prefixes.
41
41 6. 6. Experimental Results Summary of Results –HLP does not scale well in total memory size, power dissipation, route-update cost, and implementation cost. –DLP does not scale well in total memory size, lookup throughput, and route- update cost. –TCAMs do not scale well in implementation cost and power dissipation.
42
42 7. 7. Conclusions Proposed scalable dynamic pipelining (SDP) Three key innovations : –prove a worst-case per-stage memory bound which is significantly tighter than those of previous schemes. –This route-update cost is obviously the optimum. –Scalability at the data-structure level and hardware level.
43
43 7. 7. Conclusions SDP naturally scales in power and implementation cost. Using detailed hardware simulation. SDP is the only scheme that achieves all the five scalability requirements.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.