Download presentation
Presentation is loading. Please wait.
Published byLucy Harrington Modified over 9 years ago
1
Dynamic Pipelining: Making IP-Lookup Truly Scalable Jahangir Hasan T. N. Vijaykumar Presented by Sailesh Kumar
2
2 - Sailesh Kumar - 2/22/2016 A Simple router IP LookupCrossbar : Arriving Packets VOQs Routing table contains prefix, dest. pairs IP-lookup finds dest. with longest matching prefix At OC768, IP lookup needs to be carried out in 2 ns, can become a bottleneck
3
3 - Sailesh Kumar - 2/22/2016 This Paper’s Contribution n This paper presents an IP lookup ASIC architecture which addresses following 5 scalability challenges »Memory size - grow slowly with #prefixes »Lookup throughput – line rate »Implementation cost - complexity, chip area, etc »Power dissipation - grow slowly with #prefixes and line rate »Routing table update cost – O(1) n No existing lookup architecture effectively addresses all 5 challenges!
4
4 - Sailesh Kumar - 2/22/2016 Previous work n Several IP lookup schemes proposed n Memory access time > packet inter-arrival time »Must use pipelining n Several papers have proposed using pipelining SpaceThroughputUpdatesPowerArea TCAMs Yes HLP [Varghese et al – ISCA’03] Yes DLP [Basu, Narlikar - Infocom’05] Yes This paper Yes
5
5 - Sailesh Kumar - 2/22/2016 IP Address Lookup n Routing tables at router input ports contain (prefix, next hop) pairs n Address in packet is compared to stored prefixes, starting at left. n Prefix that matches largest number of address bits is desired match. n Packet is forwarded to the specified next hop. 01*5 110*3 1011*5 0001*0 10*7 0001 0*1 0011 00*2 1011 001*3 1011 010*5 0101 1*7 0100 1100*4 1011 0011*8 1001 1000*10 0101 1001*9 0100 110*6 prefix next hop routing table address: 1011 0010 1000 Taken from CSE 577 Lecture Notes
6
6 - Sailesh Kumar - 2/22/2016 Address Lookup Using Tries n Prefixes stored in “alphabetical order” in tree. n Prefixes “spelled” out by following path from top. »green dots mark prefix ends n To find best prefix, spell out address in tree. n Last green dot marks longest matching prefix. address: 1011 0010 1000 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 11 1 1 11 1 11 1 1 1 1 1 0 0 1 00 0 01 3
7
7 - Sailesh Kumar - 2/22/2016 P2 P3 Leaf Pushing P1 0 1 1 1 0 0 1*P2 101*P3 0*P1 prefix next hop routing table P2 Every Internal node might need to store the next hop information Leaf Pushing avoids using longest prefix matching, also reduces the node size with proper encoding Leaf Pushing, push P2 to all leaves Complicates the updates, as all leaves needs to be updated
8
8 - Sailesh Kumar - 2/22/2016 Multibit Trie n Match several bits in one step instead of single bit. »equivalent to turning sub-trees of binary trie into single nodes. n Each node may be associated with several prefixes. n For stride of s, reduces tree depth by factor of s. address: 101 100 101 000 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 11 1 1 11 1 11 1 1 1 1 1 0 0 1 00 0 01 01,10 000001 010 100 101 110 011110 100101100 *010,001,11 0 00 11--1* 1,10
9
9 - Sailesh Kumar - 2/22/2016 Controlled Prefix Expansion 1*P2 101*P3 0*P1 prefix next hop routing table 01*P1 10*P2 00*P1 1010*P3 1011*P3 11*P2 Stride 2, multibit trie P3 P2 P3 P1 0 11 01 00 P2 10 01 10 P2 P1 Controlled prefix expansion to align the stride boundaries In worst-case, controlled prefix expansion causes non- deterministic increases in the routing table size There are schemes, which uses variable strides to improve average case, but worst-case remains the same
10
10 - Sailesh Kumar - 2/22/2016 Need for Pipelined Tries n Tomorrow’s routers will run at 160 Gbps, 2 ns per packet n At most one memory access / 2 ns (may be less) n Moreover there may be millions of prefixes n In worst-case, memory requirements will be very high »Memory will be slower n Needs an architecture which »Uses multiple smaller memories »Accesses them in a pipelined manner
11
11 - Sailesh Kumar - 2/22/2016 Pipelined Trie-based IP-lookup Each level in different stage → overlap multiple packets Tree data-structure, prefixes in leaves (leaf pushing) Process IP address level-by-level to find the longest match P4 = 10010* 1 0 1 0 0 1 0 P1P2 P4 P3 P5 1 P6 P7
12
12 - Sailesh Kumar - 2/22/2016 Closest Previous Work Maps trie level to stage but this is a static mapping Updates change prefix distribution but mapping persists In worst-case any stage can have all prefixes Large worst-case memory for each stage 0* 00* 000*.. P1 P2 P3.. P1 P3 P2 X Data Structure Level Pipelining (DLP) - level to stage mapping No bound on worst-case update → Could be O(1) using Tree Bitmap But constant huge, 1852 memory accesses per update [SIGCOMM Comm Review ’04] Figure taken from Hasan et al.
13
13 - Sailesh Kumar - 2/22/2016 Memory bound per stage Figure below, shows the worst case prefix distribution There are 1 million prefixes, each of length 32-bits In this case n Largest stage will be 5 MB. n Total memory size will be 80 MB as opposed to 6 MB of the total prefix size Figure taken from Hasan et al. Moreover, a 5 MB memory can’t be accessed faster than 6 ns or so
14
14 - Sailesh Kumar - 2/22/2016 Hardware Level Pipelining - HLP n HLP pipelines the memory accesses at hardware level n Multiple words of memory are read together in a pipelined manner »Throughput only limited by the memory array access time Such memories can improve the IP lookup throughput Figure taken from Sherwood et al. As such not scalable as higher degree of pipelining leads to a prohibitive chip area and power dissipation
15
15 - Sailesh Kumar - 2/22/2016 Key Idea n HLP doesn’t scale well in chip area and power n DLP scales well in power but doesn’t scale well in »Memory size (due to static level to stage mapping) »Throughput, as one stage can’t go faster than 6 ns n Combine these two (SDP) »Use a DLP, but with a better mapping so that each stage is smaller »Use HLP at every stage to accelerate it further
16
16 - Sailesh Kumar - 2/22/2016 Key Idea: Use Dynamic Mapping Map node height to stage (instead of level to stage) Height changes with updates, captures distribution of prefixes below Hence the name dynamic mapping 0* 00* 000*.. P1 P2 P3.. P1 P3 P2 X However, the worst-case memory requirements will remain the same, i.e. when all prefixes are 32-bit long Figure taken from Hasan et al.
17
17 - Sailesh Kumar - 2/22/2016 Key Idea: Use Jump Nodes Use Jump nodes so that the worst-case memory requirements can be reduced Also restores the relation between height and distribution However, one can argue that jump nodes will reduce the memory requirements of SDP too, NO we will soon see why! Figure taken from Hasan et al. Jump 010.. 1* 1010*.. P4 P5.. P5 X P4 P5 X
18
18 - Sailesh Kumar - 2/22/2016 Another example of Jump nodes Leaf Pushing => Jump 100 Jump 11 Note that this trie will need more than one node operation for table updates, different from what the paper CLAIMS! Adding Jump nodes =>
19
19 - Sailesh Kumar - 2/22/2016 Tries with jump nodes Key properties (1) Number of leaves = number of prefixes No replication Avoids inflation of prefix expansion, leaf-pushing (2) Updates do not propagate to subtrees No replication (3) Each internal node has 2 children Jump nodes collapse away single-child nodes
20
20 - Sailesh Kumar - 2/22/2016 Total versus Per-Stage Memory Jump-nodes bound total size by 2N Would DLP+Jump nodes → small per-stage memory? log 2 N W - log 2 N N No, DLP is still static mapping → large worst-case per-stage Total bounded but not per-stage Figure taken from Hasan et al.
21
21 - Sailesh Kumar - 2/22/2016 SDP’s Per-Stage Memory Bound Proposition: Map all nodes of height h to (W-h) th pipeline stage Result: Size of k th stage = min( N / (W-k), 2 k )
22
22 - Sailesh Kumar - 2/22/2016 Key Observation #1 A node of height h has at least h prefixes in its subtree At least one path of length h to some leaf h -1 nodes along path Each node leads to at least 1 leaf Path has h -1+1 leaves = h prefixes h Figure taken from Hasan et al.
23
23 - Sailesh Kumar - 2/22/2016 Key Observation #2 No more than N / h nodes of height h for any prefix distribution Assume more than N / h nodes of height h Each accounts for at least h prefixes (obs #1) Total prefixes would exceed N By contradiction, obs #2 is true
24
24 - Sailesh Kumar - 2/22/2016 Main Result of the Proposition Map all nodes of height h to (W-h)th pipeline stage K-th stage has only N / (W-k) nodes from obs #2 1-bit trie has binary fanout → at most 2 k nodes in k-th stage Size of k-th stage = min( N / (W-k), 2 k ) nodes Results in ~20 MB for 1 million prefix 4x better than DLP Static pipelining (DLP) Dynamic pipelining (SDP) Figure taken from Hasan et al.
25
25 - Sailesh Kumar - 2/22/2016 Optimum Incremental Updates 1 update → change height and stage of many nodes Must migrate all affected nodes → inefficient update? Each ancestor in different stage = 1 node-write in each stage = 1 write bubble for any update update Updating SDP not just O(1) but exactly 1 Not many nodes needs to be moved as only ancestors’ heights can be affected Figure taken from Hasan et al.
26
26 - Sailesh Kumar - 2/22/2016 Incremental Updates 2 4 67 1011 5 9 1213 16 1 3 8 17 1415 Pipe 0Pipe 1Pipe 2Pipe 3Pipe 4Pipe 5 3107421 61295 8 11 13 14 15 16 17
27
27 - Sailesh Kumar - 2/22/2016 Incremental Updates 2 4 67 1511 5 9 1213 16 1 3 8 17 Pipe 0Pipe 1Pipe 2Pipe 3Pipe 4Pipe 5 31021 61295 8 11 13 14 15 16 17 7, Jump 74 The implementation complexity may be pretty high, cos on the fly you might need to compute the jump nodes (e.g. for 7)
28
28 - Sailesh Kumar - 2/22/2016 Efficient Memory Management Tree bit map and segmented hole compaction requires multiple memory accesses for updates Multibit trie with variable stride requires even more complex memory management SDP: No variable striding / compression → all nodes same size No fragmentation/compaction upon updates Memory management is trivial and has zero fragmentation
29
29 - Sailesh Kumar - 2/22/2016 Scaling SDP for Throughput Each SDP stage can be further pipelined in hardware HLP [ISCA’03] pipelined only in hardware without DLP Too deep at high line-rates Combine HLP + SDP for feasibly deep hardware Throughput matches future line rates Size = N / (W-k) Size = 2 k 1 2 2 2 3 # of HLP stages Figure taken from Hasan et al.
30
30 - Sailesh Kumar - 2/22/2016 Experiments Figure taken from Hasan et al.
31
31 - Sailesh Kumar - 2/22/2016 Experiments Figure taken from Hasan et al.
32
32 - Sailesh Kumar - 2/22/2016 Experiments Figure taken from Hasan et al.
33
33 - Sailesh Kumar - 2/22/2016 Discussion / Questions Figure taken from Hasan et al.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.