Presentation is loading. Please wait.

Presentation is loading. Please wait.

CAMP: Fast and Efficient IP Lookup Architecture Sailesh Kumar, Michela Becchi, Patrick Crowley, Jonathan Turner Washington University in St. Louis.

Similar presentations


Presentation on theme: "CAMP: Fast and Efficient IP Lookup Architecture Sailesh Kumar, Michela Becchi, Patrick Crowley, Jonathan Turner Washington University in St. Louis."— Presentation transcript:

1 CAMP: Fast and Efficient IP Lookup Architecture Sailesh Kumar, Michela Becchi, Patrick Crowley, Jonathan Turner Washington University in St. Louis

2 Michela Becchi - 10/20/2015 Context n Trie based IP lookup n Circular pipeline architectures

3 Michela Becchi - 10/20/2015 Context n Trie based IP lookup n Circular pipeline architectures 0*P1 000*P2 0010*P3 0011*P4 011*P5 10*P6 11*P7 110*P8 Prefix dataset IP address 111010…

4 Michela Becchi - 10/20/2015 Context n Trie based IP lookup n Circular pipeline architectures 0*P1 000*P2 0010*P3 0011*P4 011*P5 10*P6 11*P7 110*P8 Prefix dataset P2 P7 P1 P3P4 P5 P6 P8 0 0 0 0 0 0 1 11 1 1 1 Trie IP address 111010…

5 Michela Becchi - 10/20/2015 Context n Trie based IP lookup n Circular pipeline architectures 0*P1 000*P2 0010*P3 0011*P4 011*P5 10*P6 11*P7 110*P8 Prefix dataset P2 P7 P1 P3P4 P5 P6 P8 0 0 0 0 0 0 1 11 1 1 1 Trie Stage 1 Stage 2 Stage 3 Stage 4 IP address 111010…

6 Michela Becchi - 10/20/2015 Context n Trie based IP lookup n Circular pipeline architectures 0*P1 000*P2 0010*P3 0011*P4 011*P5 10*P6 11*P7 110*P8 Prefix dataset P2 P7 P1 P3P4 P5 P6 P8 0 0 0 0 0 0 1 11 1 1 1 Trie Stage 1 Stage 2 Stage 3 Stage 4 1 2 4 3 Circular pipeline IP address 111010…

7 Michela Becchi - 10/20/2015 CAMP: Circular Adaptive and Monotonic Pipeline n Problems: »Optimize global memory requirement »Avoid bottleneck stages »Make the per stage utilization uniform n Idea: »Exploit a Circular pipeline: –Each stage can be a potential entry-exit point –Possible wrap-around »Split the trie into sub-trees and map each of them independently to the pipeline

8 Michela Becchi - 10/20/2015 CAMP (cont’d) n Implications: »PROS: –Flexibility: decoupling of maximum prefix length from pipeline depth –Upgradeability: memory bank updates involve only partial remapping »CONS: –A stage can be simultaneously an entry point and a transition stage for two distinct requests l Conflicts’ origination l Scheduling mechanism required l Possible efficiency degradation

9 Michela Becchi - 10/20/2015 Trie splitting P8 P2 P3 P6P7 P4P5 P1 P2 P3 P6P7P8 P4P5 P1 n Define initial stride x n Use a direct index table with 2 x entries for first x levels n Expand short prefixes to length x n Map the sub-trees E.g.: initial stride x=2 Direct index table Subtree 1 Subtree 2 Subtree 3 x=2

10 Michela Becchi - 10/20/2015 Dealing with conflicts n Idea: use a request queue in front of each stage n Intuition: without request queues, »a request may wait till n cycles before entering the pipeline »a waiting request causes all subsequent requests to wait as well, even if not competing for the same stages n Issue: ordering »Limited to requests with different entry stages (addressed to different destinations) »An optional output reorder buffer can be used

11 Michela Becchi - 10/20/2015 Pipeline Efficiency n Metrics: »Pipeline utilization: fraction of time the pipeline is busy provided that there is a continuous backlog of requests »Lookups per Cycle (LPC): average request dispatching rate n Linear pipeline: »LPC=1 »Pipeline utilization generally low –Not uniform stage utilization n CAMP pipeline: »High pipeline utilization –Uniform stage utilization »LPC close to 1 –Complete pipeline traversal for each request –# pipeline stages = # trie levels »LPC > 1 –Most requests don’t make complete circles around pipeline –# pipeline stages > # trie levels

12 Michela Becchi - 10/20/2015 Pipeline efficiency – all stages traversed n Setup: »24 stages, all traversed by each packet »Packet bursts: sequences of packets to same entry point n Results: »Long bursts result in high utilization and LPC »For all burst size, enough queuing (32) guarantees 0.8 LPC

13 Michela Becchi - 10/20/2015 Pipeline efficiency – LPC > 1 n Setup: »32 stages, rightmost 24 bits, tree-bit map of stride 3 »Average prefix length 24 n Results: »LPC between 3 and 5 »Long bursts result in lower utilization and LPC

14 Michela Becchi - 10/20/2015 Nodes-to-stages mapping n Objectives : »Uniform distribution of nodes to stages –Minimize the size of the biggest stage »Correct operation of the circular pipeline –Avoid multiple loops around pipeline »Simplified update operation –Avoid skipping levels

15 Michela Becchi - 10/20/2015 Nodes-to-stages mapping (cont’d) n Problem Formulation (constrained graph coloring): »Given: –A list of sub-trees –A list of colors represented by numbers »Color nodes so that: –Every color is nearly equally used –A monotonic ordering relationship without gaps among colors is respected when traversing sub-trees from root to leaves n Algorithm (min-max coloring heuristic) »Color sub-trees in decreasing order of size »At each steps: –Try all possible colors on root (the rest of the sub-tree is colored consequentially) –Pick the local optimum

16 Michela Becchi - 10/20/2015 Min-max coloring heuristic - example 44444 3333 22 1 T2T2 T1T1 T4T4 T3T3 Present coloring If 1 on new root If 2 on new root If 3 on new root If 4 on new root Color 1 1 Color 2 2 Color 3 4 Color 4 5

17 Michela Becchi - 10/20/2015 Min-max coloring heuristic - example 44444 3333 22 1 T2T2 T1T1 T4T4 T3T3 Present coloring If 1 on new root If 2 on new root If 3 on new root If 4 on new root Color 1 12532 Color 2 23364 Color 3 46558 Color 4 59766

18 Michela Becchi - 10/20/2015 Min-max coloring heuristic - example 44444 3333 22 1 2222 11 4 3 T2T2 T1T1 T4T4 T3T3 Present coloring If 1 on new root If 2 on new root If 3 on new root If 4 on new root Color 1 12532 Color 2 23364 Color 3 46558 Color 4 59766

19 Michela Becchi - 10/20/2015 Min-max coloring heuristic - example 44444 3333 22 1 2222 11 4 3 T2T2 T1T1 T4T4 T3T3 Present coloring If 1 on new root If 2 on new root If 3 on new root If 4 on new root Color 1 34545 Color 2 68787 Color 3 56767 Color 4 68787

20 Michela Becchi - 10/20/2015 Min-max coloring heuristic - example 44444 3333 22 1 11 4 33 2 2222 11 4 3 T2T2 T1T1 T4T4 T3T3 Present coloring If 1 on new root If 2 on new root If 3 on new root If 4 on new root Color 1 34545 Color 2 68787 Color 3 56767 Color 4 68787

21 Michela Becchi - 10/20/2015 Min-max coloring heuristic - example 1 44444 3333 22 1 11 4 33 2 2222 11 4 3 T2T2 T1T1 T4T4 T3T3 Present coloring If 1 on new root If 2 on new root If 3 on new root If 4 on new root Color 1 5 Color 2 7 Color 3 7 Color 4 7

22 Michela Becchi - 10/20/2015 Evaluation settings n Trends in BGP tables: »Increasing number of prefixes »Most of prefixes are <26 bit (~24 bit) long »Route updates can concentrate in short period of time; however, they rarely change the shape of the trie n 50 BGP tables containing from 50K to 135K prefixes

23 Michela Becchi - 10/20/2015 Memory requirements Level based mapping Height based mapping CAMP n Balanced distribution across stages n Reduced total memory requirements »Memory overhead: 2.4% w/ initial stride 8, 0.02% w/ initial stride 12, 0.01% w/ initial stride 16

24 Michela Becchi - 10/20/2015 Updates n Techniques for handling updates »Single updates inserted as “bubbles” in the pipeline »Rebalancing computed offline and involving only a subset of tries n Scenario »migration between different BGP tables »imbalance leads to 4% increase in occupancy of larger stage

25 Michela Becchi - 10/20/2015 Summary n Analysis of a circular pipeline architecture for trie based IP lookup n Goals: »Minimize memory requirement »Maximize pipeline utilization »Handle updates efficiently n Design: »Decoupling # of stages from maximum prefix length »LPC analysis »Nodes to stages mapping heuristic n Evaluation: »On real BGP tables »Good memory utilization and ability to keep 40Gbps line rate through small memory banks

26 Michela Becchi - 10/20/2015 Thank you!

27 Michela Becchi - 10/20/2015 Addressing the worst case n Observations: »We addressed practical datasets »Worst case tries may have long and skinny sections difficult to split n Idea: adaptive CAMP »Split trie into “parent” and “child” subtries »Map the parent sub-trie into pipeline »Use more pipeline stages to mitigate effect of multiple loops around pipeline


Download ppt "CAMP: Fast and Efficient IP Lookup Architecture Sailesh Kumar, Michela Becchi, Patrick Crowley, Jonathan Turner Washington University in St. Louis."

Similar presentations


Ads by Google