Presentation is loading. Please wait.

Presentation is loading. Please wait.

Routing Convergence and the Impact of Scale Dan Massey Colorado State University.

Similar presentations


Presentation on theme: "Routing Convergence and the Impact of Scale Dan Massey Colorado State University."— Presentation transcript:

1 Routing Convergence and the Impact of Scale Dan Massey Colorado State University

2 26 October 052massey@cs.colostate.edu Internet Routing and BGP l Internet divided into Autonomous Systems n Large-scale implies maintaining entire topology at a router is not feasible. l BGP is the inter-AS routing protocol. n Router stores the AS path to a destination. n Path allows router to apply policies l How quickly does BGP converge after a change? n Can BGP continue to scale with more growth? n Do we need BGP changes or a new protocol?

3 26 October 053massey@cs.colostate.edu (B A) (C B A) (E D B A) (H G F E A) BGP Path Exploration H BZ D A E C dest. IG Obsolete paths: (C B A), (E D B A) If Z knew [B A] failed, it couldve avoided the obsolete paths Zs Candidate paths: () (C B A) (E D B A) (I H G F A) () (E D B A) (I H G F A) () (I H G F A) F ( )

4 26 October 054massey@cs.colostate.edu Path Exploration and Policy l Internet does not select the shortest path n Policies limit the number of potential paths. n Especially at high level tiers. l Example: Due to routing policy, AS-X (lower tier) sees more alternate paths than AS-Z (tier-1). n Via multiple providers n Via peers Z X P2 Y W P1

5 26 October 055massey@cs.colostate.edu Impact of Topology Growth l Denser connectivity => more alternate paths l Impact depends on policies and tier n Lower tier nodes see more slow convergence MRAI off MRAI on Jan 2, 2004Dec 2, 2004 Beacon prefix 198.32.7.0/24 RV peer ( AS# )#updates#paths#updates#paths 1239 (tier1)444374 12216288711 2914 (tier1)10662797 35571021919839

6 26 October 056massey@cs.colostate.edu Convergence Improvements l MRAI Timer (Deployed Now) n Require minimum time between updates n Typically 30 seconds l Assertion Checking (Proposed in INFOCOM 02) n Signal policy or topological failure in some cases n Discard routes that include failed subpath l Ghost Flushing (Proposed in INFOCOM 03) n When the MRAI timer delays an update, send a withdrawal l Attach Failure Notification (INFOCOM05, CompNet05) n Explicitly list the cause of the failure

7 26 October 057massey@cs.colostate.edu MRAI Rate-Limiting Timer Minimum Route Advertisement Interval (MRAI) timer: Within M=30 seconds, at most one announcement from A to B P1P1 P2P2 P P P As path changes: Msgs from A to B: P1P1 time=0time=30 time=60 P4P4 P b. delay convergence a. suppress transient changes Impact:

8 26 October 058massey@cs.colostate.edu MRAI and Ghost Flushing MRAI prevents removal of stale information Suppose P1 to P5 are increasingly worse Neighbor believes P1 still available until time 30 P1P1 P2P2 P P P As path changes: Msgs from A to B: P1P1 time=0time=30 time=60 P4P4 P w Ghost Flushing: if change to longer path and MRAI applies, send a withdraw w

9 26 October 059massey@cs.colostate.edu Root Cause Notification l The node who detects the failure attaches root cause to msg l Other nodes copy the root cause to outgoing messages (B A) (C B A) (E D B A) (H G F E A) H BZ D A E C IG Zs Candidate paths: F () ( C B A ) ( E D B A ) (I H G F A) ( ), [B A] failure the first msg is enough for Z to remove all the obsolete paths

10 26 October 0510massey@cs.colostate.edu Ghost Flushing Assertion BGP Root Cause Notification Fail-down Simulation Results Fail-down: destination becomes unreachable

11 26 October 0511massey@cs.colostate.edu Ghost Flushing Assertion BGP Root Cause Notification Implication: more redundancy means faster T long convergence Fail-over Simulation Results Fail-over: nodes switch to worse paths

12 26 October 0512massey@cs.colostate.edu Conclusions? (Not Yet!) l Root Cause Approach is Clear Winner n But several non-trivial deployment problems n Not immediately clear we could standardize it. l Ghost-Flushing Does Well in Fail-down n Easily incrementally deployed n But may not work well in Fail-over l MRAI Timer Only n Leaves us with current convergence problems n And the network is getting larger…. n And other complications in large systems….

13 26 October 0513massey@cs.colostate.edu Damping Analysis simulation calculation no damping Convergence Updates Trigger Damping Policies! (could fix if we damped the RCN rather than just updates)

14 26 October 0514massey@cs.colostate.edu But What About Packets? Improving packet delivery is the ultimate goal Ghost Flushing Assertion BGP Root Cause Notification

15 26 October 0515massey@cs.colostate.edu Conclusions l Root Cause Approach Adds Many Benefits n Convergence, dampening, packet delivery, diagnosis,…. l New Routing Designs Should Include RCN n Should be a required part of new routing protocols l Can RCN Be Added to BGP? n Not clear given existing complications n To be continued in IRTF Routing Research Group –Encourage interested researchers to join


Download ppt "Routing Convergence and the Impact of Scale Dan Massey Colorado State University."

Similar presentations


Ads by Google