Download presentation
Presentation is loading. Please wait.
Published byBaldwin Jordan Modified over 8 years ago
1
Routing Convergence Dan Massey Colorado State University
2
7 November 052massey@cs.colostate.edu Acknowledgements l Recent convergence improvements and analysis: n AT&T Research: Dan Pei n U. Minnesota: Jaideep Chandrashekar l Preliminary Data Results n UCLA: Rafti Izhak-Ratzin and Lixia Zhang n U. Arizona: Beichuan Zhang n AT&T Research: Dan Pei l Any good ideas/interesting results due to above n Bad ideas/questionable results due to myself :)
3
7 November 053massey@cs.colostate.edu Is There A Convergence Problem? l ACM/IEEE research community has answered yes… n Many measurement studies: –“Delayed Internet Convergence”, Labovitz, Ahuja, et al SIGCOMM 01 and many more since then, including more data later in this talk. n Many approaches for improving convergence –“Limiting Path Exploration in Path Vector Protocols”, Chandrashekar et al and “Avoiding transient loops during IGP convergence”, P. Francois et al both in INFOCOM 2005 –And earlier INFOCOM, SIGCOMM, CCR, Computer Networks, etc, n Convergence interacts with other events –Route flap dampening (Mao et al in SIGCOMM, Zhang et al ICDCS) –Packet Delivery (Pei, et al in DSN) l But little (at best) impact on deployed protocols….
4
7 November 054massey@cs.colostate.edu Convergence and Growth l Denser connectivity => more alternate paths l Impact depends on policies and tier n Lower tier nodes see more slow convergence MRAI off MRAI on Jan 2, 2004Dec 2, 2004 Beacon prefix 198.32.7.0/24 RV peer ( AS# )#updates#paths#updates#paths 1239 (tier1)444374 12216288711 2914 (tier1)10662797 35571021919839
5
7 November 055massey@cs.colostate.edu Different Events Cause Different Behaviors Fail-down has distinctively longer convergence New route, change to shorter route, and fail-over to longer route give similar results Note this is preliminary data, Overlapping are potential classification problems…
6
7 November 056massey@cs.colostate.edu Different Sites See Different Behaviors Tier-1 prefix viewed from lower piers 18% of Fail-Down events take only 30 seconds Tier-1 prefix viewed from other Tier-1s 40% of Fail-Down events take only 30 seconds
7
7 November 057massey@cs.colostate.edu Convergence Side Effects simulation calculation no damping Convergence Updates Trigger Damping Policies!
8
7 November 058massey@cs.colostate.edu Some Research Community Proposals l MRAI Timer (Deployed Now) n Require minimum time between updates n Can adjust time values (Griffin) l Assertion Checking (Pei et all INFOCOM 02) n Signal policy or topological failure in some cases n Discard routes that include failed subpath l Ghost Flushing (Bremler-Barr et all INFOCOM 03) n When the MRAI timer delays an update, send a withdrawal l Attach Failure Notification (INFOCOM05, CompNet05) n Explicitly list the cause of the failure
9
7 November 059massey@cs.colostate.edu MRAI Rate-Limiting Timer Minimum Route Advertisement Interval (MRAI) timer: Within M=30 seconds, at most one announcement from A to B P1P1 P2P2 P3P3 P4P4 P5P5 A’s path changes: Msgs from A to B: P1P1 time=0time=30 time=60 P4P4 P5P5 b. delay convergence a. suppress transient changes Impact: (you know this…. but compare with Ghost-Flushing in next slide)
10
7 November 0510massey@cs.colostate.edu MRAI and Ghost Flushing MRAI prevents removal of stale information Suppose P1 to P5 are increasingly worse Neighbor believes P1 still available until time 30 P1P1 P2P2 P3P3 P4P4 P5P5 A’s path changes: Msgs from A to B: P1P1 time=0time=30 time=60 P4P4 P5P5 w Ghost Flushing: if change to longer path and MRAI applies, send a withdraw w
11
7 November 0511massey@cs.colostate.edu Root Cause Notification l The node who detects the failure attaches root cause to msg l Other nodes copy the root cause to outgoing messages (B A) (C B A) (E D B A) (H G F E A) H BZ D A E C IG Z’s Candidate paths: F () ( C B A ) ( E D B A ) (I H G F A) ( ), [B A] failure the first msg is enough for Z to remove all the obsolete paths
12
7 November 0512massey@cs.colostate.edu Ghost Flushing Assertion BGP Root Cause Notification Fail-down Simulation Results Fail-down: destination becomes unreachable
13
7 November 0513massey@cs.colostate.edu Ghost Flushing Assertion BGP Root Cause Notification Implication: more redundancy means faster T long convergence Fail-over Simulation Results Fail-over: nodes switch to worse paths
14
7 November 0514massey@cs.colostate.edu But What About Packets? Don’t forget about packet delivery? Ghost Flushing Assertion BGP Root Cause Notification
15
7 November 0515massey@cs.colostate.edu Applicability of Approaches l Widely varying degrees of practicality in research solutions n Ghost-Flushing: local change that can be incrementally deployed n RCN: makes many simplifications that don’t hold in practice. n Many other approaches not covered here….
16
7 November 0516massey@cs.colostate.edu So Does Convergence Matter (to RRG)? l Claim 1: Internet routing convergence is an academic problem with little relevance. n Suggests research community should redirect research activities elsewhere… l Claim 2: Routing convergence improvements are relevant and needed n RRG suggest adopting some approaches from the research community? n RRG state why current approaches are not sufficient l Informal poll of BGP convergence researchers shows strong interest the above….
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.