DSN 2003 A Study of Packet Delivery Performance during Routing Convergence Dan Pei, Lan Wang, Lixia Zhang, UCLA Dan Massey, USC/ISI S. Felix Wu, UC Davis
DSN /24/2003 2/14 Packet Delivery during Routing Convergence Failures do occur in the Internet 20% of intra-ISP links have a MTTF < 1 day[Diot:IMW02] 40% of Inter-ISP routes have a MTT-Change < 1 day [Labovitz:FTCS-29] Routing convergence after failure takes time IS-IS(Intra-ISP protocol): 5+ seconds [Diot:IMW02] BGP(Inter-ISP protocol): 3+ minutes [Labovitz:Sigcomm00] Packets can be delivered during convergence ABC E F D G
DSN /24/2003 3/14 Goal of this paper How to maximize packet delivery during routing convergence? Topological connectivity’s impact? Studying: RIP, Distributed Bellman-Ford(DBF), BGP Previous work focused on: preventing loops, minimizing convergence time and routing overhead This problem becomes more important with Larger Internet topology [Huston01] --> higher freq. of component failures Richer connectivity[Huston01] --> potentially helps with more alternate paths Higher bandwidth --> more packets sent during convergence
DSN /24/2003 4/14 Outline for the rest of the talk Introduction of RIP, DBF and BGP Simulation results and lessons learned Conclusion
DSN /24/2003 5/14 Protocols Examined (I):RIP and DBF RIP Keep shortest path only Distributed Bellman-Ford(DBF) Keep distance info from all neighbors A B C EF D D:1 D:3 D:2 D:3 B’s route to D: Nexthop=A, Dist=4 B’s route to D: Nexthop=A, dist=4 Alternate Nexthop=C, Dist=4 D: infinity 30sec refreshing interval Damping timer to space out two triggered updates: 1~5 seconds Poison reverse: B sends infinity distance to A Both RIP and DBF: Exchange distance info.
DSN /24/2003 6/14 BGP is similar to DBF, but route includes entire path Route via A = Route via C = B’s route to D: A B C E F D D: BGP: damping timer: 25 ~ 35 seconds BGP’: damping timer: 1~5 seconds Protocols Examined (II): BGP D:
DSN /24/2003 7/14 Outline for the rest of the talk Introduction of RIP, DBF and BGP Simulation results and lessons learned Conclusion
DSN /24/2003 8/14 Simulation conducted 7 by 7 mesh topologies similar those in [Baran64] 20 pkts/second Measure Packet loss, loops, path convergence time, throughput, and e2e delay. Simulated node degree range [3 ~ 16]
DSN /24/2003 9/14 Packet Losses (I) : Observation RIP DBF, BGP’ and BGP Packet losses of DBF, BGP’ and BGP decrease to zero at degree 6. Richer connectivity helps RIP little. Node Degree Packet Loss
DSN /24/ /14 Packet Loss(II): Lessons Learned Keeping alternate paths F D A B C E F D A B C E Connectivity Matters no immediate available alternative due to poor connectivity and poison reverse RIP: DBF, BGP: alternative is more likely with richer connectivity
DSN /24/ /14 Packet Loss(III): Is an alternate path valid? Valid Alternate Paths: not using the failed link Poison reverse and BGP’s path information are not enough! [Pei:Infocom2002] F D A B C E U X V W Richer connectivity --> reduces one single link’s impact better availability of valid(but may be suboptimal) path C2 D:
DSN /24/ /14 Transient Loops(I): Observation DBF BGP’ BGP BGP has the most loops! RIP has no loops Richer connectivity reduces the chance of looping. Node Degree Losses due to loops
DSN /24/ /14 F D A B C E Transient Loops(II): Msg Propagation Damping timer slows the msg propagation, causing looping U X V W Y D: Richer connectivity can reduce the chance of looping More details in: “A Study of Transient Loops in BGP” 30 seconds! D:
DSN /24/ /14 Conclusion Network’s Ultimate goal is to deliver happy packets, so Routing Protocols should Maximize packet delivery during convergence Achieve a good balance between packet delivery AND loop prevention, routing conv. time and routing overhead Utilize the connectivity redundancy Future work Apply insights to BGP; study link state protocols, e2e TCP performance; Larger topologies, multiple pairs of S/D, multiple failures
DSN 2003 Questions?
DSN /24/ /14 Instantaneous Throughput RIP DBF BGP’ BGP RIP Time Throughput(pkts/second
DSN /24/ /14 Packet Delay During Convergence
DSN /24/ /14 Forwarding Path Convergence time BGP: no loss at degree 6 or higher Shall we still tune MRAI timer to minimize convergence time(with the risk of increasing overhead)? Node Degree BGP:70 BGP’:10 Time till there is no routing msg. BGP:13 BGP’:2 Time till the forwarding path from S to D stabilizes.
DSN /24/ /14