Presentation is loading. Please wait.

Presentation is loading. Please wait.

On Understanding of Transient Interdomain Routing Failures Feng Wang, Lixin Gao, Jia Wang, and Jian Qiu Department of Electrical and Computer Engineering.

Similar presentations


Presentation on theme: "On Understanding of Transient Interdomain Routing Failures Feng Wang, Lixin Gao, Jia Wang, and Jian Qiu Department of Electrical and Computer Engineering."— Presentation transcript:

1 On Understanding of Transient Interdomain Routing Failures Feng Wang, Lixin Gao, Jia Wang, and Jian Qiu Department of Electrical and Computer Engineering University of Massachusetts, Amherst MA 01002 AT&T Labs-research 180 Park Ave, Florham Park NJ 07869

2 Outline What is transient routing failures? When can transient routing failures occur? How long can transient routing failures last? Measurement results

3 Internet Routing Autonomous systems (ASes) –Internet Service Providers (ISPs) –Companies –Universities Intradomain Routing Protocols –Static Routing, OSPF, IS-IS Interdomain Routing Protocol –Border Gateway Protocol (BGP)

4 Long Convergence Delay Long convergence delay (Labovitz et al, TON2001) –Bringing a route back –(T up ): <shortest path length  MRAI –Disconnecting a route –(T down ): <longest path length  MRAI Fail-over: rerouting from Path A to Path B –During the time for discovering Path B, routers might experience transient routing failures, i.e., no route is available

5 An Example of Transient Routing Failure d Traffic on data plane BGP update W:20 A:10 AS1 AS2 AS0 120 10 20 W:20 10 A:10 210 BGP Routing table losing reachability AS3

6 Our Contributions Identify transient routing failures –Sufficient conditions Bound transient routing failure duration

7 Outline What is transient routing failures? When can transient routing failures occur? How long can transient routing failures last? Measurement results

8 Two sufficient conditions for a node must experience a transient routing failure (transient routing failure for sure). One sufficient condition for a node may experience a transient routing failure (potential transient routing failure). When Transient Routing Failures can Occur? 1 10 210 20 310 w w 3 2 0 20

9 When Transient Routing Failures can Occur? (contd.) 1 10 210 20 310 w 3 2 0 20 A w 310 320

10 Outline What is transient routing failures? When can transient routing failures occur? How long can transient routing failures last? Measurement results

11 How long Transient Routing Failures last? d W: 2 0 A: 10 W: 2 0 A: 10 MRAI timer 2 0 1 120 10 210

12 MRAI Timers Minimum Advertisement Interval timer –Minimum amount of time that must elapse between routing updates –Applied to BGP announcement or withdrawal Default MRAI value –eBGP session: 30 seconds – iBGP session: 5 seconds

13 Upper Bound for Transient Routing Failure Duration Transient routing failure  min(d u  +d  u )  MRAI 0  u dudu u  v, d  u 0

14 Transient Failures in a Typical BGP System A typical BGP system means that every router in the system applies common routing policies. Routing policies are guided by commercial relationships between ASes. Customer-to-provider Peer-to-peer Common routing policies: Import policies are guided by the prefer-customer routing policies. Export policies are guided by the no-valley routing policies

15 Occurrence of Transient failures in a typical BGP system In a typical BGP system, transient failures are prevalent. –Tier-1 ASes can experience transient routing failures, where alternate routes come from their edge routers. –Non tier-1 ASes can experience transient routing failures, where alternate routes are obtained from other ASes.

16 Outline What is transient routing failures? When can transient routing failures occur? How long can transient routing failures last? Measurement results

17 Measuring Transient Failures within a tier-1 AS Percentage of transient failures among all routing failures that last less than 30 seconds Cumulative distribution of transient Failure Duration BGP updates, BGP tables and router configuration files are collected during July 2004

18 Measuring Transient Failures contd. Transient failures in tier-2 ASes using Oregon RouteView’s BGP updates (July 2004)

19 Popularity of Prefixes Experiencing Transient Failures We aggregate the Netflow data collected in the tier-1 AS during the week (1/2/2005~1/8/2005) Transient routing failures can impact on popular prefixes and unpopular prefixes Fraction of transient routing failures

20 Conclusions Transient routing failures are prevalent in the Internet, and can last for a significant period of time. Majority of transient failures occur under the commonly applied routing policy setting. Popular and unpopular prefixes can experience transient failures.

21 Thanks


Download ppt "On Understanding of Transient Interdomain Routing Failures Feng Wang, Lixin Gao, Jia Wang, and Jian Qiu Department of Electrical and Computer Engineering."

Similar presentations


Ads by Google