Presentation is loading. Please wait.

Presentation is loading. Please wait.

1/21 Evaluating Potential Routing Diversity for Internet Failure Recovery *Chengchen Hu, + Kai Chen, + Yan Chen, *Bin Liu *Tsinghua University, + Northwestern.

Similar presentations


Presentation on theme: "1/21 Evaluating Potential Routing Diversity for Internet Failure Recovery *Chengchen Hu, + Kai Chen, + Yan Chen, *Bin Liu *Tsinghua University, + Northwestern."— Presentation transcript:

1 1/21 Evaluating Potential Routing Diversity for Internet Failure Recovery *Chengchen Hu, + Kai Chen, + Yan Chen, *Bin Liu *Tsinghua University, + Northwestern University

2 2/21 Internet Failures  Failure is part of everyday life in IP networks  e.g., 675,000 excavation accidents in 2004 [Common Ground Alliance]  Network cable cuts every few days …  Real-world emergencies or disasters can lead to substantial Internet disruption  Earthquakes  Storms  Terrorist incident: 9.11 event  …

3 3/21 Example: Taiwan earthquake incident  Large earthquakes hit south of Taiwan on 26 December 2006  Only two of nine cross-sea cables not affected  There are abundant physical level connectivity there, but the it took too long for ISPs to find them and use them. Page 3 figures cited from "Aftershocks from the Taiwan Earthquakes: Shaking up Internet transit in Asia, NANOG42"

4 4/21 How reliable the Internet is?  Internet is not as reliable as people expected! [Wu, CoNEXT’07]  32% ASes are vulnerable to a single critical customer- provider link cut  93.7% Tier-1 ISP’s single-homed customers are lost from the peered ISP due to Tier-1 depeering  Our question: can we find more resources to increase the Internet reliability especially when Internet emergency happens?

5 5/21 Basic Idea  Two places where we can find more routing diversities:  Internet eXchange Points (IXPs)  Co-location where multiple ASes exchange their traffic  Participant ASes in an IXP may not be connected via BGP  Internet valley-free routing policy  AS relationships: customer-provider, peering, sibling  Peering relaxation (PR): allow one AS to carry traffic from the other to its provider  Mentioned in [Wu, CoNEXT’07], but without evaluation  Our main focus:  How much can we gain from these two potential resources, i.e., IXP and PR?

6 6/21 Dataset for Evaluation  Most complete AS topology graph  BGP data  Route Views, RIPE/RIS, Abilene, CERNET BGP View  P2P traceroute  Traceroute data from 992, 000 IPs in over 3, 700 ASes  In total, 120K AS links with AS relationships  http://aqualab.cs.northwestern.edu/projects/SidewalkEnd s.html [Chen et al, CoNEXT’09]  IXP data  PCH + Peeringdb + Euro-IX (~200 IXPs)  3468 participant ASes

7 7/21 Failure Models  Tier-1 depeering  Real example: Cogent and Level3 depeering  Tier-1 provider-customer link teardown  Reported in NANOG forum  Mixed types of link breakdown  9.11 event, Taiwan earthquakes, 2003 Northeast blackout

8 8/21 Evaluation Metrics  Recovery Ratio  # of recovered AS pairs versus total # of affected AS pairs  Path Diversity  # of increased link-disjoint AS paths between affected AS pairs  Shifted Path  # of link-disjoint AS paths shifted onto a normal link after we use IXP or PR resources

9 9/21 Results: Tier-1 Depeering  36 experiments for 9 Tier-1 ASes  Recovery ratio: most of the lost AS pairs can be recovered

10 10/21 Results: Tier-1 Depeering  Path diversity: multiple AS paths between lost AS pairs

11 11/21 Results: Tier-1 Depeering  Shifted path  On average, 3.75 ~ 17.2 for all 36 experiments  Moderate traffic load shifted onto the unaffected links

12 12/21 Economic model  B pays to A for recovery  Business model  Risk alliance (like airlines): price is determined beforehand  pay on bandwidth & duration or bits (95 percentile) A B peer A B P-C A B A B IXP

13 13/21 Communication channel  Search for peers  Have direct connections to peers  Search for co-located ASes in the same IXP  ASes are connected by switches in modern IXPs  Messages are broadcasted with the help of the switches  Message confidentiality with public key crypto

14 14/21 Automatic communication  Query message (failed AS)  who connected to specific destination ASes  Reply message (surviving AS)  I can provide BW1 bandwidth to the destination AS  ACK (failed AS)  I would like buy BW2 (<=BW1)  Set up BGP sessions  Withdraw BGP sessions

15 15/21 Check available connectivity & bandwidth  Connectivity  traceroute  Available bandwidth  Maximum capacity is already known  Estimate the amount which has been used  Y. Zhang, M. Roughan, N. Duffield, and A. Greenberg, “Fast Accurate Computation of Large-Scale IP Traffic Matrices from Link Loads,” ACM SIGMETRICS, 2003.  Subtract

16 16/21 Optimal selection of helper ISPs  From a single victim ISP perspective  Buy transit from a minimal number of ASes  Recover all the (prioritized) traffic  Least cost

17 17/21 Selection heuristic Lost connectivity to {Di}, with bandwidth demand {Bi} is how much bandwidth AS j could provide to Di;

18 18/21 Selection heuristic Lost connectivity to {Di}, with bandwidth demand {Bi} Score each (helper) AS j with Select the AS with largest score (select the one with lowest price if same score) 3 2.3 5 2.1

19 19/21 Selection heuristic Update Lost connectivity to {Di}, with bandwidth demand {Bi} updated

20 20/21 Selection heuristic rescore and select Lost connectivity to {Di}, with bandwidth demand {Bi} 1 0.3 0.1 0

21 21/21 Summary  First work to evaluate the potential routing diversity via IXP and PR with the most complete AS topology graph.  40%-80% of affected AS pairs can be recovered via IXP and PR with multiple paths and moderate shifted paths.  Point out a new venue for Internet failure recovery.  Possible and practical mechanisms to utilize potential routing diversity.  Look forward to feedback and collaborations from IXP/ISPs!

22 22/21 Thank you! Q&A

23 23/21 Backup

24 24/21 Failure Models  Tier-1 depeering  Real example: Cogent and Level3 depeering  Tier-1 provider-customer link teardown  Reported in NANOG forum  Mixed types of link breakdown  9.11 event, Taiwan earthquakes, 2003 Northeast blackout

25 25/21 Results: Tier-1 provider-customer links teardown  Recovery ratio  Path diversity  4.64 for 10 Tier-1 provider-customer links teardown  4.54 for 20 Tier-1 provider-customer links teardown  Shifted path  The average number of shifted path when 10, 20 and 30 links are damaged are 3.4, 4.0 and 4.2, respectively.

26 26/21 Results: Mixed types of links breakdown  Taiwan earthquake, 9 big victim ASes  Recovery ratio

27 27/21 Results: Mixed types of links breakdown  Path diversity

28 28/21 Results: Mixed types of links breakdown  Shifted path

29 29/21 System framework  Adding an Emergency Recovery (ER) module in a router’s control plane  Setting up the communications between ER and the Intra-TE Resource Management modules.

30 30/21 Building communication channel  An example

31 31/21 Optimal selection of ISPs to help  From global view  Min. shift path or tuned AS-links  st. recover all the (prioritized) traffic we could or  Max. recovery ratio  st. shift path or tuned AS-links  From a single ISP  Min. cost for the ISP  st. recover all the (prioritized) traffic we could or  Max. recovery ratio  st. cost for the ISP

32 32/21 Selection heuristic  Lost connectivity to { Di }, with bandwidth demand { Bi }  is how much bandwidth AS j could provide to Di ;  Score each (helper) AS j with  Select the helper AS with largest score (select the one with lowest price if same score)  Update { Di } by deleting the recovered AS  Update { Bi } by subtracting the recovered bandwidth  rescore and select the next helper AS  Iteration till all are recovered


Download ppt "1/21 Evaluating Potential Routing Diversity for Internet Failure Recovery *Chengchen Hu, + Kai Chen, + Yan Chen, *Bin Liu *Tsinghua University, + Northwestern."

Similar presentations


Ads by Google