BGP Inefficiencies Supplemental slides 02/14/2007 Aditya Akella.

Slides:



Advertisements
Similar presentations
Multihoming and Multi-path Routing
Advertisements

Multihoming and Multi-path Routing
Aditya Akella An Empirical Evaluation of Wide-Area Internet Bottlenecks Aditya Akella with Srinivasan Seshan and Anees Shaikh IMC 2003.
COS 461 Fall 1997 Routing COS 461 Fall 1997 Typical Structure.
Cisco S3 C5 Routing Protocols. Network Design Characteristics Reliable – provides mechanisms for error detection and correction Connectivity – incorporate.
Part IV: BGP Routing Instability. March 8, BGP routing updates  Route updates at prefix level  No activity in “steady state”  Routing messages.
Advanced Networks 1. Delayed Internet Routing Convergence 2. The Impact of Internet Policy and Topology on Delayed Routing Convergence.
Network Layer: Internet-Wide Routing & BGP Dina Katabi & Sam Madden.
Fundamentals of Computer Networks ECE 478/578 Lecture #18: Policy-Based Routing Instructor: Loukas Lazos Dept of Electrical and Computer Engineering University.
1 Interdomain Routing Protocols. 2 Autonomous Systems An autonomous system (AS) is a region of the Internet that is administered by a single entity and.
Traffic Engineering With Traditional IP Routing Protocols
1 BGP Security -- Zhen Wu. 2 Schedule Tuesday –BGP Background –" Detection of Invalid Routing Announcement in the Internet" –Open Discussions Thursday.
15-744: Computer Networking L-6 Routing Issues. L -6; © Srinivasan Seshan, New Routing Ideas Border Gateway Protocol (BGP) cont. Overlay.
Delayed Internet Routing Convergence Craig Labovitz, Abha Ahuja, Abhijit Bose, Farham Jahanian Presented By Harpal Singh Bassali.
Routing Jennifer Rexford Advanced Computer Networks Tuesdays/Thursdays 1:30pm-2:50pm.
Routing.
1 Interdomain Routing Policy Reading: Sections plus optional reading COS 461: Computer Networks Spring 2008 (MW 1:30-2:50 in COS 105) Jennifer Rexford.
15-744: Computer Networking L-7 Routing Issues. L -7; © Srinivasan Seshan, New Routing Ideas Border Gateway Protocol (BGP) cont. Overlay networks.
Backbone Networks Jennifer Rexford COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
Network Measurement Bandwidth Analysis. Why measure bandwidth? Network congestion has increased tremendously. Network congestion has increased tremendously.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Wide-Area Traffic Management COS 597E: Software Defined Networking.
Fall 2004 Lecture 31 Inter-Domain Routing: BGP, Overlay Routing, Multihoming Vyas Sekar Based on slides from: Srini Seshan, Tim Griffin.
Computer Networks Layering and Routing Dina Katabi
1 Internet Protocol: Forwarding IP Datagrams Chapter 7.
Information-Centric Networks04a-1 Week 4 / Paper 1 Open issues in Interdomain Routing: a survey –Marcelo Yannuzzi, Xavier Masip-Bruin, Olivier Bonaventure.
1 Interdomain Routing (BGP) By Behzad Akbari Fall 2008 These slides are based on the slides of Ion Stoica (UCB) and Shivkumar (RPI)
CS 3700 Networks and Distributed Systems Inter Domain Routing (It’s all about the Money) Revised 8/20/15.
RON: Resilient Overlay Networks David Andersen, Hari Balakrishnan, Frans Kaashoek, Robert Morris MIT Laboratory for Computer Science
Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks BGP.
RON: Resilient Overlay Networks David Andersen, Hari Balakrishnan, Frans Kaashoek, Robert Morris MIT Laboratory for Computer Science
A Routing Underlay for Overlay Networks Akihiro Nakao Larry Peterson Andy Bavier SIGCOMM’03 Reviewer: Jing lu.
David Wetherall Professor of Computer Science & Engineering Introduction to Computer Networks Hierarchical Routing (§5.2.6)
Resilient Overlay Networks By David Andersen, Hari Balakrishnan, Frans Kaashoek, and Robert Morris MIT RON Paper from ACM Oct Advanced Operating.
A comparison of overlay routing and multihoming route control Hayoung OH
A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance Feng Wang 1, Zhuoqing Morley Mao 2 Jia Wang 3, Lixin Gao 1,
T. S. Eugene Ngeugeneng at cs.rice.edu Rice University1 COMP/ELEC 429/556 Introduction to Computer Networks Inter-domain routing Some slides used with.
Networking Fundamentals. Basics Network – collection of nodes and links that cooperate for communication Nodes – computer systems –Internal (routers,
TCOM 509 – Internet Protocols (TCP/IP) Lecture 06_a Routing Protocols: RIP, OSPF, BGP Instructor: Dr. Li-Chuan Chen Date: 10/06/2003 Based in part upon.
Information-Centric Networks04b-1 Week 4 / Paper 2 Understanding BGP Misconfiguration –Rahil Mahajan, David Wetherall, Tom Anderson –ACM SIGCOMM 2002 Main.
Routing protocols. Static Routing Routes to destinations are set up manually Route may be up or down but static routes will remain in the routing tables.
An internet is a combination of networks connected by routers. When a datagram goes from a source to a destination, it will probably pass through many.
Information-Centric Networks Section # 4.2: Routing Issues Instructor: George Xylomenos Department: Informatics.
© 2005 Cisco Systems, Inc. All rights reserved. BGP v3.2—3-1 Route Selection Using Policy Controls Using Multihomed BGP Networks.
Network Layer4-1 Chapter 4: Network Layer r 4. 1 Introduction r 4.2 Virtual circuit and datagram networks r 4.3 What’s inside a router r 4.4 IP: Internet.
Michael Schapira, Princeton University Fall 2010 (TTh 1:30-2:50 in COS 302) COS 561: Advanced Computer Networks
A survey of Internet routing reliability Presented by Kundan Singh IRT internal talk April 9, 2003.
PlanetSeer: Internet Path Failure Monitoring and Characterization in Wide-Area Services Ming Zhang, Chi Zhang Vivek Pai, Larry Peterson, Randy Wang Princeton.
Chapter 4: Network Layer
Chapter 4: Network Layer
A Comparison of Overlay Routing and Multihoming Route Control
COS 561: Advanced Computer Networks
Introduction to Internet Routing
Intra-Domain Routing Jacob Strauss September 14, 2006.
Routing.
Department of Computer and IT Engineering University of Kurdistan
COS 561: Advanced Computer Networks
COS 561: Advanced Computer Networks
COS 561: Advanced Computer Networks
COS 561: Advanced Computer Networks
Chapter 4: Network Layer
Chapter 4: Network Layer
COS 561: Advanced Computer Networks
Chapter 4: Network Layer
COMP/ELEC 429/556 Introduction to Computer Networks
BGP Interactions Jennifer Rexford
COS 461: Computer Networks
BGP Instability Jennifer Rexford
An Empirical Evaluation of Wide-Area Internet Bottlenecks
Routing.
A Comparison of Overlay Routing and Multihoming Route Control
Presentation transcript:

BGP Inefficiencies Supplemental slides 02/14/2007 Aditya Akella

BGP Complexity BGP is a very complicated protocol –Too many knobs –Need to accommodate (sub-optimal) ISP policies –Requires complex, human configuration For all its complexity, BGP offers no guarantees –Performance?? –Reliability?? –Correctness?? –Reachability?? All of BGPs complexity begets… Headache!

BGP Pitfalls and Problems Pitfalls and problems –Misconfiguration –Convergence –Performance –Reliability –Stability –Security –And the list goes on…

Favorite Scapegoat! BGP Networking community

Misconfiguration [Mahajan02Sigcomm] Origin misconfiguration: accidentally inject routes for prefixes into global BGP tables Old RouteNew Route Self deaggregation (failure to aggregate) a.b.0.0/16 X Y Za.b.c.0/24 X Y Z Related origin (likely connected to the network– human error) a.b.0.0/16 X Y Za.b.0.0/16 X Y a.b.0.0/16 X Y Z O a.b.c.0/24 X Y a.b.c.0/24 X Y Z O Foreign origin (address space hijack!) a.b.0.0/16 X Y Za.b.0.0/16 X Y O a.b.c.0/24 X Y O e.f.g.h/i X Y O

Misconfiguration Export misconfiguration: export route to a peer in violation of policy ExportPolicy Violation Provider  AS  ProviderRoute exported to provider was imported from a provider Provider  AS  PeerRoute exported to peer was imported from a provider Peer  AS  ProviderRoute exported to provider was imported from a peer Peer  AS  PeerRoute exported to peer was imported from a peer

Interesting Observations Origin misconfig –72% of new routes may be misconfig –11-13% of misconfig incidents affect connectivity Pings and checks –Self de-aggregation is the main cause Export misconfig –Upto 500 misconfiguration incidents per day –All forms are prevalent, although provider-AS-provider is more likely

Effects and Causes Effects –Routing load –Connectivity disruption –Extra traffic –Policy violation Causes (Origin misconfig) –Router vendor software bugs: announce and withdraw routes on reboot –Reliance on upstream filtering –New configuration not saved to stable storage (separate command and no autosave!) –Hijacks of address spaces –Forgotten to install filter –Human operators and poor interface P1P2 A C Intended policy: Provide transit to C through link A-C Configured policy: Export all routes originated by C to P1 and P2 Correct policy: export only when AS path is “C” Export Misconfig

BGP Convergence [Labovitz00Sigcomm] Conventional beliefs –Path vector converges faster than traditional DV (eliminates the count to infinity problem) –Internet path restoration takes order of 10s of seconds Convergence –Recovery after a fault may take as much as ten minutes –Single routing fault could result in multiple announcements and withdrawals –Loss and RTT around times of faults are much worse Upon route withdrawal, explore paths of increasing length –In the worst case, could explore n! paths –Depends which messages are processed and when Limit between update message could reduce messages –Forces all outstanding messages to be processed

End-to-End Routing Behavior [Paxson96Sigcomm] Large scale routing behavior as seen by end- hosts, based on analysis of traceroutes Pathologies: persistent routing loops, routing failures and long connectivity outages Stability: 9% or routes changed every 10s of minutes, 30% about ~6hrs and 68% took a few days Symmetry: more than half of paths probed were asymmetric at router level

Inefficiencies in BGP & Internet Routing Route convergence and oscillations Poor reliability –No way to exploit redundancy in Internet paths Inefficiency: sub-optimal RTTs and throughputs –What are some of the causes? Policies in routing: Inter-domain and Intra-domain Lack of direct routes, “sparseness” of the Internet graph

Inefficiency of Routes [Spring03Sigcomm] Three classes of reasons for poor performance (“inflation”) –Intra-domain topology and policy Topology: no direct link between all cities Routing policy: “shortest paths” may be avoided due to engineering –ISP Peering Peeering topology: limited peering between ISPs Peering policy: hot-potato routing or early-exit routing –Inter-domain Topology: AS graph is sparse Inter-domain policies: policies are policies

Path Inflation Summary

Internet Bottlenecks High-speed “core” Slow, flaky home connection Big, fat Pipe(s) Last-mile, slow access links limit transfer bandwidth Most bottlenecks are last-mile As access technology improves… Non-access or Wide-Area Bottlenecks? 100Mbps home connection

Wide-Area Bottlenecks Wide-Area Internet/ High-speed “core” Small ISP Small ISP Sprint ATT Very Small ISP Tiny ISP Small ISP Small ISP Tiny ISP Very Small ISP UUNet Small ISP Small ISP Small ISP Small ISP Small ISP Unconstrained TCP flow Link with the least available bandwidth Not the “traditional” bottlenecks  may not be congestedWide-area bottleneck  where an unconstrained TCP flow sees delays and losses

Measurement Tool: BFind Monitor queues, identify where queues build up  bottleneck source dest Ideally… But no control over destination Emulate the whole process from the source!

Measurement Tool: BFind source dest Rate controlled UDP stream Rounds of Traceroutes Monitor links for queueing Report to UDP process 1Mbps Round j: Queueing on #2! Rate for round 2:1+  MbpsRate for round 3: 1+2  Mbps Flag #2, keep curent rate for round j+1  force queueing Round 1: No queueing! If #2 flagged too many times  quit. Identify #2 as bottleneck Round 2: No queueing! Round 1 Round 2Round j BFind functions like TCP: gradually increase send rate until hits bottleneck Can identify key properties of the bottleneck –Location, latency, available bandwidth (== send rate of BFind before quitting) –Single-ended control Quits after 180s and before send rate hits 50Mbps Bfind validation: wide-area experiments and simulations

Results: Location Intra-ISP links Inter-ISP links Tier 43%1% Tier 39%8% Tier 212%13% Tier 125%63% Tier 4 – 4, 3, 2, 114%1% Tier 3 – 3, 2, 117%3% Tier 2 – 2, 112%4% Tier 1 – 18%6% %bottlenecks %all links 49% 51% Peering Link Intra-ISP Link One of the two peering links with 50% chance One of the four non-peering links with 50% chance Probability of being the bottleneck = 0.25 Probability of being the bottleneck = 0.125

Results: Available Bandwidth Intra-ISP links Inter-ISP links Tier-1 ISPs are the best Tier-3 ISPs have slightly higher available bandwidth than tier-2 Tier-1 –1 peering is the best Peering involving tiers-2,3 similar

Performance: End-to-End Perspective From an end-to-end view… –Is there a way of extracting better performance? Is there scope? How do we realize this? Scope: Savage99, CMU Multihoming work Reality: UW’s “Detour” system, MIT’s RON, Akamai’s SureRoute, CMU’s Route Control implementation

Quantifying Performance Loss [Savage99Sigcomm] Measure round trip time (RTT) and loss rate between pairs of hosts Alternate path characteristics –30-55% of hosts had lower latency –10% of alternate routes have 50% lower latency –75-85% have lower loss rates

Bandwidth Estimation RTT & loss for multi-hop path –RTT by addition –Loss either worst or combine of hops – why? Large number of flows  combination of probabilities Small number of flows  worst hop Bandwidth calculation –TCP bandwidth is based primarily on loss and RTT 70-80% paths have better bandwidth 10-20% of paths have 3x improvement

Possible Sources of Alternate Paths A few really good or bad AS’s –No, benefit of top ten hosts not great Better congestion or better propagation delay? –How to measure? Propagation = 5th percentile of delays –Both contribute to improvement of performance

Overlay Networks Basic idea: –Treat multiple hops through IP network as one hop in overlay network –Run routing protocol on overlay nodes Why? –For performance – like the Savage 99 paper showed –For efficiency – can make core routers very simple E.g. CSFQ, Also aid deployment. E.g. Active networks –For functionality – can provide new features such as multicast, active processing

Future of Overlay Application specific overlays –Why should overlay nodes only do routing? Caching –Intercept requests and create responses Transcoding –Changing content of packets to match available bandwidth Peer-to-peer applications

Overlay Challenges “Routers” no longer have complete knowledge about link they are responsible for How do you build efficient overlay –Probably don’t want all N 2 links – which links to create? –Without direct knowledge of underlying topology how to know what’s nearby and what is efficient? Do we need overlays for performance?

Number of Route Choices Flexible control of end- to-end path  many route choices Multiple candidate paths Single path Multiple BGP paths BGP: one path via each ISP  choices linked to #ISPs Few more route choices…?

Route Selection Mechanism BGP: simple, coarse metrics such as least AS hops, policy Best performing path Least AS hops Policy compliant Current best performing BGP path Overlays: complex, performance-oriented selection Sophisticated selection among multiple BGP routes Smart selection “Multihoming route control”

Overlay Routing vs. Multihoming Route Control Cost Operational issues Route ControlOverlay Routing Sprint $$ Genuity $$ ATT $$ Overlay provider $$ ATT $$ Overlay node forces inter- mediate ISP to provide transit /18 netblock Announce /20 sub-blocks to ISPs If all multihomed ends do this Routing table expansionBad interactions with policies Connectivity feesConnectivity fees + overlay fee