Understanding and Limiting BGP Instabilities Zhi-Li Zhang Jaideep Chandrashekar Kuai Xu

Slides:



Advertisements
Similar presentations
1 Copyright  1999, Cisco Systems, Inc. Module10.ppt10/7/1999 8:27 AM BGP — Border Gateway Protocol Routing Protocol used between AS’s Currently Version.
Advertisements

BGP.
CS Summer 2003 CS672: MPLS Architecture, Applications and Fault-Tolerance.
Border Gateway Protocol Ankit Agarwal Dashang Trivedi Kirti Tiwari.
CS540/TE630 Computer Network Architecture Spring 2009 Tu/Th 10:30am-Noon Sue Moon.
Lecture 9 Overview. Hierarchical Routing scale – with 200 million destinations – can’t store all dests in routing tables! – routing table exchange would.
© J. Liebeherr, All rights reserved 1 Border Gateway Protocol This lecture is largely based on a BGP tutorial by T. Griffin from AT&T Research.
Border Gateway Protocol Autonomous Systems and Interdomain Routing (Exterior Gateway Protocol EGP)
Fundamentals of Computer Networks ECE 478/578 Lecture #18: Policy-Based Routing Instructor: Loukas Lazos Dept of Electrical and Computer Engineering University.
1 Interdomain Routing Protocols. 2 Autonomous Systems An autonomous system (AS) is a region of the Internet that is administered by a single entity and.
1 Border Gateway Protocol (BGP). 2 Contents  Internet connectivity and BGP  connectivity services, AS relationships  BGP Basics  BGP sessions, BGP.
1 Network Architecture and Design Routing: Exterior Gateway Protocols and Autonomous Systems Border Gateway Protocol (BGP) Reference D. E. Comer, Internetworking.
Practical and Configuration issues of BGP and Policy routing Cameron Harvey Simon Fraser University.
1 BGP Security -- Zhen Wu. 2 Schedule Tuesday –BGP Background –" Detection of Invalid Routing Announcement in the Internet" –Open Discussions Thursday.
S ufficient C onditions to G uarantee P ath V isibility Akeel ur Rehman Faridee
Interdomain Routing and The Border Gateway Protocol (BGP) Courtesy of Timothy G. Griffin Intel Research, Cambridge UK
The Border Gateway Protocol (BGP) Sharad Jaiswal.
1 Policy-Based Path-Vector Routing Reading: Sections COS 461: Computer Networks Spring 2006 (MW 1:30-2:50 in Friend 109) Jennifer Rexford Teaching.
Computer Networking Lecture 10: Inter-Domain Routing
More on BGP Check out the links on politics: ICANN and net neutrality To read for next time Path selection big example Scaling of BGP.
15-744: Computer Networking L-5 Inter-Domain Routing.
CSEE W4140 Networking Laboratory Lecture 5: IP Routing (OSPF and BGP) Jong Yul Kim
Feb 12, 2008CS573: Network Protocols and Standards1 Border Gateway Protocol (BGP) Network Protocols and Standards Winter
© 2009 Cisco Systems, Inc. All rights reserved. ROUTE v1.0—6-1 Connecting an Enterprise Network to an ISP Network Considering the Advantages of Using BGP.
Interdomain Routing and the Border Gateway Protocol (BGP) Reading: Section COS 461: Computer Networks Spring 2011 Mike Freedman
ROUTING PROTOCOLS PART IV ET4187/ET5187 Advanced Telecommunication Network.
Border Gateway Protocol(BGP) L.Subramanian 23 rd October, 2001.
Computer Networks Layering and Routing Dina Katabi
Inter-domain Routing: Today and Tomorrow Dr. Jia Wang AT&T Labs Research Florham Park, NJ 07932, USA
1 Computer Communication & Networks Lecture 22 Network Layer: Delivery, Forwarding, Routing (contd.)
1 Interdomain Routing (BGP) By Behzad Akbari Fall 2008 These slides are based on the slides of Ion Stoica (UCB) and Shivkumar (RPI)
CS 3700 Networks and Distributed Systems Inter Domain Routing (It’s all about the Money) Revised 8/20/15.
Lecture 4: BGP Presentations Lab information H/W update.
Chapter 9. Implementing Scalability Features in Your Internetwork.
Border Gateway Protocol
Network Layer r Introduction r Datagram networks r IP: Internet Protocol m Datagram format m IPv4 addressing m ICMP r What’s inside a router r Routing.
Xuan Zheng (modified by M. Veeraraghavan) 1 BGP overview BGP operations BGP messages BGP decision algorithm BGP states.
Border Gateway Protocol (BGP) W.lilakiatsakun. BGP Basics (1) BGP is the protocol which is used to make core routing decisions on the Internet It involves.
More on Internet Routing A large portion of this lecture material comes from BGP tutorial given by Philip Smith from Cisco (ftp://ftp- eng.cisco.com/pfs/seminars/APRICOT2004.
T. S. Eugene Ngeugeneng at cs.rice.edu Rice University1 COMP/ELEC 429/556 Introduction to Computer Networks Inter-domain routing Some slides used with.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 ECSE-6600: Internet Protocols Informal Quiz #08: SOLUTIONS Shivkumar Kalyanaraman: GOOGLE: “Shiv.
Network Layer4-1 Intra-AS Routing r Also known as Interior Gateway Protocols (IGP) r Most common Intra-AS routing protocols: m RIP: Routing Information.
1 Introduction to Computer Networks University of Ilam By: Dr. Mozafar Bag-Mohammadi Routing.
Interdomain Routing and BGP Routing NJIT May 3, 2003 Timothy G. Griffin AT&T Research
CSCI-1680 Network Layer: Inter-domain Routing Based partly on lecture notes by Rob Sherwood, David Mazières, Phil Levis, Rodrigo Fonseca John Jannotti.
An internet is a combination of networks connected by routers. When a datagram goes from a source to a destination, it will probably pass through many.
CS 640: Introduction to Computer Networks Aditya Akella Lecture 11 - Inter-Domain Routing - BGP (Border Gateway Protocol)
1 Agenda for Today’s Lecture The rationale for BGP’s design –What is interdomain routing and why do we need it? –Why does BGP look the way it does? How.
© 2005 Cisco Systems, Inc. All rights reserved. BGP v3.2—6-1 Scaling Service Provider Networks Scaling IGP and BGP in Service Provider Networks.
© 2005 Cisco Systems, Inc. All rights reserved. BGP v3.2—1-1 BGP Overview Understanding BGP Path Attributes.
BGP Basics BGP uses TCP (port 179) BGP Established unicast-based connection to each of its BGP- speaking peers. BGP allowing the TCP layer to handle such.
Text BGP Basics. Document Name CONFIDENTIAL Border Gateway Protocol (BGP) Introduction to BGP BGP Neighbor Establishment Process BGP Message Types BGP.
Michael Schapira, Princeton University Fall 2010 (TTh 1:30-2:50 in COS 302) COS 561: Advanced Computer Networks
Inter-domain Routing Outline Border Gateway Protocol.
CS 640: Introduction to Computer Networks Aditya Akella Lecture 11 - Inter-Domain Routing - BGP (Border Gateway Protocol)
Border Gateway Protocol BGP-4 BGP environment How BGP works BGP information BGP administration.
CSci5221: Inter-Domain Routing Convergence Issues and Improvements 1 Inter-Domain Routing Convergence Issues, Impacts and Improvements Inter-Domain Routing.
ROUTING ON THE INTERNET COSC Jun-16. Routing Protocols  routers receive and forward packets  make decisions based on knowledge of topology.
1 Internet Routing 11/11/2009. Admin. r Assignment 3 2.
CS 3700 Networks and Distributed Systems
Border Gateway Protocol
CS 3700 Networks and Distributed Systems
Border Gateway Protocol
BGP supplement Abhigyan Sharma.
Lixin Gao ECE Dept. UMASS, Amherst
Cours BGP-MPLS-IPV6-QOS
COS 561: Advanced Computer Networks
COS 561: Advanced Computer Networks
BGP Instability Jennifer Rexford
Computer Networks Protocols
Presentation transcript:

Understanding and Limiting BGP Instabilities Zhi-Li Zhang Jaideep Chandrashekar Kuai Xu

BGP: Internet Glue “Path-vector” routing protocol. Allows networks to tell other networks about destinations that they are “responsible” for and how to reach them Using “route advertisements”, also called “NLRI” or “network-layer reachability information”

BGP: Internet Glue (cont’d) Policy-based: allow ISPs to richly express their routing policy, both in selecting outbound paths and in announcing internal routes Relatively “simple” protocol, but configuration is complex and the entire world can see, and be impacted by, mis- configurations.

ASes & AS Numbers (ASNs) An autonomous system is an independent routing domain that has been assigned an Autonomous System Number (ASN). Currently over 15,000 in use through are “private” Examples AS57U of Minnesota GigaPoP AS217U of Minnesota AS701UUNET AS1239Sprint ASNs represent atoms of BGP routing policy.

AS 1 Genuity AS 57 UMN GigaPoP AS 7911 Wiltel AS Internet2 AS 217 UMN AS 1998 State of Minnesota /16 Internet Connectivity of University of Minnesota

Architecture of Internet Routing AS 1 AS 2 BGP EGP = Exterior Gateway Protocol IGP = Interior Gateway Protocol Metric based: OSPF, IS-IS, RIP Policy based: BGP ISIS OSPF

Simplified BGP Operations Establish session on TCP port 179 Exchange all active routes Exchange incremental updates AS1 AS2 While connection is ALIVE exchange route UPDATE messages BGP session

Types of BGP Messages Open : Establish a peering session. Keep Alive : Handshake at regular intervals. Notification : Shuts down a peering session. Update : announce new routes or withdraw previously announced routes. Announcement : prefix + attribute values Withdrawals : prefix only

BGP Attributes Value Code Reference ORIGIN [RFC1771] 2 AS_PATH [RFC1771] 3 NEXT_HOP [RFC1771] 4 MULTI_EXIT_DISC [RFC1771] 5 LOCAL_PREF [RFC1771] 6 ATOMIC_AGGREGATE [RFC1771] 7 AGGREGATOR [RFC1771] 8 COMMUNITY [RFC1997] 9 ORIGINATOR_ID [RFC2796] 10 CLUSTER_LIST [RFC2796] 11 DPA [Chen] 12 ADVERTISER [RFC1863] 13 RCID_PATH / CLUSTER_ID [RFC1863] 14 MP_REACH_NLRI [RFC2283] 15 MP_UNREACH_NLRI [RFC2283] 16 EXTENDED COMMUNITIES [Rosen] reserved for development Not all attributes need to be present in every announcement

Two Types of BGP Neighbor Relationships External Neighbor (eBGP) in a different Autonomous Systems Internal Neighbor (iBGP) in the same Autonomous System AS1 AS2 eBGP iBGP iBGP is routed (using IGP!) eBGP

iBGP Peers Must be Fully Meshed iBGP neighbors do not announce routes received via iBGP to other iBGP neighbors. eBGP update iBGP updates iBGP is needed to avoid routing loops within an AS Injecting external routes into IGP does not scale and causes BGP policy information to be lost BGP does not provide “shortest path” routing

AS PATH Attribute AS /16 AS Path = 6341 AS 1239 Sprint AS 1755 Ebone AT&T AS 3549 Global Crossing /16 AS Path = /16 AS Path = AS /16 AT&T Research Prefix Originated AS RIPE NCC RIS project AS 1129 Global Access /16 AS Path = /16 AS Path = /16 AS Path = /16 AS Path =

Inter-domain Loop Prevention BGP at AS YYY will never accept a route with ASPATH containing YYY. AS /16 ASPATH = Don’t Accept! AS 1

BGP Best Path Selection Ignore if exit point unreachable Highest local preference Lowest AS path length Lowest origin type Lowest MED (with same next hop AS) Lowest IGP cost to next hop Lowest router ID of BGP speaker

In a nutshell BGP = Path Vector Protocol + Policies. The Path vector protocol is very simple Distribute Reachability. Prevent Loops. All the complexity is introduced by locally administered policies. Determine which paths are selected. And which neighbors they are exported to.

Path Exploration and Slow Convergence

What is Path Exploration? When a link fails (or is repaired), routers “go through” a sequence of paths before selecting a “converged” path. Results from dependencies in advertised “path vectors”. Router’s best path is an extension of a neighbors’ best path. Which extends a best path from one of its own neighbors. And so on……

What is Path Exploration (cont’d) When a link fails, a set of dependent paths becomes invalid (or obsolete). Removed one by one from the system. Router selects and propagates it. Receives withdrawal. Selects next best path (possibly invalid). Receive withdrawal, repeat till no more invalid paths.

Path Exploration example Network in a steady state

Path Exploration Example (cont’d) W 9 8

Path Exploration Example (cont’d) Paths and both contain the “problem edge” additional messages to force 7 to flush “bad paths”. Number of “spurious messages” increases with the “richness” of connectivity … …… 8 9

Impact of Path Exploration In general, convergence time is O(LΔ) ‘L’ is the longest simple path in the network. ‘Δ’ is the time between successive announcements. From measurements: up to 15 minutes to converge (after link failure).

Impact of Path Exploration (cont’d) Delays a router from picking valid, alternate paths. Have to first go through all the invalid paths. Large scale packet losses in a short duration. Core routers process millions of packets a second. In the absence of path exploration, convergence time is Ω(Dh). ‘D’ is “diameter” of the network (D << L) ‘h’ is message processing time at a node.

Causes for Path Exploration Invalid paths are selected, propagated, then withdrawn. Routers waste time processing “stale information” Delay convergence to valid, perhaps less preferred, alternate paths Key Issue: How to distinguish invalid paths from valid” paths Difficult in BGP: AS Paths --high level, abstract

AS PATHS: High Level Connectivity AS 81 AS 217 AS 1239 AS 3 AS AS 217 and AS 3 receive the same AS PATH [ ] Underlying physical paths are disjoint.

Naive Solutions Fail. TAG withdrawals: When router generates withdrawal, tag it with cause/location WDRAW: (2,1) failed

Naïve Solutions Fail (cont’d) AS Paths do not describe (or reflect) internal AS topology. When an internal edge fails, which AS Path affected? [10] or [210]? 2

Naïve Solutions Fail (cont’d) Link between 3.2 and 6.1 fails. 6.1 generates a withdrawal and tags with Should 6.3 remove all paths containing ? AS 3 AS

EPIC --- A Simple Solution Exploit Path dependencies to Invalidate Paths. To avoid Path Exploration: When link fails, a set of dependent paths becomes invalid. All the dependent paths must be removed from the system. Dependent paths cannot be described using only AS Paths. AS Paths are annotated with additional information (forward edge sequence numbers). Can capture path dependencies. Can distinguish valid and invalid paths.

Forward Edge Sequence Numbers When AS Path being advertised to an external AS neighbor, include fesn of “forward” external edge. fesn = edge identifier + sequence number AS XAS Y Edge

Forward Edge Sequence Numbers (cont’d) Defined per destination, for every AS-AS edge. When AS X sends a route to AS Y, the fesn (X:Y, n) is attached; If route already has a previously attached fesn, new fesn is prepended to it ---- fesnList. AS X AS Z AS Y AS W (X:Y, n) (X:Z, m) (X:Y, n)

fesn Management When a link fails, its fesn does not change. Same value carried in withdrawals. When is repaired: AS X increments the sequence number. Subsequent route announcements carry “updated” fesn. So a larger fesn always corresponds to “newer” information

fesnList Propagation [4210] {(0:1, 7)(1:2, 7)(2:4, 14)(4:7, 11)} [0] {(0:1, 7)} [10] {(0:1, 7)(1:3, 3)} [10] {(0:1, 7)(1:2, 7)} Same AS Path, distinct fesnLists

fesnList Propagation (4:7, 11) (2:4, 14) (1:2, 7) (0:1, 7) 5210(5:7, 10) (2:5, 14) (1:2, 7) (0:1, 7) 6310(6:7, 3) (3:6, 7) (1:3, 7) (0:1, 7) After the routes are processed at all nodes Routing Table at AS 7

Invalidating Paths upon Failure When router generates a withdrawal: The fesnList of withdrawn route (“path stem”) is attached to the withdrawal. When router receives a withdrawal: 1. Invalidates all routes containing the fesnList 2. Selects a new best path 3. If best path has changed, it sends new best route to its neighbors, and the withdrawal is piggybacked. 4. If no valid path, only withdrawal is forwarded.

Invalidating Paths: Example W: {(1:2, 7), (0:1, 7)} 4210(4:7, 11) (2:4, 14) (1:2, 7) (0:1, 7) 5210(5:7, 10) (2:5, 14) (1:2, 7) (0:1, 7) 6310(6:7, 3) (3:6, 7) (1:3, 7) (0:1, 7) 76310

Handling Link Repairs When is repaired: 1. AS X increments the fesn for the edge 2. Generates a new route announcement to send to AS Y (reflects updated fesn) 3. At AS Y, the route is installed into routing table and a subsequent route update may be generated. 4. After all updates have been processed, every fesnList containing (X:Y, n) will reflect the updated value.

What about Multiple Edges? Each edge is associated with a minor fesn Contrast with major fesn for “logical” AS-AS edge. All edges between ASes share the same major fesn, but have distinct minor fesn’s. Minor fesn is incremented with corresponding edge. major fesn incremented only if all edges are affected.

Minor fesn’s Minor fesn’s are only used between adjacent ASes. All routers in AS 6 include minor fesn in route updates. When the updates exported externally (to AS 7) minor fesn is removed. AS 3 AS (11) 7 (13) common major fesn distinct minor fesn’s

fesn – Key Properties Sequence number is monotonic --- new events will have higher values. Imposes a partial ordering on the fesnLists. Old information can be easily detected, and discarded. Allows compact, correct description of invalid paths i.e. the fesnList in a withdrawal captures all obsolete paths.

EPIC Properties No router will select an invalid path after receiving any update triggered by a single failure event. No router will select an invalid path after receiving at least one update triggered by each of a set of multiple failure events. Achieves optimal bounds for a path vector protocol. Routers may still explore paths. But these paths are all valid.

EPIC Performance (vs BGP) Time(L-2)Δ(D-1)h Messages(L-2)(|E| -1)|E| - 1 Time(L’+D’–1) ΔD’(h+Δ) Messages(L’+D’-1)(|E’|-1)(|E’|-1)D’(h+Δ)/Δ Fail Down Fail Over BGPEPIC

Root Cause Analysis of BGP Events

BGP Routing Dynamics BGP routing instabilities BGP routing suffers from many problems, e.g., mis- configurations, link failures, policy changes, slow convergence, etc. BGP update streams are visible from all BGP- monitoring vantage points. Open research problems What are the common characteristics of BGP dynamics? What are primary causes of BGP routing dynamics? How to visualize BGP dynamics?

BGP Routing Update (per second) View: UMN Time: 2003/12/07 – 2003/12/14 Time vs. Number of BGP updates at prefix level BGP Update Burst BGP Update Noise

BGP Routing Update (per second) (cont.) View: UMN Time: 2003/12/07 – 2003/12/14 Time vs. Number of BGP updates at AS level BGP Update Burst BGP Update Noise

Modeling BGP Routing Dynamics Modeling BGP dynamics on all prefixes/ASes is challenging. ~120, 000 prefixes, ~16,000 ASes High-dimensional time-series BGP updates are temporally and spatially correlated