Hot Potatoes Heat Up BGP Routing Jennifer Rexford AT&T Labs—Research Joint work with Renata Teixeira, Aman Shaikh, and.

Slides:



Advertisements
Similar presentations
Routing Basics.
Advertisements

1 Interdomain Traffic Engineering with BGP By Behzad Akbari Spring 2011 These slides are based on the slides of Tim. G. Griffin (AT&T) and Shivkumar (RPI)
Border Gateway Protocol Ankit Agarwal Dashang Trivedi Kirti Tiwari.
Network Layer: Internet-Wide Routing & BGP Dina Katabi & Sam Madden.
CS540/TE630 Computer Network Architecture Spring 2009 Tu/Th 10:30am-Noon Sue Moon.
© J. Liebeherr, All rights reserved 1 Border Gateway Protocol This lecture is largely based on a BGP tutorial by T. Griffin from AT&T Research.
Border Gateway Protocol Autonomous Systems and Interdomain Routing (Exterior Gateway Protocol EGP)
Fundamentals of Computer Networks ECE 478/578 Lecture #18: Policy-Based Routing Instructor: Loukas Lazos Dept of Electrical and Computer Engineering University.
1 BGP Anomaly Detection in an ISP Jian Wu (U. Michigan) Z. Morley Mao (U. Michigan) Jennifer Rexford (Princeton) Jia Wang (AT&T Labs)
1 Interdomain Routing Protocols. 2 Autonomous Systems An autonomous system (AS) is a region of the Internet that is administered by a single entity and.
1 Route Control Platform – IEEE CCW 2004 Route Control Platform Making an AS look and act like one router Aman Shaikh AT&T Labs - Research IEEE CCW 2004.
TIE Breaking: Tunable Interdomain Egress Selection Renata Teixeira Laboratoire d’Informatique de Paris 6 Université Pierre et Marie Curie with Tim Griffin.
1 Finding a Needle in a Haystack: Pinpointing Significant BGP Routing Changes in an IP Network Jian Wu (University of Michigan) Z. Morley Mao (University.
Traffic Engineering With Traditional IP Routing Protocols
S ufficient C onditions to G uarantee P ath V isibility Akeel ur Rehman Faridee
1 Traffic Engineering for ISP Networks Jennifer Rexford IP Network Management and Performance AT&T Labs - Research; Florham Park, NJ
Traffic Engineering in IP Networks Jennifer Rexford Computer Science Department Princeton University; Princeton, NJ
1 Policy-Based Path-Vector Routing Reading: Sections COS 461: Computer Networks Spring 2006 (MW 1:30-2:50 in Friend 109) Jennifer Rexford Teaching.
MIRED: Managing IP Routing is Extremely Difficult Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
Slide -1- February, 2006 Interdomain Routing Gordon Wilfong Distinguished Member of Technical Staff Algorithms Research Department Mathematical and Algorithmic.
Dynamics of Hot-Potato Routing in IP Networks Renata Teixeira (UC San Diego) with Aman Shaikh (AT&T), Tim Griffin(Intel),
Internet Routing (COS 598A) Today: Interdomain Traffic Engineering Jennifer Rexford Tuesdays/Thursdays.
Inherently Safe Backup Routing with BGP Lixin Gao (U. Mass Amherst) Timothy Griffin (AT&T Research) Jennifer Rexford (AT&T Research)
1 Design and implementation of a Routing Control Platform Matthew Caesar, Donald Caldwell, Nick Feamster, Jennifer Rexford, Aman Shaikh, Jacobus van der.
Internet Routing (COS 598A) Today: Hot-Potato Routing Jennifer Rexford Tuesdays/Thursdays 11:00am-12:20pm.
A Routing Control Platform for Managing IP Networks Jennifer Rexford Princeton University
Network Monitoring for Internet Traffic Engineering Jennifer Rexford AT&T Labs – Research Florham Park, NJ 07932
1 Interdomain Routing Policy Reading: Sections plus optional reading COS 461: Computer Networks Spring 2008 (MW 1:30-2:50 in COS 105) Jennifer Rexford.
Backbone Networks Jennifer Rexford COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
Dynamics of Hot-Potato Routing in IP Networks Jennifer Rexford AT&T Labs—Research Joint work with Renata Teixeira, Aman.
Computer Networks Layering and Routing Dina Katabi
Internet Routing: Measurement, Modeling, and Analysis Dr. Jia Wang AT&T Labs Research Florham Park, NJ 07932, USA
Network Sensitivity to Hot-Potato Disruptions Renata Teixeira (UC San Diego) with Aman Shaikh (AT&T), Tim Griffin(Intel),
Authors Renata Teixeira, Aman Shaikh and Jennifer Rexford(AT&T), Tim Griffin(Intel) Presenter : Farrukh Shahzad.
UCSC 1 Aman ShaikhWIRED Panel on Intra-domain Routing Panel on Intra-domain Routing WIRED 2003 Workshop on Internet Routing Evolution and Design Aman Shaikh.
CS 3700 Networks and Distributed Systems Inter Domain Routing (It’s all about the Money) Revised 8/20/15.
Using Measurement Data to Construct a Network-Wide View Jennifer Rexford AT&T Labs—Research Florham Park, NJ
Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks BGP.
Chapter 9. Implementing Scalability Features in Your Internetwork.
A Case Study in Understanding OSPFv2 and BGP4 Interactions Using Efficient Experiment Design David Bauer†, Murat Yuksel‡, Christopher Carothers† and Shivkumar.
Dynamics of Hot-Potato Routing in IP Networks Jennifer Rexford AT&T Labs—Research Joint work with Renata Teixeira (UCSD),
Border Gateway Protocol (BGP) W.lilakiatsakun. BGP Basics (1) BGP is the protocol which is used to make core routing decisions on the Internet It involves.
More on Internet Routing A large portion of this lecture material comes from BGP tutorial given by Philip Smith from Cisco (ftp://ftp- eng.cisco.com/pfs/seminars/APRICOT2004.
T. S. Eugene Ngeugeneng at cs.rice.edu Rice University1 COMP/ELEC 429/556 Introduction to Computer Networks Inter-domain routing Some slides used with.
On Understanding of Transient Interdomain Routing Failures Feng Wang, Lixin Gao, Jia Wang, and Jian Qiu Department of Electrical and Computer Engineering.
Routing and Routing Protocols
CS 4396 Computer Networks Lab BGP. Inter-AS routing in the Internet: (BGP)
Evolving Toward a Self-Managing Network Jennifer Rexford Princeton University
An internet is a combination of networks connected by routers. When a datagram goes from a source to a destination, it will probably pass through many.
Evolving Toward a Self-Managing Network Jennifer Rexford Princeton University
1 Agenda for Today’s Lecture The rationale for BGP’s design –What is interdomain routing and why do we need it? –Why does BGP look the way it does? How.
BGP Routing Stability of Popular Destinations Jennifer Rexford, Jia Wang, Zhen Xiao, and Yin Zhang AT&T Labs—Research Florham Park, NJ All flaps are not.
Text BGP Basics. Document Name CONFIDENTIAL Border Gateway Protocol (BGP) Introduction to BGP BGP Neighbor Establishment Process BGP Message Types BGP.
1 Effective Diagnosis of Routing Disruptions from End Systems Ying Zhang Z. Morley Mao Ming Zhang.
Michael Schapira, Princeton University Fall 2010 (TTh 1:30-2:50 in COS 302) COS 561: Advanced Computer Networks
Doing Don’ts: Modifying BGP Attributes within an Autonomous System Luca Cittadini, Stefano Vissicchio, Giuseppe Di Battista Università degli Studi RomaTre.
CS 3700 Networks and Distributed Systems
BGP Routing Stability of Popular Destinations
CS 3700 Networks and Distributed Systems
Jian Wu (University of Michigan)
Border Gateway Protocol
COS 561: Advanced Computer Networks
BGP supplement Abhigyan Sharma.
Interdomain Traffic Engineering with BGP
Cours BGP-MPLS-IPV6-QOS
COS 561: Advanced Computer Networks
COS 561: Advanced Computer Networks
Backbone Networks Mike Freedman COS 461: Computer Networks
BGP Instability Jennifer Rexford
Computer Networks Protocols
Presentation transcript:

Hot Potatoes Heat Up BGP Routing Jennifer Rexford AT&T Labs—Research Joint work with Renata Teixeira, Aman Shaikh, and Timothy Griffin

2 Outline  Internet routing Interdomain and intradomain routing Coupling due to hot-potato routing  Measuring hot-potato routing Measuring the two routing protocols Correlating the two data streams  Performance evaluation Characterization on AT&T’s network Implications on network practices  Conclusions and future directions

3 Autonomous Systems Client Web server AS path: 6, 5, 4, 3, 2, 1

4 Interdomain Routing (BGP)  Border Gateway Protocol (BGP) IP prefix: block of destination IP addresses AS path: sequence of ASes along the path  Policy configuration by the operator Path selection: which of the paths to use? Path export: which neighbors to tell? “I can reach /24” “I can reach /24 via AS 1”

5 Intradomain Routing (IGP)  Interior Gateway Protocol (OSPF and IS-IS) Shortest path routing based on link weights Routers flood link-state information to each other Routers compute “next hop” to reach other routers  Weights configuration by the operator Simple heuristics: link capacity or physical distance Traffic engineering: tuning link weights to the traffic

6 Two-Level Internet Routing  Hierarchical routing Intra-domain Metric based Inter-domain Reachability and policy  Design principles Scalability Isolation Simplicity of reasoning AS 1 AS 2 AS 3 intra-domain routing (IGP) inter-domain routing (BGP) Autonomous system (AS) = network with unified administrative routing policy (ex. AT&T, Sprint, UCSD)

7 packet to dst Motivation dst X ISP network Y Z Hot-potato routing = ISPs policy of routing to closest exit point when there is more than one route to destination Consequences:  Routers CPU overload  Transient forwarding instability  Traffic shift  Inter-domain routing changes ISP network failure planned maintenance traffic engineering Routes to thousands of destinations switch exit point!!!

8 BGP Decision Process  Ignore if exit point unreachable  Highest local preference  Lowest AS path length  Lowest origin type  Lowest MED (with same next hop AS)  Lowest IGP cost to next hop  Lowest router ID of BGP speaker Hot potato

9 Outline  Internet routing Interdomain and intradomain routing Coupling due to hot-potato routing  Measuring hot-potato routing Measuring the two routing protocols Correlating the two data streams  Performance evaluation Characterization on AT&T’s network Implications on network practices  Conclusions and future directions

10 Why is This So Darn Hard?  Noisy signals Single event can cause multiple IGP messages Large amount of background BGP updates Multiple messages for single BGP routing change  Protocol implementation Routing protocols provide limited information High complexity of BGP due to configurable policies Many vendor-specific details, such as timers  Monitoring limitations Cannot collect data from every vantage point Delays in delivering data to the collection machine Time synchronization across multiple collectors

11 Our Approach  Collect measurement of both protocols BGP monitor and OSPF monitor  Correlate the two streams of data Match BGP updates with OSPF events  Analyze the protocol interaction X Y Z M AT&T backbone OSPF messages BGP updates

12 Challenges  Lack of information on routing messages Routing protocols are designed to determine a path between two hosts, but not to give reason X Y Z M dst Example 1: BGP update caused by OSPF BGP: announcement: dst, X OSPF: CHG: X, dst, Ydst, X 8

13 M Challenges  Lack of information on routing messages Routing protocols are designed to determine a path between two hosts, but not to give reason X Y Z dst Example 2: BGP update NOT caused by OSPF 9 10 dst, Ydst, X BGP: announcement: dst, X

14 M Challenges  Lack of information on routing messages Routing protocols are designed to determine a path between two hosts, but not to give reason X Y Z dst Example 2: BGP update NOT caused by OSPF 9 10 dst, Ydst, X 8 BGP: announcement: dst, X OSPF: CHG: X, 8

15 Heuristic for Matching Classify BGP updates by possible OSPF causes Transform stream of OSPF messages into routing changes link failure refresh weight change chg cost del chg cost Match BGP updates with OSPF events that happen close in time Stream of OSPF messages Stream of BGP updates time

16 Pre-processing OSPF LSAs  Transform OSPF messages into routing changes from a router’s perspective M X Y Z LSA weight change, LSA weight change, 10 X 5 Y 4 CHG Y, 7 X 5 Y 7 LSA delete LSA add, 1 DEL X Y 7 ADD X, 5 X 5 Y 7 OSPF routing changes: 2 1

17 Classifying BGP Updates BGP update from Z Announcement of dst, X Withdrawal of dst, Y Replacement of route to dst different route through Y ADD X? DEL Y? ADD X? CHG X or CHG Y? Y X X Y Z dst M

18 Classifying BGP Updates route via X is better route via X is worse routes are equally good ADD X? DEL Y? ADD X? CHG X, CHG Y? Y X X Y Z dst M

19 Outline  Internet routing Interdomain and intradomain routing Coupling due to hot-potato routing  Measuring hot-potato routing Measuring the two routing protocols Correlating the two data streams  Performance evaluation Characterization on AT&T’s network Implications on network practices  Conclusions and future directions

20 Time Lag OSPF-triggered BGP updates for June 25 th, 2003 Cumulative %BGP updates time BGP – time OSPF (seconds) ~15% of OSPF-triggered BGP updates in a day most OSPF-triggered BGP updates lag from 10 to 60 seconds

21 Results for June 2003  High variability according to location and day Impact on external BGP measurements and customers  One LSA can have a big impact locationminmaxdays > 10% close to peers0%3.76%0 between peers0%25.87%5 locationno impactprefixes impacted close to peers97.53%less than 1% between peers97.17%55%

22 BGP Updates Over Prefixes Cumulative %BGP updates % prefixes Non-OSPF triggered All OSPF-triggered OSPF-triggered BGP updates affects ~50% of prefixes uniformly prefixes with only one exit point Contrast with non-OSPF triggered BGP updates

23 Operational Implications  Forwarding plane convergence Accuracy of active measurements  Router proximity to exit points Likelihood of hot-potato routing changes  Cost in/out of links during maintenance Avoid triggering BGP routing changes  More complexity with route reflectors Longer delays and more BGP messages

24 Forwarding Convergence R1R1 R2R2 dst R 2 starts using R1 to reach dst Scan process runs in R 2 R 1 ’s scan process can take up to 60 seconds to run Packets to dst may be caught in a loop for 60 seconds!

25 Measurement Accuracy  Measurements of customer experience Probe machines have just one exit point! R1R1 R2R2 dst loop to reach dst W1W1 W2W2

26 What to do?  Increase estimate for forwarding convergence For destinations/customers with multiple exit points  Extensions to measurement infrastructure Multiple access links for a probe machine Multiple probe machines with same address  Better BGP implementation on the router Decrease scan timer (maybe too much overhead?) Event-driven IGP/BGP implementation

27 Avoid Equal-distance Exits Z 10 X Y Z X Y dst Small changes will make Z switch exit points to dst More robust to intra-domain routing changes

28 Careful Cost in/out Links Z X Y dst Traffic is more predictable Faster convergence Less impact on neighbors

29 iBGP Route Reflectors X Y Z W dst dst Y, 18 dst W, dst Y, 21 dst W, 20 Announcement X dst X,19 dst W, 20 Scalability trade-off: Less BGP state vs. Number of BGP updates from Z and longer convergence delay

30 Ongoing Work  Reduction of false matches Compare with conservative analysis (lower bound) Cluster BGP updates and IGP LSAs in time  Black-box testing of the routers Scan timer and its effects (forwarding loops) Vendor interactions (with Cisco)  Impact of the IGP-triggered BGP updates Changes in the flow of traffic Externally visible BGP routing changes  Modeling the protocol interaction Understand impact of router location

31 Future Directions  Improving isolation (cooling those potatoes!) Operational practices: preventing interactions Protocol design: stronger decoupling Network design: internal topology/architecture  Extending our monitoring architecture Data from multiple vantage points Real time correlation of data streams Automatic generation of alarms/reports  Better route monitoring Router support for special monitoring sessions Protocol extensions to help in troubleshooting Diagnose problems, and read the router’s mind!

32 Exporting Routing Instability Z X Y Z X Y dst Announcement No change => no announcement