Root cause analysis of BGP routing dynamics Matt Caesar, Lakshmi Subramanian, Randy H. Katz.

Slides:



Advertisements
Similar presentations
ABSTRACT Due to the Internets sheer size, complexity, and various routing policies, it is difficult if not impossible to locate the causes of large volumes.
Advertisements

Multihoming and Multi-path Routing
EIGRP Explanation & Configuration By Bill Reed. We Will Examine the features of EIGRP Discuss why EIGRP is known as a Hybrid Routing Protocol Identify.
Comparing IPv4 and IPv6 from the perspective of BGP dynamic activity Geoff Huston APNIC February 2012.
Delayed Internet Routing Convergence due to Flap Dampening Z. Morley Mao Ramesh Govindan, Randy Katz, George Varghese
Part IV: BGP Routing Instability. March 8, BGP routing updates  Route updates at prefix level  No activity in “steady state”  Routing messages.
Network Layer: Internet-Wide Routing & BGP Dina Katabi & Sam Madden.
Dongkee LEE 1 Understanding BGP Misconfiguration Ratul Mahajan, David Wetherall, Tom Anderson.
Consensus Routing: The Internet as a Distributed System John P. John, Ethan Katz-Bassett, Arvind Krishnamurthy, and Thomas Anderson Presented.
1 BGP Anomaly Detection in an ISP Jian Wu (U. Michigan) Z. Morley Mao (U. Michigan) Jennifer Rexford (Princeton) Jia Wang (AT&T Labs)
1 Interdomain Routing Protocols. 2 Autonomous Systems An autonomous system (AS) is a region of the Internet that is administered by a single entity and.
1 Measurement of Highly Active Prefixes in BGP Ricardo V. Oliveira, Rafit Izhak-Ratzin, Beichuan Zhang, Lixia Zhang GLOBECOM’05.
Part II: Inter-domain Routing Policies. March 8, What is routing policy? ISP1 ISP4ISP3 Cust1Cust2 ISP2 traffic Connectivity DOES NOT imply reachability!
BGP update profiles and the implications for secure BGP update validation processing Geoff Huston Swinburne University of Technology PAM April 2007.
A Study of Multiple IP Link Failure Fang Yu
HLP: A Next Generation Interdomain Routing Protocol Lakshminarayanan Subramanian* Matthew Caesar* Cheng Tien Ee*, Mark Handley° Morley Maoª, Scott Shenker*
1 Finding a Needle in a Haystack: Pinpointing Significant BGP Routing Changes in an IP Network Jian Wu (University of Michigan) Z. Morley Mao (University.
1 BGP Security -- Zhen Wu. 2 Schedule Tuesday –BGP Background –" Detection of Invalid Routing Announcement in the Internet" –Open Discussions Thursday.
CS Summer 2003 Quiz 1 Q1) Answer the following: List one protocol that is commonly used for intra AS routing? List one protocol that is used for.
Improving BGP Convergence Through Consistency Assertions Dan Pei, Lan Wang, Lixia Zhang UCLA Xiaoliang Zhao, Daniel Massey, Allison Mankin, USC/ISI S.
Internet Routing Instability Labovitz et al. Sigcomm 1997 Largely adopted from Ion Stoica’s slide at UCB.
BGP: Inter-Domain Routing Protocol Noah Treuhaft U.C. Berkeley.
Dynamics of Hot-Potato Routing in IP Networks Renata Teixeira (UC San Diego) with Aman Shaikh (AT&T), Tim Griffin(Intel),
Next steps in interdomain routing research (why measurements are not enough to decide about it) Steve Uhlig Université catholique de Louvain, Belgium
Learning-Based Anomaly Detection in BGP Updates Jian Zhang Jennifer Rexford Joan Feigenbaum.
Inherently Safe Backup Routing with BGP Lixin Gao (U. Mass Amherst) Timothy Griffin (AT&T Research) Jennifer Rexford (AT&T Research)
March 22, 2002 Simple Protocols, Complex Behavior (Simple Components, Complex Systems) Lixia Zhang UCLA Computer Science Department.
Graphs and Topology Yao Zhao. Background of Graph A graph is a pair G =(V,E) –Undirected graph and directed graph –Weighted graph and unweighted graph.
2003/11/051 The Temporal and Topological Characteristics of BGP Path Changes Di-Fa Chang Ramesh Govindan John Heidemann USC/Information Sciences Institute.
Scalable Construction of Resilient Overlays using Topology Information Mukund Seshadri Dr. Randy Katz.
Towards a Logic for Wide- Area Internet Routing Nick Feamster Hari Balakrishnan.
EQ-BGP: an efficient inter- domain QoS routing protocol Andrzej Bęben Institute of Telecommunications Warsaw University of Technology,
1 Computer Communication & Networks Lecture 22 Network Layer: Delivery, Forwarding, Routing (contd.)
McGraw-Hill©The McGraw-Hill Companies, Inc., 2000 Chapter 14 Routing Protocols RIP, OSPF, BGP.
Information-Centric Networks04a-1 Week 4 / Paper 1 Open issues in Interdomain Routing: a survey –Marcelo Yannuzzi, Xavier Masip-Bruin, Olivier Bonaventure.
9/15/2015CS622 - MIRO Presentation1 Wen Xu and Jennifer Rexford Department of Computer Science Princeton University Chuck Short CS622 Dr. C. Edward Chow.
On the Interaction between Dynamic Routing in the Native and Overlay Layers INFOCOM 2006 Srinivasan Seetharaman Mostafa Ammar College of Computing Georgia.
Understanding and Limiting BGP Instabilities Zhi-Li Zhang Jaideep Chandrashekar Kuai Xu
Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks BGP.
McGraw-Hill©The McGraw-Hill Companies, Inc., 2000 Chapter 13 Routing Protocols (RIP, OSPF, BGP)
BGP in 2011 Geoff Huston APNIC. Conventional (Historical) BGP Wisdom IAB Workshop on Inter-Domain routing in October 2006 – RFC 4984: “routing scalability.
BGP topics to be discussed in the next few weeks: –Excessive route update –Routing instability –BGP policy issues –BGP route slow convergence problem –Interaction.
A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance Feng Wang 1, Zhuoqing Morley Mao 2 Jia Wang 3, Lixin Gao 1,
T. S. Eugene Ngeugeneng at cs.rice.edu Rice University1 COMP/ELEC 429/556 Introduction to Computer Networks Inter-domain routing Some slides used with.
Detection of Routing Loops and Analysis of Its Causes Sue Moon Dept. of Computer Science KAIST Joint work with Urs Hengartner, Ashwin Sridharan, Richard.
1 Quantifying Path Exploration in the Internet Ricardo Oliveira, Rafit Izhak-Ratzin, Lixia Zhang, UCLA Beichuan Zhang, UArizona Dan Pei, AT&T Labs -- Research.
TCOM 509 – Internet Protocols (TCP/IP) Lecture 06_a Routing Protocols: RIP, OSPF, BGP Instructor: Dr. Li-Chuan Chen Date: 10/06/2003 Based in part upon.
1 A Framework for Measuring and Predicting the Impact of Routing Changes Ying Zhang Z. Morley Mao Jia Wang.
By, Matt Guidry Yashas Shankar.  Analyze BGP beacons which are announced and withdrawn, usually within two hour intervals.  The withdraws have an effect.
Detecting Selective Dropping Attacks in BGP Mooi Chuah Kun Huang November 2006.
R-BGP: Staying Connected in a Connected World Nate Kushman Srikanth Kandula, Dina Katabi, and Bruce Maggs.
02/01/2006USC/ISI1 Updates on Routing Experiments Cyber DEfense Technology Experimental Research (DETER) Network Evaluation Methods for Internet Security.
Eliminating Packet Loss Caused by BGP Convergence Nate Kushman Srikanth Kandula, Dina Katabi, and Bruce Maggs.
An internet is a combination of networks connected by routers. When a datagram goes from a source to a destination, it will probably pass through many.
HLP: A Next Generation Interdomain Routing Protocol Lakshminarayanan Subramanian, Matthew Caesar, Cheng Tien Ee, Mark Handley, Morley Mao, Scott Shenker,
1 Effective Diagnosis of Routing Disruptions from End Systems Ying Zhang Z. Morley Mao Ming Zhang.
1 Investigating occurrence of duplicate updates in BGP announcements Jong Han Park 1, Dan Jen 1, Mohit Lad 2, Shane Amante 3, Danny McPherson 4, Lixia.
1 On the Impact of Route Monitor Selection Ying Zhang* Zheng Zhang # Z. Morley Mao* Y. Charlie Hu # Bruce M. Maggs ^ University of Michigan* Purdue University.
Jian Wu (University of Michigan)
COMP 3270 Computer Networks
Geoff Huston APNIC 7th caida/wide measurement workshop Nov
RFC 1058 & RFC 2453 Routing Information Protocol
COS 561: Advanced Computer Networks
COS 561: Advanced Computer Networks
BGP Policies Jennifer Rexford
BGP Interactions Jennifer Rexford
COS 561: Advanced Computer Networks
BGP Instability Jennifer Rexford
Stable and Practical AS Relationship Inference with ProbLink
Geoff Huston APNIC 7th caida/wide measurement workshop Nov
Presentation transcript:

Root cause analysis of BGP routing dynamics Matt Caesar, Lakshmi Subramanian, Randy H. Katz

Motivation Interdomain routing suffers from many problems –Instability –Slow convergence after changes –Misconfigurations Poor visibility into dynamics –What is the spectrum of causes of route changes? –What are the primary causes of instability? –How does BGP respond to a routing change? Incomplete model  incomplete solutions

How can we improve routing? BGP Health monitoring system –Collect routes from routers –Infer properties of network elements –Redistribute information Achieves greater visibility Our focus: how to do inference

Inference problem Terminology: –Event: an activity that generates routing updates –Suspect location set: set of AS’s and links where event could have occurred –Suspect cause set: set of types of events that could have occurred Problem: Given route updates observed at multiple vantage points, determine the suspect set = suspect cause set, suspect location set} of routing events that trigger each update

Correlating Observations: Main idea Quiescent: Assume bursts of updates to a single prefix are correlated Turbulent: Assume updates to many prefixes are correlated Assuming correlated observations are independent worsens precision, but assuming independent observations are correlated worsens accuracy Activity  Time 

Our approach Refined suspect set for minor event Refinement across multiple views Inferences, updates from other views Correlate updates based on time Link analysis of topology Potential link/AS causing major event Detection Algorithm QuiescentTurbulent Split updates into groups Continuous updates Burst of updates for one prefix Inference rules from a single view Why continuous flaps occur? Suspect set Updates from a single view

Turbulent Inference: Example Scenario #1: –~1000 prefixes updated that used (A,C) –  (A,C) suspect Scenario #2: –~4000 prefixes updated that used (A,B), ~2000 that used (B,E) –  (A,B) suspect Scenario #3: –~2000 prefixes updated that used (A,B), ~2000 that used (B,E) –  (B,E) suspect

TurbulentInfer: Issues Effects of simultaneously occurring independent events are overshadowed Two large events simultaneously occurring Transition from Quiescent to Turbulent periods

Quiescent Inference: Example REROUTE –Silence, route change, silence Potential causes (yellow): –Link/router repair –MED/LocalPref decrease –Hold-down expired Potential causes (blue): –Link/router failure –MED/LocalPref increase –Hold-down triggered Previous path Final path “worsened” “improved”

QuiescentInfer: Inference across views Event did not occur here

QuiescentInfer: Issues Simultaneous events: –Eg. Flap in one view, Reroute in another –Eg 2. Advertisement in one view, Withdrawal in another Community attribute changes –Community change can trigger reroute several hops away Multiple peering links

Validation Well-known historical events –UUNET (10/3/02), AT&T (8/28/02) routing difficulties –Internet worms BGP beacons View at the origin View at the origin results BGP Beacon results

Results 70% of updates can be pinpointed to a single inter-AS link (pair of AS’s) –More precise inference for more major events Few AS’s, links causing majority of updates

Future work Investigate continuously flapping prefixes Apply statistical inference techniques Placement of views Alarms/Triggers to detect unhealthy behavior Real time analysis –