Internet Routing Instability Craig Labovitz, G. Robert Malan, Farham Jahanian University of Michigan Presented By Krishnanand M Kamath.

Slides:



Advertisements
Similar presentations
The Impact of Policy and Topology on Internet Routing Convergence NANOG 20 October 23, 2000 Abha Ahuja InterNap *In collaboration with.
Advertisements

Modeling Inter-Domain Routing Protocol Dynamics ISMA 2000 December 6, 2000 In collaboration with Abha, Ahuja, Roger Wattenhofer, Srinivasan Venkatachary,
Comparing IPv4 and IPv6 from the perspective of BGP dynamic activity Geoff Huston APNIC February 2012.
1 End-to-End Routing Behavior in the Internet Internet Routing Instability Presented by Carlos Flores Gaurav Jain May 31st CS 6390 Advanced Computer.
Part IV: BGP Routing Instability. March 8, BGP routing updates  Route updates at prefix level  No activity in “steady state”  Routing messages.
Advanced Networks 1. Delayed Internet Routing Convergence 2. The Impact of Internet Policy and Topology on Delayed Routing Convergence.
CS 268: Routing Behavior in the Internet Ion Stoica February 18, 2003.
Delayed Internet Routing Convergence Craig Labovitz, Microsoft Research Abha Ahuja, University of Michigan Farnam Jahanian, University of Michigan Abhit.
CS Summer 2003 CS672: MPLS Architecture, Applications and Fault-Tolerance.
Border Gateway Protocol Ankit Agarwal Dashang Trivedi Kirti Tiwari.
CS540/TE630 Computer Network Architecture Spring 2009 Tu/Th 10:30am-Noon Sue Moon.
Lecture 9 Overview. Hierarchical Routing scale – with 200 million destinations – can’t store all dests in routing tables! – routing table exchange would.
Path Vector Routing NETE0514 Presented by Dr.Apichan Kanjanavapastit.
1 Experimental Study of Internet Stability and Wide-Area Backbone Failure Craig Labovitz, Abha Ahuja Merit Network, Inc Presented by Changchun Zou.
Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."
© J. Liebeherr, All rights reserved 1 Border Gateway Protocol This lecture is largely based on a BGP tutorial by T. Griffin from AT&T Research.
Internet Routing Instability
Fundamentals of Computer Networks ECE 478/578 Lecture #18: Policy-Based Routing Instructor: Loukas Lazos Dept of Electrical and Computer Engineering University.
Consensus Routing: The Internet as a Distributed System John P. John, Ethan Katz-Bassett, Arvind Krishnamurthy, and Thomas Anderson Presented.
1 BGP Anomaly Detection in an ISP Jian Wu (U. Michigan) Z. Morley Mao (U. Michigan) Jennifer Rexford (Princeton) Jia Wang (AT&T Labs)
1 Interdomain Routing Protocols. 2 Autonomous Systems An autonomous system (AS) is a region of the Internet that is administered by a single entity and.
Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol –Datagram format.
1 Finding a Needle in a Haystack: Pinpointing Significant BGP Routing Changes in an IP Network Jian Wu (University of Michigan) Z. Morley Mao (University.
Scalability & Stability of the Internet Infrastructure Farnam Jahanian Department of EECS University of Michigan.
Analysis of BGP Routing Tables
(c) Anirban Banerjee, Winter 2005, CS-240, 2/1/2005. The Impact of Internet Policy and Topology on Delayed Routing convergence C. Labovitz, A. Ahuja, R.
Internet Routing Instability Labovitz et al. Sigcomm 1997 Largely adopted from Ion Stoica’s slide at UCB.
BGP: Inter-Domain Routing Protocol Noah Treuhaft U.C. Berkeley.
Delayed Internet Routing Convergence Craig Labovitz, Abha Ahuja, Abhijit Bose, Farham Jahanian Presented By Harpal Singh Bassali.
Dynamics of Hot-Potato Routing in IP Networks Renata Teixeira (UC San Diego) with Aman Shaikh (AT&T), Tim Griffin(Intel),
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Exterior Gateway Protocols: EGP, BGP-4, CIDR Shivkumar Kalyanaraman Rensselaer Polytechnic Institute.
15-744: Computer Networking L-5 Inter-Domain Routing.
CSEE W4140 Networking Laboratory Lecture 5: IP Routing (OSPF and BGP) Jong Yul Kim
Network Monitoring for Internet Traffic Engineering Jennifer Rexford AT&T Labs – Research Florham Park, NJ 07932
15-744: Computer Networking L-6 Inter-Domain Routing.
Feb 12, 2008CS573: Network Protocols and Standards1 Border Gateway Protocol (BGP) Network Protocols and Standards Winter
Allocations vs Announcements A comparison of RIR IPv4 Allocation Records with Global Routing Announcements Geoff Huston May 2004 (Activity supported by.
Inter-domain Routing Don Fussell CS 395T Measuring Internet Performance.
Authors Renata Teixeira, Aman Shaikh and Jennifer Rexford(AT&T), Tim Griffin(Intel) Presenter : Farrukh Shahzad.
I-4 routing scalability Taekyoung Kwon Some slides are from Geoff Huston, Michalis Faloutsos, Paul Barford, Jim Kurose, Paul Francis, and Jennifer Rexford.
Advertising Equal Cost Multi-Path Routes in BGP Manav Bhatia Samsung India Software Operations, Bangalore – India July 17, th IETF - Vienna draft-ecmp-routes-in-bgp-00.txt.
© Janice Regan, CMPT 128, CMPT 371 Data Communications and Networking BGP, Flooding, Multicast routing.
CS 3700 Networks and Distributed Systems Inter Domain Routing (It’s all about the Money) Revised 8/20/15.
Understanding and Limiting BGP Instabilities Zhi-Li Zhang Jaideep Chandrashekar Kuai Xu
CS 3830 Day 29 Introduction 1-1. Announcements r Quiz 4 this Friday r Signup to demo prog4 (all group members must be present) r Written homework on chapter.
Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks BGP.
Chapter 9. Implementing Scalability Features in Your Internetwork.
Border Gateway Protocol
Computer Networking Inter-Domain Routing BGP (Border Gateway Protocol)
Border Gateway Protocol (BGP) W.lilakiatsakun. BGP Basics (1) BGP is the protocol which is used to make core routing decisions on the Internet It involves.
A Firewall for Routers: Protecting Against Routing Misbehavior1 June 26, A Firewall for Routers: Protecting Against Routing Misbehavior Jia Wang.
More on Internet Routing A large portion of this lecture material comes from BGP tutorial given by Philip Smith from Cisco (ftp://ftp- eng.cisco.com/pfs/seminars/APRICOT2004.
A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance Feng Wang 1, Zhuoqing Morley Mao 2 Jia Wang 3, Lixin Gao 1,
Network Layer4-1 Intra-AS Routing r Also known as Interior Gateway Protocols (IGP) r Most common Intra-AS routing protocols: m RIP: Routing Information.
By, Matt Guidry Yashas Shankar.  Analyze BGP beacons which are announced and withdrawn, usually within two hour intervals.  The withdraws have an effect.
An internet is a combination of networks connected by routers. When a datagram goes from a source to a destination, it will probably pass through many.
A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance Feng Wang 1, Zhuoqing Morley Mao 2 Jia Wang 3, Lixin Gao 1,
Inter-domain Routing Outline Border Gateway Protocol.
1 Investigating occurrence of duplicate updates in BGP announcements Jong Han Park 1, Dan Jen 1, Mohit Lad 2, Shane Amante 3, Danny McPherson 4, Lixia.
ROUTING ON THE INTERNET COSC Jun-16. Routing Protocols  routers receive and forward packets  make decisions based on knowledge of topology.
Jian Wu (University of Michigan)
Border Gateway Protocol
OSPF and BGP State Migration for Resource-portable IP router
COS 561: Advanced Computer Networks
COS 561: Advanced Computer Networks
COS 561: Advanced Computer Networks
COS 461: Computer Networks
2005 – A BGP Year in Review February 2006 Geoff Huston
BGP Instability Jennifer Rexford
Computer Networks Protocols
Presentation transcript:

Internet Routing Instability Craig Labovitz, G. Robert Malan, Farham Jahanian University of Michigan Presented By Krishnanand M Kamath

Cause and Effect Define routing instability Rapid change of network reachability and topology information Causes  Router Configuration Errors  Transient Physical and data link problems  Problems with leased line, router failures, high levels of congestion  Software Configuration Errors Effects  Very many – slew of effects

Effects  Increased network latency and time to convergence  Dropped and out of order delivery of packets  Miserable end to end performance  Loss of connectivity in national networks  Route caching architecture and low end processors for CPU  Pr(Cache Miss) increases, severe CPU load, memory problems  Delays in packet processing, Keep-Alive packets are delayed  Others flag the router as down and transmit updates  Down router reinitiates peering session  Large state dump transmission  Yet more routers fail- Route Flap Storm

Solutions Route Aggregation  Reduces the overall number of networks visible in the core  Requires cooperation between service providers  Redundant connectivity to the internet – multi-homing Route Dampening Algorithms  Not a panacea – legitimate announcements may be delayed Overall, Multi-homing exhibiting linear growth Internet topology growing increasingly less hierarchical Increasing topological complexity

Recall Updates Announcements New route New policy decision for an existing route Withdrawals Explicit – associated with a withdrawal message Implicit – existing route is Replaced by announcement Of new route

Types of Updates Inter-domain routing updates Forwarding Instability Legitimate topological changes and affect the paths on which data will be forwarded between AS’s Routing policy fluctuation Reflects changes in routing policy information that may not affecting forwarding paths between AS’s Pathological Updates Redundant BGP info that reflect neither routing nor forwarding instability

Major Results  Number of BGP updates is one or more orders of magnitude larger than expected.  Routing information is dominated by pathological updates  Instability and redundant updates exhibit a periodicity of 30 & 60 secs  Instability and redundant updates show a correlation to network usage  Instability is not dominated by a small set of AS or routes  Discounting policy fluctuation and pathological behavior there remains a significant level of internet forwarding instability  Specific architectural and protocol implementation changes in commercial internet routers through collaboration with vendors

Taxonomy Data Analyzed Sequences of BGP updates for each (prefix, peer) tuple Events Identified WADiff A route is explicitly withdrawn as it becomes unreachable and later replaced with an alternative route to the same destination. The alternative route differs in its ASPATH or nexthop attribute information.(Forwarding Instability) AADiff A route is implicitly withdrawn and replaced by an alternative route as the original route becomes unreachable, or a prefferd alternative path becomes Available (Forwarding Instability)

Taxonomy(contd.) Events Identified(contd.) WADup A route is explicitly withdrawn and then re-announced as reachable. This may reflect transient topological failure, or it may represent a pathological oscillation. (Forwarding Instability or Pathological Behavior) AADup A route is implicitly withdrawn and replaced with a duplicate of the original route. Duplicate Route – is defined as a subsequent route announcement that does not differ in nexthop or ASPATH attribute information. (Pathological Behavior or Route Ploicy Fluctuation) WWDup The repeated transmission of BGP withdrawals for a prefix that is currently unreachable. (Pathological Behavior)

Methodology Data Collected: BGP routing messages Time Period: Over the course of 9 months starting Jan 96 Where: Five of the major U.S. network exchange points Tool: Unix based route servers, Multithreaded routing Toolkit(MRT)

Gross Observations We Expect, Instability  (Globally visible addresses, total number of available paths) We Observe, For 45,000 prefixes and 1500 paths- 3 to 6 million updates per day

Pathological Behavior Disturbing behaviors, Most of the BGP updates entirely pathological (WWDup) Disproportionate effect that a single service provider can have on global routing Causal relationship between manufacturer of a router and level of pathological behavior Routing updates have a regular, specific periodicity of either 30 or 60 seconds Persistence of pathological behavior are under five minutes

Origins of Pathologies Stateless BGP: Withdrawals are sent for every explicitly and implicitly withdrawn prefix- no state on info advertised to peers Plausible Explanations, CSU Timer problems Unjittered 30 second interval timer, self-synchronization Misconfigured interaction of IGP/BGP protocols Router vendor software bugs Unconstrained routing policies

Analysis of Instability Instability as the sum of AADiff, WADiff and WADup updates

Fine-grained Instability Statistics There is no correlation between the size of an AS and its proportion of the instability statistics.

Fine-grained Instability Statistics No single AS or prefix consistently dominates the instability statistics Instability is evenly distributed across routes

Temporal Properties of Instability Plausible causes for the periodicity, Routing software timers, self synchronization, and routing loops CSU handshaking timeouts Flaw in routing protocol

Origins of Internet Routing Instability Craig Labovitz, G. Robert Malan, Farham Jahanian University of Michigan

Introduction We observed,  Several orders of magnitude more routing updates  Large number of duplicate routing messages  Unexpected frequency components between instability events Extend earlier analysis by,  Identifying the origins of many of the pathological behavior  Impact of specific commercial router software changes suggested  Additional router software changes that can decrease updates exchanged by an additional 30 percent or more

Major Results  Volume of inter-domain routing updates has decreased by an order of magnitude since April  The majority of BGP messages consists of redundant announcements  A growing proportion of instability stems from specific changes in Internet architecture coupled with limitations in router software and algorithms.  Instability is not disproportionately dominated by prefixes of specific lengths.  Persistently oscillating routes dominate the BGP traffic generated by a few Internet providers.  Experimentally confirmed a num of origins of pathological routing behavior postulated in the earlier work.

Analysis of Gross Trends Note,  Dramatic decrease in the number of withdrawals  Number of announcements have doubled over 28 month period  Growth of BGP announcements disproportional to any corresponding increase in the number of routing table entries

Taxonomy Analyze sequences of BGP updates for each (prefix, peer) tuple Identify the events, AADup: A route is implicitly withdrawn and replaced with a duplicate of the original route. We define a duplicate route as a subsequent route announcement that does not differ in any BGP path attribute information. AADiff: A route is implicitly withdrawn and replaced by an alternative route as the original route becomes unreachable, or a preferred alternative path becomes available. Tup and Tdown Fluctuation in the reachability for a given prefix Tup:currently unreachable prefix announced reachable & transitions up Tdown: announced route is withdrawn and transitions down

Analysis of Update Categories AADup Behavior stems from: 1.Non – transitive attribute filtering 2.Combination of BGP minimum advertising timer with stateless BGP

Analysis of AADiffs Note  Low percentage of ASPath ASDiffs  Growth in number of origin AADiffs related to architecture and and policy issues  Growth in number of community AADiffs reflects its recent adoption by many ISPs  Oscillations in MED due to the IBGP mapped MED policy at two service providers

IBGP Mapped MED

Frequency Recall, Frequency defined as inverse of inter-arrival time between routing updates Predominant frequencies have a 30 sec and 60 sec periodicity Cause, Frequency components stem from a fixed minimum BGP advertisement timer used by atleast one router vendor

Prefix Length Statistics

Conclusions Volume of routing update messages decreased by an order of magnitude by specific software changes on the majority of core Internet backbone routers. Software changes successfully suppressed the generation of pathological withdrawals. Proposed new software changes that may reduce instability levels by an additional thirty percent. Instability is well distributed across both autonomous system and prefix space. No single service provider or set of network destinations appears to be at fault.