Update Damping in BGP Geoff Huston Chief Scientist, APNIC.

Slides:



Advertisements
Similar presentations
BGP01 An Examination of the Internets BGP Table Behaviour in 2001 Geoff Huston Telstra.
Advertisements

Allocation vs Announcement A comparison of RIR IPv4 Allocation Records with Global Routing Announcements Geoff Huston February 2004 (Supported by APNIC)
Measuring IPv6 Deployment Geoff Huston George Michaelson
Routing Table Status Report Geoff Huston February 2005 APNIC.
Routing Table Status Report November 2005 Geoff Huston APNIC.
BGP Status Update Geoff Huston September What Happening (AS4637) Date.
Routing Convergence and the Impact of Scale Dan Massey Colorado State University.
Understanding the Impact of Route Reflection in Internal BGP Ph.D. Final Defense presented by Jong Han (Jonathan) Park July 15 th,
Comparing IPv4 and IPv6 from the perspective of BGP dynamic activity Geoff Huston APNIC February 2012.
Delayed Internet Routing Convergence due to Flap Dampening Z. Morley Mao Ramesh Govindan, Randy Katz, George Varghese
BGP trends of an AS Looking under the hood and diagnosing the noise. Stephan Millet Network Engineer Telstra.
Part IV: BGP Routing Instability. March 8, BGP routing updates  Route updates at prefix level  No activity in “steady state”  Routing messages.
© J. Liebeherr, All rights reserved 1 Border Gateway Protocol This lecture is largely based on a BGP tutorial by T. Griffin from AT&T Research.
2006 – (Almost another) BGP Year in Review A BRIEF update to the 2005 report 18 October 2006 IAB Routing Workshop Geoff Huston APNIC.
1 BGP Anomaly Detection in an ISP Jian Wu (U. Michigan) Z. Morley Mao (U. Michigan) Jennifer Rexford (Princeton) Jia Wang (AT&T Labs)
1 Interdomain Routing Protocols. 2 Autonomous Systems An autonomous system (AS) is a region of the Internet that is administered by a single entity and.
1 Measurement of Highly Active Prefixes in BGP Ricardo V. Oliveira, Rafit Izhak-Ratzin, Beichuan Zhang, Lixia Zhang GLOBECOM’05.
BGP update profiles and the implications for secure BGP update validation processing Geoff Huston Swinburne University of Technology PAM April 2007.
BGP in 2009 Geoff Huston APNIC May Conventional BGP Wisdom IAB Workshop on Inter-Domain routing in October 2006 – RFC 4984: “routing scalability.
An Operational Perspective on BGP Security Geoff Huston GROW WG IETF 63 August 2005.
Delayed Internet Routing Convergence Craig Labovitz, Abha Ahuja, Abhijit Bose, Farham Jahanian Presented By Harpal Singh Bassali.
Bgpmon real-time collection and distribution of BGP updates Dave Matthews, Yan Chen, Dan Massey Department of Computer Science Colorado State University.
Allocations vs Announcements A comparison of RIR IPv4 Allocation Records with Global Routing Announcements Geoff Huston May 2004 (Activity supported by.
CRIO: Scaling IP Routing with the Core Router-Integrated Overlay Xinyang (Joy) Zhang Paul Francis Jia Wang Kaoru Yoshida.
Advertising Equal Cost Multi-Path Routes in BGP Manav Bhatia Samsung India Software Operations, Bangalore – India July 17, th IETF - Vienna draft-ecmp-routes-in-bgp-00.txt.
Interconnectivity Density Compare number of AS’s to average AS path length A uniform density model would predict an increasing AS Path length (“Radius”)
IPv4 Address Lifetime Expectancy Revisited - Revisited Geoff Huston November 2003 Presentation to the IEPG Research activity supported by APNIC The Regional.
CAIA Seminar – 18 August 2007 – Taming BGP An incremental approach to improving the dynamic properties of BGP Geoff Huston.
BGP in 2011 Geoff Huston APNIC. Conventional (Historical) BGP Wisdom IAB Workshop on Inter-Domain routing in October 2006 – RFC 4984: “routing scalability.
© 2001, Cisco Systems, Inc. A_BGP_Confed BGP Confederations.
Copyright 2012 Kenneth M. Chipps Ph.D. Cisco CCNA Exploration CCNA 2 Routing Protocols and Concepts BGP Last Update
Inter-Domain Routing Trends Geoff Huston APNIC March 2007.
Border Gateway Protocol (BGP) W.lilakiatsakun. BGP Basics (1) BGP is the protocol which is used to make core routing decisions on the Internet It involves.
1 Quantifying Path Exploration in the Internet Ricardo Oliveira, Rafit Izhak-Ratzin, Lixia Zhang, UCLA Beichuan Zhang, UArizona Dan Pei, AT&T Labs -- Research.
By, Matt Guidry Yashas Shankar.  Analyze BGP beacons which are announced and withdrawn, usually within two hour intervals.  The withdraws have an effect.
02/01/2006USC/ISI1 Updates on Routing Experiments Cyber DEfense Technology Experimental Research (DETER) Network Evaluation Methods for Internet Security.
Routing Table Status Report Geoff Huston November 2004 APNIC.
AS Numbers NANOG 35 Geoff Huston APNIC. Current AS Number Status.
1 Auto-Detecting Hijacked Prefixes? Routing SIG 7 Sep 2005 APNIC20, Hanoi, Vietnam Geoff Huston.
© 2005 Cisco Systems, Inc. All rights reserved. BGP v3.2—7-1 Optimizing BGP Scalability Using BGP Route Dampening.
© 2005 Cisco Systems, Inc. All rights reserved. BGP v3.2—7-1 Optimizing BGP Scalability Improving BGP Convergence.
Routing Table Status Report Geoff Huston August 2004 APNIC.
Routing Table Status Report August 2005 Geoff Huston.
Tracking the Internet’s BGP Table Geoff Huston Telstra December 2000.
1 Investigating occurrence of duplicate updates in BGP announcements Jong Han Park 1, Dan Jen 1, Mohit Lad 2, Shane Amante 3, Danny McPherson 4, Lixia.
Auto-Detecting Hijacked Prefixes?
Auto-Detecting Hijacked Prefixes?
More Specific Announcements in BGP
Jian Wu (University of Michigan)
Routing 2015 Scaling BGP Geoff Huston APNIC May 2016.
Border Gateway Protocol
Measuring BGP Geoff Huston.
COS 561: Advanced Computer Networks
Interdomain routing V. Arun
BGP update profiles and the implications for secure BGP update validation processing Geoff Huston PAM April 2007.
Routing and Addressing in 2017
More Specific Announcements in BGP
Geoff Huston APNIC 7th caida/wide measurement workshop Nov
COS 561: Advanced Computer Networks
Geoff Huston September 2002
COS 561: Advanced Computer Networks
COS 561: Advanced Computer Networks
Use of Simplex Satellite Configurations to support Internet Traffic
BGP Interactions Jennifer Rexford
2005 – A BGP Year in Review February 2006 Geoff Huston
Routing Experiments Chen-Nee Chuah, Sonia Fahmy, Denys Ma,
BGP Instability Jennifer Rexford
Computer Networks Protocols
BGP: 2008 Geoff Huston APNIC.
Geoff Huston APNIC 7th caida/wide measurement workshop Nov
Presentation transcript:

Update Damping in BGP Geoff Huston Chief Scientist, APNIC

BGP Growth: Table Size

BGP Growth: Updates (05 – 06)

Limits to Growth? Are there practical limits to the size of the routed network ? limits to routing database size ? limits in routing update processing load ? practical bounds for time to reach converged routing states ?

Current Understandings The protocol message peak rate is increasing faster than the number of routed entries BGP is a chatty protocol Dense interconnection implies higher levels of path exploration to stabilize on best available paths Some concern that BGP has some practical limits in terms of size and convergence times within the bounds of currently deployed routing machinery Some further concern that these limits may be achieved in the near term future

Profiling BGP Load Use a BGP monitor connected to DFZ update feeds Quagga Log all updates Process logs and generate daily profile

Update Distribution by Prefix

Update Distribution by Origin AS

Previous Analysis of BGP Update Profile Update load profile and convergence times do not appear to be precisely aligned to routing table size The BGP load profile is heavily skewed, with a small number of route objects, and a small number of origin ASs, contributing a disproportionate amount to the routing update load Background load appears to be heavily related to close-to-collector routing events that affect large numbers of routed objects Intense load appears to be related to close-to-origin routing events that affect small numbers of routed objects with each event As the network grows the highly active component of route load does not appear to grow proportionally

Whats the cause here? BGP Updates recorded at AS2.0, June 28 – July 12 AS21452

Whats the cause here? BGP Updates recorded at AS2.0, June 28 – July 12 AS21452 This daily cycle of updates with a weekend profile is a characteristic signature of the origin AS performing some form of load-based routing

Poor Traffic Engineering? An increasing trend to multi-home an AS with multiple transit providers Spread traffic across the multiple transit paths by selectively altering advertisements The use of load monitors and BGP control systems to automate the process Poor tuning (or no tuning!) of the automated traffic engineering process produces extremely unstable BGP outcomes! AS1 AS2 AS3

BGP Update Load Profile It appears that the majority of the BGP load is caused by a very small number of unstable origination configurations, possibly driven by automated systems with limited or no feedback control This problem is getting larger over time The related protocol update load consumes routing resources, but does not change the base information state – it generally oscillates across a small set of states that do not imply local forwarding change

Mitigating BGP Update Loads Current set of deployed tools to mitigate BGP update overheads: 1. Minimum Route Advertisement Interval Timer (MRAI) 2. Withdrawal MRAI Timer 3. Route Flap Damping 4. Output Queue Compression

1. MRAI Timer Optional timer in BGP ON in Ciscos (30 seconds) OFF in Junipers (0 secconds) Suppress the advertisement of successive updates to a peer for a given prefix until the timer expires Commonly implemented as suppress ALL updates to a peer until a per-peer MRAI timer expires

2. Withdrawal MRAI TIMER Variant on MRAI where withdrawals are also time limited in the same way as updates

3. Route Flap Damping RFD attempts to apply a heuristic to identify noisy prefixes and apply a longer term suppression to update propagation Uses the concept of a penalty score applied to a prefix learned from a peer Each update and withdrawal adds to the score The score decays exponentially over time If the score exceeds a suppress threshold the route is damped Damping remains in place until the score drops below the release threshold Damping is applied to the adj-rib-in

4. Output Queue Compression BGP is a rate-throttled protocol (due to TCP transport) A process-loaded BGP peer applies back pressure to the other side of the BGP session by shutting down the advertised TCP recv window The local BGP process may then perform queue compression on the output queue for that peer, removing queued updates that refer to the same prefix Apply queue compression when this queue forms Close TCP window when this queue forms

Some Observations RFD – long term suppression Route Flap damping extends convergence times by hours with no real benefit offset MRAI – short term suppression MRAI variations in the network make path exploration noisier Even with piecemeal MRAI deployment we still have a significant routing load attributable to Path Exploration Output Queue Compression Rarely triggered in todays network!

CodeDescription AA+Announcement of an already announced prefix with a longer AS Path (update to longer path) AA-Announcement of an announced prefix with a shorter AS Path (update to shorter path) AA0Announcement of an announced prefix with a different path of the same length (update to a different AS Path of same length) AA*Announcement of an announced prefix with the same path but different attributes (update of attributes) AAAnnouncement of an announced prefix with no change in path or attributes (possible BGP error or data collection error) WA+Announcement of a withdrawn prefix, with longer AS Path WA-Announcement of a withdrawn prefix, with shorter AS Path WA 0 Announcement of a withdrawn prefix, with different AS Path of the same length WA*Announcement of a withdrawn prefix with the same AS Path, but different attributes WAAnnouncement of a withdrawn prefix with the same AS Path and same attributes AWWithdrawal of an announced prefix WWWithdrawal of a withdrawn prefix (possible BGP error or a data collection error) BGP Update Types Announced-to-Announced Updates Withdrawn-to-Announced Updates Announced-to-Withdrawn Withdrawn-to-Withdrawn

CodeCount AA+607,093 AA-555,609 AA0594,029 AA*782,404 AA195,707 WA+238,141 WA-190,328 WA 0 51,780 WA*30,797 WA77,440 AW627,538 WW0 BGP Path Exploration? April 2007 BGP Update Profile Totals of each type of prefix updates, using a recording of all BGP updates as heard by AS2.0 for the month of April 2007

BGP Update Profile Path Exploration Candidates

Time Distribution of Updates 24 hour cycles? Elapsed time between received updates for the same prefix - hours

Time Distribution of Updates Route Flap Damping? Elapsed time between received updates for the same prefix - minutes

Time Distribution of Updates MRAI Timer Elapsed time between received updates for the same prefix - seconds

Update Sequence Length Distribution A sequence is a set of updates for the same prefix that are separated by an interval <= the sequence timer (35 seconds)

Path Exploration Damping (PED) A prevalent form of path hunting is the update sequence of increasing AS path length, followed by a withdrawal, all closely coupled in time {AA+, AA0, AA} *, AW The AA+, AA0 and AA updates are intermediate noise updates in this case representing transient routing states Can these updates be locally suppressed for a short interval to see if they are path of a BGP Path Exploration activity? The suppression would hold the update in the local output queue for a fixed time interval (in which case the update is released) or the update is further updated by queuing a subsequent update (or withdrawal) for the same prefix

Path Exploration Damping Apply a 35 second MRAI timer to AA+, AA0 and AA updates queued to eBGP peers No MRAI timer applied to all other updates and all withdrawals 35 seconds is used to compensate for MRAI- filtered update sequences that use 30 second interval

PED Results on BGP data Path Exploration Damping applied to BGP updates recorded at AS2.0, June 28 – July 12

PED Results on BGP data Path Exploration Damping applied to BGP updates recorded at AS2.0, June 28 – July 12

PED Results on BGP data 21% of all updates in the collection period wouldve been eliminated by Path Exploration Damping Average update rate for the month would fall from 1.60 prefix updates per second to 1.22 prefix updates per second Average peak update rates fall from 355 to 290 updates per second

Summary Much of the update processing load in BGP is in processing non-informative intermediate states caused by BGP Path Exploration Existing approaches to suppress this processing load appear to be too coarse to be very effective Some significant leverage in further reducing BGP peak load rates can be obtained by applying a more selective algorithm to the MRAI approach in BGP, attempting to isolate Path Exploration updates by the use of local heuristics

Thank You Questions?