1 End-to-End Routing Behavior in the Internet Internet Routing Instability Presented by Carlos Flores Gaurav Jain May 31st CS 6390 Advanced Computer Networks Dr. Ravi Prakash
2 Topics of Presentation I. Introduction II. Routing Behavior in the Internet III. Routing Instability IV. Conclusions
3 Introduction Purpose of Studies Analyze the routing behavior in the Internet for pathological conditions, routing stability and routing symmetry for end-to-end measurements. Analyze BGP routing messages to examine Internet routing instability.
4 Routing Behavior Main questions * What pathologies and failures occur in routing? * Stable or unstable routes? * Symmetric or Asymmetric routes? Terms AS’s - Autonomous Systems. Set of routers and hosts unified by a single administrative authority. BGP - Border Gateway Protocol. Protocol used for transmission among different AS’s. Flapping - Frequent change of routes between AS’s.
5 Methodology Number of Internet sites: 37 Tools: - Traceroute - NPD (Network Probe Deamon) - npd_control program. Time: D1 dataset - collected Nov - Dec ‘94. D2 dataset - collected Nov - Dec ‘95. Size: D Measurements. D2 - 37,097 Measurements.
6 Routing Pathologies 1) Routing loops: A) Forwarding Loops: Packets forwarded by a router return to the router. B) Information Loops: Router acts on connectivity info. derived by information it itself propagated earlier. C) Traceroute Loops: measurement reports the same sequence of routers multiple times. Results: D traceroute loops (0.13%) D traceroute loops (0.16%) Loops Duration:1) < 3 hours 2) > half day
7 Routing Pathologies 2) Erroneous Routing: D1 - 1 Packet routed to Israel instead of London! No safe assumption can be made of correct routing. 3) Connectivity Altered Midstream Results: Routes lost or altered: D traces D traces Conclusion: Recovery time bimodal: 1) <= 1 second 2) Order of 1 minute.
8 Routing Pathologies 4) Fluttering: Rapid oscillating routing. D2 - Very little fluttering observed. Problems: - Unstable Network paths - Occur in one direction (asymmetry) - Roundtrip time difficult to estimate. Advantages: Balance network load. 5) Infrastructure failure. “host unreachable” deep inside the network. Results:D % Availability D % Availability
9 Routing Pathologies 6) Unreachable due to too many hops. * Hop count not always proportional to geographic distance: A) End-to-end route 1500 Km: 3 hops. B) End-to-end route 3 Km: 11 hops. * Operational diameter of the Internet grown beyond default value of 30 hops. * Longer initial value of TTL needed.
10 Routing Pathologies 7) Temporary Outages. Sequence of consecutive traceroute packets lost.
11 Routing Pathologies 8) Time of day patterns. Temporary outages D2 - Minimum: 0.4%. Outages between 01: :00 hrs. Maximum: 8.0%. Outages between 15: :00 hrs. Infrastructure failure Minimum: 1.2%. 09: :00 hrs. Maximum: 9.3%. 15: :00 hrs.
12 Routing Pathologies Summary PathologyProbabilityTrendNotes Persistent loops0.13 – 0.16%Some lasted hours. Erroneous routing %No instances in D2 Mid-stream change 0.16% | 0.44%WorseRapidly varying routes Infraestructure failure 0.21% | 0.48%WorseNo dominant link Outage >= 30 secs 0.96% | 2.2%WorseDuration exponentially distributed Total Pathologies1.5% | 3.3%Worse
13 Routing Symmetry Goal: Assess the degree to which routes are symmetric or asymmetric. Effects of network asymmetries: Complicate network measurements, troubleshooting, accounting and routers’ anticipatory flow state. Sources: - Link asymmetric costs (bandwidth, payment scheme). - Configuration errors, inconsistencies. - “hot potato”, “cold potato” routing.
14 Routing Symmetry Analysis D2: 49% of measurements showed an asymmetric path visiting at least one different path. Size of asymmetries: - Majority of asymmetries confined to a single hop (only one city or AS different).
15 End-to-End Routing Stability Objective Do routes change often or are routes stable over time? * Views of routing stability: A) Prevalence – likeliness of observing the same route in the future. B) Persistence – How long a route will remain the same. * Routes level of granularity: - Internet granularity (host granurality) - City granularity - AS’s granularity
16 End-to-End Routing Stability * Routing Prevalence - Host granularity: For half of virtual paths measured, same route observed 82% or more of the time. Internet paths strongly dominated by a single route. -City granularity: 97% -AS granularity: 100% * Internet paths very strongly dominated by same set of cities and same AS’s, but significant site-to-site variation.
17 End-to-End Routing Stability * Routing Persistence How long a route is likely to endure before changing? Rapid Route Alternation: No high-frequency routing oscillation for measurements of less than 1 hour. Medium Scale Route Alternation: Observation of virtual paths spaced 1 hour apart not likely to suffer a route change. Large scale Route Alternation: 90% chance of observing a route with a duration of at least a week.
18 End-to-End Routing Stability * Summary of routing persistence: -Route changes occur over a wide range of time scales (seconds to days) - 2/3 of Internet paths have stable routes lasting from days to weeks. Time Scale%Notes 10’s of minutes9%Mainly route changes inside the network Hours4%Usually intra-network changes 6+hours19%Intra-network changes Days68%50% less than a week 50% more than a week
19 Internet Routing Instability Analysis based on data collected from BGP routing messages (interdomain routing). What is Routing Instability? Rapid change of network reachability and topology information. Origins: Router configuration errors. Physical and data link problems. Software bugs.
20 Internet Routing Instability Effects: Increase packet loss. Delays in time for network convergence. Resource overhead (memory, CPU) within Internet Infrastructure. Terminology: Prefixes: Destination IP addresses blocks. ASPATH: List of AS’s numbers in a particular route.
21 Internet Routing Instability Routing forms: 1.Announcements. 2.Withdrawals. Types of interdomain routing updates: Forwarding instability. Routing policy fluctuation. Redundant pathological updates. Instability: Forwarding Instability + Routing policy fluctuation.
22 Internet Routing Instability Methodology * Time of study: 9 months * Data: Logged BGP routing messages at 5 major U.S. Network exchange points. * Purpose: - Analyze the BGP data in attempt to characterize and understand the origins and operational impact of routing instability.
23 Internet Routing Instability * Update categories: A = Announcement W = Withdrawal - WADiff: route withdrawn and replaced with an alternative route. - AADiff: route implicitly withdrawn and replaced by a preferred alternative path. - WADup: route explicitly withdrawn and then reannounced as reachable. - AADup: route implicitly withdrawn and replaced with a duplicate of original. - WWDup: repeated transmission of BGP withdrawals for a prefix currently unreachable. Analysis of pathological routing information
24 Internet Routing Instability * Update categories: Analysis of pathological routing information InstabilityPathological behavior WADup WWDupAADiff WADiff AADup 5% 95%
25 Internet Routing Instability 1) BGP updates dominated by WWDup. 2) AADup and WADup consistently dominate the remaining categories. 3) Only a small portion of BGP updates contribute to AADiff and WADiff. Results
26 Internet Routing Instability * All pathological routing incidents caused by small service providers. * Some WWDups caused by a vendor’s router implementation decision. * Instability: AADiff + WADiff + WADups. * Trends: Peaks of updates in the afternoons. Little instability in the weekend. * Routing instability closely related to bandwidth usage and packet loss. Results
27 Internet Routing Instability * Plot of time of day vs. no. of updates --> bell shaped curve (peak afternoon). * Weekends --> less instability * Rigorous approach to identify instability frequency - peak at 24 hrs. and 7 days. * In a day, periodicity observed at 30 s. and 60 s. * NO SINGLE ROUTE DOMINATES INSTABILITY. * NO SINGLE AS DOMINATES INSTABILITY. Results
28 Internet Routing Instability * Stateless BGP implementations. * Each withdrawal induces some short lived pathological network oscillation. * Oscillations due to misconfigured CSUs. * Jittered timer to coalesce multiple routing updates. * Unjittered timers in periodic message model. * Improper configuration of the interaction between interior gateway protocols and BGP. Possible origins of routing pathologies
29 Internet Routing Instability * 99% of routing information is pathological (redundant) and many not reflect real network topological changes. * Although redundant updates are quickly discarded by routers, they consume router resources and high rates of them (300 updates per second) can crash a router. * Forwarding instability highly present: * 3-10% of routes have 1 or more WADiff per day. * 5-20% of routes have 1 or more AADiff per day. * 10-50% 1 or more WADup per day. Results...
30 Conclusions zNo “typical” Internet site or path. zLikelihood of encounter a major routing pathology more than doubled from zInternet paths heavily dominated by a prevalent route, but routes persistence show wide variation of time (seconds to days). z2/3 of Internet routes have routes persisting from days or weeks.
31 Conclusions zInternet routing instability still poorly understood. zBy 1995, half of virtual paths differ by >=1 city in a two way path. zHow can we make it better?