Internet Routing (COS 598A) Today: Intradomain Topology Jennifer Rexford Tuesdays/Thursdays 11:00am-12:20pm
Outline Router architecture –Line cards –Switching fabric –Router processor Network topology –From hub-and-spoke to backbones –Customer connecting to providers Measuring the topology –Traceroute probes from many vantage points –Associating an IP address with an AS Discussion of the papers
What is a Router? A computer with… –Multiple interfaces –Implementing routing protocols –Packet forwarding Wide range of variations of routers –Small LinkSys device in a home network –Linux-based PC running router software –Million-dollar high-end routers with large chassis … and links –Serial line –Ethernet –Packet-over-SONET
Fibers Coaxial Cable LinksInterfacesSwitches/routers Ethernet card Wireless card Large router Telephone switch Network Components
Inside a High-End Router Switching Fabric Processor Line card
Router Components: Line Cards Interfacing –Physical link –Switching fabric Packet handling –Buffer management –Link scheduling –Packet filtering (ACLs) –Packet forwarding (FIB) –Rate-limiting –Packet marking –Measurement to/from link to/from switch FIB Receive Transmit
Router Components: Switching Fabric Deliver packet inside the router –From incoming interface to outgoing interface –A small network in and of itself Must operate very quickly –Multiple packets going to same outgoing interface –Switch scheduling to match inputs to outputs Implementation techniques –Bus, crossbar, interconnection network, … –Running at a faster speed (e.g., 2X) than links –Dividing variable-length packets into cells
Router Components: Router Processor So-called “Loopback” interface –IP address of the CPU on the router Control-plane software –Implementation of the routing protocols –Creation of forwarding table for the line cards Interface to network administrators –Command-line interface for configuration –Transmission of measurement statistics Handling of special data packets –Packets with IP options enabled –Packets with expired Time-To-Live field
Network Topology
Hub-and-Spoke Topology Single hub node –Common in enterprise networks –Main location and satellite sites –Simple design and trivial routing Problems –Single point of failure –Bandwidth limitations –High delay between sites –Costs to backhaul to hub
Simple Alternatives to Hub-and-Spoke Dual hub-and-spoke –Higher reliability –Higher cost –Good building block Levels of hierarchy –Reduce backhaul cost –Aggregate the bandwidth –Shorter site-to-site delay …
Backbone Networks Backbone networks –Multiple Points-of-Presence (PoPs) –Lots of communication between PoPs –Need to accommodate diverse traffic demands –Need to limit propagation delay
Abilene Internet2 Backbone
Points-of-Presence (PoPs) Inter-PoP links –Long distances –High bandwidth Intra-PoP links –Short cables between racks or floors –Aggregated bandwidth Links to other networks –Wide range of media and bandwidth Intra-PoP Other networks Inter-PoP
Deciding Where to Locate Nodes and Links Placing Points-of-Presence (PoPs) –Large population of potential customers –Other providers or exchange points –Cost and availability of real-estate –Mostly in major metropolitan areas Placing links between PoPs –Already fiber in the ground –Needed to limit propagation delay –Needed to handle the traffic load
Customer Connecting to a Provider Provider 1 access link 2 access links Provider 2 access routers Provider 2 access PoPs
Multi-Homing: Two or More Providers Motivations for multi-homing –Extra reliability, survive single ISP failure –Financial leverage through competition –Better performance by selecting better path –Gaming the 95 th -percentile billing model Provider 1 Provider 2
Measuring the Topology
Motivation for Measuring the Topology Business analysis –Comparisons with competitors –Selecting a provider or peer Scientific curiosity –Treating data networks like an organism –Understand structure and evolution of Internet Input to research studies –Network design, routing protocols, … Interesting research problem in its own right –How to measure/infer the topology
Basic Idea: Measure from Many Angles Source 1 Source 2
Where to Get Sources and Destinations? Source machines –Get accounts in many places Good to have a lot of friends –Use an infrastructure like PlanetLab Good to have friends who have lots of friends –Use public traceroute servers (nicely) Destination addresses –Walk through the IP address space One (or a few) IP addresses per prefix –Learn destination prefixes from public BGP tables
Traceroute: Measuring the Forwarding Path Time-To-Live field in IP packet header –Source sends a packet with a TTL of n –Each router along the path decrements the TTL –“TTL exceeded” sent when TTL reaches 0 Traceroute tool exploits this TTL behavior source destination TTL=1 Time exceeded TTL=2 Send packets with TTL=1, 2, 3, … and record source of “time exceeded” message
Example Traceroute Output (Berkeley to CNN) * * Hop number, IP address, DNS name inr-daedalus-0.CS.Berkeley.EDU soda-cr-1-1-soda-br-6-2 vlan242.inr-202-doecev.Berkeley.EDU gigE6-0-0.inr-666-doecev.Berkeley.EDU qsv-juniper--ucb-gw.calren2.net POS1-0.hsipaccess1.SanJose1.Level3.net ? pos8-0.hsa2.Atlanta2.Level3.net pop2-atm-P0-2.atdn.net ? pop1-atl-P4-0.atdn.net www4.cnn.com No response from router No name resolution
Problems with Traceroute Missing responses –Routers might not send “Time-Exceeded” –Firewalls may drop the probe packets –“Time-Exceeded” reply may be dropped Misleading responses –Probes taken while the path is changing –Name not in DNS, or DNS entry misconfigured Mapping IP addresses –Mapping interfaces to a common router –Mapping interface/router to Autonomous System Angry operators who think this is an attack
Map Traceroute Hops to ASes * * Traceroute output: (hop number, IP) AS25 AS11423 AS3356 AS1668 AS5662 Berkeley CNN Calren Level3 AOL Need accurate IP-to-AS mappings (for network equipment).
Candidate Ways to Get IP-to-AS Mapping Routing address registry –Voluntary public registry such as whois.radb.net –Used by prtraceroute and “NANOG traceroute” –Incomplete and quite out-of-date Mergers, acquisitions, delegation to customers Origin AS in BGP paths –Public BGP routing tables such as RouteViews –Used to translate traceroute data to an AS graph –Incomplete and inaccurate… but usually right Multiple Origin ASes (MOAS), no mapping, wrong mapping
Example: BGP Table (“show ip bgp” at RouteViews) Network Next Hop Metric LocPrf Weight Path * / i * i * i * i * i *> i * i * / i * i *> i * i * i * i AS 80 is General Electric, AS 701 is UUNET, AS 7018 is AT&T AS 3786 is DACOM (Korea), AS 1221 is Telstra
Refining Initial IP-to-AS Mapping Start with initial IP-to-AS mapping –Mapping from BGP tables is usually correct –Good starting point for computing the mapping Collect many BGP and traceroute paths –Signaling and forwarding AS path usually match –Good way to identify mistakes in IP-to-AS map Successively refine the IP-to-AS mapping –Find add/change/delete that makes big difference –Base these “edits” on operational realities
Extra AS due to Internet eXchange Points IXP: shared place where providers meet –E.g., Mae-East, Mae-West, PAIX –Large number of fan-in and fan-out ASes A B C D E F G Traceroute AS pathBGP AS path Ignore extra traceroute AS hop with high fan-in and fan-out B C F G AE
Extra AS due to Sibling ASes Sibling: organizations with multiple ASes: –E.g., Sprint AS 1239 and AS 1791 –AS numbers equipment with addresses of another Traceroute AS pathBGP AS path A B C D E F G H A B C D E F G Merge sibling ASes “belong together” as if they were one AS.
Unannounced Infrastructure Addresses AB C A C B A C B C C does not announce part of its address space in BGP (e.g., /24) /8 Fix the IP-to-AS map to associate /24 with C
Improving the IP-to-AS Mapping Algorithm for modifying the IP-to-AS map –Small number of rules for modifying the map –Making small changes that make a big difference Results of the algorithm –Changes about 2.9% of mappings –Much better agreement (95%) with BGP AS paths Validation –AT&T router configuration data –Whois queries to verify sibling ASes –List of known Internet eXchange Points
Exploring the Remaining Mismatches Route aggregation –Traceroute AS path longer in 20% of mismatches –Different paths for destinations in same prefix Interface numbering at AS boundaries –Boundary links numbered from one AS –Verified cases where AT&T (AS 7018) is involved BGP path: B C Traceroute path: B C D B CC D D E E B CB D D BGP path: B C D Traceroute path: B D
Discussion of the Two Papers Measuring ISP topologies with RocketFuel –Measure judiciously –First view of ISP topologies –PoP structure, inter-PoP graphs, peering, … –Good? Bad? What areas for future work? First-principles of router-level topology –Explain the high variability in router degree –Technological limits on switching capacity –Many low-speed links at edge, few large in core –High variability at edge due to economics –Good? Bad? What areas for future work?
Some Project Ideas Accuracy of router-level mapping –Apply traceroute to map out the Abilene network –Use PlanetLab nodes for many vantage points –Verify against the actual topology of the network Influence of inaccuracy in router-level maps –Characterize the types of inaccuracy that arise –Determine the influence on key graph metrics –Identify ways to limit the effects of inaccuracy Design better router support for measurement –To support topology discovery, troubleshooting, … –Be cognizant of need to be efficient, not used for attacks, not reveal too-sensitive information, etc.
Reading for Thursday: AS-Level Topology Two papers, and one video –“Toward capturing representative AS-level Internet topologies” –“Interconnection, peering, and settlements” –NANOG video on evolution of Internet peering One-page review of first paper (hard-copy) –Brief summary of the paper –Reasons to accept the paper –Reasons to reject the paper –Three suggestions for future research directions Optional reading –Should computer scientists experiment more?