Rensselaer Polytechnic Institute 1 Today’s Big Picture Large ISP Dial-Up ISP Access Network Small ISP Stub Large number of diverse networks
Rensselaer Polytechnic Institute 2 Internet AS Map: caida.org
Rensselaer Polytechnic Institute 3 Autonomous System(AS) q Internet is not a single network q Collection of networks controlled by different administrations q An autonomous system is a network under a single administrative control q An AS owns an IP prefix q Every AS has a unique AS number q ASes need to inter-network themselves to form a single virtual global network q Need a common protocol for communication
Rensselaer Polytechnic Institute 4 Who speaks Inter-AS routing? R border routerinternal router BGP R2R1R3 AS1 AS2 Two types of routers Border router(Edge), Internal router(Core) Two border routers of different ASes will have a BGP session
Rensselaer Polytechnic Institute 5 Intra-AS vs Inter-AS q An AS is a routing domain q Within an AS: q Can run a link-state routing protocol q Trust other routers q Scale of network is relatively small q Between ASes: q Lack of information about other AS’s network (Link- state not possible) q Crossing trust boundaries q Link-state protocol will not scale q Routing protocol based on route propagation
Rensselaer Polytechnic Institute 6 Autonomous Systems (ASes) An autonomous system is an autonomous routing domain that has been assigned an Autonomous System Number (ASN). All parts within an AS remain connected. RFC 1930: Guidelines for creation, selection, and registration of an Autonomous System … the administration of an AS appears to other ASes to have a single coherent interior routing plan and presents a consistent picture of what networks are reachable through it.
Rensselaer Polytechnic Institute 7 IP Address Allocation and Assignment: Internet Registries IANA RFC Internet Registry IP Allocation Guidelines RFC Address Allocation for Private Internets RFC An Architecture for IP Address Allocation with CIDR ARIN APNIC RIPE Allocate to National and local registries and ISPs Addresses assigned to customers by ISPs
Rensselaer Polytechnic Institute 8 AS Numbers (ASNs) ASNs are 16 bit values through are “private” Genuity: 1 MIT: 3 Harvard: 11 UC San Diego: 7377 AT&T: 7018, 6341, 5074, … UUNET: 701, 702, 284, 12199, … Sprint: 1239, 1240, 6211, 6242, … … ASNs represent units of routing policy Currently over 11,000 in use.
Rensselaer Polytechnic Institute 9 Nontransit vs. Transit ASes ISP 1 ISP 2 Nontransit AS might be a corporate or campus network. Could be a “content provider” NET A Traffic NEVER flows from ISP 1 through NET A to ISP 2 Internet Service providers (ISPs) have transit networks
Rensselaer Polytechnic Institute 10 Selective Transit NET B NET C NET A provides transit between NET B and NET C and between NET D and NET C NET A NET D NET A DOES NOT provide transit Between NET D and NET B Most transit ASes allow only selective transit key impact of commercialization
Rensselaer Polytechnic Institute 11 Customers and Providers Customer pays provider for access to the Internet provider customer IP traffic provider customer
Rensselaer Polytechnic Institute 12 Customer-Provider Hierarchy IP traffic provider customer
Rensselaer Polytechnic Institute 13 The Peering Relationship peer customerprovider Peers provide transit between their respective customers Peers do not provide transit between peers Peers (often) do not exchange $$$ traffic allowed traffic NOT allowed
Rensselaer Polytechnic Institute 14 BGP-4 q BGP = Border Gateway Protocol q Is a Policy-Based routing protocol q Is the de facto EGP of today’s global Internet q Relatively simple protocol, but configuration is complex and the entire world can see, and be impacted by, your mistakes : BGP-1 [RFC 1105] –Replacement for EGP (1984, RFC 904) 1990 : BGP-2 [RFC 1163] 1991 : BGP-3 [RFC 1267] 1995 : BGP-4 [RFC 1771] –Support for Classless Interdomain Routing (CIDR)
Rensselaer Polytechnic Institute 15 BGP Operations (Simplified) Establish session on TCP port 179 Exchange all active routes Exchange incremental updates AS1 AS2 While connection is ALIVE exchange route UPDATE messages BGP session
Rensselaer Polytechnic Institute 16 Four Types of BGP Messages q Open : Establish a peering session. q Keep Alive : Handshake at regular intervals. q Notification : Shuts down a peering session. q Update : Announcing new routes or withdrawing previously announced routes. announcement = prefix + attributes values
Rensselaer Polytechnic Institute 17 What is Routing Policy q Policy refers to arbitrary preference among a menu of available routes (based upon routes’ attributes) q Public description of the relationship between external BGP peers q Can also describe internal BGP peer relationship q Eg: Who are my BGP peers q What routes are q Originated by a peer q Imported from each peer q Exported to each peer q Preferred when multiple routes exist q What to do if no route exists?
Rensselaer Polytechnic Institute 18 Routing Policy Example q AS1 originates prefix “d” q AS1 exports “d” to AS2, AS2 imports q AS2 exports “d” to AS3, AS3 imports q AS3 exports “d” to AS5, AS5 imports
Rensselaer Polytechnic Institute 19 Routing Policy Example (cont) q AS5 also imports “d” from AS4 q Which route does it prefer? q Does it matter? q Consider case where q AS3 = Commercial Internet q AS4 = Internet2
Rensselaer Polytechnic Institute 20 Import and Export Policies q Inbound filtering controls outbound traffic q filters route updates received from other peers q filtering based on IP prefixes, AS_PATH, community q Outbound Filtering controls inbound traffic q forwarding a route means others may choose to reach the prefix through you q not forwarding a route means others must use another router to reach the prefix q Attribute Manipulation q Import: LOCAL_PREF (manipulate trust) q Export: AS_PATH and MEDs
Rensselaer Polytechnic Institute 21 Attributes are Used to Select Best Routes /24 pick me! /24 pick me! /24 pick me! /24 pick me! Given multiple routes to the same prefix, a BGP speaker must pick at most one best route (Note: it could reject them all!)
Rensselaer Polytechnic Institute 22 BGP Policy Knob: Attributes Value Code Reference ORIGIN [RFC1771] 2 AS_PATH [RFC1771] 3 NEXT_HOP [RFC1771] 4 MULTI_EXIT_DISC [RFC1771] 5 LOCAL_PREF [RFC1771] 6 ATOMIC_AGGREGATE [RFC1771] 7 AGGREGATOR [RFC1771] 8 COMMUNITY [RFC1997] 9 ORIGINATOR_ID [RFC2796] 10 CLUSTER_LIST [RFC2796] 11 DPA [Chen] 12 ADVERTISER [RFC1863] 13 RCID_PATH / CLUSTER_ID [RFC1863] 14 MP_REACH_NLRI [RFC2283] 15 MP_UNREACH_NLRI [RFC2283] 16 EXTENDED COMMUNITIES [Rosen] reserved for development From IANA: We will cover a subset of these attributes Not all attributes need to be present in every announcement
Rensselaer Polytechnic Institute 23 BGP Route Processing Best Route Selection Apply Import Policies Best Route Table Apply Export Policies Install forwarding Entries for best Routes. Receive BGP Updates Best Routes Transmit BGP Updates Apply Policy = filter routes & tweak attributes Based on Attribute Values IP Forwarding Table Apply Policy = filter routes & tweak attributes
Rensselaer Polytechnic Institute 24 Import and Export Policies q For inbound traffic q Filter outbound routes q Tweak attributes on outbound routes in the hope of influencing your neighbor’s best route selection q For outbound traffic q Filter inbound routes q Tweak attributes on inbound routes to influence best route selection outbound routes inbound routes inbound traffic outbound traffic In general, an AS has more control over outbound traffic
Rensselaer Polytechnic Institute 25 Policy Implementation Flow Main BGP RIB Adj RIB Out Outgo- ing Adj RIB In Incom- ing Main RIB/ FIB IGPs Static & HW Info
Rensselaer Polytechnic Institute 26 Conceptual Model of BGP Operation q RIB : Routing Information Base q Adj-RIB-In: Prefixes learned from neighbors. As many Adj-RIB-In as there are peers q Loc-RIB: Prefixes selected for local use after analyzing Adj-RIB-Ins. This RIB is advertised internally. q Adj-RIB-Out : Stores prefixes advertised to a particular neighbor. As many Adj-RIB-Out as there are neighbors
Rensselaer Polytechnic Institute 27 Path Attributes: ORIGIN q ORIGIN: q Describes how a prefix came to BGP at the origin AS q Prefixes are learned from a source and “injected” into BGP: q Directly connected interfaces, manually configured static routes, dynamic IGP or EGP q Values: q IGP (EGP): Prefix learnt from IGP (EGP) q INCOMPLETE: Static routes
Rensselaer Polytechnic Institute 28 Path Attributes: AS-PATH q List of ASs thru which the prefix announcement has passed. AS on path adds ASN to AS-PATH q Eg: /16 originates at AS1 and is advertised to AS3 via AS2. q Eg: AS-SEQUENCE: “ ” q Used for loop detection and path selection AS1 (100) AS2 (200) AS3 (15) /16
Rensselaer Polytechnic Institute 29 Traffic Often Follows ASPATH AS 4AS 3 AS 2 AS /16 ASPATH = IP Packet Dest =
Rensselaer Polytechnic Institute 30 … But It Might Not AS 4AS 3 AS 2 AS /16 ASPATH = IP Packet Dest = AS /25 ASPATH = /25 AS 2 filters all subnets with masks longer than / /16 ASPATH = 1 From AS 4, it may look like this packet will take path 3 2 1, but it actually takes path 3 2 5
Rensselaer Polytechnic Institute 31 Shorter AS-PATH Doesn’t Mean Shorter # Hops AS 4 AS 3 AS 2 AS 1 BGP says that path 4 1 is better than path Duh!
Rensselaer Polytechnic Institute 32 ASPATH Padding: Shed inbound traffic Padding will (usually) force inbound traffic from AS 1 to take primary link AS /24 ASPATH = customer AS 2 provider /24 backupprimary /24 ASPATH = 2
Rensselaer Polytechnic Institute 33 Padding May Not Shut Off All Traffic AS /24 ASPATH = customer AS 2 provider /24 ASPATH = 2 AS 3 provider AS 3 will send traffic on “backup” link because it prefers customer routes and local preference is considered before ASPATH length! Padding in this way is often used as a form of load balancing backupprimary
Rensselaer Polytechnic Institute 34 Deaggregation + Multihoming AS 1 customer AS 2 provider /8 AS 3 provider /16 If AS 1 does not announce the more specific prefix, then most traffic to AS 2 will go through AS 3 because it is a longer match AS 2 is “punching a hole” in the CIDR block of AS 1=> subverts CIDR
Rensselaer Polytechnic Institute 35 BGP Table Growth Thanks: Geoff Huston.
Rensselaer Polytechnic Institute 36 Large BGP Tables Considered Harmful Routing tables must store best routes and alternate routes Burden can be large for routers with many alternate routes (route reflectors for example) Routers have been known to die Increases CPU load, especially during session reset
Rensselaer Polytechnic Institute 37 ASNs Growth From: Geoff Huston.
Rensselaer Polytechnic Institute 38 Dealing with ASN growth… q Make ASNs larger than 16 bits q How about 32 bits? q See Internet Draft: “BGP support for four-octet AS number space” (draft-ietf-idr-as4bytes-03.txt) q Requires protocol change and wide deployment q Change the way ASNs are used q Allow multihomed, non-transit networks to use private ASNs q Uses ASE (AS number Substitution on Egress ) q See Internet Draft: “Autonomous System Number Substitution on Egress” (draft-jhaas-ase-00.txt) q Works at edge, requires protocol change (for loop prevention)
Rensselaer Polytechnic Institute 39 Daily Update Count
Rensselaer Polytechnic Institute 40 A Few Bad Apples … Thanks to Madanlal Musuvathi for this plot. Data source: RIPE NCC Typically, 80% of the updates are for less than 5% Of the prefixes. Most prefixes are stable most of the time. On this day, about 83% of the prefixes were not updated. Percent of BGP table prefixes
Rensselaer Polytechnic Institute 41 Route Flap Dampening (RFC 2439) Routes are given a penalty for changing. If penalty exceeds suppress limit, the route is dampened. When the route is not changing, its penalty decays exponentially. If the penalty goes below reuse limit, then it is announced again. Can dramatically reduce the number of BGP updates Requires additional router resources Applied on eBGP inbound only
Rensselaer Polytechnic Institute 42 Route Flap Dampening Example route dampened for nearly 1 hour penalty for each flap = 1000
Rensselaer Polytechnic Institute 43 How Long Does BGP Take to Adapt to Changes? From: Abha Ahuja and Craig Labovitz
Rensselaer Polytechnic Institute 44 Two Main Factors in Delayed Convergence q Rate limiting timer slows everything down q BGP can explore many alternate paths before giving up or arriving at a new path q No global knowledge in vectoring protocols
Rensselaer Polytechnic Institute 45 Implementation Does Matter! stateless withdraws widely deployed stateful withdraws widely deployed