Download presentation
Presentation is loading. Please wait.
Published byDominic Payne Modified over 9 years ago
2
IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com http://www.cambridge.intel-research.net/~tgriffin
3
Common View of the Telco Network Brick
4
Common View of the IP Network (Layer 3)
5
What This Course Is About
6
Forwarding vs. Routing Forwarding: Use of local information (tables, data structures) to determine treatment of data traffic –traffic classification –treatment bases on classification –can drop traffic –or decide where to send it next, and how to send it Routing: Use of global information to populate forwarding data structures in multiple network nodes –goal is normally to optimize something given state of the network –difficult to fully automated --- often requires intricate configuration –is partially automated with dynamic routing protocols
7
What You Should Take Away Heterogeneity –Many technologies –Many networks –Many “routing policies” IP Routing is evolving –New protocols and extensions –New router knobs –New ways of using existing technologies
8
Anthropology of Routing Protocol Standards Vendors Operator Forums Academic Research IEEE, IETF, … NANOG, RIPE, … Books, Training, … Cisco, Juniper … Sun, Microsoft, … EE CS Maths Could add regulation, markets, ….
9
8 Routing can happen at any level Physical Network DataLink Transport Application Session Presentation Physical Network DataLink Transport Application Session Presentation data sentdata received
10
9 IP is a Network Layer Protocol Physical 1 Network DataLink 1 Transport Application Session Presentation Network Physical 1 DataLink 1 Physical 2 DataLink 2 Router Physical 2 Network DataLink 2 Transport Application Session Presentation Medium 1Medium 2 Separate physical networks glued together into one logical network
11
10 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Version| IHL | Service Type | Total Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Identification |Flags| Fragment Offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Time to Live | Protocol | Header Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options | Padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ All Hail the IP Datagram! HEADERHEADER DATADATA 1981, RFC 791... up to 65,515 octets of data... ::|+|+|::|+|+| ::|+|+|::|+|+| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ shaded fields little-used today
12
11 IP Hour Glass IP Networking Technologies Networking Applications FrameATM DWDMSONET email Web file transfer Ethernet FDDI Multimedia X.25 Remote Access Voice VPN Minimalist network layer TCP e-stuff
13
Routers Talking to Routers Routing info Routing computation is distributed among routers within a routing domain Computation of best next hop based on routing information is the most CPU/memory intensive task on a router Routing messages are usually not routed, but exchanged via layer 2 between physically adjacent routers (internal BGP and multi-hop external BGP are exceptions)
14
Architecture of Dynamic Routing AS 1 AS 2 BGP EGP = Exterior Gateway Protocol IGP = Interior Gateway Protocol Metric based: OSPF, IS-IS, RIP, EIGRP (cisco) Policy based: BGP The Routing Domain of BGP is the entire Internet OSPF EIGRP
15
Topology information is flooded within the routing domain Best end-to-end paths are computed locally at each router. Best end-to-end paths determine next-hops. Based on minimizing some notion of distance Works only if policy is shared and uniform Examples: OSPF, IS-IS Each router knows little about network topology Only best next-hops are chosen by each router for each destination network. Best end-to-end paths result from composition of all next-hop choices Does not require any notion of distance Does not require uniform policies at all routers Examples: RIP, BGP Link StateVectoring Technology of Distributed Routing
16
The Gang of Four Link StateVectoring EGP IGP BGP RIP IS-IS OSPF
17
Autonomous Routing Domains (ARDs) A collection of physical networks glued together using IP, that have a unified administrative routing policy. Campus networks Corporate networks ISP Internal networks …
18
Autonomous System (AS) Numbers 16 bit values. 64512 through 65535 are “private” Genuity: 1 MIT: 3 JANET: 786 UC San Diego: 7377 AT&T: 7018, 6341, 5074, … UUNET: 701, 702, 284, 12199, … Sprint: 1239, 1240, 6211, 6242, … … ASNs represent units of routing policy Currently over 16,000 in use.
19
Autonomous Routing Domains Don’t Always Need BGP or an ASN Qwest Yale University Nail up default routes 0.0.0.0/0 pointing to Qwest Nail up routes 130.132.0.0/16 pointing to Yale 130.132.0.0/16 Static routing is the most common way of connecting an autonomous routing domain to the Internet. This helps explain why BGP is a mystery to many …
20
ASNs Can Be “Shared” (RFC 2270) AS 701 UUNet ASN 7046 is assigned to UUNet. It is used by Customers single homed to UUNet, but needing BGP for some reason (load balancing, etc..) [RFC 2270] AS 7046 Crestar Bank AS 7046 NJIT AS 7046 Hood College 128.235.0.0/16
21
ARD != AS Most ARDs have no ASN (statically routed at Internet edge) Some unrelated ARDs share the same ASN (RFC 2270) Some ARDs are implemented with multiple ASNs (example: Worldcom, AT&T, …) ASes are an implementation detail of Interdomain routing
22
How Many ASNs? Thanks to Geoff Huston. http://bgp.potaroo.net on October 24, 2003 15,981
23
How many prefixes? Thanks to Geoff Huston. http://bgp.potaroo.net on October 24, 2003 154,894 Note: numbers actually depends point of view… 29% 23% Address space covered
24
A Bit of OGI’s AS Neighborhood AS 2914 Verio AS 11964 OGI 128.223.0.0/16 AS 7018 AT&T AS 1239 Sprint AS 6366 Portland State U AS 11995 Oregon Health Sciences U AS 3356 Level 3 Sources: ARIN, Route Views, RIPE AS 3356 Level 3 AS 14262 Portland Regional Education Network AS 7774 U of Alaska AS 3807 U of Montana AS 101 U of Washington
25
wiscnet.net GO BUCKY!
26
Partial View of cs.wisc.edu Neighborhood AS 2381 WiscNet AS 209 Qwest AS 59 UW Academic Computing 128.105.0.0/16 AS 3549 Global Crossing AS 1 Genuity AS 3136 UW Madison AS 7050 UW Milwaukee 129.89.0.0/16130.47.0.0/16
27
26 Policy : Transit vs. Nontransit AS 701 AS144 AS 701 A nontransit AS allows only traffic originating from AS or traffic with destination within AS IP traffic UUnet Bell Labs AT&T CBB A transit AS allows traffic with neither source nor destination within AS to flow across the network
28
Customers and Providers Customer pays provider for access to the Internet provider customer IP traffic provider customer
29
The “Peering” Relationship peer customerprovider Peers provide transit between their respective customers Peers do not provide transit between peers Peers (often) do not exchange $$$ traffic allowed traffic NOT allowed
30
Peering Provides Shortcuts Peering also allows connectivity between the customers of “Tier 1” providers. peer customerprovider
31
Peering Wars Reduces upstream transit costs Can increase end-to- end performance May be the only way to connect your customers to some part of the Internet (“Tier 1”) You would rather have customers Peers are usually your competition Peering relationships may require periodic renegotiation Peering struggles are by far the most contentious issues in the ISP world! Peering agreements are often confidential. PeerDon’t Peer
32
31 Policy-Based vs. Distance-Based Routing? ISP1 ISP2 ISP3 Cust1 Cust2 Cust3 Host 1 Host 2 Minimizing “hop count” can violate commercial relationships that constrain inter- domain routing. YES NO
33
32 Why not minimize “AS hop count”? Regional ISP1 Regional ISP2 Regional ISP3 Cust1 Cust3 Cust2 National ISP1 National ISP2 YESNO
34
33 BGP-4 BGP = Border Gateway Protocol Is a Policy-Based routing protocol Is the de facto EGP of today’s global Internet Relatively simple protocol, but configuration is complex and the entire world can see, and be impacted by, your mistakes. 1989 : BGP-1 [RFC 1105] –Replacement for EGP (1984, RFC 904) 1990 : BGP-2 [RFC 1163] 1991 : BGP-3 [RFC 1267] 1995 : BGP-4 [RFC 1771] –Support for Classless Interdomain Routing (CIDR)
35
34 Four Types of BGP Messages Open : Establish a peering session. Keep Alive : Handshake at regular intervals. Notification : Shuts down a peering session. Update : Announcing new routes or withdrawing previously announced routes. announcement = prefix + attributes values
36
BGP Attributes Value Code Reference ----- --------------------------------- --------- 1 ORIGIN [RFC1771] 2 AS_PATH [RFC1771] 3 NEXT_HOP [RFC1771] 4 MULTI_EXIT_DISC [RFC1771] 5 LOCAL_PREF [RFC1771] 6 ATOMIC_AGGREGATE [RFC1771] 7 AGGREGATOR [RFC1771] 8 COMMUNITY [RFC1997] 9 ORIGINATOR_ID [RFC2796] 10 CLUSTER_LIST [RFC2796] 11 DPA [Chen] 12 ADVERTISER [RFC1863] 13 RCID_PATH / CLUSTER_ID [RFC1863] 14 MP_REACH_NLRI [RFC2283] 15 MP_UNREACH_NLRI [RFC2283] 16 EXTENDED COMMUNITIES [Rosen]... 255 reserved for development From IANA: http://www.iana.org/assignments/bgp-parameters Most important attributes Not all attributes need to be present in every announcement
37
Attributes are Used to Select Best Routes 192.0.2.0/24 pick me! 192.0.2.0/24 pick me! 192.0.2.0/24 pick me! 192.0.2.0/24 pick me! Given multiple routes to the same prefix, a BGP speaker must pick at most one best route (Note: it could reject them all!)
38
37 BGP Route Processing Best Route Selection Apply Import Policies Best Route Table Apply Export Policies Install forwarding Entries for best Routes. Receive BGP Updates Best Routes Transmit BGP Updates Apply Policy = filter routes & tweak attributes Based on Attribute Values IP Forwarding Table Apply Policy = filter routes & tweak attributes Open ended programming. Constrained only by vendor configuration language
39
Route Selection Summary Highest Local Preference Shortest ASPATH Lowest MED i-BGP < e-BGP Lowest IGP cost to BGP egress Lowest router ID traffic engineering Enforce relationships Throw up hands and break ties
40
39 ASPATH Attribute AS7018 135.207.0.0/16 AS Path = 6341 AS 1239 Sprint AS 1755 Ebone AT&T AS 3549 Global Crossing 135.207.0.0/16 AS Path = 7018 6341 135.207.0.0/16 AS Path = 3549 7018 6341 AS 6341 135.207.0.0/16 AT&T Research Prefix Originated AS 12654 RIPE NCC RIS project AS 1129 Global Access 135.207.0.0/16 AS Path = 7018 6341 135.207.0.0/16 AS Path = 1239 7018 6341 135.207.0.0/16 AS Path = 1755 1239 7018 6341 135.207.0.0/16 AS Path = 1129 1755 1239 7018 6341
41
In fairness: could you do this “right” and still scale? Exporting internal state would dramatically increase global instability and amount of routing state Shorter Doesn’t Always Mean Shorter AS 4 AS 3 AS 2 AS 1 Mr. BGP says that path 4 1 is better than path 3 2 1 Duh!
42
BGP Routing Tables Use “whois” queries to associate an ASN with “owner” (for example, http://www.arin.net/whois/arinwhois.html)http://www.arin.net/whois/arinwhois.html 7018 = AT&T Worldnet, 701 =Uunet, 3561 = Cable & Wireless, … show ip bgp BGP table version is 111849680, local router ID is 203.62.248.4 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal Origin codes: i - IGP, e - EGP, ? - incomplete Network Next Hop Metric LocPrf Weight Path... *>i192.35.25.0 134.159.0.1 50 0 16779 1 701 703 i *>i192.35.29.0 166.49.251.25 50 0 5727 7018 14541 i *>i192.35.35.0 134.159.0.1 50 0 16779 1 701 1744 i *>i192.35.37.0 134.159.0.1 50 0 16779 1 3561 i *>i192.35.39.0 134.159.0.3 50 0 16779 1 701 80 i *>i192.35.44.0 166.49.251.25 50 0 5727 7018 1785 i *>i192.35.48.0 203.62.248.34 55 0 16779 209 7843 225 225 225 225 225 i *>i192.35.49.0 203.62.248.34 55 0 16779 209 7843 225 225 225 225 225 i *>i192.35.50.0 203.62.248.34 55 0 16779 3549 714 714 714 i *>i192.35.51.0/25 203.62.248.34 55 0 16779 3549 14744 14744 14744 14744 14744 14744 14744 14744 i... Thanks to Geoff Huston. http://www.telstra.net/ops on July 6, 2001
43
AS Graphs Depend on Point of View peer customer provider 54 2 13 6 54 2 6 13 546 13 54 2 6 13 2
44
AS Graphs Can Be Fun The subgraph showing all ASes that have more than 100 neighbors in full graph of 11,158 nodes. July 6, 2001. Point of view: AT&T route-server
45
AS Graphs Do Not Show “Topology”! The AS graph may look like this. Reality may be closer to this… BGP was designed to throw away information!
46
45 So Many Choices Which route should Frank pick to 13.13.0.0./16? AS 1 AS 2 AS 4 AS 3 13.13.0.0/16 Frank’s Internet Barn peer customerprovider
47
46 LOCAL PREFERENCE AS 1 AS 2 AS 4 AS 3 13.13.0.0/16 local pref = 80 local pref = 100 local pref = 90 Higher Local preference values are more preferred Local preference used ONLY in iBGP
48
47 Implementing Backup Links with Local Preference (Outbound Traffic) Forces outbound traffic to take primary link, unless link is down. AS 1 primary link backup link Set Local Pref = 100 for all routes from AS 1 AS 65000 Set Local Pref = 50 for all routes from AS 1 We’ll talk about inbound traffic soon …
49
48 Multihomed Backups (Outbound Traffic) Forces outbound traffic to take primary link, unless link is down. AS 1 primary link backup link Set Local Pref = 100 for all routes from AS 1 AS 2 Set Local Pref = 50 for all routes from AS 3 AS 3 provider
50
49 Shedding Inbound Traffic with ASPATH Prepending Prepending will (usually) force inbound traffic from AS 1 to take primary link AS 1 192.0.2.0/24 ASPATH = 2 2 2 customer AS 2 provider 192.0.2.0/24 backupprimary 192.0.2.0/24 ASPATH = 2 Yes, this is a Glorious Hack …
51
50 … But Padding Does Not Always Work AS 1 192.0.2.0/24 ASPATH = 2 2 2 2 2 2 2 2 2 2 2 2 2 2 customer AS 2 provider 192.0.2.0/24 ASPATH = 2 AS 3 provider AS 3 will send traffic on “backup” link because it prefers customer routes and local preference is considered before ASPATH length! Padding in this way is often used as a form of load balancing backupprimary
52
51 COMMUNITY Attribute to the Rescue! AS 1 customer AS 2 provider 192.0.2.0/24 ASPATH = 2 AS 3 provider backupprimary 192.0.2.0/24 ASPATH = 2 COMMUNITY = 3:70 Customer import policy at AS 3: If 3:90 in COMMUNITY then set local preference to 90 If 3:80 in COMMUNITY then set local preference to 80 If 3:70 in COMMUNITY then set local preference to 70 AS 3: normal customer local pref is 100, peer local pref is 90
53
Don’t celebrate just yet… customer peering provider/customer Provider B (Tier 1) Provider A (Tier 1) Provider C (Tier 2) Now, customer wants a backup link to C…. provider/customer
54
Customer installs a “backup link” … customer Provider B (Tier 1) Provider A (Tier 1) Provider C (Tier 2) customer sends “lower my preference” Community value primary backup
55
Disaster Strikes! customer Provider B (Tier 1) Provider A (Tier 1) Provider C (Tier 2) primary backup customer is happy that backup was installed …
56
The primary link is repaired, and something odd occurs… customer Provider B (Tier 1) Provider A (Tier 1) Provider C (Tier 2) primary backup YIKES --- routing DOES NOT return to normal!!!
57
WAIT! It Gets Better… A P B B B C B D P = primaryB = backup
58
OOOOOPS! A P B B B C B D Suppose A, B, C all break ties in the same direction (clockwise or counter-clockwise) No solution = Protocol Divergence
59
What the heck is going on? There is no guarantee that a BGP configuration has a unique routing solution. –When multiple solutions exist, the (unpredictable) order of updates will determine which one is wins. There is no guarantee that a BGP configuration has any solution! –And checking configurations NP-Complete [GW1999] Complex policies (weights, communities setting preferences, and so on) increase chances of routing anomalies. –… yet this is the current trend!
60
Are we too complacent? If the provider/customer digraph is acyclic and every AS obeys the commandments Thou shall prefer customer routes over all others Thou shall use provider routes only as a last resort Thou shall not provide transit between peers or providers then the BGP configuration is robust. [see Gao-Griffin-Rexford, INFOCOM 2001]
61
Worrisome trends… Some Autonomous Routing Domains (ARDs) are implemented with multiple ASNs (example: MCI, InterNap, AT&T) –Such “sibling” ASes are not confined to “customer/provider, peer/peer” relationships –ASNs are becoming just an implementation detail. Some ASes participate in different roles in different parts of the world (Sprint, for example). –I don’t think we understand this. We all know abut MED… –But MED oscillation is not a feature interaction problem (MEDs and Route Reflection), but rather a manifestation of BGP’s general principle --- the more complex the policies, the more likely that bad things happen. MED just makes it easy to write very complex policies… Communities are being used for clever interdomain signaling. –Nobody has read “Inherently Safe Backup Routing with BGP” Gao, Griffin, Rexford. INFOCOM 2001 –“te” communities and extended communities…
62
Let’s look at “te” communities… See A survey of the utilization of the BGP community attribute Bruno Quoitin and Olivier Bonaventure http://www.infonet.fundp.ac.be/doc/tr/Infonet-TR-2002-02.html n = 0 do not announce prefix 1 <= n <= 3 prepend n times to announcement 13129:101n - do not announce/prepend to Sprint (AS1239) 13129:102n - do not announce/prepend to Cogent (AS16631) 13129:103n - do not announce/prepend to Abovenet (AS6461) 13129:111n - do not announce/prepend to DE-CIX 13129:112n - do not announce/prepend to INXS 13129:113n - do not announce/prepend to SFINX 13129:114n - do not announce/prepend to LINX 13129:115n - do not announce/prepend to AMS-IX 13129:116n - do not announce/prepend to IX-HH 13129:117n - do not announce/prepend to NYIIX 13129:191n - do not announce/prepend to DTAG (AS3320) 13129:192n - do not announce/prepend to DFN (AS680) 13129:1990 - do not announce to the RIPE RIS project (AS12654) AS13129, Global Access Telecommunications, Inc (Frankfurt) Accepted on inbound routes
63
Some AS286 Communities remarks: +---------------------------------------------------------- remarks: | COMMUNITIES - ROUTE ORIGIN - NOT SETTABLE BY CUSTOMERS remarks: | remarks: | 286:286 European customer routes remarks: | 286:999 US customer routes (received from Qwest) remarks: | 286:888 European or US peer routes remarks: | remarks: | 286:3000 + countrycode Country where route is received remarks: | countrycode E.164 international dial prefix remarks: | remarks: | EXAMPLES remarks: | remarks: | 286:286 286:3031 Customer in Amsterdam remarks: | 286:286 286:3032 Customer in Brussels remarks: | 286:888 286:3044 Peer in London remarks: | remarks: +---------------------------------------------------------- From KPN Eurorings Backbone: Comment: Aren’t we happy that RPSL has “remarks”!
64
Need “Semantics of Interdomain Routing” Distinct from mechanism of finding routings (protocols). –Don’t start with “new algorithms”!!! BGP policy languages/usage have evolved organically --- lack of design. –Too closely tied to mechanism of BGP –RPSL doesn’t even begin to address the issues… What do we want to be true? What do we mean by “autonomy” How much “expressive power” is really required?
65
References [VGE1996, VGE2000] Persistent Route Oscillations in Inter-Domain Routing. Kannan Varadhan, Ramesh Govindan, and Deborah Estrin. Computer Networks, Jan. 2000. (Also USC Tech Report, Feb. 1996) [GW1999] An Analysis of BGP Convergence Properties. Timothy G. Griffin, Gordon Wilfong. SIGCOMM 1999 [GSW1999] Policy Disputes in Path Vector Protocols. Timothy G. Griffin, F. Bruce Shepherd, Gordon Wilfong. ICNP 1999 [GW2001] A Safe Path Vector Protocol. Timothy G. Griffin, Gordon Wilfong. INFOCOM 2001 [GR2000] Stable Internet Routing without Global Coordination. Lixin Gao, Jennifer Rexford. SIGMETRICS 2000 [GGR2001] Inherently safe backup routing with BGP. Lixin Gao, Timothy G. Griffin, Jennifer Rexford. INFOCOM 2001 –[GW2002a] On the Correctness of IBGP Configurations. Griffin and Wilfong.SIGCOMM 2002. –[GW2002b] An Analysis of the MED oscillation Problem. Griffin and Wilfong. ICNP 2002.
66
Pointers Interdomain routing links –http://www.cambridge.intel- research.net/~tgriffin/interdomain/ These slides –http://www.cambridge.intel- research.net/~tgriffin/talks_tutorials
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.