Download presentation
Presentation is loading. Please wait.
Published byGordon Rogers Modified over 8 years ago
1
Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin
2
Best Effort Connectivity This is the fundamental service provided by Internet Service Providers (ISPs) All other IP services depend on connectivity: DNS, email, VPNs, Web Hosting, … IP traffic 135.207.49.8192.0.2.153
3
3 Routing vs. Forwarding R R R A B C D R1 R2 R3 R4 R5 E NetNxt Hop R4 R3 R4 Direct R4 NetNxt Hop A B C D E default R2 Direct R5 R2 NetNxt Hop A B C D E default R1 Direct R3 R1 R3 R1 Default to upstream router A B C D E default Forwarding: determine next hop Routing: establish end-to-end paths Forwarding always works Routing can be badly broken
4
4 How Are Forwarding Tables Populated to implement Routing? StaticallyDynamically Routers exchange network reachability information using ROUTING PROTOCOLS. Routers use this to compute best routes Administrator manually configures forwarding table entries In practice : a mix of these. Static routing mostly at the “edge” + More control + Not restricted to destination-based forwarding - Doesn’t scale - Slow to adapt to network failures + Can rapidly adapt to changes in network topology + Can be made to scale well - Complex distributed algorithms - Consume CPU, Bandwidth, Memory - Debugging can be difficult - Current protocols are destination-based
5
Routers Talking to Routers Routing info Routing computation is distributed among routers within a routing domain Computation of best next hop based on routing information is the most CPU/memory intensive task on a router Routing messages are usually not routed, but exchanged via layer 2 between physically adjacent routers (internal BGP and multi-hop external BGP are exceptions)
6
Before We Go Any Further … IP ROUTING PROTOCOLS DO NOT DYNAMICALLY ROUTE AROUND NETWORK CONGESTION IP traffic can be very bursty Dynamic adjustments in routing typically operate more slowly than fluctuations in traffic load Dynamically adapting routing to account for traffic load can lead to wild, unstable oscillations of routing system
7
Autonomous Routing Domains A collection of physical networks glued together using IP, that have a unified administrative routing policy. Campus networks Corporate networks ISP Internal networks …
8
Autonomous Systems (ASes) An autonomous system is an autonomous routing domain that has been assigned an Autonomous System Number (ASN). RFC 1930: Guidelines for creation, selection, and registration of an Autonomous System … the administration of an AS appears to other ASes to have a single coherent interior routing plan and presents a consistent picture of what networks are reachable through it.
9
AS Numbers (ASNs) ASNs are 16 bit values. 64512 through 65535 are “private” Genuity: 1 MIT: 3 Harvard: 11 UC San Diego: 7377 AT&T: 7018, 6341, 5074, … UUNET: 701, 702, 284, 12199, … Sprint: 1239, 1240, 6211, 6242, … … ASNs represent units of routing policy Currently over 11,000 in use.
10
Architecture of Dynamic Routing AS 1 AS 2 BGP EGP = Exterior Gateway Protocol IGP = Interior Gateway Protocol Metric based: OSPF, IS-IS, RIP, EIGRP (cisco) Policy based: BGP The Routing Domain of BGP is the entire Internet OSPF EIGRP
11
Topology information is flooded within the routing domain Best end-to-end paths are computed locally at each router. Best end-to-end paths determine next-hops. Based on minimizing some notion of distance Works only if policy is shared and uniform Examples: OSPF, IS-IS Each router knows little about network topology Only best next-hops are chosen by each router for each destination network. Best end-to-end paths result from composition of all next- hop choices Does not require any notion of distance Does not require uniform policies at all routers Examples: RIP, BGP Link StateVectoring Technology of Distributed Routing
12
The Gang of Four Link StateVectoring EGP IGP BGP RIP IS-IS OSPF
13
13 Many Routing Processes Can Run on a Single Router Forwarding Table OSPF Domain RIP Domain BGP OS kernel OSPF Process OSPF Routing tables RIP Process RIP Routing tables BGP Process BGP Routing tables Forwarding Table Manager
14
14 IPv4 Addresses are 32 Bit Values 1111111100010001 1000011100000000 255 0 13417 255.17.134.0 Dotted quad notation IPv6 addresses have 128 bits
15
15 Classful Addresses 0nnnnnnn 10nnnnnnnnnnnnnn 110nnnnn hhhhhhhh n = network address bit h = host identifier bit Class A Class C Class B Leads to a rigid, flat, inefficient use of address space …
16
16 RFC 1519: Classless Inter-Domain Routing (CIDR) IP Address : 12.4.0.0 IP Mask: 255.254.0.0 0000110000000100 00000000 1111111111111110 00000000 Address Mask for hostsNetwork Prefix Use two 32 bit numbers to represent a network. Network number = IP address + Mask Usually written as 12.4.0.0/15
17
Which IP Addresses are Covered by a Prefix? 0000110000000100 00000000 1111111111111110 00000000 12.4.0.0/15 0000110000000101 0000100100010000 0000110000000111 0000100100010000 12.5.9.16 12.7.9.16 12.5.9.16 is covered by prefix 12.4.0.0/15 12.7.9.16 is not covered by prefix 12.4.0.0/15
18
18 CIDR = Hierarchy in Addressing 12.0.0.0/8 12.0.0.0/16 12.254.0.0/16 12.1.0.0/16 12.2.0.0/16 12.3.0.0/16 :::::: 12.253.0.0/16 12.3.0.0/24 12.3.1.0/24 :::: 12.3.254.0/24 12.253.0.0/19 12.253.32.0/19 12.253.64.0/19 12.253.96.0/19 12.253.128.0/19 12.253.160.0/19 12.253.192.0/19 ::::::
19
Classless Forwarding Destination =12.5.9.16 ------------------------------- payload PrefixInterfaceNext Hop 12.0.0.0/810.14.22.19ATM 5/0/8 12.4.0.0/15 12.5.8.0/23attached Ethernet 0/1/3 Serial 1/0/7 10.1.3.77 IP Forwarding Table 0.0.0.0/010.14.11.33ATM 5/0/9 even better OK better best!
20
IP Address Allocation and Assignment: Internet Registries IANA www.iana.org RFC 2050 - Internet Registry IP Allocation Guidelines RFC 1918 - Address Allocation for Private Internets RFC 1518 - An Architecture for IP Address Allocation with CIDR ARIN www.arin.org APNIC www.apnic.org RIPE www.ripe.org Allocate to National and local registries and ISPs Addresses assigned to customers by ISPs
21
21 Nontransit vs. Transit ASes ISP 1 ISP 2 Nontransit AS might be a corporate or campus network. Could be a “content provider” NET A Traffic NEVER flows from ISP 1 through NET A to ISP 2 (At least not intentionally!) IP traffic Internet Service providers (often) have transit networks
22
22 Selective Transit NET B NET C NET A provides transit between NET B and NET C and between NET D and NET C NET A NET D NET A DOES NOT provide transit Between NET D and NET B Most transit networks transit in a selective manner… IP traffic
23
Customers and Providers Customer pays provider for access to the Internet provider customer IP traffic provider customer
24
Customers Don’t Always Need BGP provider customer Nail up default routes 0.0.0.0/0 pointing to provider. Nail up routes 192.0.2.0/24 pointing to customer 192.0.2.0/24 Static routing is the most common way of connecting an autonomous routing domain to the Internet. This helps explain why BGP is a mystery to many …
25
Customer-Provider Hierarchy IP traffic provider customer
26
The Peering Relationship peer customerprovider Peers provide transit between their respective customers Peers do not provide transit between peers Peers (often) do not exchange $$$ traffic allowed traffic NOT allowed
27
Peering Provides Shortcuts Peering also allows connectivity between the customers of “Tier 1” providers. peer customerprovider
28
Peering Wars Reduces upstream transit costs Can increase end-to-end performance May be the only way to connect your customers to some part of the Internet (“Tier 1”) You would rather have customers Peers are usually your competition Peering relationships may require periodic renegotiation Peering struggles are by far the most contentious issues in the ISP world! Peering agreements are often confidential. PeerDon’t Peer
29
29 BGP-4 BGP = Border Gateway Protocol Is a Policy-Based routing protocol Is the de facto EGP of today’s global Internet Relatively simple protocol, but configuration is complex and the entire world can see, and be impacted by, your mistakes. 1989 : BGP-1 [RFC 1105] –Replacement for EGP (1984, RFC 904) 1990 : BGP-2 [RFC 1163] 1991 : BGP-3 [RFC 1267] 1995 : BGP-4 [RFC 1771] –Support for Classless Interdomain Routing (CIDR)
30
30 BGP Operations (Simplified) Establish session on TCP port 179 Exchange all active routes Exchange incremental updates AS1 AS2 While connection is ALIVE exchange route UPDATE messages BGP session
31
31 Four Types of BGP Messages Open : Establish a peering session. Keep Alive : Handshake at regular intervals. Notification : Shuts down a peering session. Update : Announcing new routes or withdrawing previously announced routes. announcement = prefix + attributes values
32
BGP Attributes Value Code Reference ----- --------------------------------- --------- 1 ORIGIN [RFC1771] 2 AS_PATH [RFC1771] 3 NEXT_HOP [RFC1771] 4 MULTI_EXIT_DISC [RFC1771] 5 LOCAL_PREF [RFC1771] 6 ATOMIC_AGGREGATE [RFC1771] 7 AGGREGATOR [RFC1771] 8 COMMUNITY [RFC1997] 9 ORIGINATOR_ID [RFC2796] 10 CLUSTER_LIST [RFC2796] 11 DPA [Chen] 12 ADVERTISER [RFC1863] 13 RCID_PATH / CLUSTER_ID [RFC1863] 14 MP_REACH_NLRI [RFC2283] 15 MP_UNREACH_NLRI [RFC2283] 16 EXTENDED COMMUNITIES [Rosen]... 255 reserved for development From IANA: http://www.iana.org/assignments/bgp-parameters This tutorial will cover these attributes Not all attributes need to be present in every announcement
33
Attributes are Used to Select Best Routes 192.0.2.0/24 pick me! 192.0.2.0/24 pick me! 192.0.2.0/24 pick me! 192.0.2.0/24 pick me! Given multiple routes to the same prefix, a BGP speaker must pick at most one best route (Note: it could reject them all!)
34
34 Two Types of BGP Neighbor Relationships External Neighbor (eBGP) in a different Autonomous Systems Internal Neighbor (iBGP) in the same Autonomous System AS1 AS2 eBGP iBGP iBGP is routed (using IGP!)
35
35 iBGP Peers Must be Fully Meshed iBGP neighbors do not announce routes received via iBGP to other iBGP neighbors. eBGP update iBGP updates iBGP is needed to avoid routing loops within an AS Injecting external routes into IGP does not scale and causes BGP policy information to be lost BGP does not provide “shortest path” routing Is iBGP an IGP? NO!
36
36 BGP Next Hop Attribute Every time a route announcement crosses an AS boundary, the Next Hop attribute is changed to the IP address of the border router that announced the route. AS 6431 AT&T Research 135.207.0.0/16 Next Hop = 12.125.133.90 AS 7018 AT&T AS 12654 RIPE NCC RIS project 12.125.133.90 135.207.0.0/16 Next Hop = 12.127.0.121 12.127.0.121
37
Forwarding Table Join EGP with IGP For Connectivity AS 1AS 2 192.0.2.1 135.207.0.0/16 10.10.10.10 EGP 192.0.2.1135.207.0.0/16 destinationnext hop 10.10.10.10192.0.2.0/30 destinationnext hop 135.207.0.0/16 Next Hop = 192.0.2.1 192.0.2.0/30 135.207.0.0/16 destinationnext hop 10.10.10.10 + 192.0.2.0/3010.10.10.10
38
Forwarding Table Next Hop Often Rewritten to Loopback AS 1AS 2 192.0.2.1 135.207.0.0/16 10.10.10.10 EGP 127.22.33.44135.207.0.0/16 destinationnext hop 10.10.10.10127.22.33.44 destinationnext hop 135.207.0.0/16 destinationnext hop 10.10.10.10 + 127.22.33.4410.10.10.10 127.22.33.44 135.207.0.0/16 Next Hop = 192.0.2.1 135.207.0.0/16 Next Hop = 127.22.33.44
39
Implementing Customer/Provider and Peer/Peer relationships Enforce transit relationships –Outbound route filtering Enforce order of route preference –provider < peer < customer Two parts:
40
Import Routes From peer From peer From provider From provider From customer From customer provider routecustomer routepeer routeISP route
41
Export Routes To peer To peer To customer To customer To provider From provider provider routecustomer routepeer routeISP route filters block
42
42 How Can Routes be Colored? BGP Communities! A community value is 32 bits By convention, first 16 bits is ASN indicating who is giving it an interpretation community number Very powerful BECAUSE it has no (predefined) meaning Community Attribute = a list of community values. (So one route can belong to multiple communities) RFC 1997 (August 1996) Used for signally within and between ASes Two reserved communities no_advertise 0xFFFFFF02: don’t pass to BGP neighbors no_export = 0xFFFFFF01: don’t export out of AS
43
Communities Example 1:100 –Customer routes 1:200 –Peer routes 1:300 –Provider Routes To Customers –1:100, 1:200, 1:300 To Peers –1:100 To Providers –1:100 AS 1 ImportExport
44
192.0.2.0/24 Accidental or malicious announcement of your prefix can blackhole your destinations in large part of the Internet peer customerprovider Need Filter Here! legitimate not legitimate Blackholes
45
Mars Attacks! 0.0.0.0/0: default 10.0.0.0/8: private 172.16.0.0/12: private 192.168.0.0/16: private 127.0.0.0/8: loopbacks 128.0.0.0/16: IANA reserved 192.0.2.0/24: test networks 224.0.0.0/3: classes D and E ….. Martian list often includes
46
Import Routes (Revisited) From peer From peer From provider From provider From customer From customer provider routecustomer routepeer routeISP route filters martians xxxxxx Customer address filters cccccc potential backhole Martian
47
47 So Many Choices Which route should Frank pick to 13.13.0.0./16? AS 1 AS 2 AS 4 AS 3 13.13.0.0/16 Frank’s Internet Barn peer customerprovider
48
48 BGP Route Processing Best Route Selection Apply Import Policies Best Route Table Apply Export Policies Install forwarding Entries for best Routes. Receive BGP Updates Best Routes Transmit BGP Updates Apply Policy = filter routes & tweak attributes Based on Attribute Values IP Forwarding Table Apply Policy = filter routes & tweak attributes Open ended programming. Constrained only by vendor configuration language
49
Tweak Tweak Tweak For inbound traffic –Filter outbound routes –Tweak attributes on outbound routes in the hope of influencing your neighbor’s best route selection For outbound traffic –Filter inbound routes –Tweak attributes on inbound routes to influence best route selection outbound routes inbound routes inbound traffic outbound traffic In general, an AS has more control over outbound traffic
50
Route Selection Summary Highest Local Preference Shortest ASPATH Lowest MED i-BGP < e-BGP Lowest IGP cost to BGP egress Lowest router ID traffic engineering Enforce relationships Throw up hands and break ties
51
51 Back to Frank … AS 1 AS 2 AS 4 AS 3 13.13.0.0/16 peer customerprovider local pref = 80 local pref = 100 local pref = 90 Higher Local preference values are more preferred Local preference only used in iBGP
52
52 Implementing Backup Links with Local Preference (Outbound Traffic) Forces outbound traffic to take primary link, unless link is down. AS 1 primary link backup link Set Local Pref = 100 for all routes from AS 1 AS 65000 Set Local Pref = 50 for all routes from AS 1 We’ll talk about inbound traffic soon …
53
53 Multihomed Backups (Outbound Traffic) Forces outbound traffic to take primary link, unless link is down. AS 1 primary link backup link Set Local Pref = 100 for all routes from AS 1 AS 2 Set Local Pref = 50 for all routes from AS 3 AS 3 provider
54
54 ASPATH Attribute AS7018 135.207.0.0/16 AS Path = 6341 AS 1239 Sprint AS 1755 Ebone AT&T AS 3549 Global Crossing 135.207.0.0/16 AS Path = 7018 6341 135.207.0.0/16 AS Path = 3549 7018 6341 AS 6341 135.207.0.0/16 AT&T Research Prefix Originated AS 12654 RIPE NCC RIS project AS 1129 Global Access 135.207.0.0/16 AS Path = 7018 6341 135.207.0.0/16 AS Path = 1239 7018 6341 135.207.0.0/16 AS Path = 1755 1239 7018 6341 135.207.0.0/16 AS Path = 1129 1755 1239 7018 6341
55
55 Interdomain Loop Prevention BGP at AS YYY will never accept a route with ASPATH containing YYY. AS 7018 12.22.0.0/16 ASPATH = 1 333 7018 877 Don’t Accept! AS 1
56
Traffic Often Follows ASPATH AS 4AS 3 AS 2 AS 1 135.207.0.0/16 ASPATH = 3 2 1 IP Packet Dest = 135.207.44.66
57
… But It Might Not AS 4AS 3 AS 2 AS 1 135.207.0.0/16 ASPATH = 3 2 1 IP Packet Dest = 135.207.44.66 AS 5 135.207.44.0/25 ASPATH = 5 135.207.44.0/25 AS 2 filters all subnets with masks longer than /24 135.207.0.0/16 ASPATH = 1 From AS 4, it may look like this packet will take path 3 2 1, but it actually takes path 3 2 5
58
In fairness: could you do this “right” and still scale? Exporting internal state would dramatically increase global instability and amount of routing state Shorter Doesn’t Always Mean Shorter AS 4 AS 3 AS 2 AS 1 Mr. BGP says that path 4 1 is better than path 3 2 1 Duh!
59
59 Shedding Inbound Traffic with ASPATH Padding Hack Padding will (usually) force inbound traffic from AS 1 to take primary link AS 1 192.0.2.0/24 ASPATH = 2 2 2 customer AS 2 provider 192.0.2.0/24 backupprimary 192.0.2.0/24 ASPATH = 2
60
60 Padding May Not Shut Off All Traffic AS 1 192.0.2.0/24 ASPATH = 2 2 2 2 2 2 2 2 2 2 2 2 2 2 customer AS 2 provider 192.0.2.0/24 ASPATH = 2 AS 3 provider AS 3 will send traffic on “backup” link because it prefers customer routes and local preference is considered before ASPATH length! Padding in this way is often used as a form of load balancing backupprimary
61
61 COMMUNITY Attribute to the Rescue! AS 1 customer AS 2 provider 192.0.2.0/24 ASPATH = 2 AS 3 provider backupprimary 192.0.2.0/24 ASPATH = 2 COMMUNITY = 3:70 Customer import policy at AS 3: If 3:90 in COMMUNITY then set local preference to 90 If 3:80 in COMMUNITY then set local preference to 80 If 3:70 in COMMUNITY then set local preference to 70 AS 3: normal customer local pref is 100, peer local pref is 90
62
62 Hot Potato Routing: Go for the Closest Egress Point 192.44.78.0/24 15 56 IGP distances egress 1 egress 2 This Router has two BGP routes to 192.44.78.0/24. Hot potato: get traffic off of your network as Soon as possible. Go for egress 1!
63
63 Getting Burned by the Hot Potato 15 56 17 2865 High bandwidth Provider backbone Low bandwidth customer backbone Heavy Content Web Farm Many customers want their provider to carry the bits! tiny http request huge http reply SFFNYC San Diego
64
64 Cold Potato Routing with MEDs (Multi-Exit Discriminator Attribute) 15 56 17 2865 Heavy Content Web Farm 192.44.78.0/24 MED = 15 192.44.78.0/24 MED = 56 This means that MEDs must be considered BEFORE IGP distance! Prefer lower MED values Note1 : some providers will not listen to MEDs Note2 : MEDs need not be tied to IGP distance
65
Route Selection Summary Highest Local Preference Shortest ASPATH Lowest MED i-BGP < e-BGP Lowest IGP cost to BGP egress Lowest router ID traffic engineering Enforce relationships Throw up hands and break ties This is somewhat simplified. Hey, what happened to ORIGIN??
66
Policies Can Interact Strangely (“Route Pinning” Example) backup Disaster strikes primary link and the backup takes over Primary link is restored but some traffic remains pinned to backup 1 2 3 4 Install backup link using community customer
67
News At 11:00 BGP is not guaranteed to converge on a stable routing. Policy interactions could lead to “livelock” protocol oscillations. See “Persistent Route Oscillations in Inter-domain Routing” by K. Varadhan, R. Govindan, and D. Estrin. ISI report, 1996 Corollary: BGP is not guaranteed to recover from network failures.
68
What Problem is BGP solving? Underlying problem Shortest Paths Distributed means of computing a solution. X? RIP, OSPF, IS-IS BGP aid in the design of policy analysis algorithms and heuristics aid in the analysis and design of BGP and extensions help explain some BGP routing anomalies provide a fun way of thinking about the protocol X could A Wee Bit of Theory
69
Separate dynamic and static semantics SPVP = Simple Path Vector Protocol = a distributed algorithm for solving SPP BGP SPVP Booo Hooo, Many, many complications... BGP Policies Stable Paths Problem (SPP) static semantics dynamic semantics See [Griffin, Shepherd, Wilfong]
70
1 An instance of the Stable Paths Problem (SPP) 2 5 5 2 1 0 0 2 1 0 2 0 1 3 0 1 0 3 0 4 2 0 4 3 0 3 4 2 1 A graph of nodes and edges, Node 0, called the origin, For each non-zero node, a set or permitted paths to the origin. This set always contains the “null path”. A ranking of permitted paths at each node. Null path is always least preferred. (Not shown in diagram) When modeling BGP : nodes represent BGP speaking routers, and 0 represents a node originating some address block most preferred … least preferred (not null) Yes, the translation gets messy!
71
5 5 2 1 0 1 A Solution to a Stable Paths Problem 2 0 2 1 0 2 0 1 3 0 1 0 3 0 4 2 0 4 3 0 3 4 2 1 node u’s assigned path is either the null path or is a path uwP, where wP is assigned to node w and {u,w} is an edge in the graph, each node is assigned the highest ranked path among those consistent with the paths assigned to its neighbors. A Solution need not represent a shortest path tree, or a spanning tree. A solution is an assignment of permitted paths to each node such that
72
An SPP may have multiple solutions First solution 102 1 2 0 1 0 102102 2 1 0 2 0 1 2 0 1 0 2 1 0 2 0 1 2 0 1 0 2 1 0 2 0 Second solution DISAGREE
73
BAD GADGET : No Solution 2 0 3 1 2 1 0 2 0 1 3 0 1 0 3 2 0 3 0 4 3
74
SURPRISE : Beware of Backup Policies 2 0 3 1 2 1 0 2 0 1 3 0 1 0 3 4 2 0 3 0 4 4 0 4 2 0 4 3 0 Becomes a BAD GADGET if link (4, 0) goes down. BGP is not robust : it is not guaranteed to recover from network failures.
75
PRECARIOUS 1 0 2 1 2 0 1 0 2 1 0 2 0 3 4 5 6 5 3 1 0 5 6 3 1 2 0 5 3 1 2 0 6 3 1 0 6 4 3 1 2 0 6 3 1 2 0 4 3 1 0 4 5 3 1 2 0 4 3 1 2 0 3 1 0 3 1 2 0 As with DISAGREE, this part has two distinct solutions This part has a solution only when node 1 is assigned the direct path (1 0). Has a solution, but can get “trapped”
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.