Presentation is loading. Please wait.

Presentation is loading. Please wait.

Univ. of TehranComputer Network1 Computer Networks Computer Networks (Graduate level) University of Tehran Dept. of EE and Computer Engineering By: Dr.

Similar presentations


Presentation on theme: "Univ. of TehranComputer Network1 Computer Networks Computer Networks (Graduate level) University of Tehran Dept. of EE and Computer Engineering By: Dr."— Presentation transcript:

1 Univ. of TehranComputer Network1 Computer Networks Computer Networks (Graduate level) University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani Lecture 7: Inter-domain Routing

2 Univ. of TehranComputer Network2 Inter-Domain Routing Border Gateway Protocol (BGP) Assigned reading [LAB00] Delayed Internet Routing Convergence Sources RFC1771: main BGP RFC RFC1772-3-4: application, experiences, and analysis of BGP RFC1965: AS confederations for BGP Christian Huitema: “Routing in the Internet”, chapters 8 and 9. John Stewart III: “BGP4 - Inter-domain routing in the Internet”

3 Univ. of TehranComputer Network3 Outline External BGP (E-BGP) Internal BGP (I-BGP) Multi-Homing Stability Issues

4 Univ. of TehranComputer Network4 Internet’s Area Hierarchy What is an Autonomous System (AS)? A set of routers under a single technical administration, using an interior gateway protocol (IGP) and common metrics to route packets within the AS and using an exterior gateway protocol (EGP) to route packets to other AS’s Sometimes AS’s use multiple IGPs and metrics, but appear as single AS’s to other AS’s Each AS assigned unique ID AS’s peer at network exchange routing information.

5 Univ. of TehranComputer Network5 Example 12 3 1.1 1.2 2.1 2.2 3.1 3.2 2.2.1 4 4.1 4.2 5 5.1 5.2 EGP IGP EGP IGP EGP

6 Univ. of TehranComputer Network6 History Mid-80s: EGP Reachability protocol (no shortest path) Did not accommodate cycles (tree topology) Evolved when all networks connected to NSF backbone Result: BGP introduced as routing protocol Latest version = BGP 4 BGP-4 supports CIDR Primary objective: connectivity not performance

7 Univ. of TehranComputer Network7 Choices Link state or distance vector? No universal metric – policy decisions Problems with distance-vector: Bellman-Ford algorithm may not converge Problems with link state: Metric used by routers not the same – loops LS database too large – entire Internet May expose policies to other AS’s

8 Univ. of TehranComputer Network8 Solution: Distance Vector with Path Each routing update carries the entire path Loops are detected as follows: When AS gets route check if AS already is in path If yes, reject route If no, add self and (possibly) advertise route further Advantage: Metrics are local - AS chooses path, protocol ensures no loops

9 Univ. of TehranComputer Network9 Interconnecting BGP Peers BGP uses TCP to connect peers Advantages: Simplifies BGP No need for periodic refresh - routes are valid until withdrawn, or the connection is lost Incremental updates Disadvantages Congestion control on a routing protocol? Poor interaction during high load

10 Univ. of TehranComputer Network10 Hop-by-hop Model BGP advertises to neighbors only those routes that it uses Consistent with the hop-by-hop Internet paradigm e.g., AS1 cannot tell AS2 to route to other AS’s in a manner different than what AS2 has chosen (need source routing for that)

11 Univ. of TehranComputer Network11 AS Categories Stub: an AS that has only a single connection to one other AS - carries only local traffic. Multi-homed: an AS that has connections to more than one AS, but does not carry transit traffic Transit: an AS that has connections to more than one AS, and carries both transit and local traffic (under certain policy restrictions)

12 Univ. of TehranComputer Network12 AS Categories AS1 AS3 AS2 AS1 AS2 AS3AS1 AS2 Stub Multi-homed Transit

13 Univ. of TehranComputer Network13 Policy with BGP BGP provides capability for enforcing various policies Policies are not part of BGP: they are provided to BGP as configuration information BGP enforces policies by choosing paths from multiple alternatives and controlling advertisement to other AS’s

14 Univ. of TehranComputer Network14 Examples of BGP Policies A multi-homed AS refuses to act as transit Limit path advertisement A multi-homed AS can become transit for some AS’s Only advertise paths to some AS’s An AS can favor or disfavor certain AS’s for traffic transit from itself

15 Univ. of TehranComputer Network15 Routing Information Bases (RIB) Routes are stored in RIBs Adj-RIBs-In: routing info that has been learned from other routers (unprocessed routing info) Loc-RIB: local routing information selected from Adj-RIBs-In (routes selected locally) Adj-RIBs-Out: info to be advertised to peers (routes to be advertised)

16 Univ. of TehranComputer Network16 BGP Common Header Length (2 bytes)Type (1 byte) 0 123 Marker (security and message delineation) 16 bytes Types: OPEN, UPDATE, NOTIFICATION, KEEPALIVE

17 Univ. of TehranComputer Network17 Optional parameters BGP OPEN message LengthType: open 0 123 Marker (security and message delineation) version My autonomous systemHold time BGP identifier Parameter length My AS: id assigned to that AS Hold timer: max interval between KEEPALIVE or UPDATE messages interval implies no keep_alive. BGP ID: IP address of one interface (same for all messages)

18 Univ. of TehranComputer Network18 BGP UPDATE message LengthType: update 0 123 Marker (security and message delineation)..routes len Withdrawn routes (variable) Path attribute len Path attributes (variable) Network layer reachability information (NLRI) (variable) Many prefixes may be included in UPDATE, but must share same attributes. UPDATE message may report multiple withdrawn routes.... Withdrawn..

19 Univ. of TehranComputer Network19 BGP UPDATE Message List of withdrawn routes Network layer reachability information List of reachable prefixes Path attributes Origin Path Metrics All prefixes advertised in a message have same path attributes

20 Univ. of TehranComputer Network20 NLRI Network Level Reachability Information list of IP address prefixes encoded as follows: Length (1 byte) Prefix (variable)

21 Univ. of TehranComputer Network21 Path attributes Attribute type (2 bytes)Attribute length (1-2 bytes) Attribute Value (variable length) Type-Length-Value encoding Attribute flags (1 byte)Attribute type code (1 byte) Attribute type field Flags: optional, v.s. well-known transitive, partial, extended length

22 Univ. of TehranComputer Network22 Data BGP NOTIFICATION message Length Type: NOTIFICATION 0 123 Marker (security and message delineation) Error code Error sub-code Used for error notification TCP connection is closed immediately after notification

23 Univ. of TehranComputer Network23 BGP KEEPALIVE message Length Type: KEEPALIVE 0 123 Marker (security and message delineation) Sent periodically to peers to ensure connectivity. If hold_time is zero, messages are not sent.. Sent in place of an UPDATE message

24 Univ. of TehranComputer Network24 Path Selection Criteria Information based on path attributes Attributes + external (policy) information Examples: Hop count Policy considerations Preference for AS Presence or absence of certain AS Path origin Link dynamics

25 Univ. of TehranComputer Network25 Route Selection Summary Highest Local Preference Shortest ASPATH Lowest MED i-BGP < e-BGP Lowest IGP cost to BGP egress Lowest router ID traffic engineering Enforce relationships Throw up hands and break ties

26 26 Back to Frank … AS 1 AS 2 AS 4 AS 3 13.13.0.0/16 peer customerprovider local pref = 80 local pref = 100 local pref = 90 Higher Local preference values are more preferred Local preference only used in iBGP

27 27 Implementing Backup Links with Local Preference (Outbound Traffic) Forces outbound traffic to take primary link, unless link is down. AS 1 primary link backup link Set Local Pref = 100 for all routes from AS 1 AS 65000 Set Local Pref = 50 for all routes from AS 1 We’ll talk about inbound traffic soon …

28 28 Multihomed Backups (Outbound Traffic) Forces outbound traffic to take primary link, unless link is down. AS 1 primary link backup link Set Local Pref = 100 for all routes from AS 1 AS 2 Set Local Pref = 50 for all routes from AS 3 AS 3 provider

29 29 ASPATH Attribute AS7018 135.207.0.0/16 AS Path = 6341 AS 1239 Sprint AS 1755 Ebone AT&T AS 3549 Global Crossing 135.207.0.0/16 AS Path = 7018 6341 135.207.0.0/16 AS Path = 3549 7018 6341 AS 6341 135.207.0.0/16 AT&T Research Prefix Originated AS 12654 RIPE NCC RIS project AS 1129 Global Access 135.207.0.0/16 AS Path = 7018 6341 135.207.0.0/16 AS Path = 1239 7018 6341 135.207.0.0/16 AS Path = 1755 1239 7018 6341 135.207.0.0/16 AS Path = 1129 1755 1239 7018 6341

30 30 COMMUNITY Attribute to the Rescue! AS 1 customer AS 2 provider 192.0.2.0/24 ASPATH = 2 AS 3 provider backupprimary 192.0.2.0/24 ASPATH = 2 COMMUNITY = 3:70 Customer import policy at AS 3: If 3:90 in COMMUNITY then set local preference to 90 If 3:80 in COMMUNITY then set local preference to 80 If 3:70 in COMMUNITY then set local preference to 70 AS 3: normal customer local pref is 100, peer local pref is 90

31 31 Hot Potato Routing: Go for the Closest Egress Point 192.44.78.0/24 15 56 IGP distances egress 1 egress 2 This Router has two BGP routes to 192.44.78.0/24. Hot potato: get traffic off of your network as Soon as possible. Go for egress 1!

32 32 Getting Burned by the Hot Potato 15 56 17 2865 High bandwidth Provider backbone Low bandwidth customer backbone Heavy Content Web Farm Many customers want their provider to carry the bits! tiny http request huge http reply SFFNYC San Diego

33 33 Cold Potato Routing with MEDs (Multi-Exit Discriminator Attribute) 15 56 17 2865 Heavy Content Web Farm 192.44.78.0/24 MED = 15 192.44.78.0/24 MED = 56 This means that MEDs must be considered BEFORE IGP distance! Prefer lower MED values Note1 : some providers will not listen to MEDs Note2 : MEDs need not be tied to IGP distance

34 Univ. of TehranComputer Network34 Route Selection Summary Highest Local Preference Shortest ASPATH Lowest MED i-BGP < e-BGP Lowest IGP cost to BGP egress Lowest router ID traffic engineering Enforce relationships Throw up hands and break ties This is somewhat simplified. Hey, what happened to ORIGIN??

35 Univ. of TehranComputer Network35 Policies Can Interact Strangely (“Route Pinning” Example) backup Disaster strikes primary link and the backup takes over Primary link is restored but some traffic remains pinned to backup 1 2 3 4 Install backup link using community customer

36 Univ. of TehranComputer Network36 Path Attributes Categories (recall flags): well-known mandatory (passed on) well-known discretionary (passed on) optional transitive (passed on) optional non-transitive (if unrecognized, not passed on) Optional attributes allow for BGP extensions

37 Univ. of TehranComputer Network37 Path attribute message format (repeated) O T P E Attribute flags Attribute type code 0 O: optional or well-known T: transitive or local P: partially evaluated E: length in 1 or 2 bytes Origin AS_path Next hop etc.

38 Univ. of TehranComputer Network38 ORIGIN path attribute Well-known, mandatory attribute. Describes how a prefix was generated at the origin AS. Possible values: IGP: prefix learned from IGP EGP: prefix learned through EGP INCOMPLETE: none of the above (often seen for static routes)

39 Univ. of TehranComputer Network39 AS_PATH attribute Well-known, mandatory attribute. Important components: list of traversed AS’s If forwarding to internal peer: do not modify AS_PATH attribute If forwarding to external peer: prepend self into the path

40 Univ. of TehranComputer Network40 Next hop path attribute Well-known, mandatory attribute NEXT_HOP: IP address of border router to be used as next hop Usually, next hop is the router sending the UPDATE message Useful when some routers do not speak BGP

41 Univ. of TehranComputer Network41 Example of NEXT_HOP A (BGP) B (BGP) C (no BGP) 138.39.0.0/16 UPDATE MSG through BGP Traffic to 138.39.0.0/16

42 Univ. of TehranComputer Network42 LOCAL PREF Local (within an AS) mechanism to provide relative priority among BGP routers R1R2 R3R4 I-BGP AS 256 AS 300 Local Pref = 500Local Pref =800 AS 100 R5 AS 200

43 Univ. of TehranComputer Network43 AS_PATH List of traversed AS’s AS 500 AS 300 AS 200AS 100 180.10.0.0/16 300 200 100 170.10.0.0/16 300 200 170.10.0.0/16180.10.0.0/16

44 Univ. of TehranComputer Network44 CIDR and BGP AS X 197.8.2.0/24 AS Y 197.8.3.0/24 AS T (provider) 197.8.0.0/23 AS Z What should T announce to Z?

45 Univ. of TehranComputer Network45 Options Advertise all paths: Path 1: through T can reach 197.8.0.0/23 Path 2: through T can reach 197.8.2.0/24 Path 3: through T can reach 197.8.3.0/24 But this does not reduce routing tables! We would like to advertise: Path 1: through T can reach 197.8.0.0/22

46 Univ. of TehranComputer Network46 Sets and Sequences Problem: what do we list in the route? List T: omitting information not acceptable, may lead to loops List T, X, Y: misleading, appears as 3-hop path Solution: restructure AS Path attribute as: Path: (Sequence (T), Set (X, Y)) If Z wants to advertise path: Path: (Sequence (Z, T), Set (X, Y)) In practice used only if paths in set have same attributes

47 Univ. of TehranComputer Network47 Multi-Exit Discriminator (MED) Hint to external neighbors about the preferred path into an AS Non-transitive attribute (we will see later why) Different AS choose different scales Used when two AS’s connect to each other in more than one place

48 Univ. of TehranComputer Network48 MED Hint to R1 to use R3 over R4 link Cannot compare AS40’s values to AS30’s R1R2 R3R4 AS 30 AS 40 180.10.0.0 MED = 120 180.10.0.0 MED = 200 AS 10 180.10.0.0 MED = 50

49 Univ. of TehranComputer Network49 MED MED is typically used in provider/subscriber scenarios It can lead to unfairness if used between ISP because it may force one ISP to carry more traffic: SF NY ISP1 ignores MED from ISP2 ISP2 obeys MED from ISP1 ISP2 ends up carrying traffic most of the way ISP1 ISP2

50 Univ. of TehranComputer Network50 Other Attributes ORIGIN Source of route (IGP, EGP, other) NEXT_HOP Address of next hop router to use Used to direct traffic to non-BGP router Check out http://www.cisco.com for full explanationhttp://www.cisco.com

51 Univ. of TehranComputer Network51 Decision Process Processing order of attributes: Select route with highest LOCAL-PREF Select route with shortest AS-PATH Apply MED (if routes learned from same neighbor)

52 Univ. of TehranComputer Network52 Outline External BGP (E-BGP) Internal BGP (I-BGP) Multi-Homing Stability Issues

53 Univ. of TehranComputer Network53 Internal vs. External BGP R3R4 R1 R2 E-BGP BGP can be used by R3 and R4 to learn routes How do R1 and R2 learn routes? Option 1: Inject routes in IGP Only works for small routing tables Option 2: Use I-BGP AS1 AS2

54 Univ. of TehranComputer Network54 I-BGP

55 Univ. of TehranComputer Network55 Internal BGP (I-BGP) Same messages as E-BGP Different rules about re-advertising prefixes: Prefix learned from E-BGP can be advertised to I-BGP neighbor and vice-versa, but Prefix learned from one I-BGP neighbor cannot be advertised to another I-BGP neighbor Reason: no AS PATH within the same AS and thus danger of looping.

56 Univ. of TehranComputer Network56 Internal BGP (I-BGP) R3R4 R1 R2 E-BGP I-BGP R3 can tell R1 and R2 prefixes from R4 R3 can tell R4 prefixes from R1 and R2 R3 cannot tell R2 prefixes from R1 R2 can only find these prefixes through a direct connection to R1 Result: I-BGP routers must be fully connected (via TCP)! contrast with E-BGP sessions that map to physical links AS1 AS2

57 Univ. of TehranComputer Network57 Link Failures Two types of link failures: Failure on an E-BGP link Failure on an I-BGP Link These failures are treated completely different in BGP Why?

58 Univ. of TehranComputer Network58 Failure on an E-BGP Link AS1 R1 AS2 R2 Physical link E-BGP session 138.39.1.1/30138.39.1.2/30 If the link R1-R2 goes down The TCP connection breaks BGP routes are removed This is the desired behavior

59 Univ. of TehranComputer Network59 Failure on an I-BGP Link R1 R2 R3 Physical link I-BGP connection 138.39.1.1/30 138.39.1.2/30 If link R1-R2 goes down, R1 and R2 should still be able to exchange traffic The indirect path through R3 must be used Thus, E-BGP and I-BGP must use different conventions with respect to TCP endpoints

60 Univ. of TehranComputer Network60 Outline External BGP (E-BGP) Internal BGP (I-BGP) Multi-Homing Stability Issues

61 Univ. of TehranComputer Network61 Multi-homing With multi-homing, a single network has more than one connection to the Internet. Improves reliability and performance: Can accommodate link failure Bandwidth is sum of links to Internet Challenges Getting policy right (MED, etc..) Addressing

62 Univ. of TehranComputer Network62 Multi-homing to a Single Provider Case 1 ISP Customer R1 R2 Easy solution: Use IMUX or Multi-link PPP Hard solution: Use BGP Makes assumptions about traffic (same amount of prefixes can be reached from both links)

63 Univ. of TehranComputer Network63 Multi-homing to a single provider: Case 2 ISP Customer R1 R2 If multiple prefixes, may use MED good if traffic load from prefixes is equal If single prefix, load may be unequal break-down prefix and advertise different prefixes over different links R3 138.39/16204.70/16

64 Univ. of TehranComputer Network64 Multi-homing to a single provider: Case 3 ISP Customer R1R2 For ISP-> customer traffic, same as before: use MED good if traffic load to prefixes is equal For customer -> ISP traffic: R3 alternates links multiple default routes R3 138.39/16204.70/16

65 Univ. of TehranComputer Network65 Multi-homing to a single provider: Case 4 ISP Customer R1R2 Most reliable approach no equipment sharing Customer -> ISP: same as case 2 ISP -> customer: same as case 3 R3 138.39/16204.70/16 R4

66 Univ. of TehranComputer Network66 Multi-homing to Multiple Providers Major issues: Addressing Aggregation Customer address space: Delegated by ISP1 Delegated by ISP2 Delegated by ISP1 and ISP2 Obtained independently Advantage and disadvantage? ISP1ISP2 ISP3 Customer

67 Univ. of TehranComputer Network67 Address Space from one ISP Customer uses address space from one, I.e ISP1 ISP1 advertises /16 aggregate Customer advertises /24 route to ISP2 ISP2 relays route to ISP1 and ISP3 ISP2-3 use /24 route ISP1 routes directly Problems with traffic load? 138.39/16 138.39.1/24 ISP1ISP2 ISP3 Customer

68 Univ. of TehranComputer Network68 Pitfalls ISP1 aggregates to a /19 at border router to reduce internal tables. ISP1 still announces /16. ISP1 hears /24 from ISP2. ISP1 routes packets for customer to ISP2! Workaround: ISP1 must inject /24 into I-BGP. 138.39.0/19 138.39/16 ISP1ISP2 ISP3 Customer 138.39.1/24

69 Univ. of TehranComputer Network69 Address Space from Both ISPs ISP1 and ISP2 continue to announce aggregates Load sharing depends on traffic to two prefixes Lack of reliability: if ISP1 link goes down, part of customer becomes inaccessible. Customer may announce prefixes to both ISPs, but still problems with longest match as in case 1. 138.39.1/24 204.70.1/24 ISP1ISP2 ISP3 Customer

70 Univ. of TehranComputer Network70 Address Space Obtained Independently Offers the most control, but at the cost of aggregation. Still need to control paths suppose ISP1 large, ISP2-3 small customer advertises long path to ISP1, but local-pref attribute used to override ISP3 learns shorter path from ISP2 ISP1ISP2 ISP3 Customer

71 Univ. of TehranComputer Network71 Outline External BGP (e-BGP) Internal BGP (i-BGP) Multi-Homing Stability Issues

72 Univ. of TehranComputer Network72 Signs of Routing Instability Record of BGP messages at major exchanges Discovered orders of magnitude larger than expected updates Bulk were duplicate withdrawals Stateless implementation of BGP – did not keep track of information passed to peers Impact of few implementations Strong frequency (30/60 sec) components Interaction with other local routing/links etc.

73 Univ. of TehranComputer Network73 Route Flap Storm Overloaded routers fail to send Keep_Alive message and marked as down I-BGP peers find alternate paths Overloaded router re-establishes peering session Must send large updates Increased load causes more routers to fail!

74 Univ. of TehranComputer Network74 Route Flap Dampening Routers now give higher priority to BGP/Keep_Alive to avoid problem Associate a penalty with each route Increase when route flaps Exponentially decay penalty with time When penalty reaches threshold, suppress route

75 Univ. of TehranComputer Network75 BGP Limitations: Oscillations AS 0 AS 2 AS 1 R (*R,1R,2R) (0R,*R,2R)(0R,1R,*R)

76 Univ. of TehranComputer Network76 BGP Limitations: Oscillations AS 0 AS 2 AS 1 R (-,*1R,2R) (*0R,-,2R) (*0R,1R,-) W W W    (*R,1R,2R) (0R,*R,2R) (0R,1R,*R)

77 Univ. of TehranComputer Network77 BGP Limitations: Oscillations AS 0 AS 2 AS 1 R (-,*1R,2R) (-,-,*2R) (01R,*1R,-) 01R    (-,*1R,2R) (*0R,-,2R) (*0R,1R,-)

78 Univ. of TehranComputer Network78 BGP Limitations: Oscillations AS 0 AS 2 AS 1 R (-,-,*2R) (*01R,10R,-) 10R    (-,*1R,2R) (-,-,*2R) (01R,*1R,-)

79 Univ. of TehranComputer Network79 BGP Limitations: Oscillations AS 0 AS 2 AS 1 R (-,-,-) (-,-,*20R) (*01R,10R,-) 20R    (-,-,*2R) (*01R,10R,-)

80 Univ. of TehranComputer Network80 BGP Limitations: Oscillations AS 0 AS 2 AS 1 R (-,*12R,-) (-,-,*20R) (*01R,-,-) 12R    (-,-,-) (-,-,*20R)(*01R,10R,-)

81 Univ. of TehranComputer Network81 BGP Limitations: Oscillations AS 0 AS 2 AS 1 R (-,*12R,21R) (-,-,-)(*01R,-,-) 21R (-,*12R,-) (-,-,*20R)(*01R,-,-)  

82 Univ. of TehranComputer Network82 BGP Oscillations Can possible explore every possible path through network  (n-1)! Combinations Limit between update messages (MinRouteAdver) reduces exploration Forces router to process all outstanding messages Typical Internet failover times New/shorter link  60 seconds Results in simple replacement at nodes Down link  180 seconds Results in search of possible options Longer link  120 seconds Results in replacement or search based on length

83 Univ. of TehranComputer Network83 Problems Routing table size Need an entry for all paths to all networks Required memory= O((N + M*A) * K) N: number of networks M: mean AS distance (in terms of hops) A: number of AS’s K: number of BGP peers

84 Univ. of TehranComputer Network84 Routing Table Size Mean AS DistanceNumber of AS’s 2,100559 4,00010100 10,00015300 BGP Peers/Net 3 6 10 100,000203,00020 NetworksMemory 27,000 108,000 490,000 1,040,000 Problem reduced with CIDR

85 Univ. of TehranComputer Network85 Next Lecture: TCP Transport layer issues Assigned reading [S+99] The End-to-End Effects of Internet Path Selection [Tsu88] The Landmark Hierarchy: A New Hierarchy for Routing in Very Large Networks


Download ppt "Univ. of TehranComputer Network1 Computer Networks Computer Networks (Graduate level) University of Tehran Dept. of EE and Computer Engineering By: Dr."

Similar presentations


Ads by Google