Nick Feamster Georgia Tech

Slides:



Advertisements
Similar presentations
OSPF 1.
Advertisements

Improving Internet Availability. Some Problems Misconfiguration Miscoordination Efficiency –Market efficiency –Efficiency of end-to-end paths Scalability.
Intradomain Routing CS 4251: Computer Networking II Nick Feamster Spring 2008.
Multihoming and Multi-path Routing
Virtual Links: VLANs and Tunneling
CS 4251: Computer Networking II Nick Feamster Spring 2008
Multihoming and Multi-path Routing
Intradomain Routing Computer Networking I Nick Feamster Spring 2010.
MPLS VPN.
CSCI-1680 Network Layer: Intra-domain Routing Based partly on lecture notes by David Mazières, Phil Levis, John Jannotti Rodrigo Fonseca.
McGraw-Hill © The McGraw-Hill Companies, Inc., 2004 Chapter 22 Network Layer: Delivery, Forwarding, and Routing Copyright © The McGraw-Hill Companies,
© 2006 Cisco Systems, Inc. All rights reserved. MPLS v MPLS VPN Technology Introducing the MPLS VPN Routing Model.
© 2006 Cisco Systems, Inc. All rights reserved. MPLS v MPLS VPN Technology Introducing MPLS VPN Architecture.
Lecture 9 Overview. Hierarchical Routing scale – with 200 million destinations – can’t store all dests in routing tables! – routing table exchange would.
© J. Liebeherr, All rights reserved 1 Border Gateway Protocol This lecture is largely based on a BGP tutorial by T. Griffin from AT&T Research.
Fundamentals of Computer Networks ECE 478/578 Lecture #18: Policy-Based Routing Instructor: Loukas Lazos Dept of Electrical and Computer Engineering University.
Intradomain Topology and Routing (Nick Feamster) January 30, 2008.
INTERDOMAIN ROUTING POLICY COS 461: Computer Networks Spring 2010 (MW 3:00-4:20 in COS 105) Mike Freedman
Courtesy: Nick McKeown, Stanford
1 Interdomain Routing Protocols. 2 Autonomous Systems An autonomous system (AS) is a region of the Internet that is administered by a single entity and.
Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol –Datagram format.
CS Summer 2003 Lecture 14. CS Summer 2003 MPLS VPN Architecture MPLS VPN is a collection of sites interconnected over MPLS core network. MPLS.
Link-State Routing Reading: Sections 4.2 and COS 461: Computer Networks Spring 2011 Mike Freedman
Link-State Routing Reading: Sections 4.2 and COS 461: Computer Networks Spring 2010 (MW 3:00-4:20 in COS 105) Michael Freedman
Routing and Routing Protocols
© 2006 Cisco Systems, Inc. All rights reserved. Implementing Secure Converged Wide Area Networks (ISCW) Module 4: Frame Mode MPLS Implementation.
Intradomain Topology and Routing Nick Feamster CS 6250 September 5, 2007.
Interdomain Routing Policy COS 461: Computer Networks Spring 2011 Mike Freedman 1.
MPLS L3 and L2 VPNs Virtual Private Network –Connect sites of a customer over a public infrastructure Requires: –Isolation of traffic Terminology –PE,
CS 4700 / CS 5700 Network Fundamentals Lecture 9: Intra Domain Routing Revised 7/30/13.
Interdomain Routing (Nick Feamster) February 4, 2008.
1 Semester 2 Module 6 Routing and Routing Protocols YuDa college of business James Chen
1 Computer Communication & Networks Lecture 22 Network Layer: Delivery, Forwarding, Routing (contd.)
Unicast Routing Protocols  A routing protocol is a combination of rules and procedures that lets routers in the internet inform each other of changes.
M.Menelaou CCNA2 ROUTING. M.Menelaou ROUTING Routing is the process that a router uses to forward packets toward the destination network. A router makes.
1 Chapter 27 Internetwork Routing (Static and automatic routing; route propagation; BGP, RIP, OSPF; multicast routing)
Routing protocols Basic Routing Routing Information Protocol (RIP) Open Shortest Path First (OSPF)
10-1 Last time □ Transitioning to IPv6 ♦ Tunneling ♦ Gateways □ Routing ♦ Graph abstraction ♦ Link-state routing Dijkstra's Algorithm ♦ Distance-vector.
Lecture 4: BGP Presentations Lab information H/W update.
Chapter 9. Implementing Scalability Features in Your Internetwork.
Network Layer4-1 Chapter 4: Network Layer r 4. 1 Introduction r 4.2 Virtual circuit and datagram networks r 4.3 What’s inside a router r 4.4 IP: Internet.
Page 110/27/2015 A router ‘knows’ only of networks attached to it directly – unless you configure a static route or use routing protocols Routing protocols.
Border Gateway Protocol (BGP) W.lilakiatsakun. BGP Basics (1) BGP is the protocol which is used to make core routing decisions on the Internet It involves.
Routing 2 CS457 Fall 2010.
1MPLS QOS 10/00 © 2000, Cisco Systems, Inc. rfc2547bis VPN Alvaro Retana Alvaro Retana
1 Version 3.1 Module 6 Routed & Routing Protocols.
Dynamic Routing Protocols II OSPF
Spring 2000CS 4611 Routing Outline Algorithms Scalability.
Intradomain Topology and Routing Nick Feamster CS 7260 January 17, 2007.
Network Layer (2). Review Physical layer: move bits between physically connected stations Data link layer: move frames between physically connected stations.
Border Gateway Protocol (BGP) (Bruce Maggs and Nick Feamster)
CS 5565 Network Architecture and Protocols
Dynamic Routing Protocols II OSPF
Routing Jennifer Rexford.
CSE390 – Advanced Computer Networks
Border Gateway Protocol
Interdomain Routing (Nick Feamster).
Interdomain Traffic Engineering with BGP
CS 457 – Lecture 12 Routing Spring 2012.
Intra-Domain Routing Jacob Strauss September 14, 2006.
Routing: Distance Vector Algorithm
CS 4700 / CS 5700 Network Fundamentals
Routers Routing algorithms
CS 3700 Networks and Distributed Systems
Dynamic Routing and OSPF
CS 3700 Networks and Distributed Systems
COS 461: Computer Networks
EE 122: Intra-domain routing: Distance Vector
Computer Networks Protocols
Border Gateway Protocol (BGP)
Presentation transcript:

Nick Feamster Georgia Tech Advanced Routing Nick Feamster Georgia Tech

Tutorial Outline Topology BGP IS-IS Business relationships BGP/MPLS VPNs

Internet Routing Overview Autonomous Systems (ASes) Abilene Comcast Georgia Tech AT&T Cogent Today: Intradomain (i.e., “intra-AS”) routing Monday: Interdomain routing

Today: Routing Inside an AS Intra-AS topology Nodes and edges Example: Abilene Intradomain routing protocols Distance Vector Split-horizon/Poison-reverse Example: RIP Link State Example: OSPF, ISIS

Topology Design Where to place “nodes”? Where to place “edges”? Typically in dense population centers Close to other providers (easier interconnection) Close to other customers (cheaper backhaul) Note: A “node” may in fact be a group of routers, located in a single city. Called a “Point-of-Presence” (PoP) Where to place “edges”? Often constrained by location of fiber

Node Clusters: Point-of-Presence (PoP) A “cluster” of routers in a single physical location Inter-PoP links Long distances High bandwidth Intra-PoP links Cables between racks or floors Aggregated bandwidth PoP

Example: Abilene Network Topology

Another Example Backbone

Problem: Routing Routing: the process by which nodes discover where to forward traffic so that it reaches a certain node Within an AS: there are two “styles” Distance vector: iterative, asynchronous, distributed Link State: global information, centralized algorithm

Forwarding vs. Routing Forwarding: data plane Routing: control plane Directing a data packet to an outgoing link Individual router using a forwarding table Routing: control plane Computing paths the packets will follow Routers talking amongst themselves Individual router creating a forwarding table

Distance-Vector Routing x y z 1 2 y x z 1 2 5 x y z 1 5 x y z 5 2 Routers send routing table copies to neighbors Routers compute costs to destination based on shortest available path Based on Bellman-Ford Algorithm dx(y) = minv{ c(x,v) + dv(y) } Solution to this equation is x’s forwarding table

Distance Vector Algorithm Each node: Iterative, asynchronous: each local iteration caused by: Local link cost change Distance vector update message from neighbor Distributed: Each node notifies neighbors only when its DV changes Neighbors then notify their neighbors if necessary wait for (change in local link cost or message from neighbor) recompute estimates if DV to any destination has changed, notify neighbors

Good News Travels Quickly x y z 1 3 2 y x z 1 2 5 x y z 1 3 2 x y z 1 3 2 When costs decrease, network converges quickly

Problem: Bad News Travels Slowly x y z 60 50 5 2 3 y x z 1 2 50 60 x y z 60 50 5 2 7 Note also that there is a forwarding loop between y and z.

This continues… 60 1 2 50 Question: How long does this continue? x y z 60 50 5 2 3 y x z 1 2 50 60 x y z 60 50 5 2 7 Question: How long does this continue? Answer: Until z’s path cost to x via y is greater than 50.

“Solution”: Poison Reverse x y z 1 X 2 y 1 x y z 1 3 2 2 x y z 1 3 2 x z 5 If z routes through y to get to x, z advertises infinite cost for x to y Does poison reverse always work?

Does Poison Reverse Always Work? x z 1 3 50 60 w

Routing Information Protocol (RIP) Distance vector protocol Nodes send distance vectors every 30 seconds … or, when an update causes a change in routing Link costs in RIP All links have cost 1 Valid distances of 1 through 15 … with 16 representing infinity Small “infinity”  smaller “counting to infinity” problem

Link-State Routing Keep track of the state of incident links Whether the link is up or down The cost on the link Broadcast the link state Every router has a complete view of the graph Compute Dijkstra’s algorithm Examples: Open Shortest Path First (OSPF) Intermediate System – Intermediate System (IS-IS)

Link-State Routing Idea: distribute a network map Each node performs shortest path (SPF) computation between itself and all other nodes Initialization step Add costs of immediate neighbors, D(v), else infinite Flood costs c(u,v) to neighbors, N For some D(w) that is not in N D(v) = min( c(u,w) + D(w), D(v) )

Detecting Topology Changes Beaconing Periodic “hello” messages in both directions Detect a failure after a few missed “hellos” Performance trade-offs Detection speed Overhead on link bandwidth and CPU Likelihood of false detection “hello”

Broadcasting the Link State Flooding Node sends link-state information out its links The next node sends out all of its links except the one where the information arrived X A X A C B D C B D (a) (b) X A X A C B D C B D (c) (d)

Broadcasting the Link State Reliable flooding Ensure all nodes receive the latestlink-state information Challenges Packet loss Out-of-order arrival Solutions Acknowledgments and retransmissions Sequence numbers Time-to-live for each packet

When to Initiate Flooding Topology change Link or node failure Link or node recovery Configuration change Link cost change Periodically Refresh the link-state information Typically (say) 30 minutes Corrects for possible corruption of the data

Scaling Link-State Routing Message overhead Suppose a link fails. How many LSAs will be flooded to each router in the network? Two routers send LSA to A adjacent routers Each of A routers sends to A adjacent routers … Suppose a router fails. How many LSAs will be generated? Each of A adjacent routers originates an LSA …

Scaling Link-State Routing Two scaling problems Message overhead: Flooding link-state packets Computation: Running Dijkstra’s shortest-path algorithm Introducing hierarchy through “areas” Area 0 area border router

Link-State vs. Distance-Vector Convergence DV has count-to-infinity DV often converges slowly (minutes) DV has timing dependences Link-state: O(n2) algorithm requires O(nE) messages Robustness Route calculations a bit more robust under link-state DV algorithms can advertise incorrect least-cost paths In DV, errors can propagate (nodes use each others tables) Bandwidth Consumption for Messages Messages flooded in link state

Open Shortest Paths First (OSPF) Area 0 Key Feature: hierarchy Network’s routers divided into areas Backbone area is area 0 Area 0 routers perform SPF computation All inter-area traffic travles through Area 0 routers (“border routers”)

Another Example: IS-IS Originally: ISO Connectionless Network Protocol CLNP: ISO equivalent to IP for datagram delivery services ISO 10589 or RFC 1142 Later: Integrated or Dual IS-IS (RFC 1195) IS-IS adapted for IP Doesn’t use IP to carry routing messages OSPF more widely used in enterprise, IS-IS in large service providers

Hierarchical Routing in IS-IS Backbone Area 49.0002 Area 49.001 Level-1 Routing Level-1 Routing Level-2 Routing Like OSPF, 2-level routing hierarchy Within an area: level-1 Between areas: level-2 Level 1-2 Routers: Level-2 routers may also participate in L1 routing

ISIS on the Wire…

IS-IS Configuration on Abilene (atlang) lo0 { unit 0 { …. family iso { address 49.0000.0000.0000.0014.00; } isis { level 2 wide-metrics-only; /* OC192 to WASHng */ interface so-0/0/0.0 { level 2 metric 846; level 1 disable; ISO Address Configured on Loopback Interface Only Level 2 IS-IS in Abilene

Interdomain Routing Today’s interdomain routing protocol: BGP BGP route attributes Usage Problems Business relationships See http://nms.lcs.mit.edu/~feamster/papers/dissertation.pdf (Chapter 2.1-2.3) for good coverage of this topic.

Internet Routing The Internet Abilene Georgia Tech Comcast AT&T Cogent Large-scale: Thousands of autonomous networks Self-interest: Independent economic and performance objectives But, must cooperate for global connectivity

Internet Business Model (Simplified) Provider Preferences implemented with local preference manipulation Free to use Pay to use Peer Get paid to use Customer Destination Customer/Provider: One AS pays another for reachability to some set of destinations “Settlement-free” Peering: Bartering. Two ASes exchange routes with one another.

Relationship #1: Customer-Provider Filtering Routes from customer: to everyone Routes from provider: only to customers From other destinations To the customer From the customer To other destinations providers providers advertisements traffic customer customer

Relationship #2: Peering Filtering Routes from peer: only to customers No routes from other peers or providers advertisements peer peer traffic customer customer

The Business Game and Depeering Cooperative competition (brinksmanship) Much more desirable to have your peer’s customers Much nicer to get paid for transit Peering “tiffs” are relatively common 31 Jul 2005: Level 3 Notifies Cogent of intent to disconnect. 16 Aug 2005: Cogent begins massive sales effort and mentions a 15 Sept. expected depeering date. 31 Aug 2005: Level 3 Notifies Cogent again of intent to disconnect (according to Level 3) 5 Oct 2005 9:50 UTC: Level 3 disconnects Cogent. Mass hysteria ensues up to, and including policymakers in Washington, D.C. 7 Oct 2005: Level 3 reconnects Cogent During the “outage”, Level 3 and Cogent’s singly homed customers could not reach each other. (~ 4% of the Internet’s prefixes were isolated from each other)

Depeering Continued Resolution… …but not before an attempt to steal customers! As of 5:30 am EDT, October 5th, Level(3) terminated peering with Cogent without cause (as permitted under its peering agreement with Cogent) even though both Cogent and Level(3) remained in full compliance with the previously existing interconnection agreement. Cogent has left the peering circuits open in the hope that Level(3) will change its mind and allow traffic to be exchanged between our networks. We are extending a special offering to single homed Level 3 customers. Cogent will offer any Level 3 customer, who is single homed to the Level 3 network on the date of this notice, one year of full Internet transit free of charge at the same bandwidth currently being supplied by Level 3. Cogent will provide this connectivity in over 1,000 locations throughout North America and Europe.

Internet Routing Protocol: BGP Autonomous Systems (ASes) Route Advertisement Destination Next-hop AS Path 130.207.0.0/16 192.5.89.89 66.250.252.44 10578..2637 174… 2637 Session Traffic Diagram of routing table is very confusing because it’s not pointing to anything Green arrow shorter, and too thick… green is a msg More intuition about how the system actually works. Don’t say “interdomain” DESTINATION-BASED Routing Tables look like a set of possible routes and a rankings over these routes (pop up a simplified table fragment)

Question: What’s the difference between IGP and iBGP? Two Flavors of BGP iBGP eBGP External BGP (eBGP): exchanging routes between ASes Internal BGP (iBGP): disseminating routes to external destinations among the routers within an AS Question: What’s the difference between IGP and iBGP?

Example BGP Routing Table The full routing table > show ip bgp Network Next Hop Metric LocPrf Weight Path *>i3.0.0.0 4.79.2.1 0 110 0 3356 701 703 80 i *>i4.0.0.0 4.79.2.1 0 110 0 3356 i *>i4.21.254.0/23 208.30.223.5 49 110 0 1239 1299 10355 10355 i * i4.23.84.0/22 208.30.223.5 112 110 0 1239 6461 20171 i > show ip bgp 130.207.7.237 BGP routing table entry for 130.207.0.0/16 Paths: (1 available, best #1, table Default-IP-Routing-Table) Not advertised to any peer 10578 11537 10490 2637 192.5.89.89 from 18.168.0.27 (66.250.252.45) Origin IGP, metric 0, localpref 150, valid, internal, best Community: 10578:700 11537:950 Last update: Sat Jan 14 04:45:09 2006 Specific entry. Can do longest prefix lookup: Prefix AS path Next-hop

Routing Attributes and Route Selection BGP routes have the following attributes, on which the route selection process is based: Local preference: numerical value assigned by routing policy. Higher values are more preferred. AS path length: number of AS-level hops in the path Multiple exit discriminator (“MED”): allows one AS to specify that one exit point is more preferred than another. Lower values are more preferred. eBGP over iBGP Shortest IGP path cost to next hop: implements “hot potato” routing Router ID tiebreak: arbitrary tiebreak, since only a single “best” route can be selected

Other BGP Attributes Next-hop: 192.5.89.89 Next-hop: 4.79.2.1 iBGP 4.79.2.2 4.79.2.1 Next-hop: IP address to send packets en route to destination. (Question: How to ensure that the next-hop IP address is reachable?) Community value: Semantically meaningless. Used for passing around “signals” and labelling routes. More in a bit.

Local Preference Control over outbound traffic Higher local pref Primary Destination Backup Lower local pref Control over outbound traffic Not transitive across ASes Coarse hammer to implement route preference Useful for preferring routes from one AS over another (e.g., primary-backup semantics)

Communities and Local Preference Primary Destination Backup “Backup” Community Customer expresses provider that a link is a backup Affords some control over inbound traffic More on multihoming, traffic engineering in Lecture 7

AS Path Length Traffic Destination Among routes with highest local preference, select route with shortest AS path length Shortest AS path != shortest path, for any interpretation of “shortest path”

AS Path Length Hack: Prepending Traffic AS 2 AS 3 AS Path: “1” AS Path: “1 1” AS 1 D Attempt to control inbound traffic Make AS path length look artificially longer How well does this work in practice vs. e.g., hacks on longest-prefix match?

Multiple Exit Discriminator (MED) Dest. Traffic San Francisco New York MED: 20 MED: 10 I Los Angeles Mechanism for AS to control how traffic enters, given multiple possible entry points.

Hot-Potato Routing Prefer route with shorter IGP path cost to next-hop Idea: traffic leaves AS as quickly as possible Dest. New York Atlanta Traffic Common practice: Set IGP weights in accordance with propagation delay (e.g., miles, etc.) 10 5 I Washington, DC

Problems with Hot-Potato Routing Small changes in IGP weights can cause large traffic shifts Dest. San Fran New York Traffic Question: Cost of sub-optimal exit vs. cost of large traffic shifts 11 10 5 I LA

MPLS Overview Main idea: Virtual circuit Packets forwarded based only on circuit identifier Source 1 Destination Source 2 Router can forward traffic to the same destination on different interfaces/paths.

Circuit Abstraction: Label Swapping D A 2 1 Tag Out New 3 A 2 D Label-switched paths (LSPs): Paths are “named” by the label at the path’s entry point At each hop, label determines: Outgoing interface New label to attach Label distribution protocol: responsible for disseminating signalling information

Layer 3 Virtual Private Networks Private communications over a public network A set of sites that are allowed to communicate with each other Defined by a set of administrative policies determine both connectivity and QoS among sites established by VPN customers One way to implement: BGP/MPLS VPN mechanisms (RFC 2547)

Building Private Networks Separate physical network Good security properties Expensive! Secure VPNs Encryption of entire network stack between endpoints Layer 2 Tunneling Protocol (L2TP) “PPP over IP” No encryption Layer 3 VPNs Privacy and interconnectivity (not confidentiality, integrity, etc.)

Layer 2 vs. Layer 3 VPNs Layer 2 VPNs can carry traffic for many different protocols, whereas Layer 3 is “IP only” More complicated to provision a Layer 2 VPN Layer 3 VPNs: potentially more flexibility, fewer configuration headaches

Layer 3 BGP/MPLS VPNs VPN A/Site 1 VPN A/Site 2 VPN A/Site 3 VPN B/Site 2 VPN B/Site 1 VPN B/Site 3 CEA1 CEB3 CEA3 CEB2 CEA2 CE1B1 CE2B1 PE1 PE2 PE3 P1 P2 P3 10.1/16 10.2/16 10.3/16 10.4/16 BGP to exchange routes MPLS to forward traffic Isolation: Multiple logical networks over a single, shared physical infrastructure Tunneling: Keeping routes out of the core

High-Level Overview of Operation IP packets arrive at PE Destination IP address is looked up in forwarding table Datagram sent to customer’s network using tunneling (i.e., an MPLS label-switched path)

BGP/MPLS VPN key components Forwarding in the core: MPLS Distributing routes between PEs: BGP Isolation: Keeping different VPNs from routing traffic over one another Constrained distribution of routing information Multiple “virtual” forwarding tables Unique addresses: VPN-IP4 Address extension

Layer 3 VPNs “Vanilla” Layer 3 VPNs: All customer routes in the core Site 1 Site 2 CORE IBGP EBGP BGP/MPLS VPNs: BGP between PEs; MPLS in the core Site 1 LDP LDP LDP Site 2 P MPLS CORE P PE PE

Problems Introduced by Layer 3 VPNs Overlapping address space in forwarding table Solution: Virtual routing and forwarding table (“VRF”) Overlapping address space in BGP routes Solution: “Route distinguisher”--- 8-byte VPN-specific identifier prepended to each IP address Typically, one route distinguisher per VPN New VPN-IP address family Routes carried with multi-protocol BGP Filtering routes from routes not at that site Route target: basically a special BGP community value

Virtual Routing and Forwarding Separate tables per customer at each router Customer 1 10.0.1.0/24 10.0.1.0/24 RD: Green Customer 1 Customer 2 10.0.1.0/24 Customer 2 10.0.1.0/24 RD: Blue

Routing: Constraining Distribution Performed by Service Provider using route filtering based on BGP Extended Community attribute BGP Community is attached by ingress PE route filtering based on BGP Community is performed by egress PE Site 2 BGP Static route, RIP, etc. RD:10.0.1.0/24 Route target: Green Next-hop: A Site 1 A 10.0.1.0/24 Site 3

BGP/MPLS VPN Routing in Cisco IOS Customer A Customer B ip vrf Customer_A rd 100:110 route-target export 100:1000 route-target import 100:1000 ! ip vrf Customer_B rd 100:120 route-target export 100:2000 route-target import 100:2000

Forwarding PE and P routers have BGP next-hop reachability through the backbone IGP Labels are distributed through LDP (hop-by-hop) corresponding to BGP Next-Hops Two-Label Stack is used for packet forwarding Top label indicates Next-Hop (interior label) Second level label indicates outgoing interface or VRF (exterior label) Corresponds to LSP of BGP next-hop (PE) Corresponds to VRF/interface at exit Layer 2 Header Label 1 Label 2 IP Datagram

Forwarding in BGP/MPLS VPNs Step 1: Packet arrives at incoming interface Site VRF determines BGP next-hop and Label #2 Label 2 IP Datagram Step 2: BGP next-hop lookup, add corresponding LSP (also at site VRF) Label 1 Label 2 IP Datagram

Scalability Problems Lots of customers leads to explosion of routing tables How to ensure that no single router needs to carry state for all customers?

Other Uses for MPLS/Tunneling Reducing state in network core Internal routers no longer need paths for every destination Traffic engineering Can shift traffic based on virtual circuits, not just destination prefixes

Open Research Questions Static configuration analysis for enforcing isolation and other security policies Easier, in some sense, since security (reachability) policies are likely easier to encode