Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Exterior Gateway Protocols: EGP, BGP-4, CIDR Shivkumar Kalyanaraman Rensselaer Polytechnic Institute.

Slides:



Advertisements
Similar presentations
Rensselaer Polytechnic Institute 1 Today’s Big Picture Large ISP Dial-Up ISP Access Network Small ISP Stub Large number of diverse networks.
Advertisements

1 Interdomain Traffic Engineering with BGP By Behzad Akbari Spring 2011 These slides are based on the slides of Tim. G. Griffin (AT&T) and Shivkumar (RPI)
Border Gateway Protocol Ankit Agarwal Dashang Trivedi Kirti Tiwari.
CS540/TE630 Computer Network Architecture Spring 2009 Tu/Th 10:30am-Noon Sue Moon.
Lecture 9 Overview. Hierarchical Routing scale – with 200 million destinations – can’t store all dests in routing tables! – routing table exchange would.
Path Vector Routing NETE0514 Presented by Dr.Apichan Kanjanavapastit.
© J. Liebeherr, All rights reserved 1 Border Gateway Protocol This lecture is largely based on a BGP tutorial by T. Griffin from AT&T Research.
Border Gateway Protocol Autonomous Systems and Interdomain Routing (Exterior Gateway Protocol EGP)
Fundamentals of Computer Networks ECE 478/578 Lecture #18: Policy-Based Routing Instructor: Loukas Lazos Dept of Electrical and Computer Engineering University.
1 Interdomain Routing Protocols. 2 Autonomous Systems An autonomous system (AS) is a region of the Internet that is administered by a single entity and.
Interdomain Routing and The Border Gateway Protocol (BGP) Courtesy of Timothy G. Griffin Intel Research, Cambridge UK
Interdomain Routing and The Border Gateway Protocol (BGP)
Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol –Datagram format.
Practical and Configuration issues of BGP and Policy routing Cameron Harvey Simon Fraser University.
CS 164: Global Internet Slide Set In this set... More about subnets Classless Inter Domain Routing (CIDR) Border Gateway Protocol (BGP) Areas with.
Interdomain Routing and The Border Gateway Protocol (BGP) Courtesy of Timothy G. Griffin Intel Research, Cambridge UK
The Border Gateway Protocol (BGP) Sharad Jaiswal.
Computer Networking Lecture 10: Inter-Domain Routing
More on BGP Check out the links on politics: ICANN and net neutrality To read for next time Path selection big example Scaling of BGP.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Exterior Gateway Protocols: EGP, BGP-4, CIDR Shivkumar Kalyanaraman Rensselaer Polytechnic Institute.
Ion Stoica October 2, 2002 (* this presentation is based on Lakshmi Subramanian’s slides) EE 122: Inter-domain routing – Border Gateway Protocol (BGP)
14 – Inter/Intra-AS Routing
Lecture Week 3 Introduction to Dynamic Routing Protocol Routing Protocols and Concepts.
ROUTING PROTOCOLS PART IV ET4187/ET5187 Advanced Telecommunication Network.
Border Gateway Protocol(BGP) L.Subramanian 23 rd October, 2001.
Computer Networks Layering and Routing Dina Katabi
Inter-domain Routing Outline Border Gateway Protocol.
Inter-domain Routing: Today and Tomorrow Dr. Jia Wang AT&T Labs Research Florham Park, NJ 07932, USA
I-4 routing scalability Taekyoung Kwon Some slides are from Geoff Huston, Michalis Faloutsos, Paul Barford, Jim Kurose, Paul Francis, and Jennifer Rexford.
Introduction to BGP.
IP is a Network Layer Protocol Physical 1 Network DataLink 1 Transport Application Session Presentation Network Physical 1 DataLink 1 Physical 2 DataLink.
1 Interdomain Routing (BGP) By Behzad Akbari Fall 2008 These slides are based on the slides of Ion Stoica (UCB) and Shivkumar (RPI)
CS 3700 Networks and Distributed Systems Inter Domain Routing (It’s all about the Money) Revised 8/20/15.
Routing protocols Basic Routing Routing Information Protocol (RIP) Open Shortest Path First (OSPF)
Lecture 4: BGP Presentations Lab information H/W update.
Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks BGP.
Chapter 9. Implementing Scalability Features in Your Internetwork.
Border Gateway Protocol
Network Layer r Introduction r Datagram networks r IP: Internet Protocol m Datagram format m IPv4 addressing m ICMP r What’s inside a router r Routing.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Exterior Gateway Protocols: BGP-4, CIDR Shivkumar Kalyanaraman Rensselaer Polytechnic Institute.
Xuan Zheng (modified by M. Veeraraghavan) 1 BGP overview BGP operations BGP messages BGP decision algorithm BGP states.
Border Gateway Protocol (BGP) W.lilakiatsakun. BGP Basics (1) BGP is the protocol which is used to make core routing decisions on the Internet It involves.
More on Internet Routing A large portion of this lecture material comes from BGP tutorial given by Philip Smith from Cisco (ftp://ftp- eng.cisco.com/pfs/seminars/APRICOT2004.
T. S. Eugene Ngeugeneng at cs.rice.edu Rice University1 COMP/ELEC 429/556 Introduction to Computer Networks Inter-domain routing Some slides used with.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 ECSE-6600: Internet Protocols Informal Quiz #08: SOLUTIONS Shivkumar Kalyanaraman: GOOGLE: “Shiv.
Network Layer4-1 Intra-AS Routing r Also known as Interior Gateway Protocols (IGP) r Most common Intra-AS routing protocols: m RIP: Routing Information.
Interdomain Routing and BGP Routing NJIT May 3, 2003 Timothy G. Griffin AT&T Research
An internet is a combination of networks connected by routers. When a datagram goes from a source to a destination, it will probably pass through many.
CS 640: Introduction to Computer Networks Aditya Akella Lecture 11 - Inter-Domain Routing - BGP (Border Gateway Protocol)
1 Agenda for Today’s Lecture The rationale for BGP’s design –What is interdomain routing and why do we need it? –Why does BGP look the way it does? How.
Text BGP Basics. Document Name CONFIDENTIAL Border Gateway Protocol (BGP) Introduction to BGP BGP Neighbor Establishment Process BGP Message Types BGP.
Michael Schapira, Princeton University Fall 2010 (TTh 1:30-2:50 in COS 302) COS 561: Advanced Computer Networks
Inter-domain Routing Outline Border Gateway Protocol.
CS 640: Introduction to Computer Networks Aditya Akella Lecture 11 - Inter-Domain Routing - BGP (Border Gateway Protocol)
ROUTING ON THE INTERNET COSC Jun-16. Routing Protocols  routers receive and forward packets  make decisions based on knowledge of topology.
1 Internet Routing 11/11/2009. Admin. r Assignment 3 2.
1 CS716 Advanced Computer Networks By Dr. Amir Qayyum.
CS 3700 Networks and Distributed Systems
Border Gateway Protocol
CS 3700 Networks and Distributed Systems
Border Gateway Protocol
COS 561: Advanced Computer Networks
BGP supplement Abhigyan Sharma.
Interdomain Traffic Engineering with BGP
Exterior Gateway Protocols: EGP, BGP-4, CIDR: Brief Version
COS 561: Advanced Computer Networks
COS 561: Advanced Computer Networks
COMP/ELEC 429/556 Introduction to Computer Networks
BGP Instability Jennifer Rexford
Computer Networks Protocols
Presentation transcript:

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Exterior Gateway Protocols: EGP, BGP-4, CIDR Shivkumar Kalyanaraman Rensselaer Polytechnic Institute Based in part upon slides of Tim Griffin (AT&T), Ion Stoica (UCB), J. Kurose (U Mass), Noel Chiappa (MIT)

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 2 q Cores, Peers, and the limit of default routes q Autonomous systems & EGP q BGP4 q CIDR: reducing router table sizes q Refs: Chap 10,14,15. Books: “Routing in Internet” by Huitema, “Interconnections” by Perlman, “BGP4” by Stewart, Sam Halabi, Danny McPherson, Internet Routing ArchitecturesInternet Routing Architectures q Reading: Geoff Huston, Commentary on Inter-domain Routing in the InternetCommentary on Inter-domain Routing in the Internet q Reference: BGP-4 Standards Document: In TXTIn TXT q Reading: Norton, Internet Service Providers and PeeringInternet Service Providers and Peering q Reading: Labovitz et al, Delayed Internet Routing ConvergenceDelayed Internet Routing Convergence q Reference: Paxson, End-to-End Routing Behavior in the Internet,End-to-End Routing Behavior in the Internet, q Reading: Interdomain Routing: Additional Notes: In PDF | In MS WordIn PDF In MS Word q Reference Site: Griffin, Interdomain Routing LinksInterdomain Routing Links Overview

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 3 History: Default Routes: limits q Default routes => partial information q Routers/hosts w/ default routes rely on other routers to complete the picture. q In general routing “signposts” should be: q Consistent, I.e., if packet is sent off in one direction then another direction should not be more optimal. q Complete, I.e., should be able to reach all destinations

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 4 Core q A small set of routers that have consistent & complete information about all destinations. q Outlying routers can have partial information provided they point default routes to the core q Partial info allows site administrators to make local routing changes independently. CORE S1S2Sm...

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 5 Peer Backbones q Initially NSFNET had only one connection to ARPANET (router in Pittsburg) => only one route between the two. q Addition of multiple interconnections => multiple possible routes => need for dynamic routing q Single core replaced by a network of peer backbones => more scalable q Today there are over 30 backbones! q Routing protocol at cores/peers: GGP -> EGP-> BGP-4

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 6 Exterior Gateway Protocol (EGP) q A mechanism that allows non-core routers to learn routes from core (external routes) routers so that they can choose optimal backbone routes q A mechanism for non-core routers to inform core routers about hidden networks (internal routes) q Autonomous System (AS) has the responsibility of advertising reachability info to other ASs. q One+ routers may be designated per AS. q Important that reachability info propagates to core routers

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 7 Purpose of EGP R border router internal router EGP R2R1R3 A AS1 AS2 you can reach net A via me traffic to A table at R1: dest next hop AR2 Share connectivity information across ASes

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 8 EGP Operation q Neighbor Acquisition: Reliable 2-way handshake q Neighbor Reachability: q Hellos: j out of m hellos OK => Neighbor UP q k out of n hellos NOT OK => Neighbor DOWN q Updates/Queries: q EGP is an incremental protocol. New info => send updates q Each router can query neighbors as well q Reachability advertized; metrics ignored q Requires a tree topology of ASes to avoid loops (eg: see next slide)

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 9 Why EGP Requires a Tree Structure..

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 10 EGP weaknesses q EGP does not interpret the distance metrics in routing update messages => cannot be compute shorter of two routes q As a result it restricts the topology to a tree structure, with the core as the root q Rapid growth => many networks may be temporarily unreachable q Only one path to destination => no load sharing q Need new protocol => BGP-4

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 11 Today’s Big Picture Large ISP Dial-Up ISP Access Network Small ISP Stub Large number of diverse networks

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 12 Internet AS Map: caida.org

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 13 Autonomous System(AS) q Internet is not a single network q Collection of networks controlled by different administrations q An autonomous system is a network under a single administrative control q An AS owns an IP prefix q Every AS has a unique AS number q ASes need to inter-network themselves to form a single virtual global network q Need a common protocol for communication

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 14 Intra-AS and Inter-AS routing inter-AS, intra-AS routing in gateway A.c network layer link layer physical layer a b b a a C A B d Gateways: perform inter-AS routing amongst themselves perform intra-AS routers with other routers in their AS A.c A.a C.b B.a c b c

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 15 Who speaks Inter-AS routing? R border routerinternal router BGP R2R1R3 AS1 AS2  Two types of routers  Border router(Edge), Internal router(Core)  Two border routers of different ASes will have a BGP session

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 16 Intra-AS vs Inter-AS q An AS is a routing domain q Within an AS: q Can run a link-state routing protocol q Trust other routers q Scale of network is relatively small q Between ASes: q Lack of information about other AS’s network (Link- state not possible) q Crossing trust boundaries q Link-state protocol will not scale q Routing protocol based on route propagation

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 17 Autonomous Systems (ASes)  An autonomous system is an autonomous routing domain that has been assigned an Autonomous System Number (ASN).  All parts within an AS remain connected. RFC 1930: Guidelines for creation, selection, and registration of an Autonomous System … the administration of an AS appears to other ASes to have a single coherent interior routing plan and presents a consistent picture of what networks are reachable through it.

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 18 IP Address Allocation and Assignment: Internet Registries IANA RFC Internet Registry IP Allocation Guidelines RFC Address Allocation for Private Internets RFC An Architecture for IP Address Allocation with CIDR ARIN APNIC RIPE Allocate to National and local registries and ISPs Addresses assigned to customers by ISPs

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 19 AS Numbers (ASNs) ASNs are 16 bit values through are “private” Genuity: 1 MIT: 3 Harvard: 11 UC San Diego: 7377 AT&T: 7018, 6341, 5074, … UUNET: 701, 702, 284, 12199, … Sprint: 1239, 1240, 6211, 6242, … … ASNs represent units of routing policy Currently over 11,000 in use.

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 20 Nontransit vs. Transit ASes ISP 1 ISP 2 Nontransit AS might be a corporate or campus network. Could be a “content provider” NET A Traffic NEVER flows from ISP 1 through NET A to ISP 2 Internet Service providers (ISPs) have transit networks

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 21 Selective Transit NET B NET C NET A provides transit between NET B and NET C and between NET D and NET C NET A NET D NET A DOES NOT provide transit Between NET D and NET B Most transit ASes allow only selective transit key impact of commercialization

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 22 Customers and Providers Customer pays provider for access to the Internet provider customer IP traffic provider customer

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 23 Customer-Provider Hierarchy IP traffic provider customer

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 24 The Peering Relationship peer customerprovider Peers provide transit between their respective customers Peers do not provide transit between peers Peers (often) do not exchange $$$ traffic allowed traffic NOT allowed

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 25 Peering Wars q Reduces upstream transit costs q Can increase end-to-end performance q May be the only way to connect your customers to some part of the Internet (“Tier 1”) q You would rather have customers q Peers are usually your competition q Peering relationships may require periodic renegotiation Peering struggles are by far the most contentious issues in the ISP world! Peering agreements are often confidential. PeerDon’t Peer

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 26 Requirements for Inter-AS Routing q Should scale for the size of the global Internet. q Focus on reachability, not optimality q Use address aggregation techniques to minimize core routing table sizes and associated control traffic q At the same time, it should allow flexibility in topological structure (eg: don’t restrict to trees etc) q Allow policy-based routing between autonomous systems q Policy refers to arbitrary preference among a menu of available routes (based upon routes’ attributes) q Fully distributed routing (as opposed to a signaled approach) is the only possibility. q Extensible to meet the demands for newer policies.

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 27 q Topology information is flooded within the routing domain q Best end-to-end paths are computed locally at each router. q Best end-to-end paths determine next-hops. q Based on minimizing some notion of distance q Works only if policy is shared and uniform q Examples: OSPF, IS-IS q Each router knows little about network topology q Only best next-hops are chosen by each router for each destination network. q Best end-to-end paths result from composition of all next- hop choices q Does not require any notion of distance q Does not require uniform policies at all routers q Examples: RIP, BGP Link StateVectoring Recall: Distributed Routing Techniques

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 28 BGP-4 q BGP = Border Gateway Protocol q Is a Policy-Based routing protocol q Is the de facto EGP of today’s global Internet q Relatively simple protocol, but configuration is complex and the entire world can see, and be impacted by, your mistakes : BGP-1 [RFC 1105] –Replacement for EGP (1984, RFC 904) 1990 : BGP-2 [RFC 1163] 1991 : BGP-3 [RFC 1267] 1995 : BGP-4 [RFC 1771] –Support for Classless Interdomain Routing (CIDR)

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 29 BGP Operations (Simplified) Establish session on TCP port 179 Exchange all active routes Exchange incremental updates AS1 AS2 While connection is ALIVE exchange route UPDATE messages BGP session

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 30 Four Types of BGP Messages q Open : Establish a peering session. q Keep Alive : Handshake at regular intervals. q Notification : Shuts down a peering session. q Update : Announcing new routes or withdrawing previously announced routes. announcement = prefix + attributes values

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 31 Border Gateway Protocol (BGP) q Allows multiple cores and arbitrary topologies of AS interconnection. q Uses a path-vector concept which enables loop prevention in complex topologies q In AS-level, shortest path may not be preferred for policy, security, cost reasons. q Different routers have different preferences (policy) => as packet goes thru network it will encounter different policies q Bellman-Ford/Dijkstra don’t work! q BGP allows attributes for AS and paths which could include policies (policy-based routing).

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 32 BGP (Cont’d) q When a BGP Speaker A advertises a prefix to its B that it has a path to IP prefix C, B can be certain that A is actively using that AS-path to reach that destination q BGP uses TCP between 2 peers (reliability) q Exchange entire BGP table first (50K+ routes!) q Later exchanges only incremental updates q Application (BGP)-level keepalive messages q Hold-down timer (at least 3 sec) locally config q Interior and exterior peers: need to exchange reachability information among interior peers before updating intra- AS forwarding table.

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 33 Two Types of BGP Neighbor Relationships External Neighbor (eBGP) in a different Autonomous Systems Internal Neighbor (iBGP) in the same Autonomous System AS1 AS2 eBGP iBGP iBGP is routed (using IGP!)

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 34 I-BGP and E-BGP R border router internal router R1 AS1 R4 R5 B AS3 E-BGP R2R3 A AS2 announce B IGP: Interior Gateway Protocol. Examples: IS-IS, OSPF I-BGP IGP

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 35 I-BGP q Why is IGP (OSPF, ISIS) not used ? q In large ASs full route table is very large (100K routes!) q Rate of change of routes is frequent q Tremendous amount of control traffic q Not to mention Dijkstra computation being evoked for any change… q BGP policy information may be lost q I-BGP :Within an AS q Same protocol/state machines as EBGP q But different rules about advertising prefixes q Prefix learned from an I-BGP neighbor cannot be advertised to another I-BGP neighbor to avoid looping => need full IBGP mesh ! q AS-PATH cannot be used internally. Why ?

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 36 IBGP vs EBGP q I-BGP nodes: typically ABRs, or other nodes where default routes terminate q I-BGP peering sessions between every pair of routers within an AS: full mesh. A B D C Physical link IBGP session AS1

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 37 iBGP Peers: Fully Meshed eBGP update iBGP updates  iBGP is needed to avoid routing loops within an AS  Full Mesh =>  Independent of physical connectivity.  Single link may see same update multiple times!  iBGP neighbors do not announce routes received via iBGP to other iBGP neighbors.  Is iBGP an IGP? NO!  Set of neighbor relationships to transfer BGP info

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 38 IBGP Scaling: Route Reflection q Add hierarchy to I-BGP q Route reflector: A router whose BGP implementation supports the re-advertisement of routes between I-BGP neighbors q Route reflector client: A router which depends on route reflector to re-advertise its routes to entire AS and learn routes from the route reflector

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 39 Route Reflection RR-C1 RR-C2 RR1 RR2 RR3 RR-C3 RR-C4 AS1 AS2 ER / /16 EBGP IBGP

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 40 AS Confederations q Divide and conquer: Divides a large AS into sub- ASs AS R1 R2 Sub-AS

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 41 CIDR q Shortage of class Bs => give out a set of class Cs instead of one class B address q Problem: every class C n/w needs a routing entry ! q Solution: Classless Inter-domain Routing (CIDR). q Also called “supernetting” q Key: allocate addresses such that they can be summarized, I.e., contiguously. q Share same higher order bits (I.e. prefix) q Routing tables and protocols must be capable of carrying a subnet mask. Notation: /23 q When an IP address matches multiple entries (eg ), choose the one which had the longest mask (“longest-prefix match”)

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 42 RFC 1519: Classless Inter-Domain Routing (CIDR) IP Address : IP Mask: Address Mask for hostsNetwork Prefix Pre-CIDR: Network ID ended on 8-, 16, 24- bit boundary CIDR: Network ID can end at any bit boundary Usually written as /15, a.k.a “supernetting”

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 43 Understanding Prefixes and Masks (Recap) / is covered by prefix / is not covered by prefix /15

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 44 Service Provider Global Internet Routing Mesh …...…… …...……. Inter-domain Routing Without CIDR Service Provider Global Internet Routing Mesh …...…… /16 Inter-domain Routing With CIDR

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 45 Longest Prefix Match (Classless) Forwarding Destination = payload PrefixInterfaceNext Hop / ATM 5/0/ / /23attached Ethernet 0/1/3 Serial 1/0/ IP Forwarding Table / ATM 5/0/9 even better OK better best!

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 46 What is Routing Policy q Policy refers to arbitrary preference among a menu of available routes (based upon routes’ attributes) q Public description of the relationship between external BGP peers q Can also describe internal BGP peer relationship q Eg: Who are my BGP peers q What routes are q Originated by a peer q Imported from each peer q Exported to each peer q Preferred when multiple routes exist q What to do if no route exists?

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 47 Routing Policy Example q AS1 originates prefix “d” q AS1 exports “d” to AS2, AS2 imports q AS2 exports “d” to AS3, AS3 imports q AS3 exports “d” to AS5, AS5 imports

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 48 Routing Policy Example (cont) q AS5 also imports “d” from AS4 q Which route does it prefer? q Does it matter? q Consider case where q AS3 = Commercial Internet q AS4 = Internet2

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 49 Import and Export Policies q Inbound filtering controls outbound traffic q filters route updates received from other peers q filtering based on IP prefixes, AS_PATH, community q Outbound Filtering controls inbound traffic q forwarding a route means others may choose to reach the prefix through you q not forwarding a route means others must use another router to reach the prefix q Attribute Manipulation q Import: LOCAL_PREF (manipulate trust) q Export: AS_PATH and MEDs

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 50 Attributes are Used to Select Best Routes /24 pick me! /24 pick me! /24 pick me! /24 pick me! Given multiple routes to the same prefix, a BGP speaker must pick at most one best route (Note: it could reject them all!)

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 51 BGP Policy Knob: Attributes Value Code Reference ORIGIN [RFC1771] 2 AS_PATH [RFC1771] 3 NEXT_HOP [RFC1771] 4 MULTI_EXIT_DISC [RFC1771] 5 LOCAL_PREF [RFC1771] 6 ATOMIC_AGGREGATE [RFC1771] 7 AGGREGATOR [RFC1771] 8 COMMUNITY [RFC1997] 9 ORIGINATOR_ID [RFC2796] 10 CLUSTER_LIST [RFC2796] 11 DPA [Chen] 12 ADVERTISER [RFC1863] 13 RCID_PATH / CLUSTER_ID [RFC1863] 14 MP_REACH_NLRI [RFC2283] 15 MP_UNREACH_NLRI [RFC2283] 16 EXTENDED COMMUNITIES [Rosen] reserved for development From IANA: We will cover a subset of these attributes Not all attributes need to be present in every announcement

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 52 BGP Route Processing Best Route Selection Apply Import Policies Best Route Table Apply Export Policies Install forwarding Entries for best Routes. Receive BGP Updates Best Routes Transmit BGP Updates Apply Policy = filter routes & tweak attributes Based on Attribute Values IP Forwarding Table Apply Policy = filter routes & tweak attributes

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 53 Import and Export Policies q For inbound traffic q Filter outbound routes q Tweak attributes on outbound routes in the hope of influencing your neighbor’s best route selection q For outbound traffic q Filter inbound routes q Tweak attributes on inbound routes to influence best route selection outbound routes inbound routes inbound traffic outbound traffic In general, an AS has more control over outbound traffic

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 54 Policy Implementation Flow Main BGP RIB Adj RIB Out Outgo- ing Adj RIB In Incom- ing Main RIB/ FIB IGPs Static & HW Info

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 55 Conceptual Model of BGP Operation q RIB : Routing Information Base q Adj-RIB-In: Prefixes learned from neighbors. As many Adj-RIB-In as there are peers q Loc-RIB: Prefixes selected for local use after analyzing Adj-RIB-Ins. This RIB is advertised internally. q Adj-RIB-Out : Stores prefixes advertised to a particular neighbor. As many Adj-RIB-Out as there are neighbors

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 56 UPDATE message in BGP q Primary message between two BGP speakers. q Used to advertise/withdraw IP prefixes (NLRI) q Path attributes field : unique to BGP q Apply to all prefixes specified in NLRI field q Optional vs Well-known; Transitive vs Non-transitive Withdrawn Routes Length 2 octets Withdrawn Routes (variable length) Total Path Attributes Length Path Attributes (variable length) Network Layer Reachability Info. (NLRI: variable length)

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 57 Path Attributes: ORIGIN q ORIGIN: q Describes how a prefix came to BGP at the origin AS q Prefixes are learned from a source and “injected” into BGP: q Directly connected interfaces, manually configured static routes, dynamic IGP or EGP q Values: q IGP (EGP): Prefix learnt from IGP (EGP) q INCOMPLETE: Static routes

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 58 Path Attributes: AS-PATH q List of ASs thru which the prefix announcement has passed. AS on path adds ASN to AS-PATH q Eg: /16 originates at AS1 and is advertised to AS3 via AS2. q Eg: AS-SEQUENCE: “ ” q Used for loop detection and path selection AS1 (100) AS2 (200) AS3 (15) /16

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 59 Traffic Often Follows ASPATH AS 4AS 3 AS 2 AS /16 ASPATH = IP Packet Dest =

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 60 … But It Might Not AS 4AS 3 AS 2 AS /16 ASPATH = IP Packet Dest = AS /25 ASPATH = /25 AS 2 filters all subnets with masks longer than / /16 ASPATH = 1 From AS 4, it may look like this packet will take path 3 2 1, but it actually takes path 3 2 5

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 61 Shorter AS-PATH Doesn’t Mean Shorter # Hops AS 4 AS 3 AS 2 AS 1 BGP says that path 4 1 is better than path Duh!

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 62 Path Attributes: NEXT-HOP q Next-hop: node to which packets must be sent for the IP prefixes. May not be same as peer. q UPDATE for , NEXT-HOP= BGP Speakers Not a BGP Speaker

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 63 Recursive Lookup q If routes (prefix) are learnt thru iBGP, NEXT-HOP is the iBGP router which originated the route. q Note: iBGP peer might be several IP-level hops away as determined by the IGP q Hence BGP NEXT-HOP is not the same as IP next- hop q BGP therefore checks if the “NEXT-HOP” is reachable through its IGP. q If so, it installs the IGP next-hop for the prefix q This process is known as “recursive lookup” – the lookup is done in the control-plane (not data-plane) before populating the forwarding table. q Example in next slide

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 64 Forwarding Table Join EGP with IGP For Connectivity AS 1AS / EGP /16 destinationnext hop /30 destinationnext hop /16 Next Hop = / /16 destinationnext hop /

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 65 Local Preference AS1 AS2 MED Load-Balancing Knobs in BGP q LOCAL-PREF: outbound traffic, local preference (box- level knob) q MED: Inbound-traffic, typically from the same ISP (link- level knob)

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 66 Path Attribute: LOCAL-PREF q Locally configured indication about which path is preferred to exit the AS in order to reach a certain network. Default value = 100. Higher is better.

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 67 Attributes: MULTI-EXIT Discriminator q Also called METRIC or MED Attribute. Lower is better q AS1:multihomed customer. q AS2 (provider) includes MED to AS1 q AS1 chooses which link (NEXTHOP) to use q Eg: traffic to AS3 can go thru Link1, and AS2 thru Link2 AS1 AS2 AS3 AS4 Link A Link B

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 68 Hot Potato Routing: Closest Egress Point / IGP distances egress 1 egress 2 This Router has two BGP routes to /24. Hot potato: get traffic off of your network as Soon as possible. Go for egress 1!

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 69 Getting Burned by the Hot Potato High bandwidth Provider backbone Low b/w customer backbone Heavy Content Web Farm Many customers want their provider to carry the bits! tiny http request huge http reply SFFNYC San Diego

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 70 Cold Potato Routing with MEDs (Multi-Exit Discriminator Attribute) Heavy Content Web Farm /24 MED = /24 MED = 56 This means that MEDs must be considered BEFORE IGP distance! Prefer lower MED values Note1 : some providers will not listen to MEDs Note2 : MEDs need not be tied to IGP distance

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 71 MEDs Can Export Internal Instability Heavy Content Web Farm /24 MED = /24 MED = 56 OR FLAP

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 72 ASPATH Padding: Shed inbound traffic Padding will (usually) force inbound traffic from AS 1 to take primary link AS /24 ASPATH = customer AS 2 provider /24 backupprimary /24 ASPATH = 2

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 73 Padding May Not Shut Off All Traffic AS /24 ASPATH = customer AS 2 provider /24 ASPATH = 2 AS 3 provider AS 3 will send traffic on “backup” link because it prefers customer routes and local preference is considered before ASPATH length! Padding in this way is often used as a form of load balancing backupprimary

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 74 Deaggregation + Multihoming AS 1 customer AS 2 provider /8 AS 3 provider /16 If AS 1 does not announce the more specific prefix, then most traffic to AS 2 will go through AS 3 because it is a longer match AS 2 is “punching a hole” in the CIDR block of AS 1=> subverts CIDR

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 75 CIDR at Work, No load balancing ISP3 AS / /16 Link A Link B ISP /11 ISP / / /16 PrefixNext Hop ORIGIN AS /11ISP /10ISP2 Table at ISP3

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 76 CIDR Subverted for Load Balancing ISP3 AS / /16 Link A Link B ISP /11 ISP / /24, / /24, /16 PrefixNext Hop ORIGIN AS /11ISP /10ISP /24ISP1AS /24ISP2AS1 Table at ISP3

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 77 How Can Routes be Colored? BGP Communities A community value is 32 bits By convention, first 16 bits is ASN indicating who is giving it an interpretation community number Community Attribute = a list of community values. (So one route can belong to multiple communities) RFC 1997 (August 1996) Used within and between ASes The set of ASes must agree on how to interpret the community value Very powerful BECAUSE it has no (predefined) meaning Two reserved communities no_advertise 0xFFFFFF02: don’t pass to BGP neighbors no_export = 0xFFFFFF01: don’t export out of AS

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 78 Communities Example q 1:100 q Customer routes q 1:200 q Peer routes q 1:300 q Provider Routes q To Customers q 1:100, 1:200, 1:300 q To Peers q 1:100 q To Providers q 1:100 AS 1 ImportExport

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 79 BGP Route Selection Process q If NEXTHOP is inaccessible do not consider the route. q Prefer largest LOCAL-PREF q If same LOCAL-PREF prefer the shortest AS-PATH. q If all paths are external prefer the lowest ORIGIN code (IGP<EGP<INCOMPLETE). q If ORIGIN codes are the same prefer the lowest MED. q If MED is same, prefer min-cost NEXT-HOP q If routes learned from EBGP or IBGP, prefer paths learnt from EBGP q Final tie-break: Prefer the route with I-BGP ID (IP address) Series of tie-breaker decisions...

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 80 Route Selection Summary Highest Local Preference Shortest ASPATH Lowest MED i-BGP < e-BGP Lowest IGP cost to BGP egress Lowest router ID traffic engineering Enforce relationships Throw up hands and break ties

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 81 Caveat BGP is not guaranteed to converge on a stable routing. Policy interactions could lead to “livelock” protocol oscillations. See “Persistent Route Oscillations in Inter-domain Routing” by K. Varadhan, R. Govindan, and D. Estrin. ISI report, 1996 Corollary: BGP is not guaranteed to recover from network failures.

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 82 BGP Table Growth Thanks: Geoff Huston.

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 83 Large BGP Tables Considered Harmful Routing tables must store best routes and alternate routes Burden can be large for routers with many alternate routes (route reflectors for example) Routers have been known to die Increases CPU load, especially during session reset

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 84 ASNs Growth From: Geoff Huston.

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 85 Dealing with ASN growth… q Make ASNs larger than 16 bits q How about 32 bits? q See Internet Draft: “BGP support for four-octet AS number space” (draft-ietf-idr-as4bytes-03.txt) q Requires protocol change and wide deployment q Change the way ASNs are used q Allow multihomed, non-transit networks to use private ASNs q Uses ASE (AS number Substitution on Egress ) q See Internet Draft: “Autonomous System Number Substitution on Egress” (draft-jhaas-ase-00.txt) q Works at edge, requires protocol change (for loop prevention)

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 86 Daily Update Count

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 87 A Few Bad Apples … Thanks to Madanlal Musuvathi for this plot. Data source: RIPE NCC Typically, 80% of the updates are for less than 5% Of the prefixes. Most prefixes are stable most of the time. On this day, about 83% of the prefixes were not updated. Percent of BGP table prefixes

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 88 Squashing Updates q Rate limiting on sending updates q Send batch of updates every MinRouteAdvertisementInterval seconds (+/- random fuzz) q Default value is 30 seconds q A router can change its mind about best routes many times within this interval without telling neighbors q Route Flap Dampening q Punish routes for misbehaving Effective in dampening oscillations inherent in the vectoring approach Must be turned on with configuration

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 89 Route Flap Dampening (RFC 2439) Routes are given a penalty for changing. If penalty exceeds suppress limit, the route is dampened. When the route is not changing, its penalty decays exponentially. If the penalty goes below reuse limit, then it is announced again. Can dramatically reduce the number of BGP updates Requires additional router resources Applied on eBGP inbound only

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 90 Route Flap Dampening Example route dampened for nearly 1 hour penalty for each flap = 1000

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 91 How Long Does BGP Take to Adapt to Changes? From: Abha Ahuja and Craig Labovitz

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 92 Two Main Factors in Delayed Convergence q Rate limiting timer slows everything down q BGP can explore many alternate paths before giving up or arriving at a new path q No global knowledge in vectoring protocols

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 93 Implementation Does Matter! stateless withdraws widely deployed stateful withdraws widely deployed

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 94 What is RPSL? Why ? q Object oriented language (developed by RIPE 181) q Structured objects q Describes things interesting to routing policy q Routes, ASNs, Peer Relationships etc q Allows consistent configuration between BGP peers q Expertise encoded in the tools that generate the policy rather than engineer configuring peering session q Automatic, manageable solution for filter generation RFC “Routing Policy Specification Language (RPSL)” FOR MORE INFO...

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 95 Summary q BGP is a fairly simple protocol … q … but it is not easy to configure q BGP is running on more than 100K routers making it one of world’s largest and most visible distributed systems q Global dynamics and scaling principles are still not well understood q Traffic Engineering hacked in as an afterthought…