Computer Networks Winter 2002 / 2003 Chapter 4 NETWORK LAYER Dynamic PowerPoint Slides Only Distributed Computing Group Computer Networks Winter 2002 / 2003
Remember: Count to Infinity Problem a b c c: 3 c: 4 c: 5 c: 6 c: 7 c: 8 Distributed Computing Group Computer Networks R. Wattenhofer
BGP does not count to infinity a b c d e Zurich Destination Dir Dst Path Zurich c 4 cdeZ Destination Dir Dst Path Zurich b 5 bcdeZ Distributed Computing Group Computer Networks R. Wattenhofer
BGP does not count to infinity “withdraw Zurich” a b c d e Zurich Destination Dir Dst Path Zurich c 4 cdeZ Destination Dir Dst Path Zurich b 5 bcdeZ Distributed Computing Group Computer Networks R. Wattenhofer
Distributed Computing Group Computer Networks R. Wattenhofer BGP Basics Continued “announce bcdeZ” a b c d e Zurich Destination Dir Dst Path Zurich c 4 cdeZ Destination Dir Dst Path Zurich b 5 bcdeZ Distributed Computing Group Computer Networks R. Wattenhofer
Distributed Computing Group Computer Networks R. Wattenhofer BGP Basics Continued “announce bfeZ” f 30s a b c d e Zurich Destination Dir Dst Path Zurich c 4 cdeZ f 3 feZ backup active Destination Dir Dst Path Zurich b 4 bfeZ Distributed Computing Group Computer Networks R. Wattenhofer
Distributed Computing Group Computer Networks R. Wattenhofer BGP Basics Continued “announce bcdeZ” f a b c d e Zurich Destination Dir Dst Path Zurich c 4 cdeZ f 3 feZ backup active Destination Dir Dst Path Zurich b 4 bfeZ Distributed Computing Group Computer Networks R. Wattenhofer
BGP (Border Gateway Protocol) BGP is the Internet de-facto standard Path Vector protocol Receive BGP update (announce or withdrawal) from a neighbor. Update routing table. Does update affect active route? (Loop detection, policy, etc.) If yes, send update to all neighbors that are allowed by policy. MinRouteAdver: At most 1 announce per neighbor per 30+jitter seconds. Store the active routes of the neighbors. Distributed Computing Group Computer Networks R. Wattenhofer
Internet Architecture BGP Destination Dir Dst Path Zurich c 4 cdeZ 172.30.160/19 R1 1239 1 3561 iBGP Route flap dampening Multipath Soft configuration … Distributed Computing Group Computer Networks R. Wattenhofer
Internet inter-AS routing: BGP BGP messages exchanged using TCP. BGP messages OPEN: opens TCP connection to peer and authenticates sender UPDATE: advertises new path (or withdraws old) KEEPALIVE keeps connection alive in absence of UPDATES; also ACKs OPEN request NOTIFICATION: reports errors in previous msg; also used to close connection Policy Even if two BGP routers are connected they may not announce all their routes or use all the routes of the other Example: if AS A does not want to route traffic of AS B, then A should simply not announce anything to B. Distributed Computing Group Computer Networks R. Wattenhofer
Distributed Computing Group Computer Networks R. Wattenhofer Robustness of BGP a d b c a startet mit acd, b dann bd (weil ueber a haben wir bacd was 3 hops sind, zu lange…), c mit cbd, doch dann muss a sich wieder umentscheiden und geht auf direkt was wiederum b aendern laesst… etc. We are interested in routes to destination d. Nodes a,b,c all have the policy to prefer a 2-hop route through their clockwise neighbor over a direct 1-hop route to destination d. Distributed Computing Group Computer Networks R. Wattenhofer
BGP Update Traffic (Mae-East) Cisco bug “withdraw loop” is fixed with IOS release. Distributed Computing Group Computer Networks R. Wattenhofer
Internet Evolution: NSFNet (1995) NSFNet Backbone Hello/EGP Hello/EGP Regional Regional Regional Campus Campus Campus Campus Distributed Computing Group Computer Networks R. Wattenhofer
Internet Evolution: Today AS1 AS2 AS5 BGP AS6 AS4 AS3 AS8 AS7 Distributed Computing Group Computer Networks R. Wattenhofer
Distributed Computing Group Computer Networks R. Wattenhofer Experimental Setup Analyzed secondary paths of 20x20 AS pairs: Inject and monitor BGP faults. Survey providers on policies. Distributed Computing Group Computer Networks R. Wattenhofer
Distributed Computing Group Computer Networks R. Wattenhofer BGP Convergence Times 180 Distributed Computing Group Computer Networks R. Wattenhofer
BGP Convergence Results If a link comes up, the convergence time is in the order of time to forward a message on the shortest path. If a link goes down, the convergence time is in the order of time to forward a message on the longest path. Distributed Computing Group Computer Networks R. Wattenhofer
Intuition for Slow Convergence p a p:p b c d e f a:p a:p b:ap a:p c:ap a:p d:ap a:p e:ap Distributed Computing Group Computer Networks R. Wattenhofer
Intuition for Slow Convergence p Os a p:p b c d e f a:p a:p b:ap a:p c:ap a:p d:ap a:p e:ap Distributed Computing Group Computer Networks R. Wattenhofer
Intuition for Slow Convergence p a p:p W W W W W b c d e f a:p a:p b:ap a:p c:ap a:p d:ap a:p e:ap Distributed Computing Group Computer Networks R. Wattenhofer
Intuition for Slow Convergence a W c d e a:p c:ap Distributed Computing Group Computer Networks R. Wattenhofer
Intuition for Slow Convergence a dcap c d e a:p c:ap Distributed Computing Group Computer Networks R. Wattenhofer
Intuition for Slow Convergence p O.1s a p:p W cbap dcap edap b c d e f a:p a:p b:ap a:p c:ap a:p d:ap a:p e:ap Distributed Computing Group Computer Networks R. Wattenhofer
Intuition for Slow Convergence p O.2s a - W cbap dcap edap b c d e f - b:ap c:ap d:ap e:ap d:cap e:dap c:bap Distributed Computing Group Computer Networks R. Wattenhofer
Intuition for Slow Convergence p 30s!!! a - W dcbap edcap b c d e f - - c:bap d:cap e:dap Distributed Computing Group Computer Networks R. Wattenhofer
Intuition for Slow Convergence p 60s a - W edcbap b c d e f - - - d:cbap e:dcap Distributed Computing Group Computer Networks R. Wattenhofer
Intuition for Slow Convergence p 90s a - W b c d e f - - - - e:dcbap Distributed Computing Group Computer Networks R. Wattenhofer
Intuition for Slow Convergence p a b c d e f Convergence in the time to forward a message on the longest path. Distributed Computing Group Computer Networks R. Wattenhofer
Distributed Computing Group Computer Networks R. Wattenhofer The longest path… p a g e b h f d c j i Distributed Computing Group Computer Networks R. Wattenhofer
Distributed Computing Group Computer Networks R. Wattenhofer … is NP-complete (APX) p a g e b h f d c j i Distributed Computing Group Computer Networks R. Wattenhofer
Example of BGP Convergence Time BGP Message/Event 10:40:30 2129 withdraws p 10:41:08 2117 announces 5696 2129 p 10:41:32 2117 announces 1 5696 2129 p 10:41:50 2117 announces 2041 3508 3508 4540 7037 1239 5696 2129 p 10:42:17 2117 announces 1 2041 3508 3508 4540 7037 1239 5696 2129 p 10:43:05 2117 announces 2041 3508 3508 4540 7037 1239 6113 5696 2129 p 10:43:35 2117 announces 1 2041 3508 3508 4540 7037 1239 6113 5696 2129 p 10:43:59 2117 withdraws p Distributed Computing Group Computer Networks R. Wattenhofer
Distributed Computing Group Computer Networks R. Wattenhofer Remember the Example p a W b c d e f edap edcap edcbap Distributed Computing Group Computer Networks R. Wattenhofer W
Distributed Computing Group Computer Networks R. Wattenhofer What might help? Idea: Attach a “cause tag” to the withdrawal message identifying the failed link/node (for a given prefix). It can be shown that a cause tag reduces the convergence time to the shortest path Problems Since BGP is widely deployed, it cannot be changed easily ISP’s (AS’s) don’t like the world to know that it is their link that is not stable, and cause tags do exactly that. Race conditions make the cause tags protocol intricate Distributed Computing Group Computer Networks R. Wattenhofer
Example with BGP-CT (Cause Tags) p:p b c d e f a:p a:p b:ap a:p c:ap a:p d:ap a:p e:ap Distributed Computing Group Computer Networks R. Wattenhofer
Distributed Computing Group Computer Networks R. Wattenhofer Example with BGP-CT p Os a p:p b c d e f a:p a:p b:ap a:p c:ap a:p d:ap a:p e:ap Distributed Computing Group Computer Networks R. Wattenhofer
Distributed Computing Group Computer Networks R. Wattenhofer Example with BGP-CT p O.1s a p:p W(ap) W(ap) W(ap) W(ap) W(ap) b c d e f a:p a:p b:ap a:p c:ap a:p d:ap a:p e:ap Distributed Computing Group Computer Networks R. Wattenhofer
Convergence Time using Cause Tags b c p x e f Distributed Computing Group Computer Networks R. Wattenhofer
Convergence Time using Cause Tags b c p x e f Distributed Computing Group Computer Networks R. Wattenhofer
Convergence Time using Cause Tags b c p x e f Convergence in the time to forward a message on the new shortest path (instead of the longest). Distributed Computing Group Computer Networks R. Wattenhofer