Seamless BGP Migration with Router Grafting Eric Keller, Jennifer Rexford Princeton University Kobus van der Merwe AT&T Research NSDI 2010.

Slides:



Advertisements
Similar presentations
Power Saving. 2 Greening of the Internet Main idea: Reduce energy consumption in the network by turning off routers (and router components) when they.
Advertisements

The Platform as a Service Model for Networking Eric Keller, Jennifer Rexford Princeton University INM/WREN 2010.
Logically Centralized Control Class 2. Types of Networks ISP Networks – Entity only owns the switches – Throughput: 100GB-10TB – Heterogeneous devices:
IP Forwarding Relates to Lab 3.
Live Migration of an Entire Network (and its Hosts) Eric Keller, Soudeh Ghorbani, Matthew Caesar, Jennifer Rexford HotNets 2012.
Slick: A control plane for middleboxes Bilal Anwer, Theophilus Benson, Dave Levin, Nick Feamster, Jennifer Rexford Supported by DARPA through the U.S.
Migrating and Grafting Routers to Accommodate Change Eric Keller Princeton University Jennifer Rexford, Jacobus van der Merwe, Yi Wang, and Brian Biskeborn.
Performance Evaluation of Open Virtual Routers M.Siraj Rathore
Fundamentals of Computer Networks ECE 478/578 Lecture #18: Policy-Based Routing Instructor: Loukas Lazos Dept of Electrical and Computer Engineering University.
Grafting Routers to Accommodate Change Eric Keller Princeton University Oct12, 2010 Jennifer Rexford, Jacobus van der Merwe, Michael Schapira.
Projects Related to Coronet Jennifer Rexford Princeton University
William Stallings Data and Computer Communications 7 th Edition (Selected slides used for lectures at Bina Nusantara University) Internetworking.
1 In VINI Veritas: Realistic and Controlled Network Experimentation Jennifer Rexford with Andy Bavier, Nick Feamster, Mark Huang, and Larry Peterson
VROOM: Virtual ROuters On the Move Jennifer Rexford Joint work with Yi Wang, Eric Keller, Brian Biskeborn, and Kobus van der Merwe
Refactoring Router Software to Minimize Disruption Eric Keller Advisor: Jennifer Rexford Princeton University Final Public Oral - 8/26/2011.
VROOM: Virtual ROuters On the Move Jennifer Rexford Joint work with Yi Wang, Eric Keller, Brian Biskeborn, and Kobus van der Merwe (AT&T)
Internet Routing (COS 598A) Today: Interdomain Traffic Engineering Jennifer Rexford Tuesdays/Thursdays.
1 Design and implementation of a Routing Control Platform Matthew Caesar, Donald Caldwell, Nick Feamster, Jennifer Rexford, Aman Shaikh, Jacobus van der.
A Routing Control Platform for Managing IP Networks Jennifer Rexford Princeton University
Green Networking Jennifer Rexford Computer Science Department Princeton University.
Rethinking Routers in the Age of Virtualization Jennifer Rexford Princeton University
VROOM: Virtual ROuters On the Move Yi Wang (Princeton) With: Kobus van der Merwe (AT&T Labs - Research) Jennifer Rexford (Princeton)
Routing.
Virtual ROuters On the Move (VROOM): Live Router Migration as a Network-Management Primitive Yi Wang, Eric Keller, Brian Biskeborn, Kobus van der Merwe,
TCP: Software for Reliable Communication. Spring 2002Computer Networks Applications Internet: a Collection of Disparate Networks Different goals: Speed,
1 IP Forwarding Relates to Lab 3. Covers the principles of end-to-end datagram delivery in IP networks.
Jennifer Rexford Fall 2010 (TTh 1:30-2:50 in COS 302) COS 561: Advanced Computer Networks Stub.
Connecting LANs, Backbone Networks, and Virtual LANs
Network Topologies.
Dynamic Infrastructure for Dependable Cloud Services Eric Keller Princeton University.
Virtual ROuters On the Move (VROOM): Live Router Migration as a Network-Management Primitive Yi Wang, Eric Keller, Brian Biskeborn, Kobus van der Merwe,
ISO Layer Model Lecture 9 October 16, The Need for Protocols Multiple hardware platforms need to have the ability to communicate. Writing communications.
Presentation on Osi & TCP/IP MODEL
© Janice Regan, CMPT 128, CMPT 371 Data Communications and Networking BGP, Flooding, Multicast routing.
Common Devices Used In Computer Networks
1 IP Forwarding Relates to Lab 3. Covers the principles of end-to-end datagram delivery in IP networks.
Eric Keller, Evan Green Princeton University PRESTO /22/08 Virtualizing the Data Plane Through Source Code Merging.
Using Measurement Data to Construct a Network-Wide View Jennifer Rexford AT&T Labs—Research Florham Park, NJ
1.4 Open source implement. Open source implement Open vs. Closed Software Architecture in Linux Systems Linux Kernel Clients and Daemon Servers Interface.
1 Route Optimization for Large Scale Network Mobility Assisted by BGP Feriel Mimoune, Farid Nait-Abdesselam, Tarik Taleb and Kazuo Hashimoto GLOBECOM 2007.
A Snapshot on MPLS Reliability Features Ping Pan March, 2002.
Networking Fundamentals. Basics Network – collection of nodes and links that cooperate for communication Nodes – computer systems –Internal (routers,
Evolving Toward a Self-Managing Network Jennifer Rexford Princeton University
McGraw-Hill©The McGraw-Hill Companies, Inc., 2004 Connecting Devices CORPORATE INSTITUTE OF SCIENCE & TECHNOLOGY, BHOPAL Department of Electronics and.
Yaping Zhu with: Jennifer Rexford (Princeton University) Aman Shaikh and Subhabrata Sen (ATT Research) Route Oracle: Where Have.
Evolving Toward a Self-Managing Network Jennifer Rexford Princeton University
© 2005 Cisco Systems, Inc. All rights reserved. BGP v3.2—7-1 Optimizing BGP Scalability Improving BGP Convergence.
THE NETWORKS Theo Chakkapark. Open System Interconnection  The tower of power!  The source of this power comes from the model’s flexibility.
11 ROUTING IP Chapter 3. Chapter 3: ROUTING IP2 CHAPTER INTRODUCTION  Understand the function of a router.  Understand the structure of a routing table.
4343 X2 – The Transport Layer Tanenbaum Ch.6.
Exploration 3 Chapter 4. What is VTP? VTP allows a network manager to configure a switch so that it will propagate VLAN configurations to other switches.
Mobility With IP, implicit assumption that there is no mobility. Addresses -- network part, host part -- so routers determine how to get to correct network.
MZR: A Multicast Protocol based on Zone Routing
Click to edit Master subtitle style
ICMP ICMP – Internet Control Message Protocol
COS 561: Advanced Computer Networks
Routing.
Refactoring Router Software to Minimize Disruption
VLAN Trunking Protocol
John Scudder October 24, 2000 BGP Update John Scudder October 24, 2000.
COS 561: Advanced Computer Networks
Dynamic Routing and OSPF
COS 561: Advanced Computer Networks
COS 561: Advanced Computer Networks
Distributed Systems CS
COS 461: Computer Networks
BGP Instability Jennifer Rexford
Routing.
Yi Wang, Eric Keller, Brian Biskeborn,
Distributed Systems CS
Presentation transcript:

Seamless BGP Migration with Router Grafting Eric Keller, Jennifer Rexford Princeton University Kobus van der Merwe AT&T Research NSDI 2010

Dealing with Change 2 Networks need to be highly reliable –To avoid service disruptions Operators need to deal with change –Install, maintain, upgrade, or decommission equipment –Deploy new services –Manage resource usage (CPU, bandwidth) But… change causes disruption –Forcing a tradeoff

Why is Change so Hard? Root cause is the monolithic view of a router (Hardware, software, and links as one entity) 3

Why is Change so Hard? Root cause is the monolithic view of a router (Hardware, software, and links as one entity) 4 Revisit the design to make dealing with change easier

Our Approach: Grafting In nature: take from one, merge into another –Plants, skin, tissue Router Grafting –To break the monolithic view –Focus on moving link (and corresponding BGP session) 5

Why Move Links? 6

Planned Maintenance 7 Shut down router to… –Replace power supply –Upgrade to new model –Contract network Add router to… –Expand network

Planned Maintenance Could migrate links to other routers –Away from router being shutdown, or –To router being added (or brought back up) 8

Customer Requests a Feature Network has mixture of routers from different vendors * Rehome customer to router with needed feature 9

Traffic Management Typical traffic engineering: * adjust routing protocol parameters based on traffic Congested link 10

Traffic Management Instead… * Rehome customer to change traffic matrix 11

Understanding the Disruption (today) 12 delete neighbor Add neighbor BGP updates 1)Reconfigure old router, remove old link 2)Add new link link, configure new router 3)Establish new BGP session (exchange routes) updates

Understanding the Disruption (today) 13 1)Reconfigure old router, remove old link 2)Add new link link, configure new router 3)Establish new BGP session (exchange routes) Downtime (Minutes)

Router Grafting: Breaking up the router 14 Send state Move link

Router Grafting: Breaking up the router 15 Router Grafting enables this breaking apart a router (splitting/merging).

Not Just State Transfer 16 Migrate session AS100 AS200 AS400 AS300

Not Just State Transfer 17 Migrate session AS100 AS200 AS400 AS300 The topology changes (Need to re-run decision processes)

Goals Routing and forwarding should not be disrupted –Data packets are not dropped –Routing protocol adjacencies do not go down –All route announcements are received Change should be transparent –Neighboring routers/operators should not be involved –Redesign the routers not the protocols 18

Challenge: Protocol Layers BGP TCP IP BGP TCP IP Migrate Link Migrate State Exchange routes Deliver reliable stream Send packets Physical Link A B C 19

Physical Link BGP TCP IP BGP TCP IP Migrate Link Migrate State Exchange routes Deliver reliable stream Send packets Physical Link A B C 20

Unplugging cable would be disruptive Remote end-point Migrate-from Migrate-to 21 Physical Link

mi Unplugging cable would be disruptive Links are not physical wires –Switchover in nanoseconds Remote end-point Migrate-from Migrate-to 22 Physical Link

IP BGP TCP IP BGP TCP IP Migrate Link Migrate State Exchange routes Deliver reliable stream Send packets Physical Link A B C 23

IP address is an identifier in BGP Changing it would require neighbor to reconfigure –Not transparent –Also has impact on TCP (later) 24 Changing IP Address mi Remote end-point Migrate-from Migrate-to

IP address not used for global reachability –Can move with BGP session –Neighbor doesn’t have to reconfigure 25 Re-assign IP Address mi Remote end-point Migrate-from Migrate-to

TCP BGP TCP IP BGP TCP IP Migrate Link Migrate State Exchange routes Deliver reliable stream Send packets Physical Link A B C 26

Dealing with TCP TCP sessions are long running in BGP –Killing it implicitly signals the router is down BGP and TCP extensions as a workaround (not supported on all routers) 27

Migrating TCP Transparently Capitalize on IP address not changing –To keep it completely transparent Transfer the TCP session state –Sequence numbers –Packet input/output queue (packets not read/ack’d) 28 TCP(data, seq, …) send() ack TCP(data’, seq’) recv() app OS

BGP TCP IP BGP TCP IP Migrate Link Migrate State Exchange routes Deliver reliable stream Send packets Physical Link A B C 29

BGP: What (not) to Migrate Requirements –Want data packets to be delivered –Want routing adjacencies to remain up Need –Configuration –Routing information Do not need (but can have) –State machine –Statistics –Timers Keeps code modifications to a minimum 30

Routing Information mi Could involve remote end-point –Similar exchange as with a new BGP session –Migrate-to router sends entire state to remote end-point –Ask remote-end point to re-send all routes it advertised Disruptive –Makes remote end-point do significant work 31 Remote end-point Exchange Routes Migrate-from Migrate-to

Routing Information (optimization) mi Migrate-from router send the migrate-to router: The routes it learned –Instead of making remote end-point re-announce The routes it advertised –So able to send just an incremental update 32 Remote end-point Migrate-from Migrate-to Incremental Update Send routes advertised/learned

Migration in The Background Remote End-point Migrate-to Migrate-from 33 Migration takes a while –A lot of routing state to transfer –A lot of processing is needed Routing changes can happen at any time Disruptive if not done in the background

While exporting routing state In-memory: p1, p2, p3, p4 Dump: p1, p2 Remote End-point Migrate-to Migrate-from 34 BGP is incremental, append update

While moving TCP session and link Remote End-point Migrate-to Migrate-from 35 TCP will retransmit

While importing routing state Remote End-point Migrate-to Migrate-from 36 In-memory: p1, p2 Dump: p1, p2, p3, p4 BGP is incremental, ignore dump file

Special Case: Cluster Router 37 Switching Fabric Blade Line card A B C D Blade ABCD Don’t need to re-run decision processes Links ‘migrated’ internally

Prototype Added grafting into Quagga –Import/export routes, new ‘inactive’ state –Routing data and decision process well separated Graft daemon to control process SockMi for TCP migration 38 Modified Quagga graft daemon Linux kernel SockMi.ko Graftable Router Handler Comm Linux kernel click click.ko Emulated link migration Quagga Unmod. Router Linux kernel

Evaluation Impact on migrating routers Disruption to network operation Overhead on rest of the network 39

Evaluation Impact on migrating routers Disruption to network operation Overhead on rest of the network 40

Impact on Migrating Routers How long migration takes –Includes export, transmit, import, lookup, decision –CPU Utilization roughly 25% 41 Between Routers 0.9s (20k) 6.9s (200k) Between Blades 0.3s (20k) 3.1s (200k)

Disruption to Network Operation Data traffic affected by not having a link –nanoseconds Routing protocols affected by unresponsiveness –Set old router to “inactive”, migrate link, migrate TCP, set new router to “active” –milliseconds 42

Conclusions and Future Work Enables moving a single link/session with… –Minimal code change –No impact on data traffic –No visible impact on routing protocol adjacencies –Minimal overhead on rest of network Future work –Explore applications –Generalize grafting (multiple sessions, different protocols, other resources) 43

Questions? Contact info: 44