Download presentation
Presentation is loading. Please wait.
1
Rethinking Routers in the Age of Virtualization Jennifer Rexford Princeton University http://www.cs.princeton.edu/~jrex/virtual.html
2
Traditional View of a Router A big, physical device… – Processors – Multiple links – Switching fabric … that directs Internet traffic – Connects to other routers – Computes routes – Forwards packets
3
Times Are Changing
4
Backbone Links are Virtual Flexible underlying transport network – Layer-3 links are multi-hop paths at layer 2 4 Chicago New York Washington D.C.
5
Routing Separate From Forwarding Separation of functionality – Control plane: computes paths – Forwarding plane: forwards packets Switching Fabric Processor Line card data plane control plane
6
Multiple Virtual Routers Multiple virtual routers on same physical one – Virtual Private Networks (VPNs) – Router consolidation for smaller footprint Switching Fabric data plane control plane
7
Capitalizing on Virtualization Simplify network management – Hide planned changes in the physical topology Improve router reliability – Survive bugs in complex routing software Deploy new value-added services – Customized protocols in virtual networks Enable new network business models – Separate service providers from the infrastructure What should the router “hypervisor” look like?
8
VROOM: Virtual Routers On the Move With Yi Wang, Eric Keller, Brian Biskeborn, and Kobus van der Merwe
9
The Two Notions of “Router” IP-layer logical functionality, and physical equipment 9 Logical (IP layer) Physical
10
Tight Coupling of Physical & Logical Root of many network-management challenges (and “point solutions”) 10 Logical (IP layer) Physical
11
VROOM: Breaking the Coupling Re-mapping logical node to another physical node 11 Logical (IP layer) Physical VROOM enables this re-mapping of logical to physical through virtual router migration.
12
Case 1: Planned Maintenance NO reconfiguration of VRs, NO reconvergence 12 A B VR-1
13
Case 1: Planned Maintenance NO reconfiguration of VRs, NO reconvergence 13 A B VR-1
14
Case 1: Planned Maintenance NO reconfiguration of VRs, NO reconvergence 14 A B VR-1
15
Case 2: Service Deployment/Evolution Move (logical) router to more powerful hardware 15
16
Case 2: Service Deployment/Evolution VROOM guarantees seamless service to existing customers during the migration 16
17
Case 3: Power Savings 17 $ Hundreds of millions/year of electricity bills
18
Case 3: Power Savings 18 Contract and expand the physical network according to the traffic volume
19
Case 3: Power Savings 19 Contract and expand the physical network according to the traffic volume
20
Case 3: Power Savings 20 Contract and expand the physical network according to the traffic volume
21
Virtual Router Migration: Challenges 21 1.Migrate an entire virtual router instance All control-plane processes & data-plane states
22
Virtual Router Migration: Challenges 22 1.Migrate an entire virtual router instance 2.Minimize disruption Data plane: millions of packets/sec on a 10Gbps link Control plane: less strict (with routing message retrans.)
23
Virtual Router Migration: Challenges 23 1.Migrating an entire virtual router instance 2.Minimize disruption 3.Link migration
24
Virtual Router Migration: Challenges 24 1.Migrating an entire virtual router instance 2.Minimize disruption 3.Link migration
25
VROOM Architecture 25 Dynamic Interface Binding Data-Plane Hypervisor
26
Key idea: separate the migration of control and data planes 1.Migrate the control plane 2.Clone the data plane 3.Migrate the links 26 VROOM’s Migration Process
27
Leverage virtual server migration techniques Router image – Binaries, configuration files, etc. 27 Control-Plane Migration
28
Leverage virtual server migration techniques Router image Memory – 1 st stage: iterative pre-copy – 2 nd stage: stall-and-copy (when the control plane is “frozen”) 28 Control-Plane Migration
29
Leverage virtual server migration techniques Router image Memory 29 Control-Plane Migration Physical router A Physical router B DP CP
30
Clone the data plane by repopulation – Enable migration across different data planes – Avoid copying duplicate information 30 Data-Plane Cloning Physical router A Physical router B CP DP-old DP-new
31
Data-plane cloning takes time – Installing 250k routes may take several seconds Control & old data planes need to be kept “online” Solution: redirect routing messages through tunnels 31 Remote Control Plane Physical router A Physical router B CP DP-old DP-new
32
Data-plane cloning takes time – Installing 250k routes takes over 20 seconds Control & old data planes need to be kept “online” Solution: redirect routing messages through tunnels 32 Remote Control Plane Physical router A Physical router B CP DP-old DP-new
33
Data-plane cloning takes time – Installing 250k routes takes over 20 seconds Control & old data planes need to be kept “online” Solution: redirect routing messages through tunnels 33 Remote Control Plane Physical router A Physical router B CP DP-old DP-new
34
At the end of data-plane cloning, both data planes are ready to forward traffic 34 Double Data Planes CP DP-old DP-new
35
With the double data planes, links can be migrated independently 35 Asynchronous Link Migration A CP DP-old DP-new B
36
Virtualized operating system – OpenVZ, supports VM migration Routing protocols – Quagga software suite Packet forwarding – NetFPGA hardware Router hypervisor – Our extensions for repopulating data plane, remote control plane, double data planes, … 36 Prototype Implementation
37
Data plane: NetFPGA – No packet loss or extra delay Control plane: Quagga routing software – All routing-protocol adjacencies stay up – Core router migration (intradomain only) Inject an unplanned link failure at another router At most one retransmission of an OSPF message – Edge router migration (intra and interdomain) Control-plane downtime: 3.56 seconds Within reasonable keep-alive timer intervals 37 Experimental Results
38
Conclusions on VROOM Useful network-management primitive – Separate tight coupling between physical and logical – Simplify management, enable new applications Evaluation of prototype – No disruption in packet forwarding – No noticeable disruption in routing protocols Ongoing work – Migration scheduling as an optimization problem – Extensions to hypervisor for other applications 38
39
VERB: Virtually Eliminating Router Bugs With Eric Keller, Minlan Yu, and Matt Caesar
40
Router Bugs Are Important Routing software is complicated – Leads to programming errors (aka “bugs”) – Recent string of high-profile outages Bugs different from traditional failures – Byzantine failures, don’t simply crash the router – Violate protocol, and cause cascading outages The problem is getting worse – Software is getting more complicated – Other outages becoming less common – Vendors allowing third-party software
41
Exploit Software and Data Diversity Many sources of diversity – Diverse code (Quagga, XORP, BIRD) – Diverse protocols (OSPF and IS-IS) – Diverse environment (timing, ordering, memory) Reasonable overhead – Extra processor blade for hardware reliability – Multi-core processors, separate route servers, … Special properties of routing software – Clear interfaces to data plane and other routers – Limited dependence on past history
42
Handling Bugs at Run Time Diverse replication – Run multiple control planes in parallel – Vote on routing messages and forwarding table UPDATE VOTER FIB VOTER REPLICA MANAGER Hypervisor Forwarding Table (FIB) IF 1 IF 2 Protocol daemon RIB Protocol daemon RIB Protocol daemon RIB
43
UPDATE VOTER FIB VOTER REPLICA MANAGER Hypervisor Replicating Incoming Routing Messages FIB IF 1 IF 2 12.0.0.0/8 Update Protocol daemon RIB Protocol daemon RIB Protocol daemon RIB No need for protocol parsing – operates at socket level
44
UPDATE VOTER FIB VOTER REPLICA MANAGER Hypervisor Voting: Updates to Forwarding Table FIB IF 1 IF 2 12.0.0.0/8 IF 2 12.0.0.0/8 Update Protocol daemon RIB Protocol daemon RIB Protocol daemon RIB Transparent by intercepting calls to “Netlink”
45
UPDATE VOTER FIB VOTER REPLICA MANAGER Hypervisor Voting: Control-Plane Messages FIB IF 1 IF 2 12.0.0.0/8 IF 2 12.0.0.0/8 Update Protocol daemon RIB Protocol daemon RIB Protocol daemon RIB Transparent by intercepting socket system calls
46
Simple Voting and Recovery Tolerate transient periods of disagreement – During routing-protocol convergence (tens of sec) Several different voting mechanisms – Master-slave vs. wait-for-consensus Small, trusted software component – No parsing, treats data as opaque strings – Just 514 lines of code in our implementation Recovery – Kill faulty instance, and invoke a new one
47
Conclusion on Bug-Tolerant Router Seriousness of routing software bugs – Cause serious outages, misbehavior, vulnerability – Violate protocol semantics, so not handled by traditional failure detection and recovery Software and data diversity – Effective, and has reasonable overhead Design and prototype of bug-tolerant router – Works with Quagga, XORP, and BIRD software – Low overhead, and small trusted code base
48
Conclusions for the Talk Router virtualization is exciting – Enables wide variety of new networking techniques – … for network management & service deployment – … and even rethinking the Internet architecture Fascinating space of open questions – Other possible applications of router virtualization? – What is the right interface to router hardware? – What is the right programming environment for customized protocols on virtual networks? http://www.cs.princeton.edu/~jrex/virtual.html
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.