Eric Keller, Evan Green Princeton University PRESTO /22/08 Virtualizing the Data Plane Through Source Code Merging
2 Custom functionality –Custom user environment on each node (for controlling virtual router) –Specify single node’s packet handling Isolated from others sharing same node –Allocated share of resources (e.g. CPU, memory, bandwidth) –Protected from faults in others (e.g. another virtual router crashing) Highest performance possible Config/Query interface User Control Environment Zooming In On a Virtual Router A1A2A3 A4A5 To devices From devices
3 General Virtualization Isolation: Namespace, Resource, Performance Full/Para Virtualization –Separate operating system –By emulating underlying hardware –Each OS has own network stack Container Virtualization –Separate data structures –By modifying kernel –Shared network stack, each container can configure
4 Packet Processing Configurability Goal: run custom code for packet processing Requires: isolation –Namespace, resource, performance Could run each instance of custom code in a VM –Isolation provided by virtual machine Lighter weight solution –Provide appearance of multiple instances –While, still providing isolation Click as platform for this lightweight solution
5 Click Background: Overview Software architecture for building configurable routers –Widely used – commercially and in research –Easy to use, flexible, high performance Routers assembled from packet processing modules (Elements) –Simple and Complex Processing specified as directed graph Includes a scheduler –Schedules tasks (a series of elements) FromDevice(eth0)DiscardCounter
6 Source Code Merging: Combine graphs –Each virtual router specifies custom graph –Can target hardware or software Add extra packet processing (e.g. mux/demux) –Needed to direct packets to the correct graph Add resource accounting Lightweight Virtualization Master graph Graph 1 Graph 2 combine Graph 1 Graph 2 Input port Output port Master Graph
7 Prototype Implementation: Linux-VServer + Click + NetFPGA (future) Click Coordinating Process Install/ Query Install/ Query Install/ Query Click on NetFPGA
8 Resource Accounting with VServer Purpose of Resource Accounting –Provides performance isolation between virtual routers VServer’s Token Bucket Extension to Linux Scheduler –Controls eligibility of processes/threads to run Integration with Click –Unified accounting for packet processing and control –Each Click configuration assigned to a thread –Each thread associated with a VServer context –~10% overhead of 10 virtual routers vs unshared node
9 Isolation Properties Performance Isolation –Associate each graph with virtual container –Assume library of “safe” elements that execute within a bounded amount of time Namespace Isolation –Coordinating process statically renames Resource Isolation –Memory: assume library of “safe” elements that do not access memory outside of element –Devices: Coordinating process adds mux/demuxing elements Next: examine relaxing to allow custom elements
10 Problem 1: Unyielding Threads Linux kernel threads are cooperative (i.e. must yield) –Token scheduler controls when eligible to start Single long task can have short term disruptions –Affecting delay and jitter on other virtual networks Token bucket does not go negative –Long term, a virtual network can get more than its share Tokens added (rate A) Min tokens to exec (M) Tokens consumed (1 per scheduler tick) Size of Bucket (S)
11 Problem 1: Unyielding Threads (soln.) Determine graph’s execution time –Standard N port router example - ~ 5400 cycles (1.8us) –RadixIPLookup (167k entries) - ~1000 cycles Option 1: Break up graph Option 2: Execute inside of container elem1elem2elem3 elem1elem2elem3 From Kern To User
12 Problem 2: Custom Elements in C++ Elements have access to global state –Kernel state/functions –Click global state Could… (and we did) –Pre-compile in user mode –Pre-compile with restricted header files Not perfect: –With C++, you can manipulate pointers Instead, custom elements are unknown (“unsafe”) –For absolute safety, execute in container
13 Future Work Safety –Modify source code to add checks (e.g. CCured) –Run-time monitoring –Explore alternative tradeoff points Add support for specialized devices (FPGAs) –Click to FPGA –Partitioning graph across FPGA and Software –Specification of elements Language to target either HW or SW
14 Conclusion Goal: Enable custom data planes per virtual network Built prototype system for virtual Click in kernel –Merging Click graphs of different virtual routers –Adding elements to mux/demux packet to correct graph –Unified resource accounting with Linux-VServer Discussed issues of safety –Performance Isolation: Unyielding threads –Resource Isolation: Pointers Using source code –Enables a lightweight virtualization mechanism –Opens up compile time solutions to safety
15 Questions