PlanetLab Architecture Larry Peterson Princeton University
Issues Multiple VM Types –Linux vservers, Xen domains Federation –EU, Japan, China Resource Allocation –Policy, markets Infrastructure Services –Delegation Need to define the PlanetLab Architecture
Key Architectural Ideas Distributed virtualization –slice = set of virtual machines Unbundled management –infrastructure services run in their own slice Chain of responsibility –account for behavior of third-party software –manage trust relationships
Trust Relationships Princeton Berkeley Washington MIT Brown CMU NYU ETH Harvard HP Labs Intel NEC Labs Purdue UCSD SICS Cambridge Cornell … princeton_codeen nyu_d cornell_beehive att_mcash cmu_esm harvard_ice hplabs_donutlab idsl_psepr irb_phi paris6_landmarks mit_dht mcgill_card huji_ender arizona_stork ucb_bamboo ucsd_share umd_scriptroute … N x N Trusted Intermediary (PLC)
Principals Node Owners –host one or more nodes (retain ultimate control) –selects an MA and approves of one or more SAs Service Providers (Developers) –implements and deploys network services –responsible for the service’s behavior Management Authority (MA) –installs an maintains software on nodes –creates VMs and monitors their behavior Slice Authority (SA) –registers service providers –creates slices and binds them to responsible provider
Trust Relationships (1) Owner trusts MA to map network activity to responsible slice MA Owner Provider SA (2) Owner trusts SA to map slice to responsible providers (3) Provider trusts SA to create VMs on its behalf 3 (4) Provider trusts MA to provide working VMs & not falsely accuse it 4 (5) SA trusts provider to deploy responsible services (6) MA trusts owner to keep nodes physically secure
Architectural Elements MA NM + VMM node database Node Owner VM SCS SA slice database VM Service Provider
Narrow Waist Name space for slices Node Manager Interface rspec = < vm_type = linux_vserver, cpu_share = 32, mem_limit - 128MB, disk_quota = 5GB, base_rate = 1Kbps, burst_rate = 100Mbps, sustained_rate = 1.5Mbps >
Node Boot/Install Process NodePLC Boot Server 1. Boots from BootCD (Linux loaded) 2. Hardware initialized 3. Read network config. from floppy 7. Node key read into memory from floppy 4. Contact PLC (MA) 6. Execute boot mgr Boot Manager 8. Invoke Boot API 10. State = “install”, run installer 11. Update node state via Boot API 13. Chain-boot node (no restart) 14. Node booted 9. Verify node key, send current node state 12. Verify node key, change state to “boot” 5. Send boot manager
PlanetFlow Logs every outbound IP flow on every node –accesses ulogd via Proper –retrieves packet headers, timestamps, context ids (batched) –used to audit traffic Aggregated and archived at PLC
Chain of Responsibility Join Request PI submits Consortium paperwork and requests to join PI Activated PLC verifies PI, activates account, enables site (logged) User Activated Users create accounts with keys, PI activates accounts (logged) Nodes Added to Slices Users add nodes to their slice (logged) Slice Traffic Logged Experiments run on nodes and generate traffic (logged by Netflow) Traffic Logs Centrally Stored PLC periodically pulls traffic logs from nodes Slice Created PI creates slice and assigns users to it (logged) Network Activity Slice Responsible Users & PI
Slice Creation PLC (SA) VMM NMVM PI SliceCreate( ) SliceUsersAdd( ) User SliceNodesAdd( ) SliceAttributeSet( ) SliceInstantiate( ) SliceGetAll( ) slices.xml VM …
Slice Creation PLC (SA) VMM NMVM PI SliceCreate( ) SliceUsersAdd( ) User SliceAttributeSet( ) SliceGetTicket( ) VM … (distribute ticket to slice creation service) SliverCreate(ticket)
Brokerage Service PLC (SA) VMM NMVM PI SliceCreate( ) SliceUsersAdd( ) Broker SliceAttributeSet( ) SliceGetTicket( ) VM … (distribute ticket to brokerage service) rcap = PoolCreate(ticket)
Brokerage Service (cont) PLC (SA) VMM NMVM … (broker contacts relevant nodes) PoolSplit(rcap, slice, rspec) VM User BuyResources( ) Broker
Policy Proposals Suspend a site’s slices while a its nodes are down Resource allocation to –brokerage services –long-running services Encourage measurement experiments via ScriptRoute –lower scheduling latency for select slices Distinguish PL versus non-PL traffic –remove per-node burst limits –replace with sustained rate caps –limit slices to 5GB/day to non-PL destinations (with exceptions)