Global Overlay Network : PlanetLab Claudio E.Righetti 6 October, 2005 (some slides taken from Larry Peterson)

Slides:



Advertisements
Similar presentations
All Rights Reserved © Alcatel-Lucent 2009 Enhancing Dynamic Cloud-based Services using Network Virtualization F. Hao, T.V. Lakshman, Sarit Mukherjee, H.
Advertisements

Seungmi Choi PlanetLab - Overview, History, and Future Directions - Using PlanetLab for Network Research: Myths, Realities, and Best Practices.
PlanetLab: An Overlay Testbed for Broad-Coverage Services Bavier, Bowman, Chun, Culler, Peterson, Roscoe, Wawrzoniak Presented by Jason Waddle.
PlanetLab V3 and beyond Steve Muir Princeton University September 17, 2004.
CoreLab Update Future Internet Workshop University of Tokyo/NICT Aki NAKAO 1Future Internet Workshop, Bangkok
PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University.
PlanetLab Architecture Larry Peterson Princeton University.
PlanetLab What is PlanetLab? A group of computers available as a testbed for computer networking and distributed systems research.
PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University.
PlanetLab Operating System support* *a work in progress.
1 PlanetLab: A globally distributed testbed for New and Disruptive Services CS441 Mar 15th, 2005 Seungjun Lee
PlanetLab: Present and Future Steve Muir 3rd August, 2005 (slides taken from Larry Peterson)
PlanetLab: Catalyzing Network Innovation October 2, 2007 Larry Peterson Princeton University Timothy Roscoe Intel Research at Berkeley.
SEEDING CLOUD-BASED SERVICES: DISTRIBUTED RATE LIMITING (DRL) Kevin Webb, Barath Raghavan, Kashi Vishwanath, Sriram Ramabhadran, Kenneth Yocum, and Alex.
Xen , Linux Vserver , Planet Lab
PlanetLab: An open platform for developing, deploying, and accessing planetary-scale services Overview Adapted from Peterson.
A Case for Virtualizing Nodes on Network Experimentation Testbeds Konrad Lorincz Harvard University June 1, 2015June 1, 2015June 1, 2015.
1 In VINI Veritas: Realistic and Controlled Network Experimentation Jennifer Rexford with Andy Bavier, Nick Feamster, Mark Huang, and Larry Peterson
DISTRIBUTED CONSISTENCY MANAGEMENT IN A SINGLE ADDRESS SPACE DISTRIBUTED OPERATING SYSTEM Sombrero.
1 DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 3 Processes Skip
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
An Overlay Data Plane for PlanetLab Andy Bavier, Mark Huang, and Larry Peterson Princeton University.
Global Overlay Network : PlanetLab Claudio E.Righetti October, 2006 (some slides taken from Larry Peterson)
FI-WARE – Future Internet Core Platform FI-WARE Cloud Hosting July 2011 High-level description.
Figure 1.1 Interaction between applications and the operating system.
Xuan Guo Chapter 1 What is UNIX? Graham Glass and King Ables, UNIX for Programmers and Users, Third Edition, Pearson Prentice Hall, 2003 Original Notes.
PlanetLab Software Overview Mark Huang
The Origin of the VM/370 Time-sharing system Presented by Niranjan Soundararajan.
Container-based OS Virtualization A Scalable, High-performance Alternative to Hypervisors Stephen Soltesz, Herbert Pötzl, Marc Fiuczynski, Andy Bavier.
© 2010 VMware Inc. All rights reserved VMware ESX and ESXi Module 3.
An Introduction to Xen Prof. Chih-Hung Wu
Building service testbeds on FIRE D5.2.5 Virtual Cluster on Federated Cloud Demonstration Kit August 2012 Version 1.0 Copyright © 2012 CESGA. All rights.
Microkernels, virtualization, exokernels Tutorial 1 – CSC469.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED.
Andy Bavier, PlanetWorks Scott Baker, SB-Software July 27, 2011.
Systems Security & Audit Operating Systems security.
PlanetLab: A Distributed Test Lab for Planetary Scale Network Services Opportunities Emerging “Killer Apps”: –CDNs and P2P networks are first examples.
Xen Overview for Campus Grids Andrew Warfield University of Cambridge Computer Laboratory.
Eric Keller, Evan Green Princeton University PRESTO /22/08 Virtualizing the Data Plane Through Source Code Merging.
VICCI: Programmable Cloud Computing Research Testbed Andy Bavier Princeton University November 3, 2011.
PlanetLab and OneLab Presentation at the GRID 5000 School 9 March 2006 Timur Friedman Université Pierre et Marie Curie Laboratoire LIP6-CNRS PlanetLab.
1 On the Design & Evolution of an Architecture for Testbed Federation Stephen Soltesz, David Eisenstat, Marc Fiuczynski, Larry Peterson.
Politecnico di Torino Dipartimento di Automatica ed Informatica TORSEC Group Performance of Xen’s Secured Virtual Networks Emanuele Cesena Paolo Carlo.
Intel IT Overlay Jeff Sedayao PlanetLab Workshop at HPLABS May 11, 2006.
An Overview of the PlanetLab SeungHo Lee.
By L. Peterson, Princeton T.Anderson, UW D. Culler, T. Roscoe, Intel, Berkeley HotNets-I (Infrastructure panel), 2002 Presenter Shobana Padmanabhan Discussion.
Introduction to Windows XP Professional
1 A Blueprint for Introducing Disruptive Technology into the Internet Larry Peterson Princeton University / Intel Research.
PlanetLab Architecture Larry Peterson Princeton University.
Virtual Workspaces Kate Keahey Argonne National Laboratory.
 Virtual machine systems: simulators for multiple copies of a machine on itself.  Virtual machine (VM): the simulated machine.  Virtual machine monitor.
1 Testbeds Breakout Tom Anderson Jeff Chase Doug Comer Brett Fleisch Frans Kaashoek Jay Lepreau Hank Levy Larry Peterson Mothy Roscoe Mehul Shah Ion Stoica.
PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University.
Trusted Virtual Machine Images a step towards Cloud Computing for HEP? Tony Cass on behalf of the HEPiX Virtualisation Working Group October 19 th 2010.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
Active Directory. Computers in organizations Computers are linked together for communication and sharing of resources There is always a need to administer.
Windows 2003 Architecture, Active Directory & DNS Lecture # 3 Hassan Shuja 02/14/2006.
Hosting Wide-Area Network Testbeds: Policy Considerations Larry Peterson Princeton University.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
01/27/10 What is PlanetLab? A planet-wide testbed for the R & D of network applications and distributed computing Over 1068 nodes at 493 sites, primarily.
SERVERS. General Design Issues  Server Definition  Type of server organizing  Contacting to a server Iterative Concurrent Globally assign end points.
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Container-based Operating System Virtualization: A scalable, High-performance Alternative to Hypervisors Stephen Soltesz, Herbert Potzl, Marc E. Fiuczynski,
Oracle Solaris Zones Study Purpose Only
KERNEL ARCHITECTURE.
PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University.
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
Preventing Privilege Escalation
Presentation transcript:

Global Overlay Network : PlanetLab Claudio E.Righetti 6 October, 2005 (some slides taken from Larry Peterson)

“PlanetLab: An Overlay Testbed for Broad-Coverage Services “ Bavier, Bowman, Chun, Culler, Peterson, Roscoe, Wawrzoniak. ACM SIGCOMM Computer Communications Review. Volume 33 Number 3 : July 2003 “ Overcoming the Internet Impasse through Virtualization “ Anderson, Peterson, Shenker, Turner. IEEE Computer. April 2005 “Towards a Comprehensive PlanetLab Architecture”, Larry Peterson, Andy Bavier, Marc Fiuczynski, Steve Muir, and Timothy Roscoe, June

Overview 1.What is PlanetLab? 2.Architecture 1.Local: Nodes 2.Global: Network 3.Details 1.Virtual Machines 2.Maintenance

What Is PlanetLab? Geographically distributed overlay network Testbed for broad-coverage network services

PlanetLab Goal “…to support seamless migration of an application from an early prototype, through multiple design iterations, to a popular service that continues to evolve.”

Priorities Diversity of Network –Geographic –Links Edge-sites, co-location and routing centers, homes (DSL, cable-modem) Flexibility –Allow experimenters maximal control over PlanetLab nodes –Securely and fairly

Architecture Overview Slice : horizontal cut of global PlanetLab resources Service : set of distributed and cooperating programs delivering some higher-level functionality Each service runs in a slice of PlanetLab

Services Run in Slices PlanetLab Nodes

Services Run in Slices PlanetLab Nodes Virtual Machines Service / Slice A

Services Run in Slices PlanetLab Nodes Virtual Machines Service / Slice A Service / Slice B

Services Run in Slices PlanetLab Nodes Virtual Machines Service / Slice A Service / Slice B Service / Slice C

“… to view slice as a network of Virtual Machines, with a set of local resources bound to each VM.”

Virtual Machine Monitor ( VMM) Multiple VMs run on each PlanetLab node VMM arbitrates the nodes’s resources among them

PlanetLab Architecture Node-level –Several virtual machines on each node, each running a different service Resources distributed fairly Services are isolated from each other Network-level –Node managers, agents, brokers, and service managers provide interface and maintain PlanetLab

Per-Node View Virtual Machine Monitor (VMM) Node Mgr Local Admin VM 1 VM 2 VM n …

Node Architecture Goals Provide a virtual machine for each service running on a node Isolate virtual machines Allow maximal control over virtual machines Fair allocation of resources –Network, CPU, memory, disk

PlanetLab’s design philosophy Application Programming Interface used by tipical services Protection Interface implemented by the VMM PlanetLab node virtualization mechanisms are characterized by the these two interfaces are drawn

One Extreme: Software Runtimes (e.g., Java Virtual Machine, MS CLR) Very High level API Depend on OS to provide protection and resource allocation Not flexible

Other Extreme: Complete Virtual Machine (e.g., VMware) Very Low level API (hardware) –Maximum flexibility Excellent protection High CPU/Memory overhead –Cannot share common resources among virtual machines OS, common filesystem High-end commercial server, 10s VM

Mainstream Operating System API and protection at same level (system calls) Simple implementation (e.g., Slice = process group) Efficient use of resources (shared memory, common OS) Bad protection and isolation Maximum Control and Security?

PlanetLab Virtualization: VServers Kernel patch to mainstream OS (Linux) Gives appearance of separate kernel for each virtual machine –Root privileges restricted to activities that do not affect other vservers Some modification: resource control (e.g., File handles, port numbers) and protection facilities added

PlanetLab Network Architecture Node manger (one per node) –Create slices for service managers When service managers provide valid tickets –Allocate resources for vservers Resource Monitor (one per node) –Track node’s available resources –Tell agents about available resources

PlanetLab Network Architecture Agent (centralized) –Track nodes’ free resources –Advertise resources to resource brokers –Issue tickets to resource brokers Tickets may be redeemed with node managers to obtain the resource

PlanetLab Network Architecture Resource Broker (per service) –Obtain tickets from agents on behalf of service managers Service Manager (per service) –Obtain tickets from broker –Redeem tickets with node managers to acquire resources –If resources can be acquired, start service

Obtaining a Slice Agent Service Manager Broker

Obtaining a Slice Agent Service Manager Broker Resource Monitor

Obtaining a Slice Agent Service Manager Broker Resource Monitor

Obtaining a Slice Agent Service Manager Broker Resource Monitor ticket

Obtaining a Slice Agent Service Manager Broker ticket Resource Monitor

Obtaining a Slice Agent Service Manager Broker ticket Resource Monitor ticket

Obtaining a Slice Agent Service Manager Broker ticket

Obtaining a Slice Agent Service Manager Broker ticket

Obtaining a Slice Agent Service Manager Broker ticket

Obtaining a Slice Agent Service Manager Broker ticket

Obtaining a Slice Agent Service Manager Broker ticket Node Manager

Obtaining a Slice Agent Service Manager Broker ticket

Obtaining a Slice Agent Service Manager Broker ticket

PlanetLab Today

PlanetLab Today Global distributed systems infrastructure –platform for long-running services –testbed for network experiments 583 nodes around the world –30 countries –250+ institutions (universities, research labs, gov’t) Standard PC servers –150–200 users per server –30–40 active per hour, 5–10 at any given time –memory, CPU both heavily over-utilised

Node Software Linux Fedora Core 2 –kernel being upgraded to FC4 –always up-to-date with security-related patches VServer patches provide security –each user gets own VM (‘slice’) –limited root capabilities CKRM/VServer patches provide resource mgmt –proportional share CPU scheduling –hierarchical token bucket controls network Tx bandwidth –physical memory limits –disk quotas

Issues Multiple VM Types –Linux vservers, Xen domains Federation –EU, Japan, China Resource Allocation –Policy, markets Infrastructure Services –Delegation Need to define the PlanetLab Architecture

Key Architectural Ideas Distributed virtualization –slice = set of virtual machines Unbundled management –infrastructure services run in their own slice Chain of responsibility –account for behavior of third-party software –manage trust relationships

N x N Trust Relationships Princeton Berkeley Washington MIT Brown CMU NYU ETH Harvard HP Labs Intel NEC Labs Purdue UCSD SICS Cambridge Cornell … princeton_codeen nyu_d cornell_beehive att_mcash cmu_esm harvard_ice hplabs_donutlab idsl_psepr irb_phi paris6_landmarks mit_dht mcgill_card huji_ender arizona_stork ucb_bamboo ucsd_share umd_scriptroute … Trusted Intermediary (PLC)

Principals Node Owners –host one or more nodes (retain ultimate control) –selects an MA and approves of one or more SAs Service Providers (Developers) –implements and deploys network services –responsible for the service’s behavior Management Authority (MA) –installs an maintains software on nodes –creates VMs and monitors their behavior Slice Authority (SA) –registers service providers –creates slices and binds them to responsible provider

Trust Relationships (1) Owner trusts MA to map network activity to responsible slice MA Owner Provider SA (2) Owner trusts SA to map slice to responsible providers (3) Provider trusts SA to create VMs on its behalf 3 (4) Provider trusts MA to provide working VMs & not falsely accuse it 4 (5) SA trusts provider to deploy responsible services (6) MA trusts owner to keep nodes physically secure

Architectural Elements MA NM + VMM node database Node Owner VM SCS SA slice database VM Service Provider

Narrow Waist Name space for slices Node Manager Interface rspec = < vm_type = linux_vserver, cpu_share = 32, mem_limit - 128MB, disk_quota = 5GB, base_rate = 1Kbps, burst_rate = 100Mbps, sustained_rate = 1.5Mbps >

Node Boot/Install Process NodePLC Boot Server 1. Boots from BootCD (Linux loaded) 2. Hardware initialized 3. Read network config. from floppy 7. Node key read into memory from floppy 4. Contact PLC (MA) 6. Execute boot mgr Boot Manager 8. Invoke Boot API 10. State = “install”, run installer 11. Update node state via Boot API 13. Chain-boot node (no restart) 14. Node booted 9. Verify node key, send current node state 12. Verify node key, change state to “boot” 5. Send boot manager

PlanetFlow Logs every outbound IP flow on every node –accesses ulogd via Proper –retrieves packet headers, timestamps, context ids (batched) –used to audit traffic Aggregated and archived at PLC

Chain of Responsibility Join Request PI submits Consortium paperwork and requests to join PI Activated PLC verifies PI, activates account, enables site (logged) User Activated Users create accounts with keys, PI activates accounts (logged) Nodes Added to Slices Users add nodes to their slice (logged) Slice Traffic Logged Experiments run on nodes and generate traffic (logged by Netflow) Traffic Logs Centrally Stored PLC periodically pulls traffic logs from nodes Slice Created PI creates slice and assigns users to it (logged) Network Activity Slice Responsible Users & PI

Slice Creation PLC (SA) VMM NMVM PI SliceCreate( ) SliceUsersAdd( ) User SliceNodesAdd( ) SliceAttributeSet( ) SliceInstantiate( ) SliceGetAll( ) slices.xml VM …

Slice Creation PLC (SA) VMM NMVM PI SliceCreate( ) SliceUsersAdd( ) User SliceAttributeSet( ) SliceGetTicket( ) VM … (distribute ticket to slice creation service) SliverCreate(ticket)

Brokerage Service PLC (SA) VMM NMVM PI SliceCreate( ) SliceUsersAdd( ) Broker SliceAttributeSet( ) SliceGetTicket( ) VM … (distribute ticket to brokerage service) rcap = PoolCreate(ticket)

Brokerage Service (cont) PLC (SA) VMM NMVM … (broker contacts relevant nodes) PoolSplit(rcap, slice, rspec) VM User BuyResources( ) Broker

VIRTUAL MACHINES

PlanetLab Virtual Machines: VServers Extend the idea of chroot(2) –New vserver created by system call –Descendent processes inherit vserver –Unique filesystem, SYSV IPC, UID/GID space –Limited root privilege Can’t control host node –Irreversible

Scalability Reduce disk footprint using copy-on-write –Immutable flag provides file-level CoW –Vservers share 508MB basic filesystem Each additional vserver takes 29MB Increase limits on kernel resources (e.g., file descriptors) –Is the kernel designed to handle this? (inefficient data structures?)

Protected Raw Sockets Services may need low-level network access –Cannot allow them access to other services’ packets Provide “protected” raw sockets –TCP/UDP bound to local port –Incoming packets delivered only to service with corresponding port registered –Outgoing packets scanned to prevent spoofing ICMP also supported –16-bit identifier placed in ICMP header

Resource Limits Node-wide cap on outgoing network bandwidth –Protect the world from PlanetLab services Isolation between vservers: two approaches –Fairness: each of N vservers gets 1/N of the resources during contention –Guarantees: each slice reserves certain amount of resources (e.g., 1Mbps bandwidth, 10Mcps CPU) Left-over resources distributed fairly

Linux and CPU Resource Management The scheduler in Linux provides fairness by process, not by vserver –Vserver with many processes hogs CPU No current way for scheduler to provide guaranteed slices of CPU time

MANAGEMENT SERVICES

PlanetLab Network Management 1.PlanetLab Nodes boot a small Linux OS from CD, run on RAM disk 2.Contacts a bootserver 3.Bootserver sends a (signed) startup script Boot normally or Write new filesystem or Start sshd for remote PlanetLab Admin login Nodes can be remotely power-cycled

Dynamic Slice Creation 1.Node Manager verifies tickets from service manager 2.Creates a new vserver 3.Creates an account on the node and on the vserver

User Logs in to PlanetLab Node /bin/vsh immediately: 1.Switches to the account’s associated vserver 2.Chroot()s to the associated root directory 3.Relinquishes true root privileges 4.Switch UID/GID to account on vserver –Transition to vserver is transparent: it appears the user just logged into the PlanetLab node directly