PlanetLab: Catalyzing Network Innovation October 2, 2007 Larry Peterson Princeton University Timothy Roscoe Intel Research at Berkeley.

PlanetLab: Catalyzing Network Innovation October 2, 2007 Larry Peterson Princeton University Timothy Roscoe Intel Research at Berkeley

Challenges Security –known vulnerabilities lurking in the Internet ä DDoS, worms, malware –addressing security comes at a significant cost ä federal government spent $5.4B in 2004 ä estimated $50-100B spent worldwide on security in 2004 Reliability –e-Commerce increasingly depends on fragile Internet ä much less reliable than the phone network (three vs five 9’s) ä risks in using the Internet for mission-critical operations ä barrier to ubiquitous VoIP –an issue of ease-of-use for everyday users

Challenges (cont) Scale & Diversity –the whole world is becoming networked ä sensors, consumer electronic devices, embedded processors –assumptions about edge devices (hosts) no longer hold ä connectivity, power, capacity, mobility,… Performance –scientists have significant bandwidth requirements ä each e-science community covets its own wavelength(s) –purpose-built solutions are not cost-effective ä being on the “commodity path” makes an effort sustainable

Two Paths Incremental –apply point-solutions to the current architecture Clean-Slate –replace the Internet with a new network architecture We can’t be sure the first path will fail, but… –point-solutions result in increased complexity ä making the network harder to manage ä making the network more vulnerable to attacks ä making the network more hostile to new applications –architectural limits may lead to a dead-end

Architectural Limits Minimize trust assumptions –the Internet originally viewed network traffic as fundamentally cooperative, but should view it as adversarial Enable competition –the Internet was originally developed independent of any commercial considerations, but today the network architecture must take competition and economic incentives into account Allow for edge diversity –the Internet originally assumed host computers were connected to the edges of the network, but host-centric assumptions are not appropriate in a world with an increasing number of sensors and mobile devices

Limits (cont) Design for network transparency –the Internet originally did not expose information about its internal configuration, but there is value to both users and network administrators in making the network more transparent Enable new network services –the Internet originally provided only a best-effort packet delivery service, but there is value in making processing capability and storage capacity available in the middle of the network Integrate with optical transport –the Internet originally drew a sharp line between the network and the underlying transport facility, but allowing bandwidth aggregation and traffic engineering to be first-class abstractions has the potential to improve efficiency and performance

Barriers to Second Path Internet has become ossified –no competitive advantage to architectural change –no obvious deployment path Inadequate validation of potential solutions –simulation models too simplistic –little or no real-world experimental evaluation Testbed dilemma –production testbeds: real users but incremental change –research testbeds: radical change but no real users

Recommendation It is time for the research community, federal government, and commercial sector to jointly pursue the second path. This involves experimentally validating new network architecture(s), and doing so in a sustainable way that fosters wide-spread deployment.

Approaches Revisiting definition & placement of function –naming, addressing, and location –routing, forwarding, and addressing –management, control, and data planes –end hosts, routers, and operators Designing with new constraints in mind –selfish and adversarial participants –mobile hosts and disconnected operation –large number of small, low-power devices –ease of network management

Deployment Story Old model –global up-take of new technology –does not work due to ossification New model –incremental deployment via user opt-in –lowering the barrier-to-entry makes deployment plausible Process by which we define the new architecture –purists: settle on a single common architecture ä virtualization is a means –pluralists: multiplicity of continually evolving elements ä virtualization is an ends What architecture do we deploy? – research happens…

Validation Gap AnalysisSimulation / EmulationExperiment At Scale With Real Users Deployment (models)(code) (results) (measurements)

PlanetLab What is PlanetLab? An open, shared testbed for –Developing –Deploying –Accessing - planetary-scale services. What would you do if you had Akamai’s infrastructure?

PlanetLab Motivation New class of applications emerging that spread over sizable fraction of the web Architectural components starting to emerge The next Internet will be created as an overlay on the current one It will be defined by services, not transport There is NO vehicle to try out the next n great ideas in this area

PlanetLab Guidelines (1) Thousand viewpoints on “the cloud” is what matters –not the thousand servers –not the routers, per se –not the pipes

PlanetLab Guidelines (2) and you must have the vantage points of the crossroads –co-location centers, peering points, etc.

PlanetLab Guidelines (3) Each service needs an overlay covering many points –logically isolated Many concurrent services and applications –must be able to slice nodes = > VM per service –service has a slice across large subset Must be able to run each service / app over long period to build meaningful workload –traffic capture/generator must be part of facility Consensus on “a node” more important than “which node”

PlanetLab Guidelines (4) Test-lab as a whole must be up a lot –global remote administration and management –redundancy within Each service will require own management capability Testlab nodes cannot “bring down” their site –not on forwarding path Relationship to firewalls and proxies is key

PlanetLab Guidelines (5) Storage has to be a part of it –edge nodes have significant capacity Needs a basic well-managed capability

PlanetLab Initial core team: Intel Research: David Culler Timothy Roscoe Brent Chun Mic Bowman Princeton: Larry Peterson Mike Wawrzoniak University of Washington: Tom Anderson Steven Gribble

PlanetLab 1000+ machines spanning 500 sites and 40 countries Supports distributed virtualization each of 600+ network services running in their own slice

Requirements 1)It must provide a global platform that supports both short-term experiments and long-running services. –services must be isolated from each other –multiple services must run concurrently –must support real client workloads

Requirements 2)It must be available now, even though no one knows for sure what “it” is. –deploy what we have today, and evolve over time –make the system as familiar as possible (e.g., Linux) –accommodate third-party management services

Requirements 3)We must convince sites to host nodes running code written by unknown researchers from other organizations. –protect the Internet from PlanetLab traffic –must get the trust relationships right

Requirements 4)Sustaining growth depends on support for site autonomy and decentralized control. –sites have final say over the nodes they host –must minimize (eliminate) centralized control

Requirements 5)It must scale to support many users with minimal resources available. –expect under-provisioned state to be the norm –shortage of logical resources too (e.g., IP addresses)

Design Challenges Minimize centralized control without violating trust assumptions. Balance the need for isolation with the reality of scarce resources. Maintain a stable and usable system while continuously evolving it.

Key Architectural Ideas Distributed virtualization –slice = set of virtual machines Unbundled management –infrastructure services run in their own slice Chain of responsibility –account for behavior of third-party software –manage trust relationships

PlanetLab Implementation Research Issues Sliceability: distributed virtualization Isolation and resource control Security and integrity: exposed machines Management of a very large, widely dispersed system Instrumentation and measurement Building blocks and primitives

29 Slice-ability Each service runs in a slice of PlanetLab –distributed set of resources (network of virtual machines) –allows services to run continuously VM monitor on each node enforces slices –limits fraction of node resources consumed –limits portion of name spaces consumed Issue: global resource discovery –how do applications specify their requirements? –how do we map these requirements onto a set of nodes?

Slices

User Opt-in Server http://coblitz.org/www.princeton.edu/podcast.mp4 Client

Per-Node View Virtual Machine Monitor (VMM) Node Mgr Local Admin VM 1 VM 2 VM n …

Global View … … … PLC

Exploit Layer 2 Circuits Deployed in NLR & Internet2 (aka VINI)

Circuits (cont) Supports arbitrary virtual topologies

Circuits (cont) Exposes (can inject) network failures

Circuits (cont) BGP Participate in Internet routing

40 Distributed Control of Resources At least two interested parties –service producers (researchers) ä decide how their services are deployed over available nodes –service consumers (users) ä decide what services run on their nodes At least two contributing factors –fair slice allocation policy ä both local and global components (see above) –knowledge about node state ä freshest at the node itself

41 Unbundled Management Partition management into orthogonal services –resource discovery –monitoring node health –topology management –manage user accounts and credentials –software distribution Issues –management services run in their own slice –allow competing alternatives –engineer for innovation (define minimal interfaces)

42 Application-Centric Interfaces Inherent problems –stable platform versus research into platforms –writing applications for temporary testbeds –integrating testbeds with desktop machines Approach –adopt popular API (Linux) and evolve implementation –eventually separate isolation and application interfaces –provide generic “shim” library for desktops

43 Virtual Machines Security –prevent unauthorized access to state Familiar API –forcing users to accept a new API is death Isolation –contain resource consumption Performance –don’t want to be apologetic

Virtualization Virtual Machine Monitor (VMM) Node Mgr Owner VM VM 1 VM 2 VM n … Linux kernel (Fedora Core) + Vservers (namespace isolation) + Schedulers (performance isolation) + VNET (network virtualization) Auditing service Monitoring services Brokerage services Provisioning services

Resource Allocation Decouple slice creation and resource allocation –given a “fair share” (1/N th ) by default when created –acquire/release additional resources over time ä including resource guarantees Protect against thrashing and over-use –link bandwidth ä upper bound on sustained rate (protect campus bandwidth) –memory ä kill largest user of physical memory when swap at 85%

PlanetLab Confluence of Technologies Cluster-based management Overlay and P2P networks Virtual machines and sandboxing Service composition frameworks Internet measurement Packet processors Colo services Web services  The time is now.

Usage Stats Users: 2500+ Slices: 600+ Long-running services: ~20 –content distribution, scalable large file transfer, –multicast, pub-sub, routing overlays, anycast,… Bytes-per-day: 4 TB –1Gbps peak rates not uncommon Unique IP-addrs-per-day: 1M

Validation Gap AnalysisSimulation / EmulationExperiment At Scale With Real Users Deployment (models)(code) (results) (measurements)

Deployment Gap Maturity Time Analysis (MatLab) Controlled Experiment (EmuLab) Deployment Study (PlanetLab) Pilot Demonstration (PL Gold) Commercial Adoption Ideas Implementation Reality User & Network Reality Economic Reality

PlanetLab Emerging applications Content distribution Peer-to-Peer networks Global storage Mobility services Etc. etc. Vibrant research community embarking on new direction and none can try out their ideas.

Trust Relationships Princeton Berkeley Washington MIT Brown CMU NYU ETH Harvard HP Labs Intel NEC Labs Purdue UCSD SICS Cambridge Cornell … princeton_codeen nyu_d cornell_beehive att_mcash cmu_esm harvard_ice hplabs_donutlab idsl_psepr irb_phi paris6_landmarks mit_dht mcgill_card huji_ender arizona_stork ucb_bamboo ucsd_share umd_scriptroute … N x N Trusted Intermediary (PLC)

Principals Node Owners –host one or more nodes (retain ultimate control) –selects an MA and approves of one or more SAs Service Providers (Developers) –implements and deploys network services –responsible for the service’s behavior Management Authority (MA) –installs an maintains software on nodes –creates VMs and monitors their behavior Slice Authority (SA) –registers service providers –creates slices and binds them to responsible provider

Trust Relationships (1) Owner trusts MA to map network activity to responsible slice MA Owner Provider SA (2) Owner trusts SA to map slice to responsible providers 1 2 5 6 (3) Provider trusts SA to create VMs on its behalf 3 (4) Provider trusts MA to provide working VMs & not falsely accuse it 4 (5) SA trusts provider to deploy responsible services (6) MA trusts owner to keep nodes physically secure

Architectural Elements MA NM + VMM node database Node Owner VM SCS SA slice database VM Service Provider

Slice Creation PLC (SA) VMM NMVM PI SliceCreate( ) SliceUsersAdd( ) User/Agent GetTicket( ) VM …............ (redeem ticket with plc.scs) CreateVM(slice) plc.scs

Brokerage Service PLC (SA) VMM NMVM …............ (broker contacts relevant nodes) Bind(slice, pool) VM User BuyResources( ) Broker

PlanetLab: Two Perspectives Useful research platform Prototype of a new network architecture

What are people doing in/on/with/around PlanetLab? 1.Network measurement 2.Application-level multicast 3.Distributed Hash Tables 4.Storage 5.Resource Allocation 6.Distributed Query Processing 7.Content Distribution Networks 8.Management and Monitoring 9.Overlay Networks 10.Virtualisation and Isolation 11.Router Design 12.Testbed Federation 13.…

Lessons Learned Trust relationships –owners, operators, developers Virtualization –scalability is critical –control plane and node OS are orthogonal –least privilege in support of management functionality Decentralized control –owner autonomy –delegation Resource allocation –decouple slice creation and resource allocation –best effort + overload protection Evolve based on experience –support users quickly

Conclusions Innovation can come from anywhere Much of the Internet’s success can be traced to its support for innovation “at the edges” There is currently a high barrier-to-entry for innovating “throughout the net” One answer is a network substrate that supports “on demand, customizable networks” –enables research –supports continual innovation and evolution

PlanetLab Software Overview Mark Huang mlhuang@cs.princeton.edu

Node Software Boot –Boot CD –Boot Manager Virtualization –Linux kernel –VServer –VNET Node Management –Node Manager –NodeUpdate –PlanetLabConf Slice Management –Slice Creation Service –Proper Monitoring –PlanetFlow –pl_mom

PLC Software Database server –pl_db PLCAPI server –plc_api Web server –Website PHP –Scripts Boot server –PlanetLabConf scripts PlanetFlow archive Mail, Support (RT), DNS, Monitor, Build, CVS, QA

Boot Manager –bootmanager/source/ ä Main BootManager class, authentication, utility functions, configuration, etc. –bootmanager/source/steps/ ä Individual “steps” of the install/boot process –bootmanager/support-files/ ä Bootstrap tarball generation ä Legacy support for old Boot CDs

Virtualization Linux kernel –Fedora Core 8 kernel ä VServer patch VServer –util-vserver/ ä Userspace VServer management utilities and libraries VNET –Linux kernel module –Intercepts bind(), other socket calls –Intercepts and marks all IP packets –Implements TUN/TAP, proxy socket extensions

Node Management Node Manager (pl_nm) –sidewinder/ ä Thin XML-RPC shim around VServer (or other VMM) syscalls, and other knobs –util-python/ ä Miscellaneous Python utility functions –util-vserver/python/ ä Python bindings for VServer syscalls Node Update –NodeUpdate/ ä Wrapper around yum for keeping node RPMs up-to-date PlanetLabConf –PlanetLabConf/ ä Pull-based configuration file distribution service ä Most files dynamically generated on a per-node or per-node group basis

Slice Management Slice Creation Service (pl_conf) –sidewinder/ ä Runs in a slice ä Periodically downloads slices.xml from boot server ä Local XML-RPC API for delegated slice creation, query Proper –proper/ ä Simple local interface for executing privileged operations ä Bind mount(), privileged port bind(), root read()

Administration and Monitoring PlanetFlow (pl_netflow) –netflow/ ä MySQL schema and initialization/maintenance scripts –netflow/html/ ä PHP frontend –netflow/pfgrep/ ä Console frontend –ulogd/ ä Packet header collection, aggregation, and insertion PlanetLab Monitor (pl_mom) –pl_mom/swapmon.py ä Swap space monitor and slice reaper –pl_mom/bwmon.py ä Average daily bandwidth monitor

Database and API Database –pl_db/ ä PostgreSQL schema generated from XML PLCAPI –plc_api/specification/ ä XML specification of API functions –plc_api/PLC/ ä mod_python implementation

Web Server PHP, Static, Generated –plc_www/includes/new_plc_api.php ä Auto-generated PHP binding to PLCAPI –plc_www/db/ ä Secure portion of website –plc_www/generated/ ä Generated include files –plc/scripts/ ä Miscellaneous scripts

Boot Server Secure Software Distribution –Authenticated, encrypted with SSL –/var/www/html/boot/ ä Default location for Boot Manager –/var/www/html/install-rpms/ ä Default /etc/yum.conf location for RPM updates –/var/www/html/PlanetLabConf/ ä Server-side component ä Mostly PHP

PlanetLab: Catalyzing Network Innovation October 2, 2007 Larry Peterson Princeton University Timothy Roscoe Intel Research at Berkeley.

Similar presentations

Presentation on theme: "PlanetLab: Catalyzing Network Innovation October 2, 2007 Larry Peterson Princeton University Timothy Roscoe Intel Research at Berkeley."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

PlanetLab: Catalyzing Network Innovation October 2, 2007 Larry Peterson Princeton University Timothy Roscoe Intel Research at Berkeley.

Similar presentations

Presentation on theme: "PlanetLab: Catalyzing Network Innovation October 2, 2007 Larry Peterson Princeton University Timothy Roscoe Intel Research at Berkeley."— Presentation transcript:

Similar presentations

About project

Feedback