1 Implementing the Emulab-PlanetLab Portal: Experiences and Lessons Learned Kirk Webb Mike Hibler Robert Ricci Austin Clements Jay Lepreau University of.

Slides:



Advertisements
Similar presentations
KANSEI TESTBED OHIO STATE UNIVERSITY. HETEREGENOUS TESTBED Multiple communication networks, computation platforms, multi-modal sensors/actuators, and.
Advertisements

Colyseus: A Distributed Architecture for Online Multiplayer Games
Database Architectures and the Web
1 Planetary Network Testbed Larry Peterson Princeton University.
Department of Computer Science and Engineering University of Washington Brian N. Bershad, Stefan Savage, Przemyslaw Pardyak, Emin Gun Sirer, Marc E. Fiuczynski,
Chapter 13 (Web): Distributed Databases
Transparent Checkpoint of Closed Distributed Systems in Emulab Anton Burtsev, Prashanth Radhakrishnan, Mike Hibler, and Jay Lepreau University of Utah,
Distributed Processing, Client/Server, and Clusters
Chapter 6: Client/Server and Intranet Computing
Trusted Disk Loading in the Emulab Network Testbed Cody Cutler, Mike Hibler, Eric Eide, Rob Ricci 1.
Networking Panel Jeannie Albrecht Williams College, Plush/Gush project Ivan Seskar Rutgers University, WINLAB/ORBIT project Steven Schwab Cobham Analytic.
Emulab.net: An Emulation Testbed for Networks and Distributed Systems Jay Lepreau and many others University of Utah Intel IXA University Workshop June.
1 GENI: Global Environment for Network Innovations Jennifer Rexford Princeton University
Improving Robustness in Distributed Systems Jeremy Russell Software Engineering Honours Project.
Network Rspecs in PlanetLab and VINI Andy Bavier PL Developer's Meeting May 13-14, 2008.
1 Flexlab: A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems Jonathon Duerig, Robert Ricci, Junxing Zhang, Daniel Gebhardt,
An Experimentation Workbench for Replayable Networking Research Eric Eide, Leigh Stoller, and Jay Lepreau University of Utah, School of Computing NSDI.
Integrated Scientific Workflow Management for the Emulab Network Testbed Eric Eide, Leigh Stoller, Tim Stack, Juliana Freire, and Jay Lepreau and Jay Lepreau.
Exploiting Content Localities for Efficient Search in P2P Systems Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang 1 1 College of William and Mary,
OceanStore: An Architecture for Global-Scale Persistent Storage Professor John Kubiatowicz, University of California at Berkeley
The Future of Internet Research Scott Shenker (on behalf of many networking collaborators)
How To Use It...  Submit ns script via web form  Relax while emulab …  Generates config from script & stores in DB  Maps specified virtual topology.
Emulab Federation Preliminary Design Robert Ricci with Jay Lepreau, Leigh Stoller, Mike Hibler University of Utah USC/ISI Federation Workshop December.
1 A Large-Scale Network and Distributed Systems Testbed Jay Lepreau Chris Alfeld David Andersen (MIT) Kristin Wright University of Utah
DISTRIBUTED COMPUTING
Highly Available ACID Memory Vijayshankar Raman. Introduction §Why ACID memory? l non-database apps: want updates to critical data to be atomic and persistent.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Wide-Area Cooperative Storage with CFS Robert Morris Frank Dabek, M. Frans Kaashoek, David Karger, Ion Stoica MIT and Berkeley.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
By: Ashish Gohel 8 th sem ISE.. Why Cloud Computing ? Cloud Computing platforms provides easy access to a company’s high-performance computing and storage.
Architectures of distributed systems Fundamental Models
Management for IP-based Applications Mike Fisher BTexaCT Research
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
Tony McGregor RIPE NCC Visiting Researcher The University of Waikato DAR Active measurement in the large.
Component Technology. Challenges Facing the Software Industry Today’s applications are large & complex – time consuming to develop, difficult and costly.
1. Process Gather Input – Today Form Coherent Consensus – Next two months.
DISTRIBUTED COMPUTING Introduction Dr. Yingwu Zhu.
Distributed Computing Systems CSCI 4780/6780. Distributed System A distributed system is: A collection of independent computers that appears to its users.
Scalable Computing on Open Distributed Systems Jon Weissman University of Minnesota National E-Science Center CLADE 2008.
Developer TECH REFRESH 15 Junho 2015 #pttechrefres h Understand your end-users and your app with Application Insights.
VO-Ganglia Grid Simulator Catalin Dumitrescu, Mike Wilde, Ian Foster Computer Science Department The University of Chicago.
Large-scale Virtualization in the Emulab Network Testbed Mike Hibler, Robert Ricci, Leigh Stoller Jonathon Duerig Shashi Guruprasad, Tim Stack, Kirk Webb,
Managing the CERN LHC Tier0/Tier1 centre Status and Plans March 27 th 2003 CERN.ch.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Based upon slides from Jay Lepreau, Utah Emulab Introduction Shiv Kalyanaraman
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
1 Wide Area Network Emulation on the Millennium Bhaskaran Raman Yan Chen Weidong Cui Randy Katz {bhaskar, yanchen, wdc, Millennium.
Upcoming GENI Architecture Topics: The Future of Experiment Management with Gush Jeannie Albrecht David Irwin.
Distributed Computing Systems CSCI 4780/6780. Scalability ConceptExample Centralized servicesA single server for all users Centralized dataA single on-line.
1 University of Maryland Runtime Program Evolution Jeff Hollingsworth © Copyright 2000, Jeffrey K. Hollingsworth, All Rights Reserved. University of Maryland.
The Google Cluster Google. The Google Floor Plan.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Virtual Machine Movement and Hyper-V Replica
1  2004 Morgan Kaufmann Publishers Fallacies and Pitfalls Fallacy: the rated mean time to failure of disks is 1,200,000 hours, so disks practically never.
HOW TO BUILD A BETTER TESTBED Fabien Hermenier Robert Ricci LESSONS FROM A DECADE OF NETWORK EXPERIMENTS ON EMULAB TridentCom ’
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Designing a Federated Testbed as a Distributed System Robert Ricci, Jonathon Duerig, Gary Wong, Leigh Stoller, Srikanth Chikkulapelly, Woojin Seok 1.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
SEMINAR TOPIC ON “RAIN TECHNOLOGY”
CLOUD ARCHITECTURE Many organizations and researchers have defined the architecture for cloud computing. Basically the whole system can be divided into.
Jean-Philippe Baud, IT-GD, CERN November 2007
RHEV Platform at LHCb Red Hat at CERN 17-18/1/17
OpenMosix, Open SSI, and LinuxPMI
Redundancy Control For PostgreSQL
CHAPTER 3 Architectures for Distributed Systems
CS 501: Software Engineering Fall 1999
by Xiang Mao and Qin Chen
Component-based Applications
Mark McKelvin EE249 Embedded System Design December 03, 2002
Evolution of Mexico’s Fiscal Transparency Portal
Task Manager & Profile Interface
Presentation transcript:

1 Implementing the Emulab-PlanetLab Portal: Experiences and Lessons Learned Kirk Webb Mike Hibler Robert Ricci Austin Clements Jay Lepreau University of Utah December 5, st WORLDS

2 What is Emulab? Software to control network testbeds –Instantiates user-requested topologies on available resources –Most popular UI is fancy Web interface; XML-RPC Emulab “Classic” –~200 PCs in a densely connected cluster –Dozens of experiments “swap” in and out each day Extended to wide area in late 2002 –RON testbed and Emulab’s own wide-area nodes Now: a testbed with diverse resources –Physical, virtual and simulated nodes and links

3 Why Create "The Portal"? Diversify Emulab with new resources Explore challenges of integrating with other testbed environments Provide PlanetLab users with a powerful but easy-to-use interface

4 K.I.S.S.

5 Decisions, Decisions

6 Emulab-PlanetLab Portal Features Emulab provides all elements of the PlanetLab infrastructure service taxonomy, plus more

7 Portal Features (cont'd) Monitors sensors to ascertain node characteristics Three selection methods: manual, link-, and node-centric

8 Portal Features (cont'd) Watchdog process per virtual node Software upgrades and account updates

9 Portal Features (cont'd) “Reboot” a single virtual node, or all of them Soon: wide-area event system for control

10 Challenges and Lessons Different use models State management Interface evolution Failure

11 Different Use Models Emulab: rapid cycle experiments (mostly) PlanetLab: long-running services (mostly) Average Emulab experiment duration: five hours Building fast/synchronous on delayed/async? Delayed, asynchronous interfaces force fast synchronous clients to waste resources Exposing lower-level API primitives allows a wider range of service models

12 State Management Locations of data are spread out Data coupling issues –Identity crisis! –Balance between coherency and overhead (age-old problem) Persistent & reliable node identifiers are a must Should not assume long-term state synchronization

13 Interface Evolution Research infrastructures evolve rapidly Tension between PlanetLab goals: "Evolving Architecture“  change vs. "Unbundled Management“  many services, many players Internally, use the same API that you export Embrace the inevitable: changing APIs. Make that code modular

14 Failure All node “liveness” metrics are unreliable –Trumpet, Ganglia, Emulab Watchdog … Anything can fail –Disk space, fds, PLC, … Only execution of the application itself indicates node “liveness”!

15 Conclusions Hard to keep it working Will people build large systems on other parties’ constantly-changing research systems?