Integrated Scientific Workflow Management for the Emulab Network Testbed Eric Eide, Leigh Stoller, Tim Stack, Juliana Freire, and Jay Lepreau and Jay Lepreau.

Slides:



Advertisements
Similar presentations
Provenance-Aware Storage Systems Margo Seltzer April 29, 2005.
Advertisements

Cloud Computing to Satisfy Peak Capacity Needs Case Study.
GENI: Global Environment for Networking Innovations Larry Landweber Senior Advisor NSF:CISE Joint Techs Madison, WI July 17, 2006.
The ADAMANT Project: Linking Scientific Workflows and Networks “Adaptive Data-Aware Multi-Domain Application Network Topologies” Ilia Baldine, Charles.
The Challenges of Repeatable Experiment Archiving – Lessons from DETER Stephen Schwab SPARTA, Inc. d.b.a. Cobham Analytic Solutions May 25, 2010.
Introduction CSCI 444/544 Operating Systems Fall 2008.
Validata Release Coordinator Accelerated application delivery through automated end-to-end release management.
Transparent Checkpoint of Closed Distributed Systems in Emulab Anton Burtsev, Prashanth Radhakrishnan, Mike Hibler, and Jay Lepreau University of Utah,
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Introduction to Operating Systems CS-2301 B-term Introduction to Operating Systems CS-2301, System Programming for Non-majors (Slides include materials.
Toward Replayable Research in Networking and Systems Eric Eide University of Utah, School of Computing May 25, 2010.
An Experimentation Workbench for Replayable Networking Research Eric Eide, Leigh Stoller, and Jay Lepreau University of Utah, School of Computing NSDI.
Ch 12 Distributed Systems Architectures
NGNS Program Managers Richard Carlson Thomas Ndousse ASCAC meeting 11/21/2014 Next Generation Networking for Science Program Update.
Emulab Federation Preliminary Design Robert Ricci with Jay Lepreau, Leigh Stoller, Mike Hibler University of Utah USC/ISI Federation Workshop December.
1 A Large-Scale Network and Distributed Systems Testbed Jay Lepreau Chris Alfeld David Andersen (MIT) Kristin Wright University of Utah
Inferring the Topology and Traffic Load of Parallel Programs in a VM environment Ashish Gupta Peter Dinda Department of Computer Science Northwestern University.
New Challenges in Cloud Datacenter Monitoring and Management
C OLUMBIA U NIVERSITY Lightwave Research Laboratory Embedding Real-Time Substrate Measurements for Cross-Layer Communications Caroline Lai, Franz Fidler,
An Automated Component-Based Performance Experiment and Modeling Environment Van Bui, Boyana Norris, Lois Curfman McInnes, and Li Li Argonne National Laboratory,
1 IBM Software Group ® Mastering Object-Oriented Analysis and Design with UML 2.0 Module 1: Best Practices of Software Engineering.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
Eric Keller, Evan Green Princeton University PRESTO /22/08 Virtualizing the Data Plane Through Source Code Merging.
material assembled from the web pages at
20 October 2006Workflow Optimization in Distributed Environments Dynamic Workflow Management Using Performance Data David W. Walker, Yan Huang, Omer F.
Large Scale Sky Computing Applications with Nimbus Pierre Riteau Université de Rennes 1, IRISA INRIA Rennes – Bretagne Atlantique Rennes, France
Chapter 4 Realtime Widely Distributed Instrumention System.
E-Science for the SKA WF4Ever: Supporting Reuse and Reproducibility in Experimental Science Lourdes Verdes-Montenegro* AMIGA and Wf4Ever teams Instituto.
Accelerating Scientific Exploration Using Workflow Automation Systems Terence Critchlow (LLNL) Ilkay Altintas (SDSC) Scott Klasky(ORNL) Mladen Vouk (NCSU)
Workshop on the Future of Scientific Workflows Break Out #2: Workflow System Design Moderators Chris Carothers (RPI), Doug Thain (ND)
| nectar.org.au NECTAR TRAINING Module 3 Common use cases.
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
Objectives Functionalities and services Architecture and software technologies Potential Applications –Link to research problems.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
Framework for MDO Studies Amitay Isaacs Center for Aerospace System Design and Engineering IIT Bombay.
ATLAS Grid Data Processing: system evolution and scalability D Golubkov, B Kersevan, A Klimentov, A Minaenko, P Nevski, A Vaniachine and R Walker for the.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
CEOS WGISS-21 CNES GRID related R&D activities Anne JEAN-ANTOINE PICCOLO CEOS WGISS-21 – Budapest – 2006, 8-12 May.
Click to add text May 2012Taking advantage of Virtualisation1 TWA : Taking Advantage of Virtualisation on IBM Platforms TWS Education.
Service-oriented Resource Broker for QoS-Guaranteed in Grid Computing System Yichao Yang, Jin Wu, Lei Lang, Yanbo Zhou and Zhili Sun Centre for communication.
ReproZip Packing Experiments for Sharing and Publication Fernando Chirigati, Juliana Freire | NYU-Poly Dennis Shasha | NYU.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Large-scale Virtualization in the Emulab Network Testbed Mike Hibler, Robert Ricci, Leigh Stoller Jonathon Duerig Shashi Guruprasad, Tim Stack, Kirk Webb,
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Based upon slides from Jay Lepreau, Utah Emulab Introduction Shiv Kalyanaraman
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
Scientific Workflows for the Sensor Web ICT for Earth Observation Anwar Vahed.
PLANETS, OPF & SCAPE A summary of the tools from these preservation projects, and where their development is heading.
David Foster LCG Project 12-March-02 Fabric Automation The Challenge of LHC Scale Fabrics LHC Computing Grid Workshop David Foster 12 th March 2002.
Provenance Research BIBI RAJU, TODD ELSETHAGEN, ERIC STEPHAN 1 Pacific Northwest National Laboratory, Richland, WA.
Douglas Thain, John Bent Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau, Miron Livny Computer Sciences Department, UW-Madison Gathering at the Well: Creating.
High throughput biology data management and data intensive computing drivers George Michaels.
1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.
Resource Optimization for Publisher/Subscriber-based Avionics Systems Institute for Software Integrated Systems Vanderbilt University Nashville, Tennessee.
Using Deduplicating Storage for Efficient Disk Image Deployment Xing Lin, Mike Hibler, Eric Eide, Robert Ricci University of Utah.
18 May 2006CCGrid2006 Dynamic Workflow Management Using Performance Data Lican Huang, David W. Walker, Yan Huang, and Omer F. Rana Cardiff School of Computer.
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
ScotGRID is the Scottish prototype Tier 2 Centre for LHCb and ATLAS computing resources. It uses a novel distributed architecture and cutting-edge technology,
Landsat Remote Sensing Workflow
Taming the Complexity of Artifact Reproducibility
Grid Computing.
Integration of Singularity With Makeflow
Class project by Piyush Ranjan Satapathy & Van Lepham
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
Intro. To Operating Systems
GENI Exploring Networks of the Future
Gordon Erlebacher Florida State University
Presentation transcript:

Integrated Scientific Workflow Management for the Emulab Network Testbed Eric Eide, Leigh Stoller, Tim Stack, Juliana Freire, and Jay Lepreau and Jay Lepreau University of Utah, School of Computing USENIX 2006 / June 3, 2006

2 This Talk in One Slide Current network testbeds Current network testbeds …manage the “laboratory” …manage the “laboratory” …not the experimentation process. …not the experimentation process. → A big problem for large-scale activities! → A big problem for large-scale activities! Evolve Emulab for experiments based on scientific workflows Evolve Emulab for experiments based on scientific workflows Big mutual benefits: testbed ↔ workflow Big mutual benefits: testbed ↔ workflow Work in progress Work in progress

3 Example: UAV Simulation A distributed, real-time application A distributed, real-time application Evaluate improvements to real-time middleware Evaluate improvements to real-time middleware vs. CPU load vs. CPU load vs. network load vs. network load 4 research groups 4 research groups x 19 experiments x 19 experiments x 56 metrics x 56 metrics UAV Receiver ATR images → ← images alerts →

4 Use Emulab ConceptExperimentEmulate write “ns” file “swap in”

5 Problems Solved I get machines! I get machines! 328 PCs, and more 328 PCs, and more Time- & space-shared Time- & space-shared Loads OS and software Loads OS and software I get network! I get network! Config. topology & quality Config. topology & quality I get to collaborate! I get to collaborate! Available to researchers and educators worldwide Available to researchers and educators worldwide File storage, , … File storage, , …

6 Problems Not Solved “Now what?” Getting off the ground Run all my software Add instrumentation Collect all my data Analyze it Scaling up 19 configurations Automation

7 More Problems Not Solved “How did I get here?” Over the short term… “Where are the results I got last week?” “How did I get those results anyway?” “What if…?” …and the long term Reproducing results Reusing artifacts

8 Idea: Scientific Workflow Managing activities, inputs, and outputs is the job of a scientific workflow system Managing activities, inputs, and outputs is the job of a scientific workflow system Our approach: evolve Emulab with integrated support for scientific workflows Our approach: evolve Emulab with integrated support for scientific workflows Build on existing abstractions & mechanisms Build on existing abstractions & mechanisms Resource focus → user & task focus Resource focus → user & task focus Users work “within” and “across” experiments Users work “within” and “across” experiments

9 Contributions Address demand + opportunity Address demand + opportunity Users need to manage large-scale complexity Users need to manage large-scale complexity A symbiotic combination: leverage and impact A symbiotic combination: leverage and impact Advance the applicability of testbeds Advance the applicability of testbeds Not just Emulab — e.g., PlanetLab and DETER Not just Emulab — e.g., PlanetLab and DETER Advance scientific workflow systems Advance scientific workflow systems Exploit testbed capabilities — e.g., “total control” Exploit testbed capabilities — e.g., “total control” Address testbed requirements — e.g., flexible use Address testbed requirements — e.g., flexible use

10 Issue: Encapsulation Current “experiment” model is not fully encapsulating Current “experiment” model is not fully encapsulating Topology + static events Topology + static events Need everything else! Need everything else! Challenge: specification Challenge: specification Complete and precise… Complete and precise… …w/o huge user burden …w/o huge user burden Approach: be automatic Approach: be automatic E.g., track files used E.g., track files used Snapshot, archive, restore Snapshot, archive, restore User can refine “extent” User can refine “extent” ns file OSespackages my software inputsoutputs NFS monitors packet monitors AJAX GUI Subversion repo. datapository (DB) research filesystems

11 Issue: Definition vs. Execution Current “experiment” has multiple roles Definition The thing that you run Challenge: representing relationships Multiple runs of one setup Similar configurations Approach: a new model of experimentation Separate the roles Evolve the new abstractions

12 New Model Template Template Swapin Swapin Experiment Experiment Activity Activity Record Record n = 2 n = 4

13 Issue: History Research and educational plans are dynamic By design & by discovery Challenge: safe exploration Fork Back up Approach: keep history & support temporal navigation Keep template revisions Track provenance Locate, repeat, and reuse rev 1.1 bigger nets add params oops: need new measurements what about loss?

14 Implementation in Progress Definition Execution & History Data Analysis

15 Conclusion Large and powerful testbeds Large and powerful testbeds …enable complex and large-scale activities …enable complex and large-scale activities …lead to complex and large-scale workflow management problems …lead to complex and large-scale workflow management problems Integrated workflow management can leverage the strengths of testbeds Integrated workflow management can leverage the strengths of testbeds Systems approach — and systems challenges Systems approach — and systems challenges → Better testbeds and workflow systems → Better testbeds and workflow systems

Thanks!

17 Extra Slides After This Point