The ADAMANT Project: Linking Scientific Workflows and Networks “Adaptive Data-Aware Multi-Domain Application Network Topologies” Ilia Baldine, Charles.

Slides:



Advertisements
Similar presentations
Pegasus on the Virtual Grid: A Case Study of Workflow Planning over Captive Resources Yang-Suk Kee, Eun-Kyu Byun, Ewa Deelman, Kran Vahi, Jin-Soo Kim Oracle.
Advertisements

SLA-Oriented Resource Provisioning for Cloud Computing
Natasha Pavlovikj, Kevin Begcy, Sairam Behera, Malachy Campbell, Harkamal Walia, Jitender S.Deogun University of Nebraska-Lincoln Evaluating Distributed.
High Performance Computing Course Notes Grid Computing.
Managing Workflows Within HUBzero: How to Use Pegasus to Execute Computational Pipelines Ewa Deelman USC Information Sciences Institute Acknowledgement:
Ewa Deelman, Integrating Existing Scientific Workflow Systems: The Kepler/Pegasus Example Nandita Mangal,
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Integrated Scientific Workflow Management for the Emulab Network Testbed Eric Eide, Leigh Stoller, Tim Stack, Juliana Freire, and Jay Lepreau and Jay Lepreau.
Ewa Deelman, Optimizing for Time and Space in Distributed Scientific Workflows Ewa Deelman University.
April 2009 OSG Grid School - RDU 1 Open Science Grid John McGee – Renaissance Computing Institute University of North Carolina, Chapel.
WORKFLOWS IN CLOUD COMPUTING. CLOUD COMPUTING  Delivering applications or services in on-demand environment  Hundreds of thousands of users / applications.
Adaptive Server Farms for the Data Center Contact: Ron Sheen Fujitsu Siemens Computers, Inc Sever Blade Summit, Getting the.
CREATING A MULTI-WAVELENGTH GALACTIC PLANE ATLAS WITH AMAZON WEB SERVICES G. Bruce Berriman, John Good IPAC, California Institute of Technolog y Ewa Deelman,
Managing Workflows with the Pegasus Workflow Management System
CONDOR DAGMan and Pegasus Selim Kalayci Florida International University 07/28/2009 Note: Slides are compiled from various TeraGrid Documentations.
Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1.
PhD course - Milan, March /09/ Some additional words about cloud computing Lionel Brunie National Institute of Applied Science (INSA) LIRIS.
Virtual Machine Hosting for Networked Clusters: Building the Foundations for “Autonomic” Orchestration Based on paper by Laura Grit, David Irwin, Aydan.
Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.
The Grid is a complex, distributed and heterogeneous execution environment. Running applications requires the knowledge of many grid services: users need.
DISTRIBUTED COMPUTING
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
Open Science Grid For CI-Days Elizabeth City State University Jan-2008 John McGee – OSG Engagement Manager Manager, Cyberinfrastructure.
Managing large-scale workflows with Pegasus Karan Vahi ( Collaborative Computing Group USC Information Sciences Institute Funded.
Experiences Using Cloud Computing for A Scientific Workflow Application Jens Vöckler, Gideon Juve, Ewa Deelman, Mats Rynge, G. Bruce Berriman Funded by.
Dr. Ahmed Abdeen Hamed, Ph.D. University of Vermont, EPSCoR Research on Adaptation to Climate Change (RACC) Burlington Vermont USA MODELING THE IMPACTS.
DV/dt - Accelerating the Rate of Progress towards Extreme Scale Collaborative Science DOE: Scientific Collaborations at Extreme-Scales:
Issues in (Financial) High Performance Computing John Darlington Director Imperial College Internet Centre Fast Financial Algorithms and Computing 4th.
Agenda Motion Imagery Challenges Overview of our Cloud Activities -Big Data -Large Data Implementation Lessons Learned Summary.
SEEK Welcome Malcolm Atkinson Director 12 th May 2004.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
Pegasus: Running Large-Scale Scientific Workflows on the TeraGrid Ewa Deelman USC Information Sciences Institute
Authors: Ronnie Julio Cole David
Intermediate Condor: Workflows Rob Quick Open Science Grid Indiana University.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Scientific Workflow systems: Summary and Opportunities for SEEK and e-Science.
Experiment Management from a Pegasus Perspective Jens-S. Vöckler Ewa Deelman
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
Experiences Running Seismic Hazard Workflows Scott Callaghan Southern California Earthquake Center University of Southern California SC13 Workflow BoF.
Funded by the NSF OCI program grants OCI and OCI Mats Rynge, Gideon Juve, Karan Vahi, Gaurang Mehta, Ewa Deelman Information Sciences Institute,
1 USC Information Sciences InstituteYolanda Gil AAAI-08 Tutorial July 13, 2008 Part IV Workflow Mapping and Execution in Pegasus (Thanks.
PEER 2003 Meeting 03/08/031 Interdisciplinary Framework Major focus areas Structural Representation Fault Systems Earthquake Source Physics Ground Motions.
1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.
Managing LIGO Workflows on OSG with Pegasus Karan Vahi USC Information Sciences Institute
1 Performance Impact of Resource Provisioning on Workflows Gurmeet Singh, Carl Kesselman and Ewa Deelman Information Science Institute University of Southern.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
EGI-InSPIRE RI An Introduction to European Grid Infrastructure (EGI) March An Introduction to the European Grid Infrastructure.
HUBzero® Platform for Scientific Collaboration Copyright © 2012 HUBzero Foundation, LLC International Workshop on Science Gateways, ETH Zürich, June 3-5,
Pegasus WMS Extends DAGMan to the grid world
Cloudy Skies: Astronomy and Utility Computing
Joslynn Lee – Data Science Educator
Seismic Hazard Analysis Using Distributed Workflows
Scott Callaghan Southern California Earthquake Center
Workflows and the Pegasus Workflow Management System
US CMS Testbed.
University of Southern California
Managing Computational Workflows in the Cloud
Clouds from FutureGrid’s Perspective
Pegasus Workflows on XSEDE
Overview of Workflows: Why Use Them?
Mats Rynge USC Information Sciences Institute
rvGAHP – Push-Based Job Submission Using Reverse SSH Connections
High Throughput Computing for Astronomers
Frieda meets Pegasus-WMS
Presentation transcript:

The ADAMANT Project: Linking Scientific Workflows and Networks “Adaptive Data-Aware Multi-Domain Application Network Topologies” Ilia Baldine, Charles Schmitt, University of North Carolina at Chapel Hill/RENCI Jeff Chase, Duke University Ewa Deelman, University of Southern California Funded by NSF under the Campus Cyberinfrastructure – Network Infrastructure and Engineering (CC-NIE) Program

The Problem Scientific data is being collected at an ever increasing rate The “old days” -- big, focused experiments– LHC, LIGO, etc.. -- big data archives– SDSS, 2MASS, etc.. Today “cheap” DNA sequencers – and an increasing number of them in individual laboratories The complexity of the computational problems is ever increasing Local compute resources are often not enough (too small, limited availability) The computing infrastructure keeps changing Hardware, software, but also computational models

Computational workflow --managing application complexity Helps express multi-step computations in a declarative way Can support automation, minimize human involvement –Makes analyses easier to run Can be high-level and portable across execution platforms Keeps track of provenance to support reproducibility Fosters collaboration—code and data sharing Gives the opportunity to manage resources underneath

Large-Scale, Data-Intensive Workflows Montage Galactic Plane Workflow –18 million input images (~2.5 TB) –900 output images (2.5 GB each, 2.4 TB total) –10.5 million tasks (34,000 CPU hours) An analysis is composed of a number of related workflows– an ensemble Smart data/network provisioning are important 4 John Good (Caltech)

CyberShake PSHA Workflow 239 Workflows Each site in the input map corresponds to one workflow Each workflow has:  820,000 tasks  Description  Builders ask seismologists: “What will the peak ground motion be at my new building in the next 50 years?”  Seismologists answer this question using Probabilistic Seismic Hazard Analysis (PSHA) Southern California Earthquake Center MPI codes ~ 12,000 CPU hours, Post Processing 2,000 CPU hours Data footprint ~ 800GB Coordination between resources is needed

Environment How to manage complex workloads? Data Storage Campus Cluster XSEDE Open Science Grid Amazon Cloud Work definition Local Resource

Use Given Resources Data Storage Campus Cluster FutureGrid XSEDE Open Science Grid Amazon Cloud Work definition As a WORKFLOW Workflow Management System Local Resource work data

Workflow Management You may want to use different resources within a workflow or over time Need a high-level workflow specification Need a planning capability to map from high-level to executable workflow Need to manage the task dependencies Need to manage the execution of tasks on the remote resources Need to provide scalability, performance, reliability

Pegasus Workflow Management System (est. 2001) Pegasus makes use of available resources, but cannot control them A collaboration between USC and the Condor Team at UW Madison (includes DAGMan) Maps a resource-independent “abstract” workflow onto resources and executes the “concrete” workflow Used by a number of applications in a variety of domains Provides reliability—can retry computations from the point of failure Provides scalability—can handle large data and many computations (kbytes-TB of data, tasks) Infers data transfers, restructures workflows for performance Automatically captures provenance information Can run on resources distributed among institutions, laptop, campus cluster, Grid, Cloud

A way to make it work better Data Storage Work definition Pegasus WMS Local Resource work data Resource Provisioner Virtual Resource Pool Resources requests Resources: compute, data, networks Grids and Clouds

ORCA is a “wrapper” for off-the-shelf cloud and circuit nets etc., enabling federated orchestration: +Resource brokering +VM image distribution +Topology embedding +Stitching +Authorization o Deploys a dynamic collection of controllers o Controller receive user requests and provisions resources Open Resource Control Architecture Jeff Chase, Duke University

Pegasus and Orca, initial implementation

What we would like to do: Expand to workflow ensembles

What is missing Tools and systems that can integrate the operation of workflow-driven science applications on top of dynamic infrastructures that link campus, institutional and national resources Tools to manage workflow ensembles Need to –orchestrate the infrastructure in response to the application –monitor various workflow steps and ensemble elements –expand and shrink resource pools in response to application performance demands –integrate data movement/storage decisions with workflows/resource provisioning to optimize performance

Summary: ADAMANT will Focus on data-intensive applications: astronomy, bioinformatics, earth science Interleave workload management with resource provisioning –Emphasis on storage and network provisioning Monitor the execution and adapt resource provisioning and workload scheduling Experiment on exoGeni – – –