Download presentation
Presentation is loading. Please wait.
Published byRosalind Rodgers Modified over 8 years ago
1
Accelerating Scientific Exploration Using Workflow Automation Systems Terence Critchlow (LLNL) Ilkay Altintas (SDSC) Scott Klasky(ORNL) Mladen Vouk (NCSU) Steve Parker (Univ. of Utah) Bertram Ludaescher (UC Davis) SIAM CSE Conference February, 2007 UCRL-PRES-228193
2
What is a “scientific workflow”? l Can be arbitrarily complex Conditionals, loops / iterations, parallel execution Human interactions l A scientific workflow is any workflow performed in order to accomplish a larger scientific goal Definition A workflow is a predefined sequence of actions which performs a specific task.
3
Scientific workflows exist in all domains Promoter Identification ROADNet workflow courtesy of A. Rajasekar SDSC
4
If we can automate a workflow, application scientists can spend more time doing science
5
An executable workflow is defined within a tool in a way that allows the task to be run l There are many workflow engines available http://kepler-project.org/ http://kepler-project.org/ l A “Director” is responsible for task scheduling l An “Actor” is a single task that the workflow needs to schedule l I/O Ports connect actors
6
Creating an executable workflow requires precisely defining what needs to be done l Submit a batch job to supercomputer l When the job starts running Track progress of simulation Move output files to an archive Move output files to analysis machine l Clean up Overall architect (& prototypical user): Scott Klasky (ORNL) WF design & implementation: Norbert Podhorszki (UC Davis) Execution Log (=> Data Provenance) Splitting output enables parallel processing of same data Each actor executes in parallel as long as it has needed inputs Submit job Monitor progress and do analysis Cleanup
7
Creating an executable workflow requires precisely defining what needs to be done Overall architect (& prototypical user): Scott Klasky (ORNL) WF design & implementation: Norbert Podhorszki (UC Davis) Wait for files to appear Convert files to new data format Send files to archive Generate image Configure parameters based on user and machine Image generated using SCIRun (Univ of Utah)
8
Now that I have an executable workflow, so what? l Instead of performing the task by hand each time, you are able to update the workflow parameters, start workflow executing, and do other things
9
Now that I have an executable workflow, so what? l Monitoring for files l File transfer with automatic restart on failure l Automatic generation of images l Instead of performing the task by hand each time, you are able to update the workflow parameters, start workflow executing, and do other things l Mundane data management tasks are taken care of Actors can be reused across workflows
10
l Instead of performing the task by hand each time, you are able to update the workflow parameters, start workflow executing, and do other things l Mundane data management tasks are taken care of l Workflow executes in parallel Now that I have an executable workflow, so what? Logging, archiving, and image generation proceed in parallel without additional coding
11
Now that I have an executable workflow, so what? l Instead of performing the task by hand each time, you are able to update the workflow parameters, start workflow executing, and do other things l Mundane data management tasks are taken care of l Workflow executes in parallel l Provenance tracking Log files reflect both current status of simulation run and provide a permanent record of execution Improved provenance tracking is a major focus of ongoing work.
12
Scientific workflow automation has potential to reduce the data management burden l As experimental and simulation grows, managing the data efficiently becomes increasingly important l Scientific workflow technology removes much of the mundane data management burden, freeing scientists to do science The CIPRES project has as a key goal the creation of software infrastructure that allows developers in the community to easily contribute new software tools,... The modular nature of Kepler met our requirements, as it is a JAVA platform that allows users to construct linear, looping, and complex workflows from just the kinds of components. The CIPRES community is developing. By adopting this tool, we were able to focus on developing appropriate framework and registry tools for our community, and use the friendly Kepler user application interface as an entrée to our services. We are very excited about the progress we have made, and think the tool will be revolutionary for our user base. - Mark A. Miller, PI, NSF CIPRES project, 2006
13
This work was performed under the auspices of the U.S. Department of Energy by University of California Lawrence Livermore National Laboratory under contract No. W-7405- ENG-48.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.