Towards Self-Describing Workflows for Climate Models Kathy Saint – UCAR Ufuk Utku Turuncoglu – ITU Sylvia Murphy – NCAR Cecelia DeLuca – NCAR.

Slides:



Advertisements
Similar presentations
DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
Advertisements

Earth System Curator Spanning the Gap Between Models and Datasets.
Development of an Integrated Earth System Climate Modeling Environment
Provenance GGF18 Kepler/COW+RWS, Kepler/COW+RWS, Bowers, McPhiilips et al. Provenance Management in a COllection-oriented Scientific Workflow.
GridShib: Campus/Grid RBAC Integration GGF15 Workshop: Leveraging Site Infrastructure for Multi-Site Grids October 3th, 2005 Von Welch
Web Accessible Virtual Research Environment for Ecosystem Science Community Presentation by Siddeswara Guru.
Ewa Deelman, Integrating Existing Scientific Workflow Systems: The Kepler/Pegasus Example Nandita Mangal,
Attributes, Anonymity, and Access: Shibboleth and Globus Integration to Facilitate Grid Collaboration 4th Annual PKI R&D Workshop Tom Barton, Kate Keahey,
Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS), University of California, Santa Barbara February.
6th Biennial Ptolemy Miniconference Berkeley, CA May 12, 2005 Distributed Computing in Kepler Ilkay Altintas Lead, Scientific Workflow Automation Technologies.
John Kewley e-Science Centre GIS and Grid Computing Workshop 13 th September 2005, Leeds Grid Middleware and GROWL John Kewley
Data Grids: Globus vs SRB. Maturity SRB  Older code base  Widely accepted across multiple communities  Core components are tightly integrated Globus.
Computational Physics Kepler Dr. Guy Tel-Zur. This presentations follows “The Getting Started with Kepler” guide. A tutorial style manual for scientists.
Astrophysics, Biology, Climate, Combustion, Fusion, Nanoscience Working Group on Simulation-Driven Applications 10 CS, 10 Sim, 1 VR.
© , Michael Aivazis DANSE Software Issues Michael Aivazis California Institute of Technology DANSE Software Workshop September 3-8, 2003.
Jun Peng Stanford University – Department of Civil and Environmental Engineering Nov 17, 2000 DISSERTATION PROPOSAL A Software Framework for Collaborative.
NSF Middleware Initiative: GridShib Tom Barton University of Chicago.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Who am I? ● Catalin Comanici ● QA for 10 years, doing test automation for about 6 years ● fun guy and rock star wannabe.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Cracow Grid Workshop’10 Kraków, October 11-13,
June Amsterdam A Workflow Bus for e-Science Applications Dr Zhiming Zhao Faculty of Science, University of Amsterdam VL-e SP 2.5.
January, 23, 2006 Ilkay Altintas
GridShib: Grid-Shibboleth Integration (Identity Federation and Grids) April 11, 2005 Von Welch
CCSM Portal/ESG/ESGC Integration (a PY5 GIG project) Lan Zhao, Carol X. Song Rosen Center for Advanced Computing Purdue University With contributions by:
Coupling Climate and Hydrological Models Interoperability Through Web Services Kathy Saint/SGI – NESII Jon Goodall/University of South Carolina Richard.
Christopher Jeffers August 2012
Metadata Creation with the Earth System Modeling Framework Ryan O’Kuinghttons – NESII/CIRES/NOAA Kathy Saint – NESII/CSG July 22, 2014.
Composing Models of Computation in Kepler/Ptolemy II
GridShib Grid-Shibboleth Integration Von Welch, Tom Barton, Kate Keahey, Frank Siebenlist GlobusWORLD 2005.
TeraGrid Science Gateways: Scaling TeraGrid Access Aaron Shelmire¹, Jim Basney², Jim Marsteller¹, Von Welch²,
NE II NOAA Environmental Software Infrastructure and Interoperability Program Cecelia DeLuca Sylvia Murphy V. Balaji GO-ESSP August 13, 2009 Germany NE.
1st Workshop on Intelligent and Knowledge oriented Technologies Universal Semantic Knowledge Middleware Marek Paralič,
Through the development of advanced middleware, Grid computing has evolved to a mature technology in which scientists and researchers can leverage to gain.
A framework to support collaborative Velo: Knowledge Management for Collaborative (Science | Biology) Projects A framework to support collaborative 1.
1 Wenguang WangRichard B. Bunt Department of Computer Science University of Saskatchewan November 14, 2000 Simulating DB2 Buffer Pool Management.
07:44:46Service Oriented Cyberinfrastructure Lab, Introduction to BOINC By: Andrew J Younge
Accelerating Scientific Exploration Using Workflow Automation Systems Terence Critchlow (LLNL) Ilkay Altintas (SDSC) Scott Klasky(ORNL) Mladen Vouk (NCSU)
GridShib: Grid/Shibboleth Interoperability September 14, 2006 Washington, DC Tom Barton, Tim Freeman, Kate Keahey, Raj Kettimuthu, Tom Scavo, Frank Siebenlist,
1 Ilkay ALTINTAS - July 24th, 2007 Ilkay ALTINTAS Director, Scientific Workflow Automation Technologies Laboratory San Diego Supercomputer Center, UCSD.
Workflow Project Status Update Luciano Piccoli - Fermilab, IIT Nov
1 Geospatial and Business Intelligence Jean-Sébastien Turcotte Executive VP San Francisco - April 2007 Streamlining web mapping applications.
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
Workflow Architecture Create CaseConfigure CaseSubmit Case Authentication/ Authorization Track Status Post-processDebugging Transfer Files Publish Metadata.
Tutorial: Building Science Gateways TeraGrid 08 Tom Scavo, Jim Basney, Terry Fleury, Von Welch National Center for Supercomputing.
Wrapping Scientific Applications As Web Services Using The Opal Toolkit Wrapping Scientific Applications As Web Services Using The Opal Toolkit Sriram.
Large Scale Nuclear Physics Calculations in a Workflow Environment and Data Provenance Capturing Fang Liu and Masha Sosonkina Scalable Computing Lab, USDOE.
Framework for MDO Studies Amitay Isaacs Center for Aerospace System Design and Engineering IIT Bombay.
GO-ESSP Workshop, LLNL, Livermore, CA, Jun 19-21, 2006, Center for ATmosphere sciences and Earthquake Researches Construction of e-science Environment.
SAN DIEGO SUPERCOMPUTER CENTER Inca TeraGrid Status Kate Ericson November 2, 2006.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
NEES Cyberinfrastructure Center at the San Diego Supercomputer Center, UCSD George E. Brown, Jr. Network for Earthquake Engineering Simulation NEES TeraGrid.
Toward interactive visualization in a distributed workflow Steven G. Parker Oscar Barney Ayla Khan Thiago Ize Steven G. Parker Oscar Barney Ayla Khan Thiago.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
Earth System Curator and Model Metadata Discovery and Display for CMIP5 Sylvia Murphy and Cecelia Deluca (NOAA/CIRES) Hannah Wilcox (NCAR/CISL) Metafor.
John Kewley e-Science Centre All Hands Meeting st September, Nottingham GROWL: A Lightweight Grid Services Toolkit and Applications John Kewley.
MGRID Architecture Andy Adamson Center for Information Technology Integration University of Michigan, USA.
ESMF and the future of end-to-end modeling Sylvia Murphy National Center for Atmospheric Research
INFSO-RI JRA2 Test Management Tools Eva Takacs (4D SOFT) ETICS 2 Final Review Brussels - 11 May 2010.
Climate Modeling in a Browser The Development of an Earth System Climate Modeling Science Gateway Carol X. Song, Ph.D. Senior Research Scientist Rosen.
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
HPC University Requirements Analysis Team Training Analysis Summary Meeting at PSC September Mary Ann Leung, Ph.D.
V7 Foundation Series Vignette Education Services.
National Aeronautics and Space Administration Jet Propulsion Laboratory March 17, 2009 Workflow Orchestration: Conducting Science Efficiently on the Grid.
InSilicoLab – Grid Environment for Supporting Numerical Experiments in Chemistry Joanna Kocot, Daniel Harężlak, Klemens Noga, Mariusz Sterzel, Tomasz Szepieniec.
Efrat Jaeger – SDSC Bertram Ludäscher – UC DAVIS Krishna Sinha – Virginia Tech Ashraf Memon – SDSC Ghulam Memon – SDSC Ilkay Altintas – SDSC Kai Lin –
SDM workshop Strawman report History and Progress and Goal.
GridShib: Grid/Shibboleth Integration Update GGF 18 Shibboleth Developers BoF September 10-11, 2006 Washington, DC Tom Barton, Tim Freeman, Kate Keahey,
Mariana Vertenstein CCSM Software Engineering Group NCAR
Computational Physics Kepler
Scientific Workflows Lecture 15
Presentation transcript:

Towards Self-Describing Workflows for Climate Models Kathy Saint – UCAR Ufuk Utku Turuncoglu – ITU Sylvia Murphy – NCAR Cecelia DeLuca – NCAR

Outline Motivation Application Implementation Collecting Provenance Future Steps Analysis of Kepler

Motivation Problems in typical Earth System Modeling Application –Changing the science in complex Earth system models can involve numerous parameter changes that are hard to record and track –HPC is complex and involves many technologies each with its own learning curve –Reproducibility is becoming increasingly important –It is not easy to share information (configuration parameters, results, post-processing scripts)

Motivation (cont.) Approach: The user can create a different case with only minor changes in the workflow The workflow layer can hide the details of different technologies such as the computing environment, model and post-processing tools etc. Users can query collected standardized provenance information to compare, debug, or reproduce the results Users can share information easily: They can run same case with different input and parameters

Components of Workflow Environment The workflow encapsulates the technical details of the compute platform and allows the user to focus on the science of the model.

Conceptual Workflow Workflow includes uploading source code; creating, building and running case; and collecting provenance data.

Implementation The implementation can be mapped back directly to the conceptual workflow.

Collecting Provenance Provenance is defined as structured information that keeps track of the origin and derivation of the workflow. The basic types of provenance information: System (system environment, OS, CPU architecture, compiler versions etc.) Data (history or lineage of data, data flows, input and outputs, data transformations) Process (statistics about workflow run, transferred data size, elapsed time etc.) Workflow (version, modifications etc.)

Collecting Provenance CCSM is a multi-component model and which makes it complicated to collect provenance information. pymake – provided my ORNL and NCSU [2,7,10] tgwrapper.pl – uses SoftEnv [9] and Modules [8] applications

Future Steps Integration with Web Services –Move logic from Kepler platform to Web Server platform –Simplifies client, so user doesn’t have to build a custom Kepler with custom actors –Takes advantage of existing actors for communicating with SOAP services WebServices – for handling simple message types WSWithComplexType – for handling complex message types –An extension of the ESMF Web Services

Future Steps An idea of what the new, simplified workflow will look like, utilizing web service actors.

Analysis of Kepler Pros –Ease of Use –Customization Cons –WSWithComplexType limited & hard to debug Suggestions –Better discussion boards (searchable)

References [1] Altintas, I., Berkley, C., Jaeger, E., Jones, M., Ludäscher, B., Mock, S., 2004, Kepler: An Extensible System for Design and Execution of Scientific Workflows, 16th Intl. Conf. on Scientific and Statistical Database Management (SSDBM'04), June 2004, Santorini Island, Greece. [2] Altintas, I., Chin, G., Crawl, D., Critchlow, T., Koop, D., Ligon, J., Ludaescher, B., Mouallem, P., Nagappan, M., Podhorszki, N., Silva, C., Vouk, M., 2007, Provenance in Kepler-based Scientific Workflow Systems. Microsoft e-Science Workshop, poster. [3] Barton, T., Basney, J., Freeman, T., Scavo, T., Siebenlist, F., Welch, V., Ananthakrishnan, R., Baker, B., Goode, M., and Keahey, K. 2006, Identity Federation and Attribute-based Authorization through the Globus Toolkit, Shibboleth, Gridshib, and MyProxy. 5th Annual PKI R&D Workshop, April [4] Catlett, C. et al. "TeraGrid: Analysis of Organization, System Architecture, and Middleware Enabling New Types of Applications," HPC and Grids in Action, Ed. Lucio Grandinetti, IOS Press 'Advances in Parallel Computing' series, Amsterdam, [5] Furlani J. L., "Modules: Providing a Flexible User Environment", Proceedings of the Fifth Large Installation Systems Administration Conference (LISA V), pp , San Diego, CA, September 30 - October 3, [6] Hill, C., C. DeLuca, V. Balaji, M. Suarez, and A. da Silva, (2004). Architecture of the Earth System Modeling Framework. Computing in Science and Engineering, Volume 6, Number 1. [7] Klasky, S.; Barreto, R.; Kahn, A.; Parashar, M.; Podhorszki, N.; Parker, S.; Silver, D.;Vouk, M. A., "Collaborative visualization spaces for petascale simulations," Collaborative Technologies and Systems, CTS International Symposium on, vol., no., pp , May 2008 [8] Modules, [9] SoftEnv, [10] Vouk, M., Altintas, I., Klasky, S., Ludaescher, B., Silva, C., 2008, On SDM Provenance Framework, SDM Provenance White Paper, V3