Biology.sdsc.edu CIPRes in Kepler: An integrative workflow package for streamlining phylogenetic data analyses Zhijie Guan 1, Alex Borchers 1, Timothy.

Slides:



Advertisements
Similar presentations
Nimrod/K: Towards Massively Parallel Dynamic Grid Workflows David Abramson, Colin Enticott, Monash Ilkay Altinas, UCSD.
Advertisements

Remote Visualisation System (RVS) By: Anil Chandra.
Overview of the Science Environment for Ecological Knowledge (SEEK) Ricardo Scachetti Pereira.
Kensington Oracle Edition: Open Discovery Workflow Meets Oracle 10g Professor Yike Guo.
Provenance GGF18 Kepler/COW+RWS, Kepler/COW+RWS, Bowers, McPhiilips et al. Provenance Management in a COllection-oriented Scientific Workflow.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
UCSD SAN DIEGO SUPERCOMPUTER CENTER Ilkay Altintas Scientific Workflow Automation Technologies Provenance Collection Support in the Kepler Scientific Workflow.
Jianwu Wang, Daniel Crawl, Ilkay Altintas San Diego Supercomputer Center, University of California, San Diego 9500 Gilman Drive, MC 0505 La Jolla, CA ,
ProActive Task Manager Component for SEGL Parameter Sweeping Natalia Currle-Linde and Wasseim Alzouabi High Performance Computing Center Stuttgart (HLRS),
Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS), University of California, Santa Barbara February.
6th Biennial Ptolemy Miniconference Berkeley, CA May 12, 2005 Distributed Computing in Kepler Ilkay Altintas Lead, Scientific Workflow Automation Technologies.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Interpret Application Specifications
A Kepler-based Three Tier Architecture applied to LiDAR Interpolation and Analysis Efrat Frank, Ilkay Altintas San Diego Supercomputer Center, UCSD Configuration.
The Kepler Project Overview, Status, and Future Directions Matthew B. Jones on behalf of the Kepler Project team National Center for Ecological Analysis.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
 Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented.
1 Ilkay ALTINTAS - October, 2007 Ilkay ALTINTAS Lab Director, Scientific Workflow Automation Technologies San Diego Supercomputer Center, UCSD Kepler Scientific.
January, 23, 2006 Ilkay Altintas
Composing Models of Computation in Kepler/Ptolemy II
1 Ilkay ALTINTAS Assistant Director, National Laboratory for Advanced Data Research Manager, Scientific Workflow Automation Technologies Laboratory San.
Pipelines and Scientific Workflows with Ptolemy II Deana Pennington University of New Mexico LTER Network Office Shawn Bowers UCSD San Diego Supercomputer.
Kepler/pPOD: Scientific Workflow and Provenance Support for Assembling the Tree of Life UC DAVIS Department of Computer Science The Kepler/pPOD Team Shawn.
Science Environment for Ecological Knowledge: EcoGrid Matthew B. Jones National Center for.
Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.
Accelerating Scientific Exploration Using Workflow Automation Systems Terence Critchlow (LLNL) Ilkay Altintas (SDSC) Scott Klasky(ORNL) Mladen Vouk (NCSU)
Chad Berkley NCEAS National Center for Ecological Analysis and Synthesis (NCEAS), University of California Santa Barbara Long Term Ecological Research.
1 Ilkay ALTINTAS - July 24th, 2007 Ilkay ALTINTAS Director, Scientific Workflow Automation Technologies Laboratory San Diego Supercomputer Center, UCSD.
Accessing Grid Resources via Portals and Workflow Tools Accessing Grid Resources via Portals and Workflow Tools Sriram Krishnan, Ph.D.
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
6/12/99 Java GrandeT. Haupt1 The Gateway System This project is a collaborative effort between Northeast Parallel Architectures Center (NPAC) Ohio Supercomputer.
Framework for MDO Studies Amitay Isaacs Center for Aerospace System Design and Engineering IIT Bombay.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
AgINFRA science gateway for workflows and integrated services 07/02/2012 Robert Lovas MTA SZTAKI.
Using R in Kepler Dan Higgins – NCEAS Prepared for: Ecoinformatics Training for Ecologists LTER (Albuquerque) January 8-12, 2007
GO-ESSP Workshop, LLNL, Livermore, CA, Jun 19-21, 2006, Center for ATmosphere sciences and Earthquake Researches Construction of e-science Environment.
ICCS WSES BOF Discussion. Possible Topics Scientific workflows and Grid infrastructure Utilization of computing resources in scientific workflows; Virtual.
1 Media Grid Initiative By A/Prof. Bu-Sung Lee, Francis Nanyang Technological University.
CIPRES Software architecture/development Focus Leader: Mark Holder (FSU) Architecture:Wayne Maddison (UBC) Mark Holder (FSU) David Swofford (FSU) Implementation:
Kepler includes contributors from GEON, SEEK, SDM Center and Ptolemy II, supported by NSF ITRs (SEEK), EAR (GEON), DOE DE-FC02-01ER25486.
Your name here SPA: Successes, Status, and Future Directions Terence Critchlow And many, many, others Scientific Process Automation PNNL.
NEES Cyberinfrastructure Center at the San Diego Supercomputer Center, UCSD George E. Brown, Jr. Network for Earthquake Engineering Simulation NEES TeraGrid.
Project Database Handler The Project Database Handler is a brokering application that mediates interactions between the project database and the external.
1 Limitations of BLAST Can only search for a single query (e.g. find all genes similar to TTGGACAGGATCGA) What about more complex queries? “Find all genes.
Scientific Workflow systems: Summary and Opportunities for SEEK and e-Science.
Center for Computational Visualization University of Texas, Austin Visualization and Graphics Research Group University of California, Davis Molecular.
Toward interactive visualization in a distributed workflow Steven G. Parker Oscar Barney Ayla Khan Thiago Ize Steven G. Parker Oscar Barney Ayla Khan Thiago.
Recording Actor Provenance in Scientific Workflows Ian Wootten, Shrija Rajbhandari, Omer Rana Cardiff University, UK.
Clotho in Kepler Help sharing Clotho’s awesomeness to the world Use scientific workflow to create, reuse, share and extend Clotho’s operations.
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
Satisfying Requirements BPF for DRA shall address: –DAQ Environment (Eclipse RCP): Gumtree ISEE workbench integration; –Design Composing and Configurability,
Example projects using metadata and thesauri: the Biodiversity World Project Richard White Cardiff University, UK
Visualization in Kepler Dan Higgins – NCEAS Prepared for: Ecoinformatics Training for Ecologists LTER (Albuquerque) January 8-12, 2007
Ocean Observatories Initiative OOI Cyberinfrastructure Life Cycle Objectives Review January 8-9, 2013 Scientific Workflows for OOI Ilkay Altintas Charles.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Interfacing gLite services with the Kepler.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
Workflow-Driven Science using Kepler Ilkay Altintas, PhD San Diego Supercomputer Center, UCSD words.sdsc.edu.
CIPRES Database Focus Group NSF Site Visit June 28, 2006 San Diego.
Developing GRID Applications GRACE Project
VIEWS b.ppt-1 Managing Intelligent Decision Support Networks in Biosurveillance PHIN 2008, Session G1, August 27, 2008 Mohammad Hashemian, MS, Zaruhi.
InSilicoLab – Grid Environment for Supporting Numerical Experiments in Chemistry Joanna Kocot, Daniel Harężlak, Klemens Noga, Mariusz Sterzel, Tomasz Szepieniec.
Efrat Jaeger – SDSC Bertram Ludäscher – UC DAVIS Krishna Sinha – Virginia Tech Ashraf Memon – SDSC Ghulam Memon – SDSC Ilkay Altintas – SDSC Kai Lin –
Scientific workflow in Kepler – hands on tutorial
MATLAB Distributed, and Other Toolboxes
Semantic Mediation System
SDM workshop Strawman report History and Progress and Goal.
A Semantic Type System and Propagation
Presentation transcript:

biology.sdsc.edu CIPRes in Kepler: An integrative workflow package for streamlining phylogenetic data analyses Zhijie Guan 1, Alex Borchers 1, Timothy McPhillips 2, Shirley Cohen 3, Mark A. Miller 1, Ilkay Altintas 1 1 San Diego Supercomputer Center, UCSD 2 University of California, Davis 3 University of Pennsylvania

biology.sdsc.edu What is a Scientific Workflow?  Combination of  data integration, analysis, and visualization steps  larger, automated "scientific process"  Mission of scientific workflow systems  Promote “scientific discovery” by providing tools and methods to generate scientific workflows  Create an extensible and customizable graphical user interface for scientists from different scientific domains  Support computational experiment creation, execution, sharing, reuse and provenance  Design frameworks which define efficient ways to connect to the existing data and integrate heterogeneous data from multiple resources  Make technology useful through user’s monitor!!!

biology.sdsc.edu Promoter Identification Workflow Source: Matt Coleman (LLNL)

biology.sdsc.edu A Workflow for Phylogeny Analysis

biology.sdsc.edu Kepler is a Scientific Workflow System  … and a cross-project collaboration  June 2, 2006 Beta release Ptolemy II: A software system used for prototyping engineering system KEPLER: A platform to design and execute Scientific Workflows KEPLER = “Ptolemy II + X” for Scientific Workflows  Builds upon the open-source Ptolemy II framework

biology.sdsc.edu Some Kepler Contributors Ptolemy II Resurgence Griddles SRB LOOKING SKIDL NLADR Contributor names and funding info are at the Kepler website!! Other contributors: - Chesire (UK Text Mining Center) - DART (Great Barrier Reef, Australia) - National Digital Archives + UCSD-TV (US) - …

biology.sdsc.edu A co-development in KEPLER: GEON Dataset Generation & Registration SQL database access (JDBC) % Makefile $> ant run % Makefile $> ant run

biology.sdsc.edu Phylogeny Analysis Workflows Local Disk Multiple Sequence Alignment Phylogeny Analysis Tree Visualization

biology.sdsc.edu Kepler Workflow: Actors  Actor  Encapsulation of parameterized actions  Interface defined by ports and parameters  Port  Communication between input and output data  The place where data get in/out  Model of computation  Flow of control  Sequential / parallel execution  Implementation is a framework Actor-Oriented Design

biology.sdsc.edu CIPRes Workflow: Actors Input Port: Nexus File Content Data Matrix Tree Taxa Info Output Ports:

biology.sdsc.edu Some actors in place for… Generic Web Service Client and Web Service Harvester Customizable RDBMS query and update Command Line wrapper tools (local, ssh, scp, ftp, etc.) Some Grid actors- Globus Job Runner, GridFTP-based file access, Proxy Certificate Generator SRB support Native R and Matlab support Interaction with Nimrod and APST Communication with ORBs through actors and services Imaging, Gridding, Vis Support Textual and Graphical Output …more generic and domain-oriented actors…

biology.sdsc.edu CIPRes Workflow Run ClustalW Choose the input file Get the subset of the aligned sequences Read the tree Parse the tree Display the tree Run PAUP for Tree Inference Channel: Convey the data GUIGen: Parameter Setting Actor: Results:

biology.sdsc.edu CIPRes Workflows: Demo  Read Sequences  Multiple Sequence Alignment  Display the Alignment  Matrix Alignment  Tree Inference  Consensus Tree  Tree Visualization

biology.sdsc.edu Summary  Kepler is good at:  Integrating data, programs, and computing resources  Capturing your ideas and realizing them  Supporting computational experiment creation, execution, sharing, and reuse  Quickly prototyping scientific workflows  Building streamlining applications  Visual programming language  Don’t write your application, “draw”/compose it  Cipres-Kepler package can be used to build scientific workflows for phylogenetic data analyses

biology.sdsc.edu Future Work  Cipres-Kepler can help you  There is (always) a lot more to work on:  More actors for phylogeny analyses  Automatically generating actors based on CORBA services  Database (TreeBase) support to store large amounts of data  More computing power for large dataset processing  Need your collaboration:  Sharing experiences  Teaching each other the domain knowledge  Locating a specific problem and solving it

biology.sdsc.edu Questions? Zhijie Guan Cipres-Kepler Release: ftp://ftp.sdsc.edu/outgoing/borchers/cipresReleases/ /cipresKepler_Dist.tgz