Grid Rapid Application Virtualization Interface (gRAVI) - Service Oriented Science Ravi K Madduri, Argonne National Laboratory/ University of Chicago Joshua.

Slides:



Advertisements
Similar presentations
Open Grid Forum 19 January 31, 2007 Chapel Hill, NC Stephen Langella Ohio State University Grid Authentication and Authorization with.
Advertisements

Abstraction Layers Why do we need them? –Protection against change Where in the hourglass do we put them? –Computer Scientist perspective Expose low-level.
Creating and Sharing Re-usable Workflows in Cardiovascular Research: Lessons learned using Taverna Ravi Madduri University.
CVRG Presenter Disclosure Information Joel Saltz MD, PhD Director Comprehensive Informatics Center Emory University Translational Research Informatics.
CVRG Presenter Disclosure Information Tahsin Kurc, PhD Center for Comprehensive Informatics Emory University CardioVascular Research Grid Core Infrastructure.
Ian Foster Computation Institute Argonne National Lab & University of Chicago Services for Science.
Principles of Personalisation of Service Discovery Electronics and Computer Science, University of Southampton myGrid UK e-Science Project Juri Papay,
Hydra Partners Meeting March 2012 Bill Branan DuraCloud Technical Lead.
Ian Foster Computation Institute Argonne National Lab & University of Chicago Education in the Science 2.0 Era.
Building Scientific Workflows with Taverna and BPEL: a Comparative Study in caGrid Wei Tan 1, Paolo Missier 2, Ravi Madduri 1, Ian Foster 1 1 University.
The Grid Background and Architecture. 1. Keys to success for IT technologies Infrastructure Open Standards.
GeWorkbench caGrid TeraGrid Integration Scott Oster Ohio State University – Dept. of Biomedical Informatics Christine Hung Columbia University – JCSB/C2B2.
CaGrid Service Metadata Scott Oster - Ohio State
CompuNet Grid Computing Milena Natanov Keren Kotlovsky Project Supervisor: Zvika Berkovich Lab Chief Engineer: Dr. Ilana David Spring, /
Mike Smorul Saurabh Channan Digital Preservation and Archiving at the Institute for Advanced Computer Studies University of Maryland, College Park.
Passage Three Introduction to Microsoft SQL Server 2000.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Plan Introduction What is Cloud Computing?
Building Data-intensive Pipelines Ravi K Madduri Argonne National Lab University of Chicago.
Technical Introduction to caGrid Service Development caGrid 1.3 Justin Permar caGrid Knowledge Center
OpenMDR: Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
By Mihir Joshi Nikhil Dixit Limaye Pallavi Bhide Payal Godse.
State of Service Oriented Science Tools Open Source Grid Cluster Conference Oakland.
Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings Department of Biomedical Informatics Ohio State University.
Department of Biomedical Informatics Service Oriented Bioscience Cluster at OSC Umit V. Catalyurek Associate Professor Dept. of Biomedical Informatics.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
material assembled from the web pages at
BIRN Update Carl Kesselman Professor of Industrial and Systems Engineering Information Sciences Institute Fellow Viterbi School of Engineering University.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
Taverna and my Grid Open Workflow for Life Sciences Tom Oinn
Large Scale Sky Computing Applications with Nimbus Pierre Riteau Université de Rennes 1, IRISA INRIA Rennes – Bretagne Atlantique Rennes, France
Building and Running caGrid Workflows in Taverna 1 Computation Institute, University of Chicago and Argonne National Laboratory, Chicago, IL, USA 2 Mathematics.
CaBIG Workflow University of Chicago, USA University of Manchester, UK.
Middleware Support for Virtual Organizations Internet 2 Fall 2006 Member Meeting Chicago, Illinois Stephen Langella Department of.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
Shannon Hastings Multiscale Computing Laboratory Department of Biomedical Informatics.
Grid Trust Service (GTS). Problem How does the grid clients/services know which CA certificates to trust? Should I trust this CA?
Ashish Sharma, Tony Pan, Barla Cambazoglu, Joel Saltz Ohio State University, Columbus, OH (ashish, tpan, October 10, 2007 caBIG In Vivo.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
Introduce Grid Service Authoring Toolkit Shannon Hastings, Scott Oster, Stephen Langella, David Ervin Ohio State University Software Research Institute.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Grid Services I - Concepts
Technology behind using Taverna in caGrid caGrid user meeting Stian Soiland-Reyes, myGrid University of Manchester, UK
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
GO-ESSP Workshop, LLNL, Livermore, CA, Jun 19-21, 2006, Center for ATmosphere sciences and Earthquake Researches Construction of e-science Environment.
CaGrid Overview and Core Services caGrid Knowledge Center February 2011.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
ACGT: Open Grid Services for Improving Medical Knowledge Discovery Stelios G. Sfakianakis, FORTH.
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Applications.
Cole David Ronnie Julio. Introduction Globus is A community of users and developers who collaborate on the use and development of open source software,
1 Service Creation, Advertisement and Discovery Including caCORE SDK and ISO21090 William Stephens Operations Manager caGrid Knowledge Center February.
A Technical Overview Bill Branan DuraCloud Technical Lead.
In Vivo Imaging Middleware and Applications RSNA 2007 Berkant Barla Cambazoglu The Ohio State University Department of Biomedical Informatics.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
Globus.org/genomics Globus Galaxies Science Gateways as a Service Ravi K Madduri, University of Chicago and Argonne National Laboratory
CEDPS Services Area Update CEDPS Face-to-Face Meeting ANL October 2007.
CaGrid 1.0 Security Infrastructure Stephen Langella, Scott Oster, Shannon Hastings, David Ervin, Joshua Phillips, Vinay Kumar, Tahsin Kurc, Joel Saltz.
Holding slide prior to starting show. Lessons Learned from the GECEM Portal David Walker Cardiff University
Ian Foster Computation Institute Argonne National Lab & University of Chicago Application Hosting Services — Enabling Science 2.0 —
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
Tony Pan, Stephen Langella, Shannon Hastings, Scott Oster, Ashish Sharma, Metin Gurcan, Tahsin Kurc, Joel Saltz Department of Biomedical Informatics The.
TOWARDS AN ARCHITECTURE FOR NATIONAL DATA SERVICES Ian Foster Director, Computation Institute Argonne National Laboratory & The University of
Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois
Enhancements to Galaxy for delivering on NIH Commons
Ravi K Madduri, Argonne National Laboratory/ University of Chicago
Tools and Services Workshop
Joslynn Lee – Data Science Educator
Study course: “Computing clusters, grids and clouds” Andrey Y. Shevel
Presentation transcript:

Grid Rapid Application Virtualization Interface (gRAVI) - Service Oriented Science Ravi K Madduri, Argonne National Laboratory/ University of Chicago Joshua Boverhof, Lawrence Berkeley Laboratory Kyle Chard, University of Wellington, NZ Dinanath Sulakhe, University of Chicago Cem Onyukusel, CMU Ian Foster, University of Chicago/ANL

Agenda l What is Service Oriented Science ? u Software as a service u Examples from industry (google, salesforce.com etc) u Examples of SoS from Scientific communities l Why are we excited about this ? u How this helps Science l Sum of parts greater u Sustainable infrastructure u Virtualization l Different aspects of SoS u Authoring u Deploying and Securing u Discovery u Provisioning on demand u Composition l How does it help you ? u End users perspective – Brian’s talk u Bio workflow l Conclusions and Further resources u Q & A

Examples of SaaS from Industry l Salesforce.com l Google Calendar, Google docs, Google spell check etc.. l Amazon S3, EC2 l Flickr l Facebook l Microsoft Live l And on and on….

Service-Oriented Science l People create services (data or functions) … l which I discover (& decide whether to use) … l & compose to create a new function... l & then publish as a new service.  I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!  I hope that this “someone else” can manage security, reliability, scalability, … !! “Service-Oriented Science”, Science, 2005

Creating Services People create services (data, code, instr.) … which I discover (& decide whether to use) … & compose to create a new function... & then publish as a new service.  I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!  I hope that this “someone else” can manage security, reliability, scalability, … !! “Service-Oriented Science”, Science, 2005

Creating Services Introduce + gRAVI Shannon Hastings Scott Oster David Ervin Stephen Langella Joshua Boverhof Kyle Chard Ravi Madduri

Introduce Overview l A framework which enables fast and easy creation of Globus based grid services l Provide easy to use graphical service authoring tool. l Hide all “grid-ness” from the developer l Utilize best practice layered grid service architecture l Integration with other core grid services and architecture components u GAARDS Security Infrastructure (Dorian, GridGrouper, CSM) u Globus Index Service u Global Model Exchange (GME) u Cancer Data Standards Repository l Extension Framework for integrating with other architecture components

Inside the Introduce created service l Services have many moving and configurable parts which support features such as: u Advertisement u Discovery u Invocation u Security (Authentication/Authorization) u Stateful Resources l The Introduce Toolkit can keep all these features in sync as the developer creates and modifies the grid service

Introduce Features l Supports modification of operations u Adding operations u Removing Operations u Updating Operations u Importing Operations l Graphical Configuration u Advertisement u Security u Service Metadata Specification u Service Metadata Editing u Service Configuration Properties l Auto Generates Code for Service l Auto generates a client API for service. l Graphical Deployment of Service u Globus u Tomcat u JBoss

Appln Service Create Index service Store Repository Service Advertize Discover Invoke; get results Introduce Container Transfer GAR Deploy gRAVI l Grid Remote Application Virtualization Interface l Builds on Introduce u Define service u Create skeleton u Discover types u Add operations u Configure security l Wrap arbitrary executables

Discovering Services People create services (data or functions) … which I discover (& decide whether to use) … & compose to create a new function... & then publish as a new service.  I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!  I hope that this “someone else” can manage security, reliability, scalability, … !! “Service-Oriented Science”, Science, 2005

Discovering Services WS-MDS+ Taverna Laura Pearlman Mike Darcy myGrid Team

 The ultimate arbiter?  Types, ontologies  Can I use it?  Billions of services Discovering Services Assume success Semantics Permissions Reputation A B

Discovery (1): Registries Globus

Composing Services People create services (data or functions) … which I discover (& decide whether to use) … & compose to create a new function... & then publish as a new service.  I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!  I hope that this “someone else” can manage security, reliability, scalability, … !! “Service-Oriented Science”, Science, 2005

Taverna caGrid Scavenger with semantic/metadata based caGrid service query A sample caGrid workflow

Hosting Services People create services (data or functions) … which I discover (& decide whether to use) … & compose to create a new function... & then publish as a new service.  I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!  I hope that this “someone else” can manage security, reliability, scalability, … !! “Service-Oriented Science”, Science, 2005

Provisioning Services WS-GRAM + VWS Martin Feller Stuart Martin Kate Keahey Tim Freeman Joshua Boverhof

Provisioning using WS-GRAM l gRAVI uses JSDL for Application Description l Generates a method on the generated service called GRAM l Generates implementation that creates a GramJob from Application Description l Uses the bootstrapped community credential to run the application as a grid job l Used widely in realizing the usecases from caBIG community

Repository Service Using VWS/Cloud Computing l gRAVI service u Wrap the application as Service u Create the GAR and put in repository l Provision resources u Use clouds (VMM) l start service u Transfer gar, deploy l Index Service u Grid service registers itself nimbusstratus Index service Transfer, deploy Register service Create VM provision discovery portal GAR

Cloud Computing

SoS Deployments

cancer Biomedical Informatics Grid (caBIG™) l A national, NCI-funded program with participants from cancer centers and research institutions u Improve cancer research and accelerate efforts to find a cure for cancer u Make cancer research data and tools (maintained at different sites) more efficiently accessible to researchers u Create scalable, federated, actively managed biomedical informatics network that will connect members of the cancer research enterprise u Informatics support for basic, clinical, and translational research u “National Standards, Local Management” * l caBIG community u 50+ Cancer Centers, 30+ Organizations, over 900 participants

Example Scenario Location A Microarray, Protein, Image data Location B Microarray, Protein, Image data Location C Microarray, Protein, Image data Location C Image Analysis Location D Image Analysis Microarray and protein databases at other institutions caGrid Service Interfaces caGrid Environment Registered Object Definitions Advertisement Log on, Grid credentials Query and Analysis Workflow Discovery

Users Experience l Early Adopters u Love it ! u 45 services from 50+ cancer centers (and growing) l Some services are lot more popular than others l Bio-informatics group at Argonne – Details in subsequent slides u APS plans to pursue this model – Brian Tieman will talk more about this l End users u caBIG hack-a-thon participants were happy about workflows u Cancer Centers l Not too happy right now l Would like to see some RoI

Integration of caBIG-geWorkbench-Teragrid

caGrid Security (GTS, Grid Grouper, Dorian, CDS) php?title=GAARDS:Main

Transposon Workflow Details l Transposons are small mutant sequences of DNA that move between positions within the genome of a cell l Gene functions can be identified by inserting mutant sequences into a genome and analysing where the sequence resides (between genes or on a gene) and how it affects its phenotype l BLAST is used to compare the genome after insertion against the original genome u This identifies where the transposon resides

Realizing the workflow Format DB BLAST Create Report Create a fasta file representing the Genome sequences Compare these sequences against the original genome Find Transposo n Find sequences that have the given transposon Neighbouring Genes Neighbouring Genes BLAST Misses BLAST Misses BLAST Hits BLAST Hits Compile a report summarising where the transposon was inserted and results of the BLAST search Loop Until there are no misses or all genomes have been searched Genome Sequences

Center for Enabling Distributed Petascale Science Workflow Automation at DOE Facilities Automation Reproducibility Security Reusability Storage Metadata Analysis Visualization Advanced Photon Source

Lessons Learned l Nice higher level abstraction u Suitable for a subset of scientific computing usecases l Sustainable infrastructure l Work well with hospital/cancer center/APS IT infrastructure l Workflows l Scalability l Provenance l No Vendor lock-in (if you are careful)

Further Resources l Further Details on gRAVI u l Further Details on Introduce u l Details on Taverna + gRAVI u rvices_in_Taverna rvices_in_Taverna l gRAVI users mailing list u