GGF Summer School 24th July 2004, Italy Part 2: Architecture overview Professor Carole Goble University of Manchester

Slides:



Advertisements
Similar presentations
GRADD: Scientific Workflows. Scientific Workflow E. Science laboris Workflows are the new rock and roll of eScience Machinery for coordinating the execution.
Advertisements

© Geodise Project, University of Southampton, Applying the Semantic Web to Manage Knowledge on the Grid Feng Tao, Colin.
OMII-UK Steven Newhouse, Director. © 2 OMII-UK aims to provide software and support to enable a sustained future for the UK e-Science community and its.
Grid-Enabling Data: Sticking Plaster, Sellotape, & Chewing Gum? Colin C. Venters National Centre for e-Social Science University.
Principles of Personalisation of Service Discovery Electronics and Computer Science, University of Southampton myGrid UK e-Science Project Juri Papay,
ISMB Demo; June 27, 2005 Integrating Text Mining into Bio-Informatics Workflows Neil Davis George Demetriou Robert Gaizauskas Yikun Guo Ian Roberts Henk.
ISWC 2005, Galway Seven Bottlenecks to Workflow Reuse and Repurposing Antoon Goderis Ulrike Sattler Phillip Lord Carole Goble University of Manchester.
IBM Watson Research © 2004 IBM Corporation BioHaystack: Gateway to the Biological Semantic Web Dennis Quan
GADA Workshop 1-2 November 2005 Life Science Grid Middleware in a More Dynamic Environment Milena Radenkovic & Bartosz Wietrzyk The University of Nottingham,
On the Use of Agents in a BioInformatics Grid with slides from Luc Moreau, University of Southampton,UK myGrid.
An integrative approach for attaching semantic annotations to service descriptions Luc Moreau, University of Southampton,UK.
GGF Summer School 24 th July 2004, Italy Part 3: Integrating Services Life Science Identifiers & Information model. Data and Metadata management – the.
The my Grid project aims to provide middleware layers that make the Information Grid appropriate for the needs of bioinformatics. my Grid is building high.
Personal Data Management Why is this such an issue? Data Provenance Representing links v Representing data Identifying resources: Life Science Identifiers.
GMD German National Research Center for Information Technology Innovation through Research Jörg M. Haake Applying Collaborative Open Hypermedia.
Metadata in my Grid: Finding Services for in silico Science Dr Katy Wolstencroft myGrid University of Manchester.
Provenance in my Grid Jun Zhao School of Computer Science The University of Manchester, U.K. 21 October, 2004.
Deciding Semantic Matching of Stateless Services Duncan Hull †, Evgeny Zolin †, Andrey Bovykin ‡, Ian Horrocks †, Ulrike Sattler † and Robert Stevens †
Database Taskforce and the OGSA-DAI Project Norman Paton University of Manchester.
CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.
Taverna and my Grid Basic overview and Introduction Tom Oinn
Designing, Executing, Reusing and Sharing Workflows: Taverna and myExperiment Supporting the in silico Experiment Life Cycle Katy Wolstencroft Paul Fisher.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
1 A myGrid Project Tutorial Dr Mark Greenwood University of Manchester With considerable help from Justin Ferris, Peter Li, Phil Lord, Chris Wroe, Carole.
Taverna and my Grid Open Workflow for Life Sciences Tom Oinn
Linked-data and the Internet of Things Payam Barnaghi Centre for Communication Systems Research University of Surrey March 2012.
MyGrid: Personalised e-Biology on the Grid Professor Carole Goble Contact e-Science.
MyGrid: Personalised e-Biology on the Grid Professor Carole Goble Contact
My Grid: Upper level Grid Services for the Bioinformatican Prof. Carole Goble Sun Microsystems BioGrid Symposium, Baltimore, USA.
PLoS ONE Application Journal Publishing System (JPS) First application built on Topaz application framework Web 2.0 –Uses a template engine to display.
E-Science Tools For The Genomic Scale Characterisation Of Bacterial Secreted Proteins Tracy Craddock, Phillip Lord, Colin Harwood and Anil Wipat Newcastle.
Integrating BioMedical Text Mining Services into a Distributed Workflow Environment Rob Gaizauskas, Neil Davis, George Demetriou, Yikun Guo, Ian Roberts.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
MyGrid and the Semantic Web Phillip Lord School of Computer Science University of Manchester.
Taverna Workflows for Systems Biology Katy Wolstencroft School of Computer Science University of Manchester.
VBI Web Services Workshop May 2005 Performing In silico Experiments in a Service Based Architecture: Solutions and Issues Chris Wroe, Phillip Lord,
Tom Oinn, In general a grid system is, or should be : “A collection of a resources able to act collaboratively in pursuit of an overall.
Anil Wipat University of Newcastle upon Tyne, UK A Grid based System for Microbial Genome Comparison and analysis.
Capture, integration, and sharing of functional genomic data Steve Oliver Professor of Genomics School of Biological Sciences University of Manchester.
Quality views: capturing and exploiting the user perspective on data quality Paolo Missier, Suzanne Embury, Mark Greenwood School of Computer Science University.
Workflow in Grid Systems Workshop Dave Berry, Research Manager UK National e-Science Centre GGF10, Mar 2004.
LSIDs in a Nutshell Jun Zhao University of Manchester 1 st December, 2005.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Data access and integration with OGSA-DAI: OGSA-DQP Steven Lynden University of Manchester.
MyGrid: open knowledge based high level services for bioinformatics the information Grid Professor Carole Goble University of Manchester, UK
Grid Services I - Concepts
Association of variations in I kappa B-epsilon with Graves' disease using classical and my Grid methodologies Peter Li School of Computing Science University.
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
© Geodise Project, University of Southampton, Knowledge Management in Geodise Geodise Knowledge Management Team Barry Tao, Colin Puleston, Liming.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Bioinformatics Workflows Chris Wroe (based on material from the myGrid team & May Tassabehji / Hannah Tipney Medical Genetics, St Marys)
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
Portals for Bioinformatics Nick Sharman my Grid project manager 30 June
Cooperative experiments in VL-e: from scientific workflows to knowledge sharing Z.Zhao (1) V. Guevara( 1) A. Wibisono(1) A. Belloum(1) M. Bubak(1,2) B.
MyGrid/Taverna Provenance Daniele Turi University of Manchester OMII f2f Meeting, London, 19-20/4/06.
PharmaGrid 2004, Switzerland, July Part 5: Wrap Up Professor Carole Goble University of Manchester
Using DAML+OIL Ontologies for Service Discovery in myGrid Chris Wroe, Robert Stevens, Carole Goble, Angus Roberts, Mark Greenwood
Toward a common data and command representation for quantum chemistry Malcolm Atkinson Director 5 th April 2004.
E-Science Process. Thoughts on the e-Science Mediator in myGrid M.Nedim Alpdemir.
The my Grid Information Model Nick Sharman, Nedim Alpdemir, Justin Ferris, Mark Greenwood, Peter Li, Chris Wroe AHM2004, 1 September
Portals and my Grid Stefan Rennick Egglestone Mixed Reality Laboratory University of Nottingham.
1 A myGrid Project Tutorial (3) Dr Mark Greenwood University of Manchester With considerable help from Justin Ferris, Peter Li, Phil Lord, Chris Wroe and.
MyGrid: Personalised Bioinformatics on the Information Grid Robert Stevens, Alan Robinson & Carole Goble University of Manchester & EBI, UK myGrid project.
Workflow and myGrid Justin Ferris IT Innovation Centre 7 October 2003 Life Sciences Grid GGF9.
Taverna: A Workbench for the Design and Execution of Scientific Workflows Paul Fisher University of Manchester.
Recording and Reasoning Over Data Provenance in Web and Grid Services Martin Szomszor and Luc Moreau University of Southampton.
Provenance: Problem, Architectural issues, Towards Trust
A myGrid Project Tutorial
Presentation transcript:

GGF Summer School 24th July 2004, Italy Part 2: Architecture overview Professor Carole Goble University of Manchester

GGF Summer School 24th July 2004, Italy In a nutshell Bioinformatics toolkit Open (Web) Services – my Grid components and external domain services –Publication, discovery, interoperation, composition, decommissioning of my Grid services –No control or influence over domain service providers Metadata Driven –LSIDs, Common information model, Ontologies, Semantic Web technologies Open extensible architecture –Assemble your own components –Designed to work together –Loosely coupled Freefluo WfEE Taverna WfDE View UDDI registry Event Notification mIR Pedro Semantic Discovery Feta Info. Model Soaplab Gowlab Gateway & CHEF Portal LSID Haystack Provenance Browser

GGF Summer School 24th July 2004, Italy Key Characteristics Data Intensive, Up stream analysis Pipelines - experiments as workflows (chiefly) Adhoc exploratory investigative workflows for individuals from no particular a priori community Openness – the services are not ours. Low activation energy, incremental take-on Foundations for sharing knowledge and sharing experimental objects Multiple stakeholders Collection of components for assembly

GGF Summer School 24th July 2004, Italy Openness –open source –open world of services –open extensible technology –open to wider eScience context –open to user feedback –open to third party metadata

GGF Summer School 24th July 2004, Italy Platform Standards based (Web) Service Oriented Architecture –Publication, discovery, interoperation, composition, decommissioning of my Grid services –Web services communication fabric –XML document types –LSIDs for identifying resources Implemented in Java using Axis and Tomcat –WS-I -> OGSA / WSRF Metadata driven –RDF-coded metadata –OWL-coded ontologies –Common information model

GGF Summer School 24th July 2004, Italy Stakeholders myGrid users biologists IS specialists infrequent problem specific bioinformaticians tool builders service provider systems administrators bioinformatics tool builders Middleware for Tool Developers Bioinformaticians Service Providers Biologists are indirectly supported by the portals and apps these develop. annotators

GGF Summer School 24th July 2004, Italy Collections of Tasks Finding Description Service Discovery Enactment Building Workflow Provenance Storage Data Management Querying Domain Tasks Service Providers Bioinformaticians Scientists Annotation providers

GGF Summer School 24th July 2004, Italy Experimental entities

GGF Summer School 24th July 2004, Italy Investigation = set of experiments + metadata Experimental design components Experimental instances that are records of enacted experiments Experimental glue that groups and links design and instance components Life Science IDs, URIs, RDF

GGF Summer School 24th July 2004, Italy Web Service (Grid Service) communication fabric AMBIT Text Extraction Service Provenance Mgt Event Notification Service e-Science Mediator Feta Service & WF Discovery Information Repository Ontology Mgt Metadata Store Taverna Workbench Haystack Native Web Services SoapLab Web Portal Legacy apps UDDI Registries Ontologies FreeFluo Workflow Enactment Engine OGSA-DQP Distributed Query Processor Bioinformaticians Tool Providers Service Providers Applications Core services External services my Grid Service Stack Views Legacy apps GowLab LSID Launch pad LSID Authority

GGF Summer School 24th July 2004, Italy Service stack Taverna workbench Web Service (Grid Service) communication fabric AMBIT Text Extraction Service Native Web Services SoapLab Legacy apps Apps Core services External services Websites GowLab Web Portal LSID Launch Pad Haystack e-Science Mediator e-Science process patterns Service & workflow discovery Metadata management Data management e-Science event bus Workflow enactment ! ! ! !

GGF Summer School 24th July 2004, Italy 20,000 feet Freefluo Workflow Engine LSID Authority UDDI mIR metadata Store Service Provenance and Data browser Haystack or Portal Web services, local tools User interaction etc. Taverna Workbench View Service Semantic Discovery & Registration Event Notification Service mIR data

GGF Summer School 24th July 2004, Italy e-Science Mediator 1. Application-oriented: directly supports the e- Scientist by: providing pre-configured e-Science processes templates (i.e. system-level workflows) helping in capturing and maintaining context information (via the information model) that is relevant to the interpretation and sharing of the results of the e-science experiments. Facilitating personalisation and collaboration 2. Middleware-oriented: contributes to the synergy between my Grid services by: Acting as a sink for e-Science events initiated by my Grid components Interpreting the intercepted events and triggering interactions with other related components entailed by the semantics of those events Compensating for possible impedance mismatches with other services both in terms of data types and interaction protocols

GGF Summer School 24th July 2004, Italy Supporting the e-scientist Recurring use-cases can be captured Then corresponding process templates can be authored e-science mediator makes processes available to the user launch semantic Search facility Find Workflow Use-case Launch workflow Editor for selected WF Enable MIR browser For storage with context Find an interesting workflow for experiment Create exp. Context for this user Find Workflow Process Examine and modify if necessary Store to personal repository For later re-use

GGF Summer School 24th July 2004, Italy E-Science process templates maintained by the mediator can derive the GUI generation and interaction with the user E-Science Mediator GUI

GGF Summer School 24th July 2004, Italy Mediating between services Example: mediation during a workflow execution E-Science Mediator MIR 1: Execution started [*]3: intermediate process completed 6: workflow completed 2: Establish experiment/user context [*]4: link process trace to context 7: get WF results [*]5: Store intermediate process trace 8: Store WF results WF Enactor Notification Service 9: notify WF completion to subscribers

GGF Summer School 24th July 2004, Italy Simplified Architecture MIR Service Registry WF Enactor Notification Service E-Science Mediator Service E-Science Mediator client-stubs GUI (e-science workbench) Context preserved via myGrid Inormation Model Client-side e-science process logic Server-side e-science process logic The Grid Client Side

GGF Summer School 24th July 2004, Italy Event notification Service Publish/subscribe model –Topic based (cf. JMS topics, CORBA channels) –Hierarchic topics –Persistent event storage –Subscription leases –Federation for scalability & reliability –Event filtering

GGF Summer School 24th July 2004, Italy Portal toolkit for bioinformaticians Target application –Williams-Beuren Syndrome –Fixed set of workflows Extra my Grid portlets –Configurable –Workflow enactment –Workflow scheduling –Completion notification –Results browsing Based on CHEF & Jetspeed-1 –Portlets for team collaboration Portlet Container Interface Portlet

GGF Summer School 24th July 2004, Italy Text Services User Client Medline Server (Sheffield) Swissprot/Blast record Workflow Server Workflow Enactment Extract PubMed Id Get Medline Abstract Initial Workflow Cluster Abstracts Get Related Abstracts Medline: pre-processed offline to extract biomedical terms + indexed XScufl workflow definition + parameters Clustered PubMed Ids + titles PubMed Ids Term-annotated Medline abstracts Medline Abstracts

GGF Summer School 24th July 2004, Italy History Pre-Prototype Prototype 1 Experimental Web-based Requirements gathering Architectural workout All services represented NetBeans workbench API-based integration Info Repository oriented XML-based process provenance Workflow enactment engine Prototype 2 Second generation services Reworked information model Open information management Life Science Identifiers RDF based provenance Taverna workbench Web-based portal Demo at ISMB 2003 Full paper and demo at ISMB 2004 GSK deployment Real biology

GGF Summer School 24th July 2004, Italy Two+ Paths Core functionality Services – Soaplab and Gowlab Workflow enactment engine – Freefluo Workflow workbench – Taverna Data integration – OGSA-DQP Information model & management Mediator Innovative work Service and workflow registration Semantic discovery Provenance management Text mining In between Event notification

GGF Summer School 24th July 2004, Italy my Grid People Core Matthew Addis, Nedim Alpdemir, Tim Carver, Rich Cawley, Neil Davis, Alvaro Fernandes, Justin Ferris, Robert Gaizaukaus, Kevin Glover, Carole Goble, Chris Greenhalgh, Mark Greenwood, Yikun Guo, Ananth Krishna, Peter Li, Phillip Lord, Darren Marvin, Simon Miles, Luc Moreau, Arijit Mukherjee, Tom Oinn, Juri Papay, Savas Parastatidis, Norman Paton, Terry Payne, Matthew Pokock Milena Radenkovic, Stefan Rennick-Egglestone, Peter Rice, Martin Senger, Nick Sharman, Robert Stevens, Victor Tan, Anil Wipat, Paul Watson and Chris Wroe. Users Simon Pearce and Claire Jennings, Institute of Human Genetics School of Clinical Medical Sciences, University of Newcastle, UK Hannah Tipney, May Tassabehji, Andy Brass, St Mary’s Hospital, Manchester, UK Steve Kemp, Liverpool, UK Postgraduates Martin Szomszor, Duncan Hull, Jun Zhao, Pinar Alper, John Dickman, Keith Flanagan, Antoon Goderis, Tracy Craddock, Alastair Hampshire Industrial Dennis Quan, Sean Martin, Michael Niemi, Syd Chapman (IBM) Robin McEntire (GSK) Collaborators Keith Decker

GGF Summer School 24th July 2004, Italy Collaboration

GGF Summer School 24th July 2004, Italy Publications R. Stevens, H.J. Tipney, C. Wroe, T. Oinn, M. Senger, P. Lord, C.A. Goble, A. Brass and M. Tassabehji Exploring Williams-Beuren Syndrome Using myGrid to appear in Proceedings of 12th International Conference on Intelligent Systems in Molecular Biology, 31st Jul-4th Aug 2004, Glasgow, UK. C.A. Goble, S. Pettifer, R. Stevens and C. Greenhalgh Knowledge Integration: In silico Experiments in Bioinformatics in The Grid: Blueprint for a New Computing Infrastructure Second Edition eds. Ian Foster and Carl Kesselman, 2003, Morgan Kaufman, November R. Stevens, A. Robinson, and C.A. Goble myGrid: Personalised Bioinformatics on the Information Grid in proceedings of 11th International Conference on Intelligent Systems in Molecular Biology, 29th June–3rd July 2003, Brisbane, Australia, published Bioinformatics Vol. 19 Suppl , pp

GGF Summer School 24th July 2004, Italy