Download presentation
Presentation is loading. Please wait.
Published byJasper Moody Modified over 9 years ago
1
GGF Summer School 24th July 2004, Italy Part 2: Architecture overview Professor Carole Goble University of Manchester http://www.mygrid.org.uk
2
GGF Summer School 24th July 2004, Italy In a nutshell Bioinformatics toolkit Open (Web) Services – my Grid components and external domain services –Publication, discovery, interoperation, composition, decommissioning of my Grid services –No control or influence over domain service providers Metadata Driven –LSIDs, Common information model, Ontologies, Semantic Web technologies Open extensible architecture –Assemble your own components –Designed to work together –Loosely coupled Freefluo WfEE Taverna WfDE View UDDI registry Event Notification mIR Pedro Semantic Discovery Feta Info. Model Soaplab Gowlab Gateway & CHEF Portal LSID Haystack Provenance Browser
3
GGF Summer School 24th July 2004, Italy Key Characteristics Data Intensive, Up stream analysis Pipelines - experiments as workflows (chiefly) Adhoc exploratory investigative workflows for individuals from no particular a priori community Openness – the services are not ours. Low activation energy, incremental take-on Foundations for sharing knowledge and sharing experimental objects Multiple stakeholders Collection of components for assembly
4
GGF Summer School 24th July 2004, Italy Openness –open source –open world of services –open extensible technology –open to wider eScience context –open to user feedback –open to third party metadata
5
GGF Summer School 24th July 2004, Italy Platform Standards based (Web) Service Oriented Architecture –Publication, discovery, interoperation, composition, decommissioning of my Grid services –Web services communication fabric –XML document types –LSIDs for identifying resources Implemented in Java using Axis and Tomcat –WS-I -> OGSA / WSRF Metadata driven –RDF-coded metadata –OWL-coded ontologies –Common information model
6
GGF Summer School 24th July 2004, Italy Stakeholders myGrid users biologists IS specialists infrequent problem specific bioinformaticians tool builders service provider systems administrators bioinformatics tool builders Middleware for Tool Developers Bioinformaticians Service Providers Biologists are indirectly supported by the portals and apps these develop. annotators
7
GGF Summer School 24th July 2004, Italy Collections of Tasks Finding Description Service Discovery Enactment Building Workflow Provenance Storage Data Management Querying Domain Tasks Service Providers Bioinformaticians Scientists Annotation providers
8
GGF Summer School 24th July 2004, Italy Experimental entities
9
GGF Summer School 24th July 2004, Italy Investigation = set of experiments + metadata Experimental design components Experimental instances that are records of enacted experiments Experimental glue that groups and links design and instance components Life Science IDs, URIs, RDF
10
GGF Summer School 24th July 2004, Italy Web Service (Grid Service) communication fabric AMBIT Text Extraction Service Provenance Mgt Event Notification Service e-Science Mediator Feta Service & WF Discovery Information Repository Ontology Mgt Metadata Store Taverna Workbench Haystack Native Web Services SoapLab Web Portal Legacy apps UDDI Registries Ontologies FreeFluo Workflow Enactment Engine OGSA-DQP Distributed Query Processor Bioinformaticians Tool Providers Service Providers Applications Core services External services my Grid Service Stack Views Legacy apps GowLab LSID Launch pad LSID Authority
11
GGF Summer School 24th July 2004, Italy Service stack Taverna workbench Web Service (Grid Service) communication fabric AMBIT Text Extraction Service Native Web Services SoapLab Legacy apps Apps Core services External services Websites GowLab Web Portal LSID Launch Pad Haystack e-Science Mediator e-Science process patterns Service & workflow discovery Metadata management Data management e-Science event bus Workflow enactment ! ! ! !
12
GGF Summer School 24th July 2004, Italy 20,000 feet Freefluo Workflow Engine LSID Authority UDDI mIR metadata Store Service Provenance and Data browser Haystack or Portal Web services, local tools User interaction etc. Taverna Workbench View Service Semantic Discovery & Registration Event Notification Service mIR data
13
GGF Summer School 24th July 2004, Italy e-Science Mediator 1. Application-oriented: directly supports the e- Scientist by: providing pre-configured e-Science processes templates (i.e. system-level workflows) helping in capturing and maintaining context information (via the information model) that is relevant to the interpretation and sharing of the results of the e-science experiments. Facilitating personalisation and collaboration 2. Middleware-oriented: contributes to the synergy between my Grid services by: Acting as a sink for e-Science events initiated by my Grid components Interpreting the intercepted events and triggering interactions with other related components entailed by the semantics of those events Compensating for possible impedance mismatches with other services both in terms of data types and interaction protocols
14
GGF Summer School 24th July 2004, Italy Supporting the e-scientist Recurring use-cases can be captured Then corresponding process templates can be authored e-science mediator makes processes available to the user launch semantic Search facility Find Workflow Use-case Launch workflow Editor for selected WF Enable MIR browser For storage with context Find an interesting workflow for experiment Create exp. Context for this user Find Workflow Process Examine and modify if necessary Store to personal repository For later re-use
15
GGF Summer School 24th July 2004, Italy E-Science process templates maintained by the mediator can derive the GUI generation and interaction with the user E-Science Mediator GUI
16
GGF Summer School 24th July 2004, Italy Mediating between services Example: mediation during a workflow execution E-Science Mediator MIR 1: Execution started [*]3: intermediate process completed 6: workflow completed 2: Establish experiment/user context [*]4: link process trace to context 7: get WF results [*]5: Store intermediate process trace 8: Store WF results WF Enactor Notification Service 9: notify WF completion to subscribers
17
GGF Summer School 24th July 2004, Italy Simplified Architecture MIR Service Registry WF Enactor Notification Service E-Science Mediator Service E-Science Mediator client-stubs GUI (e-science workbench) Context preserved via myGrid Inormation Model Client-side e-science process logic Server-side e-science process logic The Grid Client Side
18
GGF Summer School 24th July 2004, Italy Event notification Service Publish/subscribe model –Topic based (cf. JMS topics, CORBA channels) –Hierarchic topics –Persistent event storage –Subscription leases –Federation for scalability & reliability –Event filtering http://cvs.mygrid.org.uk/notification-stable/downloads
19
GGF Summer School 24th July 2004, Italy Portal toolkit for bioinformaticians Target application –Williams-Beuren Syndrome –Fixed set of workflows Extra my Grid portlets –Configurable –Workflow enactment –Workflow scheduling –Completion notification –Results browsing Based on CHEF & Jetspeed-1 –Portlets for team collaboration Portlet Container Interface Portlet
20
GGF Summer School 24th July 2004, Italy Text Services User Client Medline Server (Sheffield) Swissprot/Blast record Workflow Server Workflow Enactment Extract PubMed Id Get Medline Abstract Initial Workflow Cluster Abstracts Get Related Abstracts Medline: pre-processed offline to extract biomedical terms + indexed XScufl workflow definition + parameters Clustered PubMed Ids + titles PubMed Ids Term-annotated Medline abstracts Medline Abstracts
21
GGF Summer School 24th July 2004, Italy History Pre-Prototype Prototype 1 Experimental Web-based Requirements gathering Architectural workout All services represented NetBeans workbench API-based integration Info Repository oriented XML-based process provenance Workflow enactment engine Prototype 2 Second generation services Reworked information model Open information management Life Science Identifiers RDF based provenance Taverna workbench Web-based portal Demo at ISMB 2003 Full paper and demo at ISMB 2004 GSK deployment Real biology
22
GGF Summer School 24th July 2004, Italy Two+ Paths Core functionality Services – Soaplab and Gowlab Workflow enactment engine – Freefluo Workflow workbench – Taverna Data integration – OGSA-DQP Information model & management Mediator Innovative work Service and workflow registration Semantic discovery Provenance management Text mining In between Event notification
23
GGF Summer School 24th July 2004, Italy my Grid People Core Matthew Addis, Nedim Alpdemir, Tim Carver, Rich Cawley, Neil Davis, Alvaro Fernandes, Justin Ferris, Robert Gaizaukaus, Kevin Glover, Carole Goble, Chris Greenhalgh, Mark Greenwood, Yikun Guo, Ananth Krishna, Peter Li, Phillip Lord, Darren Marvin, Simon Miles, Luc Moreau, Arijit Mukherjee, Tom Oinn, Juri Papay, Savas Parastatidis, Norman Paton, Terry Payne, Matthew Pokock Milena Radenkovic, Stefan Rennick-Egglestone, Peter Rice, Martin Senger, Nick Sharman, Robert Stevens, Victor Tan, Anil Wipat, Paul Watson and Chris Wroe. Users Simon Pearce and Claire Jennings, Institute of Human Genetics School of Clinical Medical Sciences, University of Newcastle, UK Hannah Tipney, May Tassabehji, Andy Brass, St Mary’s Hospital, Manchester, UK Steve Kemp, Liverpool, UK Postgraduates Martin Szomszor, Duncan Hull, Jun Zhao, Pinar Alper, John Dickman, Keith Flanagan, Antoon Goderis, Tracy Craddock, Alastair Hampshire Industrial Dennis Quan, Sean Martin, Michael Niemi, Syd Chapman (IBM) Robin McEntire (GSK) Collaborators Keith Decker
24
GGF Summer School 24th July 2004, Italy Collaboration http://www.accessgrid.org
25
GGF Summer School 24th July 2004, Italy Publications R. Stevens, H.J. Tipney, C. Wroe, T. Oinn, M. Senger, P. Lord, C.A. Goble, A. Brass and M. Tassabehji Exploring Williams-Beuren Syndrome Using myGrid to appear in Proceedings of 12th International Conference on Intelligent Systems in Molecular Biology, 31st Jul-4th Aug 2004, Glasgow, UK. C.A. Goble, S. Pettifer, R. Stevens and C. Greenhalgh Knowledge Integration: In silico Experiments in Bioinformatics in The Grid: Blueprint for a New Computing Infrastructure Second Edition eds. Ian Foster and Carl Kesselman, 2003, Morgan Kaufman, November 2003. R. Stevens, A. Robinson, and C.A. Goble myGrid: Personalised Bioinformatics on the Information Grid in proceedings of 11th International Conference on Intelligent Systems in Molecular Biology, 29th June–3rd July 2003, Brisbane, Australia, published Bioinformatics Vol. 19 Suppl. 1 2003, pp302-304.
26
GGF Summer School 24th July 2004, Italy http://www.mygrid.org.uk
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.