Download presentation
Presentation is loading. Please wait.
Published byEmily Payne Modified over 9 years ago
1
CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005
2
CHESS seminar July 2005 Talk plan The grid The semantic grid Reuse and repurposing 7 bottlenecks to repurposing Semantics to the rescue
3
CHESS seminar July 2005 The Grid 1.Pervasive and dependable computing utility 2.A distributed computing infrastructure for advanced science and engineering 3.Coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organisations
4
CHESS seminar July 2005 Science in the 21 st century Huge quantities of data Huge number of data collection devices Analysis is the bottleneck Global distributed science –Collaboration and sharing the norm In silico experiments –Build, reuse, repurpose on-line concurrent processes (workflows) 114 genomes 735 in progress
5
CHESS seminar July 2005 Grid application evolution Large scale data, large number of machines, expensive computation, simple semantics, small numbers of people Smaller scale data, less machine computational intensive, complex heterogeneous applications, complex semantics, many people High Energy Physics Functional Genomics Oceanography Biodiversity Earth Science Neuroscience
6
CHESS seminar July 2005 The Semantic Grid The Grid has been about large scale computation But the applications are also about collaboration A gap between grid computing endeavours and the vision of Grid computing To support the full richness of the vision we need both grid and semantic web (technologies) Knowledge explicitly asserted & explicitly used
7
CHESS seminar July 2005 Classical Web Classical Grid Semantic Web Richer semantics More computation Semantic Grid Source: Norman Paton
8
CHESS seminar July 2005 Semantics in Grid workflows Classification and discovery of computational and data resources; provenance trails Declarative specification of services, workflows and their requirements; problem solving selection Job control, distributed execution models, semantic integration, resource brokering, resource scheduling Encoding performance metrics, service state, event notification topics, access rights to databases, personal profiles and security groupings; charging infrastructure
9
CHESS seminar July 2005 Talk plan The grid The semantic grid Reuse and repurposing 7 bottlenecks to repurposing Semantics to the rescue
10
CHESS seminar July 2005 From building workflows to recycling them Reuse of workflows –Best practice –Training –Peer review Repurposing –Adapt and extend useful fragments –Build on best practice –Across groups / communities
11
CHESS seminar July 2005 Analyze This
12
CHESS seminar July 2005 Analyze This x #scientists x #workflows x #versions x #runs
13
CHESS seminar July 2005 Bridging user information need and workflow descriptions
14
CHESS seminar July 2005 Network effects! Bridging user information need and workflow descriptions
15
CHESS seminar July 2005 Reuse and repurposing A user will reuse a workflow or workflow fragment that fits their purpose and could be customised with different parameter settings or data inputs to solve their particular scientific problem.
16
CHESS seminar July 2005 Reuse and repurposing A user will reuse a workflow or workflow fragment that fits their purpose and could be customised with different parameter settings or data inputs to solve their particular scientific problem. –A piece of an experimental description that is a coherent sub- workflow that makes sense to a domain specialist (in Ptolemy, a composite actor) –A snippet of workflow code + annotation
17
CHESS seminar July 2005 Reuse and repurposing A user will reuse a workflow or workflow fragment that fits their purpose and could be customised with different parameter settings or data inputs to solve their particular scientific problem. A user will repurpose a workflow or workflow fragment by 1.finding one that is close enough to be the basis of a new workflow for a different purpose and 2.making small changes to its structure to fit it to its new purpose. Aiming for automated discovery of ranked fragments
18
CHESS seminar July 2005 7 bottlenecks to workflow repurposing 1.Lack of a comprehensive discovery model 2.Process knowledge acquisition bottleneck 3.Lack of workflow fragment rankings 4.Workflow interoperability 5.Restrictions on service availability 6.Rigidity of service and workflow definitions 7.Intellectual property rights on workflows Collect enough workflows Make workflows usable
19
CHESS seminar July 2005 A comprehensive discovery model A user will repurpose a workflow or workflow fragment by 1.finding one that is close enough to be the basis of a new workflow for a different purpose and 2.making small changes to its structure to fit it to its new purpose. Based on semantic annotation, find a set of workflows, which people can then edit –For scientists: data flow based queries in their jargon, largely abstracting from control –For developers: control flow based queries, largely abstracting from data
20
CHESS seminar July 2005 Kepler http://kepler.ecoinformatics.org/ Courtesy Bertram Ludaescher
21
CHESS seminar July 2005 Scientist queries –Find all processes where sequence alignment is followed by visualisation –Given a set of data points, services, or fragments, have these been connected up in an existing base of workflows? Alternatives? –Show me the provenance of this workflow Developer queries –How have people applied this dataflow execution model (eg in Ptolemy, an SDF Director)? –How can it be combined with other execution models? A comprehensive discovery model
22
CHESS seminar July 2005 Challenges –Libraries of (scientific) task based patterns Eg task semantics of gene annotation pipelines classified in OWL –Libraries of design patterns for distributed behaviour Identify how people build concurrent systems; how they choose (combinations of) execution semantics A good start: workflow patterns for Petri Nets –Eg synchronizing merge and multi-merge A comprehensive discovery model
23
CHESS seminar July 2005 Workflow fragment rankings A user will repurpose a workflow or workflow fragment by 1.finding one that is close enough to be the basis of a new workflow for a different purpose and 2.making small changes to its structure to fit it to its new purpose. We need metrics for processes –For scientists: ranking scientific relevance –For developers: compare processes based on the same execution semantics compare different execution semantics Challenge: defining the metrics, and combining them into rankings
24
CHESS seminar July 2005 Workflow interoperability A user will repurpose a workflow or workflow fragment by 1.finding one that is close enough to be the basis of a new workflow for a different purpose and 2.making small changes to its structure to fit it to its new purpose. Workflows take a long time to build and get very large The nice thing about standards… Different workflow systems, different (implicit) semantics Import workflows across workflow environments 1.Manually redo it in your own 2.Wrapping 3.Auto-rewrite to new environment eg
25
CHESS seminar July 2005 Workflow interoperability To inform interoperation, we need a layer of abstraction that captures behavioural semantics Many non-standardised formalisms out there –Functional languages - one paradigm fits all? –Petri nets –Process algebras –Finite State Machines –All (hierarchical-) combinations of these Challenge: –Behavioural design patterns to compare formalism classes, eg PN and SDF Director
26
CHESS seminar July 2005 Conclusions Grid = Semantic Grid Reuse <> repurposing Task and behavioural semantics both needed for repurposing Design patterns for distributed processes: a long road ahead –Task semantics –Behavioural semantics
27
CHESS seminar July 2005 EPSRC funded UK eScience Program Pilot Project Many slides taken from Carole Goble
28
CHESS seminar July 2005 Core Matthew Addis, Nedim Alpdemir, Tim Carver, Rich Cawley, Neil Davis, Alvaro Fernandes, Justin Ferris, Robert Gaizaukaus, Kevin Glover, Carole Goble, Chris Greenhalgh, Mark Greenwood, Yikun Guo, Jan Humble, Ananth Krishna, Peter Li, Phillip Lord, Darren Marvin, Simon Miles, Luc Moreau, Arijit Mukherjee, Tom Oinn, Juri Papay, Savas Parastatidis, Norman Paton, Terry Payne, Matthew Pocock Milena Radenkovic, Stefan Rennick-Egglestone, Peter Rice, Ian Roberts, Martin Senger, Nick Sharman, Robert Stevens, Victor Tan, Anil Wipat, Paul Watson, Jimi Worthington and Chris Wroe. Users Simon Pearce and Claire Jennings, Institute of Human Genetics School of Clinical Medical Sciences, University of Newcastle, UK Hannah Tipney, May Tassabehji, Andy Brass, St Mary’s Hospital, Manchester, UK Steve Kemp, Liverpool, UK Postgraduates Martin Szomszor, Duncan Hull, Jun Zhao, Pinar Alper, Keith Flanagan, Antoon Goderis, Tracy Craddock, Alastair Hampshire Industrial Dennis Quan, Sean Martin, Michael Niemi, Syd Chapman (IBM) Robin McEntire (GSK) Collaborators Keith Decker
29
CHESS seminar July 2005 References Publications on –Home page: www.cs.man.ac.uk/~goderisa –myGrid site: www.mygrid.org.uk
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.