Download presentation
Presentation is loading. Please wait.
Published byAndra Nichols Modified over 9 years ago
1
The Semantic Web, Service Oriented Architectures, the my Grid Experience Carole Goble http://www.mygrid.org.uk
2
Roadmap The problem my Grid Semantic Service / Workflow Discovery Provenance and metadata modelling Semantic Web is Semantic Glue
3
EPSRC funded UK e-Science Program Pilot Project Thanks to the other members of the Taverna project, http://taverna.sf.nethttp://taverna.sf.net
4
1.Identify new, overlapping sequence of interest 2.Characterise the new sequence at nucleotide and amino acid level Cutting and pasting between numerous web-based services i.e. BLAST, InterProScan etc 12181 acatttctac caacagtgga tgaggttgtt ggtctatgtt ctcaccaaat ttggtgttgt 12241 cagtctttta aattttaacc tttagagaag agtcatacag tcaatagcct tttttagctt 12301 gaccatccta atagatacac agtggtgtct cactgtgatt ttaatttgca ttttcctgct 12361 gactaattat gttgagcttg ttaccattta gacaacttca ttagagaagt gtctaatatt 12421 taggtgactt gcctgttttt ttttaattgg gatcttaatt tttttaaatt attgatttgt 12481 aggagctatt tatatattct ggatacaagt tctttatcag atacacagtt tgtgactatt 12541 ttcttataag tctgtggttt ttatattaat gtttttattg atgactgttt tttacaattg 12601 tggttaagta tacatgacat aaaacggatt atcttaacca ttttaaaatg taaaattcga 12661 tggcattaag tacatccaca atattgtgca actatcacca ctatcatact ccaaaagggc 12721 atccaatacc cattaagctg tcactcccca atctcccatt ttcccacccc tgacaatcaa 12781 taacccattt tctgtctcta tggatttgcc tgttctggat attcatatta atagaatcaa
5
Middleware for Life Science solutions Interoperation of services and data sources Repeat Reuse and Share Provenance Manage results My tools, my resources 12181 acatttctac caacagtgga tgaggttgtt ggtctatgtt ctcaccaaat ttggtgttgt 12241 cagtctttta aattttaacc tttagagaag agtcatacag tcaatagcct tttttagctt 12301 gaccatccta atagatacac agtggtgtct cactgtgatt ttaatttgca ttttcctgct 12361 gactaattat gttgagcttg ttaccattta gacaacttca ttagagaagt gtctaatatt 12421 taggtgactt gcctgttttt ttttaattgg
6
Middleware for Life Science 12181 acatttctac caacagtgga tgaggttgtt ggtctatgtt ctcaccaaat ttggtgttgt 12241 cagtctttta aattttaacc tttagagaag agtcatacag tcaatagcct tttttagctt 12301 gaccatccta atagatacac agtggtgtct cactgtgatt ttaatttgca ttttcctgct 12361 gactaattat gttgagcttg ttaccattta gacaacttca ttagagaagt gtctaatatt 12421 taggtgactt gcctgttttt ttttaattgg
8
Taverna Workflow Workbench OGSA-Distributed Query Processing Results management LSID mIR e-Science coordination e-Science mediator e-Science process patterns e-Science events Notification service Architectural framework my Grid information model Metadata & provenance management using semantics KAVE Legacy integration Publication and Discovery using semantics Feta Pedro Ontology Portal & Application tools
10
How to select among 3000+ services? Mostly inputs & outputs are “string” Domain specific descriptions of capabilities Selection is part of workflow assembly by bioinformaticians Selection of alternates for failure also generally user defined, and usually replicas, but need not be. First, find your service
11
Which means describe your service… Publish and find services (and workflows) with description using an ontology Define domain types for objects passed around workflow Define a set of dimensions with which service capabilities GRIMOIRES / WebDAV directory Tied to BioMOBY Central
13
Semantic discovery Publish and find services (and workflows) with description using an ontology (in OWL/RDF) Define domain types for objects passed around and a set of dimensions with which service capabilities can be defined using processor abstraction Bootstrapping descriptions Mining and maintaining descriptions The Expert Annotator GRIMOIRE / WebDAV directory Tie into BioMOBY central http://phoebus.cs.man.ac.uk:8100/fet a-beta/mygrid/descriptions/http://phoebus.cs.man.ac.uk:8100/fet a-beta/mygrid/descriptions/ Phillip Lord, Pinar Alper, Chris Wroe, and Carole Goble Feta: A light-weight architecture for user oriented semantic service discovery in Proc of 2 nd European Semantic Web Conference, Crete, June 2005
14
OWL-S OWL-WS WSMO http://www.swsi.org/ WSDL-S
15
Web Interface Processor API Processor API Generic Schema for Service (part of Information model) Specific Application Ontology e.g. caCORE Semantic Web Services Layered model Wroe C, Goble CA, Greenwood M, Lord P, Miles S, Papay J, Payne T, Moreau L Automating Experiments Using Semantic Data on a Bioinformatics Grid in IEEE Intelligent Systems Jan/Feb 2004 We don’t describe WSDL, we describe operations and processors We are classifying for people not machines, so don’t be too clever!
16
Operation name, description task method resource application Service name description author organisation Parameter name, description semantic type format transport type collection type collection format WSDL based Web service WSDL based operation Soaplab servicebioMoby serviceworkflow hasInput hasOutput Local Java code subclass
17
Semantic Web Services Semantic Descriptions for Discovery Automated Discovery services or workflows Knowledge assisted brokering & match making Guided instantiation and substitution Composition Automated Composition Self organising SOA Guided workflow assembly Composition (workflow) verification and validation
18
Semantics-enabled Problem Solving Task configuration Workflow construction Workflow Advisor Semantic service discovery EDSO task ontology
19
Observations Technical and Abstraction mismatches –Man vs Machine. Manual vs Automation. Service vs Domain Semantics. Basic errors in modelling. –Web services in the wild suck. Not everything is a Web Service. Legacy –Services, middleware, content and practice. Practicality mismatches –Automated or assisted discovery desirable, likely, popular –Automated composition undesirable, unlikely, unpopular Capturing and Curating Content –Annotation is hard. Building the Ontology is hard. QA is hard. Keeping the annotation up to date is hard. The Expert annotator; Altruism for Reuse. Quality Control; Hendler’s Principle –A little semantics goes a long way! Too complicated to use. Tools!!
20
Sharing takes effort. Unanticipated reuse by people you don’t know in automated workflows. The metadata needed pays off but its challenging and costly to obtain.. Automated, service providers, network effects Quality control. Misuse. Inappropriate use. Competitive advantage, Intellectual property. Workflow design - local or licensed services
21
The devil is in the detail Experiment provenance Simple workflow Descriptions in biological language Workflows for automagical execution – implicit iteration, generous typing … Debugging and rerunning provenance logs Simple classifications of services Expressive ontologies to match up services automatically Descriptions for automatic service execution and fault management
22
Courtesy Jim Myers, NCSA e-Scientific method in vivo in vitro in silico Discovery Electronic Notebook Scientific Provenance Engineering Provenance Authorization Project Organization Logging Curation Scientific Content
23
Tavena workflow workbench in my Grid http://taverna.sourceforge.net http://www.mygrid.org.uk http://taverna.sourceforge.net
24
Provenance in myGrid The process The data derivation path The ownership The evidence of knowledge a1 E1:S1 X1 E1:S2 Y1Z1 Manchester university “how the Y1 was produced using a1”
25
Provenance graph representation Identity for the node: URI –Universal Resource Identifier –An extension of URL An RDF (Resource Description Framework) graph: derivedFrom inputOf Ontologies –Telling what they are isA hasFeature Each URI is associated with: –A set of provenance statements –A RDF provenance graph
26
urn:data:f2 urn:data1 urn:data2 urn:compareinvocation3 urn:data12 Blast_report [input] [output] [input] [distantlyDerivedFrom] SwissProt_seq [instanceOf] Sequence_hit [hasHits] urn:hit2…. urn:hit1… urn:hit50….. [instanceOf] [similar_sequence_to] Data generated by services/workflows Concepts [ ] [performsTask] Find similar sequence [contains] Services urn:data:3 urn:hit8…. urn:hit5… urn:hit10….. [contains] [instanceOf] urn:BlastNInvocation3 urn:invocation5 urn:data:f1 [output] New sequence Missed sequence [hasName] literals DatumCollection [type] LSDatum [type] Properties [instanceOf] [output] [directlyDerivedFrom] Resource Description Framework
27
Provenance Flexible and extensible schema Data fusion and aggregation across provenance metadata Reasoning and querying over descriptions Transparent description
29
myGrid Provenance example
30
Annotate Anything People, meetings, discussions, conference talks Scientific publications, recommendations, quality comments Events, notifications, logs Services and resources Schemas and catalogue entries Models, codes, builds, workflows, Data files and data streams Sensors and sensor data … DFDL, JSDL, SAML, WSDL, WSRF, DL*, ML* as RDF? If you are using a controlled vocabulary, then lets use a standard controlled vocabulary language.
31
Seamark Demonstration: Identification of new drug candidates for BRKCB-1 Courtesy Joanne Luciano
32
Observations Flexible metadata description for data Multi tiered model for different perspectives –Machine vs Person; The ontologies for people discovery are not good enough for knowledge aggregation Make the semantics invisible Provenance aggregation Identity crisis Exposing knowledge means knowledge exposure. –Reluctance to give up knowledge assets. Vulnerability. Knowledge is power. Incentive models. IPR. Privacy. Capturing the Semantic Content explicitly. –Acquiring ontology annotations; Hard to describe policies. Vagueness and trivia. Trying to capture people-focused provenance. Hendler principle A little semantics goes a long way.
33
Data mining Knowledge Discovery Smart search Social networking Smart portals Agents Information Integration and aggregation Use of Semantic Web Technologies A Semantic Web of Life Science
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.