Download presentation
Presentation is loading. Please wait.
Published byDaniel Parrish Modified over 9 years ago
1
myGrid/Taverna Provenance Daniele Turi University of Manchester OMII f2f Meeting, London, 19-20/4/06
3
Components Identifiers –LSIDs Data –JDBC data store Metadata –RDF Provenance Plugin Browsing –Provenance Browser Plugin Security –Under development
4
LSID
5
LSID: Life Science Identifier URN specification in progress 5 part identifier (with optional version id) –urn:lsid:www.mygrid.org.uk:lsdocument:X1234 –urn:lsid:ncbi.nlm.nlh.gov.lsid.biopathways.org:genbank_gi :7717376 protocol for retrieving data and metadata about an object commitment by the provider to always return the same data for an ID
6
LSID (ctd) Issue – LSID Authorities Resolution – LSID Resolvers Examples – my Grid – Long Term Ecological Research Network – BioPathways Consortium
7
LSID (ctd 2) abstraction lightweight independent from actual storage implementation – database – file system – application both for private and public data sources
8
Data
9
Data Storage (current) Taverna can persist inputs, outputs and intermediate results in an SQL database via JDBC Optional and can be done by configuring a Baclava Data Store Allows the LSIDs of data items to be resolved against the actual data
10
Data Storage (future) Domain-specific databases –use outside myGrid Develop: –taverna processor for JDBC/OGSA-DAI –associated interface (cf BioMart) Users will be able to study the contents of an existing database and: –write queries that extract data from the database, where the query may be parameterised with values passed in from the workflow; –write requests that insert data from the workflow into a named table in the database.
11
Metadata
12
Metadata Generation Taverna Provenance Plugin Listen to Taverna Events –WorkflowEventListener Faithfully record them as ontological instance data –RDF graphs (one for each Taverna run)
13
Metadata Representation Ontology (Schema) Storage Query Browsing
14
Representation RDF –triples subject –predicate object –URIs (hence easy data integration) –semantic web language –XML serialization –flexible, powerful –sets of triples gives rise to graphs
15
Workflow Run urn:lsid:..:wfInstance:8 runs launchedBy belongsTo urn:lsid:…:org:HY7 urn:lsid:…:person:4 urn:lsid:…:workflow:6 urn:lsid:…:processRun:84 urn:lsid:…:processRun:51 executed
16
Schema Ontology –RDF schema Taxonomic inferences –also available as OWL opens it up to complex reasoning
18
Typed Workflow Run urn:lsid:..:wfInstance:8 runs launchedBy Experimenter belongsTo Organization urn:lsid:…:org:HY7 ProcessRunWorkflowRunWorkflow Provenance Ontology runs launchedBy belongsTo executed urn:lsid:…:person:4 urn:lsid:…:workflow:6 urn:lsid:…:processRun:84 urn:lsid:…:processRun:51 executed
19
Storage Named RDF graphs –retrieve whole graphs (eg workflows) –implementation in NG4J (Jena + MySQL) –scalability issues Sesame2 native store –scalable –Java 5
20
Query RDF query languages –TriQL, SeRQL, SPARQL query languages for named RDF graphs Ontology inspection/reasoning Canned Queries –workflows with failed processes –input/output of past process runs –workflows with data changed by user
22
Browsing
23
Provenance Browsing Provenance Browser Plugin –reusing Taverna GUI components Matthew Gamble
25
Analysis
26
Provenance Analysis Comparison Aggregation etc –see work by Jun Zhao
27
Security
29
User sends LSID ref and credentials to the Access Point Access Point returns data and metadata or denies access as follows: –credentials are passed to a User Directory –User Directory passes the corresponding user to the Authorization Authority –Authorization Authority returns the user attributes in the form of a (possibly signed) SAML assertion –this assertion, together with the lsid and its corresponding metadata, is passed to the Policy Enforcement Point (PEP) –PEP uses these three inputs to form an XACML request that is passed to a Policy Decision Point (PDP) that is preloaded with an XACML Policy Set.XACML Policy Set –PDP evaluates the request against its policy set and returns an XACML response to PEP –PEP decodes the response and either allows data/metadata to be returned to the user or denies access.
30
myGrid XACML Policy Scenario –supervisors can access all workflows in the organization –students can access only their own workflows –blacklisted users cannot access anything See policySet.xml on myGrid wikipolicySet.xml
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.