Presentation is loading. Please wait.

Presentation is loading. Please wait.

MyGrid/Taverna Provenance Daniele Turi University of Manchester OMII f2f Meeting, London, 19-20/4/06.

Similar presentations


Presentation on theme: "MyGrid/Taverna Provenance Daniele Turi University of Manchester OMII f2f Meeting, London, 19-20/4/06."— Presentation transcript:

1 myGrid/Taverna Provenance Daniele Turi University of Manchester OMII f2f Meeting, London, 19-20/4/06

2

3 Components Identifiers –LSIDs Data –JDBC data store Metadata –RDF Provenance Plugin Browsing –Provenance Browser Plugin Security –Under development

4 LSID

5 LSID: Life Science Identifier URN specification in progress 5 part identifier (with optional version id) –urn:lsid:www.mygrid.org.uk:lsdocument:X1234 –urn:lsid:ncbi.nlm.nlh.gov.lsid.biopathways.org:genbank_gi :7717376 protocol for retrieving data and metadata about an object commitment by the provider to always return the same data for an ID

6 LSID (ctd) Issue – LSID Authorities Resolution – LSID Resolvers Examples – my Grid – Long Term Ecological Research Network – BioPathways Consortium

7 LSID (ctd 2) abstraction lightweight independent from actual storage implementation – database – file system – application both for private and public data sources

8 Data

9 Data Storage (current) Taverna can persist inputs, outputs and intermediate results in an SQL database via JDBC Optional and can be done by configuring a Baclava Data Store Allows the LSIDs of data items to be resolved against the actual data

10 Data Storage (future) Domain-specific databases –use outside myGrid Develop: –taverna processor for JDBC/OGSA-DAI –associated interface (cf BioMart) Users will be able to study the contents of an existing database and: –write queries that extract data from the database, where the query may be parameterised with values passed in from the workflow; –write requests that insert data from the workflow into a named table in the database.

11 Metadata

12 Metadata Generation Taverna Provenance Plugin Listen to Taverna Events –WorkflowEventListener Faithfully record them as ontological instance data –RDF graphs (one for each Taverna run)

13 Metadata Representation Ontology (Schema) Storage Query Browsing

14 Representation RDF –triples subject –predicate  object –URIs (hence easy data integration) –semantic web language –XML serialization –flexible, powerful –sets of triples gives rise to graphs

15 Workflow Run urn:lsid:..:wfInstance:8 runs launchedBy belongsTo urn:lsid:…:org:HY7 urn:lsid:…:person:4 urn:lsid:…:workflow:6 urn:lsid:…:processRun:84 urn:lsid:…:processRun:51 executed

16 Schema Ontology –RDF schema Taxonomic inferences –also available as OWL opens it up to complex reasoning

17

18 Typed Workflow Run urn:lsid:..:wfInstance:8 runs launchedBy Experimenter belongsTo Organization urn:lsid:…:org:HY7 ProcessRunWorkflowRunWorkflow Provenance Ontology runs launchedBy belongsTo executed urn:lsid:…:person:4 urn:lsid:…:workflow:6 urn:lsid:…:processRun:84 urn:lsid:…:processRun:51 executed

19 Storage Named RDF graphs –retrieve whole graphs (eg workflows) –implementation in NG4J (Jena + MySQL) –scalability issues Sesame2 native store –scalable –Java 5

20 Query RDF query languages –TriQL, SeRQL, SPARQL query languages for named RDF graphs Ontology inspection/reasoning Canned Queries –workflows with failed processes –input/output of past process runs –workflows with data changed by user

21

22 Browsing

23 Provenance Browsing Provenance Browser Plugin –reusing Taverna GUI components Matthew Gamble

24

25 Analysis

26 Provenance Analysis Comparison Aggregation etc –see work by Jun Zhao

27 Security

28

29 User sends LSID ref and credentials to the Access Point Access Point returns data and metadata or denies access as follows: –credentials are passed to a User Directory –User Directory passes the corresponding user to the Authorization Authority –Authorization Authority returns the user attributes in the form of a (possibly signed) SAML assertion –this assertion, together with the lsid and its corresponding metadata, is passed to the Policy Enforcement Point (PEP) –PEP uses these three inputs to form an XACML request that is passed to a Policy Decision Point (PDP) that is preloaded with an XACML Policy Set.XACML Policy Set –PDP evaluates the request against its policy set and returns an XACML response to PEP –PEP decodes the response and either allows data/metadata to be returned to the user or denies access.

30 myGrid XACML Policy Scenario –supervisors can access all workflows in the organization –students can access only their own workflows –blacklisted users cannot access anything See policySet.xml on myGrid wikipolicySet.xml


Download ppt "MyGrid/Taverna Provenance Daniele Turi University of Manchester OMII f2f Meeting, London, 19-20/4/06."

Similar presentations


Ads by Google