Presentation is loading. Please wait.

Presentation is loading. Please wait.

Provenance of scientific information as experienced in DRIVER 6th e-Infrastructure Concertation Event Lyon, 24 th November 2008 Wolfram Horstmann Bielefeld.

Similar presentations


Presentation on theme: "Provenance of scientific information as experienced in DRIVER 6th e-Infrastructure Concertation Event Lyon, 24 th November 2008 Wolfram Horstmann Bielefeld."— Presentation transcript:

1 Provenance of scientific information as experienced in DRIVER 6th e-Infrastructure Concertation Event Lyon, 24 th November 2008 Wolfram Horstmann Bielefeld University / DRIVER

2 Notions of Provenance Where do data objects* originate from? –Scientific Work -- examples Instrumentation techniques –Manufacturers of hard- and software Methodologies –Processes, e.g. gene sequencing –Technical/Local -- examples (web)-identifiers Database, repository name * Primary data, documents, metadata …

3 Why Provenance? Quoting / Citing / Referencing as global scientific principle –„Reproducible research“ Giving credits to authors / creators in distributed environments Original location / context has to be known Experienced in Grid-Environments [1]

4 Provenance & Interoperability Re-Use / Sharing: “Addressing/Accessing” –Common view, common use –Unidirectional: No change of data objects! Federation: “Discovering in Context” –Remote representation of distributed DOs Aggregation: “Contextualizing” –Add unchanged object in a context Processing/Annotation: “Changing” –Uni- vs. Bidirectional: Change of DOs and remote representation vs. back-storage (e.g. CVS)

5 Scenarios in DRIVER

6 Digital Scientific Data

7 Digital Object Collections ⊃ ⊃ ⊃⊃

8 Digital Object Repositories ++ ++ =

9 Digital Information Space

10 Conventional Web Data

11 „Simple“ Applications

12 Metadata Infrastructure

13 Basic Provenance Settings Indicate Production Situation –Metadata Author, Instrumentation etc. Remote Representation –Indicate place of origin in remote systems Metadata as digital objects / first order citizens –Allow lineage respresentation Credits in remote environments / versioning

14 Orders of Provenance 1st order: Metadata –Provenance attached to data –Minimal „knowledge“ required in application –Allow remote handling of data objects –Require metadata infrastructure –Metadata introduce 2 objects: requires linkage 2nd order: context / compounds –Express multiple relations between objects –May introduce semantic model

15 Provenance in DRIVER #1 Simple Objects: OAI-PMH [2] –1st order provenance Metadata: minimum OAI-DC –2nd order provenance DRIVER explicit identifiers for repositories OAI-PMH: inline representation („about“)

16 Semantic/Compound Data

17 „Semantic“ Applications

18 Provenance in DRIVER #2 „Enhanced Publications“ –Research project in DRIVER-II –Representation of data /document packages –Use of OAI-ORE

19 Provenance in OAI-ORE OAI-ORE: Object Re-Use and Exchange [4] –Uses Resource Maps < Named Graphs –Uses „lineage“ to represent expl. Provenance –Future: explicit provenance model [7] ?

20 Summary Provenance essential for … –Indicating origin in distributed data spaces Accessing / Addressing Federation / Aggregation Processing / Annotation –Document and data citation / trace-back –1st order: describing data > metadata –2nd order: describing context > semantic data

21 Lessons learnt in DRIVER Use web-enabled Identification (URI/UDDI etc.) –„Dark“ databases don‘t interoperate 1st order provenance at place of origin –Requires metadata to describe origin –Enables a metadata infrastructure –Introduces linkage problem 2nd order provenance in contexts –Requires data provider identification in federators / aggregators in order to link back –May require semantic model for context –Would benefit from a semantic infrastructure

22 Resources [1] On provenance in the eScience / grid-environment –http://www.sigmod.org/sigmod/record/issues/0509/p31-special-sw-section-5.pdfhttp://www.sigmod.org/sigmod/record/issues/0509/p31-special-sw-section-5.pdf –In GLITE http://www.cesnet.cz/doc/techzpravy/2007/glite-job-provenance/ http://twiki.ipaw.info/bin/view/Challenge [2] On provenance in OAI-PMH –http://www.openarchives.org/OAI/2.0/guidelines-provenance.htmhttp://www.openarchives.org/OAI/2.0/guidelines-provenance.htm [3] On provenance OAI-ORE (referred to as ore:lineage) –http://www.openarchives.org/ore/meetings/Soton/ore_beyond_basics.pdf (general)http://www.openarchives.org/ore/meetings/Soton/ore_beyond_basics.pdf –http://www.openarchives.org/ore/1.0/vocabulary (definition)http://www.openarchives.org/ore/1.0/vocabulary [4] Named Graphs, Provenance and Trust (Caroll et al. ) –http://www4.wiwiss.fu-berlin.de/bizer/SWTSGuide/carroll-ISWC2004.pdfhttp://www4.wiwiss.fu-berlin.de/bizer/SWTSGuide/carroll-ISWC2004.pdf [5] W3C: On provenance in RDF –http://www.w3.org/2001/12/attributions/http://www.w3.org/2001/12/attributions/ [6] Open Provenance Model –http://eprints.ecs.soton.ac.uk/14979/1/opm.pdfhttp://eprints.ecs.soton.ac.uk/14979/1/opm.pdf [7] DRIVER: Digital Repository Infrastructure for European Research –http://www.driver-community.euhttp://www.driver-community.eu


Download ppt "Provenance of scientific information as experienced in DRIVER 6th e-Infrastructure Concertation Event Lyon, 24 th November 2008 Wolfram Horstmann Bielefeld."

Similar presentations


Ads by Google