Download presentation
Presentation is loading. Please wait.
1
Provenance of scientific information as experienced in DRIVER 6th e-Infrastructure Concertation Event Lyon, 24 th November 2008 Wolfram Horstmann Bielefeld University / DRIVER
2
Notions of Provenance Where do data objects* originate from? –Scientific Work -- examples Instrumentation techniques –Manufacturers of hard- and software Methodologies –Processes, e.g. gene sequencing –Technical/Local -- examples (web)-identifiers Database, repository name * Primary data, documents, metadata …
3
Why Provenance? Quoting / Citing / Referencing as global scientific principle –„Reproducible research“ Giving credits to authors / creators in distributed environments Original location / context has to be known Experienced in Grid-Environments [1]
4
Provenance & Interoperability Re-Use / Sharing: “Addressing/Accessing” –Common view, common use –Unidirectional: No change of data objects! Federation: “Discovering in Context” –Remote representation of distributed DOs Aggregation: “Contextualizing” –Add unchanged object in a context Processing/Annotation: “Changing” –Uni- vs. Bidirectional: Change of DOs and remote representation vs. back-storage (e.g. CVS)
5
Scenarios in DRIVER
6
Digital Scientific Data
7
Digital Object Collections ⊃ ⊃ ⊃⊃
8
Digital Object Repositories ++ ++ =
9
Digital Information Space
10
Conventional Web Data
11
„Simple“ Applications
12
Metadata Infrastructure
13
Basic Provenance Settings Indicate Production Situation –Metadata Author, Instrumentation etc. Remote Representation –Indicate place of origin in remote systems Metadata as digital objects / first order citizens –Allow lineage respresentation Credits in remote environments / versioning
14
Orders of Provenance 1st order: Metadata –Provenance attached to data –Minimal „knowledge“ required in application –Allow remote handling of data objects –Require metadata infrastructure –Metadata introduce 2 objects: requires linkage 2nd order: context / compounds –Express multiple relations between objects –May introduce semantic model
15
Provenance in DRIVER #1 Simple Objects: OAI-PMH [2] –1st order provenance Metadata: minimum OAI-DC –2nd order provenance DRIVER explicit identifiers for repositories OAI-PMH: inline representation („about“)
16
Semantic/Compound Data
17
„Semantic“ Applications
18
Provenance in DRIVER #2 „Enhanced Publications“ –Research project in DRIVER-II –Representation of data /document packages –Use of OAI-ORE
19
Provenance in OAI-ORE OAI-ORE: Object Re-Use and Exchange [4] –Uses Resource Maps < Named Graphs –Uses „lineage“ to represent expl. Provenance –Future: explicit provenance model [7] ?
20
Summary Provenance essential for … –Indicating origin in distributed data spaces Accessing / Addressing Federation / Aggregation Processing / Annotation –Document and data citation / trace-back –1st order: describing data > metadata –2nd order: describing context > semantic data
21
Lessons learnt in DRIVER Use web-enabled Identification (URI/UDDI etc.) –„Dark“ databases don‘t interoperate 1st order provenance at place of origin –Requires metadata to describe origin –Enables a metadata infrastructure –Introduces linkage problem 2nd order provenance in contexts –Requires data provider identification in federators / aggregators in order to link back –May require semantic model for context –Would benefit from a semantic infrastructure
22
Resources [1] On provenance in the eScience / grid-environment –http://www.sigmod.org/sigmod/record/issues/0509/p31-special-sw-section-5.pdfhttp://www.sigmod.org/sigmod/record/issues/0509/p31-special-sw-section-5.pdf –In GLITE http://www.cesnet.cz/doc/techzpravy/2007/glite-job-provenance/ http://twiki.ipaw.info/bin/view/Challenge [2] On provenance in OAI-PMH –http://www.openarchives.org/OAI/2.0/guidelines-provenance.htmhttp://www.openarchives.org/OAI/2.0/guidelines-provenance.htm [3] On provenance OAI-ORE (referred to as ore:lineage) –http://www.openarchives.org/ore/meetings/Soton/ore_beyond_basics.pdf (general)http://www.openarchives.org/ore/meetings/Soton/ore_beyond_basics.pdf –http://www.openarchives.org/ore/1.0/vocabulary (definition)http://www.openarchives.org/ore/1.0/vocabulary [4] Named Graphs, Provenance and Trust (Caroll et al. ) –http://www4.wiwiss.fu-berlin.de/bizer/SWTSGuide/carroll-ISWC2004.pdfhttp://www4.wiwiss.fu-berlin.de/bizer/SWTSGuide/carroll-ISWC2004.pdf [5] W3C: On provenance in RDF –http://www.w3.org/2001/12/attributions/http://www.w3.org/2001/12/attributions/ [6] Open Provenance Model –http://eprints.ecs.soton.ac.uk/14979/1/opm.pdfhttp://eprints.ecs.soton.ac.uk/14979/1/opm.pdf [7] DRIVER: Digital Repository Infrastructure for European Research –http://www.driver-community.euhttp://www.driver-community.eu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.