Download presentation
Presentation is loading. Please wait.
1
“provenance” DATA TRACK Chair : Krystyna Marek Rapporteur: Wolfram Horstmann 6th e-Infrastructure Concertation Lyon 24 Nov 2008
2
Motivation Last two meetings were on standards It was proposed to have a more focussed discussion –Focus on practice and interoperability rather than standards Select an arbitrary but important topic
3
Notions of Provenance Where do data objects* originate from? –Scientific Work -- examples Instrumentation techniques –Manufacturers of hard- and software Methodologies –Processes, e.g. gene sequencing –Technical/Local -- examples (web)-identifiers Database, repository name * Primary data, documents, metadata …
4
Why Provenance? Quoting / Citing / Referencing as global scientific principle –„Reproducible research“ Giving credits to authors / creators in distributed environments Original location / context has to be known Experienced in Grid-Environments [1]
5
Provenance & Interoperability Re-Use / Sharing: “Addressing/Accessing” –Common view, common use –Unidirectional: No change of data objects! Federation: “Discovering in Context” –Remote representation of distributed DOs Aggregation: “Contextualizing” –Add unchanged object in a context Processing/Annotation: “Changing” –Uni- vs. Bidirectional: Change of DOs and remote representation vs. back-storage (e.g. CVS)
6
IVOA Astronomy area: Repositories use OAI- PMH to provide general Provenance as kind of metadata –„Observation data model“ –History of data (process „lineage“) Processing Configuration: telescope, camera Ambient condiditions: temperature etc. –Versioning is included (also algorithms etc.)
7
MetaFor Data from numerical models Descriptive information from model Models are often transformed Database / Registry for models in distributed repositories
8
D4Science Framework for More than simple import framework Graphs representing provenance information –Thematic: fishing site / statistic /
9
DRIVER Focus on document repositories –Some 100 … Simple Provenance –OAI-PMH Further (2nd order) Provenance –OAI-PMH („about“): repository identifiers –Enhanced Publications >> OAI-ORE Semantic Model (named graphs) representing packages of documents and data objects
10
Solutions Provenance –Registries for curator, publisher etc. –Resolving over registry Diversity of approaches –CIDOC-CRM, OPM, EuroStats, –Languages: RDF / OAI-ORE
11
Differentiations Expertise from Data-Centers as opposed to Data-Providers –Infrastructures should provide functions to add provenenace information (but do not) –e.g. EGEE provides an additional module for recording provenance data
12
Hot topics Propagating provenance: versioning Disambiguation / Deduplication –different identical objects Who provides the data? –Each processing step should provide at least some metadata
13
Recommendations for Infrastructure Standards for Provenance: Non-existing? –Each processing step should provide at least some metadata –Look deeper into specific implementations in subject communities Technical point to point organisation –Bilateral Programming a meeting –24/25th ESA: earth science meeting?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.