Presentation is loading. Please wait.

Presentation is loading. Please wait.

“provenance” DATA TRACK Chair : Krystyna Marek Rapporteur: Wolfram Horstmann 6th e-Infrastructure Concertation Lyon 24 Nov 2008.

Similar presentations


Presentation on theme: "“provenance” DATA TRACK Chair : Krystyna Marek Rapporteur: Wolfram Horstmann 6th e-Infrastructure Concertation Lyon 24 Nov 2008."— Presentation transcript:

1 “provenance” DATA TRACK Chair : Krystyna Marek Rapporteur: Wolfram Horstmann 6th e-Infrastructure Concertation Lyon 24 Nov 2008

2 Motivation Last two meetings were on standards It was proposed to have a more focussed discussion –Focus on practice and interoperability rather than standards Select an arbitrary but important topic

3 Notions of Provenance Where do data objects* originate from? –Scientific Work -- examples Instrumentation techniques –Manufacturers of hard- and software Methodologies –Processes, e.g. gene sequencing –Technical/Local -- examples (web)-identifiers Database, repository name * Primary data, documents, metadata …

4 Why Provenance? Quoting / Citing / Referencing as global scientific principle –„Reproducible research“ Giving credits to authors / creators in distributed environments Original location / context has to be known Experienced in Grid-Environments [1]

5 Provenance & Interoperability Re-Use / Sharing: “Addressing/Accessing” –Common view, common use –Unidirectional: No change of data objects! Federation: “Discovering in Context” –Remote representation of distributed DOs Aggregation: “Contextualizing” –Add unchanged object in a context Processing/Annotation: “Changing” –Uni- vs. Bidirectional: Change of DOs and remote representation vs. back-storage (e.g. CVS)

6 IVOA Astronomy area: Repositories use OAI- PMH to provide general Provenance as kind of metadata –„Observation data model“ –History of data (process „lineage“) Processing Configuration: telescope, camera Ambient condiditions: temperature etc. –Versioning is included (also algorithms etc.)

7 MetaFor Data from numerical models Descriptive information from model Models are often transformed Database / Registry for models in distributed repositories

8 D4Science Framework for More than simple import framework Graphs representing provenance information –Thematic: fishing site / statistic /

9 DRIVER Focus on document repositories –Some 100 … Simple Provenance –OAI-PMH Further (2nd order) Provenance –OAI-PMH („about“): repository identifiers –Enhanced Publications >> OAI-ORE Semantic Model (named graphs) representing packages of documents and data objects

10 Solutions Provenance –Registries for curator, publisher etc. –Resolving over registry Diversity of approaches –CIDOC-CRM, OPM, EuroStats, –Languages: RDF / OAI-ORE

11 Differentiations Expertise from Data-Centers as opposed to Data-Providers –Infrastructures should provide functions to add provenenace information (but do not) –e.g. EGEE provides an additional module for recording provenance data

12 Hot topics Propagating provenance: versioning Disambiguation / Deduplication –different identical objects Who provides the data? –Each processing step should provide at least some metadata

13 Recommendations for Infrastructure Standards for Provenance: Non-existing? –Each processing step should provide at least some metadata –Look deeper into specific implementations in subject communities Technical point to point organisation –Bilateral Programming a meeting –24/25th ESA: earth science meeting?


Download ppt "“provenance” DATA TRACK Chair : Krystyna Marek Rapporteur: Wolfram Horstmann 6th e-Infrastructure Concertation Lyon 24 Nov 2008."

Similar presentations


Ads by Google