Download presentation
Presentation is loading. Please wait.
1
A Digital Provenance Ontology
CRM Digital A Digital Provenance Ontology TPDL 2011 Martin Doerr Center for Cultural Informatics, Institute of Computer Science Foundation for Research and Technology - Hellas Berlin, Germany September 25, 2011
2
Outline Requirements Competitors CRM and Provenance Data example
About Provenance-based reasoning Conclusions
3
Digital Provenance Metadata Requirements
Scientific data are empirical or synthetic. Scientific data cannot be understood without knowledge about the meaning of the data and the ways and circumstances of their creation We use Metadata to assess meaning (view, experimental setup, instrument settings), relevance (depicted things, their status, their conditions), quality (calibration, tolerances, errors, “artifacts”), possibilities of Improvement and Reprocessing. From generation to use, permanent storage, reuse (life-cycle)
4
Requirements Acquisition: Reliable registration of the process and context conditions The experimental setup and environment (geometry, light sources, tools, obstacles, sources of noise/reflections etc.) Capture device type, identity (individual behavior!) Hierarchical model: Inherit metadata common to series of “shots” The identity of the measured or depicted object import identifiers, metadata identity of location – GPS data import?
5
Requirements Processing: Reliable registration of parameters
Workflow logs, reliable identification of outputs with inputs input files (URIs!) output files (URIs!), formats, warning and error reports. S/W identifiers and parameters, manual adjustments! process types for reasoning Reliable linking with captured data Use and Reuse: parts, wholes and annotation: Composition of final products, information packages (SIP, DIP, AIP) composition of aggregates, selection of versions or parts for permanent storage, reuse or transfer between labs, to and from Digital Libraries. Migration to other formats (compatibility and obsoletion) Authenticity, rights
6
Competitors There is no provenance data standard format. Competitors:
Too many application-oriented, partial, overspecialized solutions. Several stand-alone models, overgeneralizations No integrated ontology of activity context Competitors: “Open Provenance Model”,”Provenance Vocabulary”,”Provenir”,”Premise” no notion of acquisition (measurement, observation), place Confuse agentive role with substance of actors, machines, S/W, context No notion of temporal indeterminacy W3C Provenance WG precondition: No use of a larger reference ontology => a dogmatic reinvention of the wheel….”antimodularity of ontologies?”
7
Competitors “Provenance Vocabulary”
8
Competitors: “Provenir”
9
CRM and Provenance The Idea:
First conceived by Stephen Stead for CHI, San Francisco 2007 Scientific data and metadata are historical records! Scientific observation and machine-supported processing is initiated, on behalf of and controlled by human activity in physical space-time, not in cyber-space! Things, data, people, times and places are causally related by events. Other relations are either deductions from events or found by observation events. CRM Digital: Specialize the CIDOC CRM (ISO21127)! Will allow for rich integrated reasoning (Christ – Ascension – Ivory panel) Innovations: The Digital Measurement Event transfers from physical to digital world. Machines “act” due to human initiative and responsibility. Humans use machines. No non-human actors!
10
t S 3D Model Creation as Meetings Museum It-Lab 3D model mesh-data
coherence volume of rendering coherence volume of mesh-creation mesh-data 2nd Computer scanner scan-data 1st Computer museum object operator coherence volume of acquisition S Museum It-Lab
11
CRM Digital 2.5
12
CRM Digital 2. 5 : Digital Events http://www. ics. forth
E7 Activity E65 Creation E11 Modification E16 Measurement D7 Digital Machine Event D10 Software Execution D12 Data Transfer Event D11 Digital Measurement Event D3 Formal Derivation D27 Calibration Process D2 Digitization Process
13
D13 Digital Information Carrier
CRM Digital 2.5: Digital Things D1 Digital Object E54 Dimension D9 Data Object E73 Information Object E70 Thing D8 Digital Device E22 Man-Made Object E84 Information Carrier D13 Digital Information Carrier D35 Area D14 Software
14
CRM Digital 2.5: Digitization
Digitization = feature transfer from physical to digital E16 Measurement E65 Creation E11 Modification P31 has modified (was modified by) P39 measured (was measured by) P40 observed dimension (was observed in) E24 Physical Man-Made Thing P94 has created (was created by) E1 CRM Entity D13 Digital Information Carrier D11 Digital Measurement Event E54 Dimension E28 Conceptual Object D2 Digitization Process L19 stores (is stored on) L1 digitized (was digitized by) L20 has created (was created by) D1 Digital Object E18 Physical Thing D9 Data Object
15
CRM Digital 2.5: Software Execution
Formal Derivation = feature transfer from digital to digital D1 Digital Object L10 had input (was input of) L11 had output (was output of) D7 Digital Machine Event D1 Digital Object L12 happened on device (was device for) L2 used as source (was source for) D10 Software Execution D1 Digital Object D8 Digital Device L18 has modified (was modified by) D13 Digital Information Carrier L13 used parameters (parameters for) D3 Formal Derivation D1 Digital Object L21 used as derivation source (was derivation source for) L22 created derivative (was derivative created by) P2 has type (is type of) D1 Digital Object D1 Digital Object E55 Type
16
D7 Digital Machine Event D13 Digital Information Carrier
CRM Digital 2.5: Data Transfer Event Unreliable transfer L10 had input (was input of) L11 had output (was output of) D1 Digital Object D7 Digital Machine Event D1 Digital Object L18 has modified (was modified by) L14 transferred (was transferred by) D1 Digital Object D12 Data Transfer Event L12 happened on device (was device for) D13 Digital Information Carrier L15 has sender (was sender for) L16 has receiver (was receiver for) D8 Digital Device D8 Digital Device D8 Digital Device
17
Applications European IP CASPAR European IP 3D-COFORM
European Space Agency: satellite data IRCAM: Digital media performances FORTH: Art Object Digitization FORTH/ Metaware: Integrating Digital Rights with Provenance model. European IP 3D-COFORM 3D model acquisition by camera, manual or by camera array. Up to files per object. 3D model acquisition by laser scan. Mesh processing, rendering Synthetic models and scene compositions. Provenance-based reasoning Scalable repositories, representative amounts of data.
18
3D Acquisition Example: worst case for metadata capture:
3D Reconstruction from Photographs – The Gipsmuseum Campaign Sven Havemann, CGV, TU Graz June 30, 2009 worst case for metadata capture: a complex manual process
19
Acquisition Workflow Hierarchy
D2 Digitization Process instantiation example Data Acquisition Event DAE1 has part has part has part Calibration Event CE1 Object Acquisition Event OAE1 Digital Documentation Event DDE1 Object Acquisition Event OAE1 has part has part has part Calibration Event CE2 Sequence Event SE1 Digital Documentation Event DDE2 Sequence Event SE1 Sequence Event SE1 has part Calibration Event CE3 Capturing Event CapE1 Capturing Event CapE1 Capturing Event CapE1 Capturing Event CapE1 Capturing Event CapE1
20
Modelling the Acquisition Process (AP)
Register: Who, when, where. equipment identifiers, equipment models, firmware Setup geometry and conditions. Assumptions: worst case, a completely manual process! Set of objects captured under common conditions. Each object captured by a sequence of “shots” Metadata are stored by “historical order” (like workflow logs) step-by-step as executed, not as planned! concatenated by referring to identifiers of previously existing or created entities and initialized events. = robust against exceptions in the planned workflow Avoid redundancy of information Hold common information as high as possible in a hierarchy of nested activities
21
3D Acquisition Example: Workflow 3D scanning – NextEngine
The Kazafani Boat Found in 1963, during a salvage excavation in the now Turkish occupied part of Cyprus (inaccessible and destroyed site). Tomb from the 12th century B.C. Unique object, hand made pottery 40x20.5x23 cm – canoe boat shape Permanently exhibited at the Nicosia Museum Workflow 3D scanning – NextEngine 3D model creation – Meshlab Rapid prototyping Testing glue, stabilizers, colours Print final replica Colour final replica 21
22
Data Acquisition Event - Schema
Persons (“operators”) Data Acquisition Event - Schema Person: uuid:aeac e0-a c9a66 (E21 Person) P131 is identified by : D21 Person Name L51 has first name: Martin (Literal E62 String) L52 has last name: Doerr (Literal E62 String) P107 is current or former member: (E40 Legal Body) L62 in the role of: (E55 Type)
23
Data Acquisition Event - Schema
Legal Bodies & Places Data Acquisition Event - Schema Legal Body: (E40 Legal Body) L4 has preferred label: STARC-The Cyprus Institute, Nicosia, Cyprus (Literal E62 String) no address P74 has current or former residence: (E53 Place) L4 has preferred label: Nicosia (Literal E62 String) P3 has note: Cyprus (Literal E62 String) exact address and the city where it is located P74 has current or former residence: uuid: dbae7cd0-e371-11e c9a66 (E53 Place) L4 has preferred label: 15 Kypranoros Street (Literal E62 String) P89 falls within: (E53 Place) P3 has note: 15 Kypranoros Street, Nicosia 1061, Cyprus (Literal E62 String) just the address without details for city L4 has preferred label: 15 Kypranoros Street (Literal E62 String) P3 has note: 15 Kypranoros Street, Nicosia 1061, Cyprus (Literal E62 String)
24
Data Acquisition Event
Data Acquisition Event: uuid:354c91e0-b3fa-11de-98c6-0002a5d5c30a (D2 Digitization Process) L4 has preferred label: 2010 Laser scanning in Arch. Museum of Nicosia (Literal E62 String) P2 has type: (E55 Type) P2 has type: (E55 Type) P3 has note: “evening sun shines through the west window” (Literal E62 String) SUPER-EVENTS: P9 forms part of: uuid:07f05f40-b415-11de-9d a5d5c30c (E7 Activity) (Project) WHEN: L31 has starting date-time: T08:00:00Z (xs:dateTime E61 Time Primitive) L32 has ending date-time: T18:00:00Z (xs:dateTime E61 Time Primitive) WHERE: P7 took place at: (E53 Place) WHO: L29 has responsible organisation: (E40 Legal Body) L30 has operator: uuid:aeac e0-a c9a66 (E21 Person)
25
Data Acquisition Event
Data Acquisition Event: uuid:354c91e0-b3fa-11de-98c6-0002a5d5c30a (D2 Digitization Process) WITH WHAT (camera): L12 happened on device: (D8 Digital Device) L59 has serial number: E (Literal E62 String) L4 has preferred label: “Next Engine Desktop 3D scanner” (Literal E62 String)(=Model) P2 has type: (E55 Type) L33 has maker: (E39 Actor) P3 has note: Next Engine Desktop 3D scanner, Multi stripe laser (Literal E62 String) L23 used software or firmware: (D14 Software) WITH WHAT (additional devices): P16 used specific object: (E22 Man Made Object) L59 has serial number: (Literal E62 String) L4 has preferred label: SONY PLFE 40 Projector (Literal E62 String)(= Model) P2 has type: (E55 Type) L33 has maker: (E39 Actor) P16 used specific object: (E22 Man Made Object)
26
Object Acquisition Event
Object Acquisition Event: uuid:07f05f40-b415-11de-9d a5d5c30b (D2 Digitization Process) L4 has preferred label: 2010 Laser scanning of Kazafani Boat in Archaeological Museum of Nicosia (Literal E62 String) P2 has type: (E55 Type) L10 had input: uuid:3d066a90-9cb1-11e0-aa c9a66 (D9 Data Object) (calibration file) L10 had input: uuid: cce-11e0-aa c9a66 (D9 Data Object) (configuration file) SUPER-EVENTS: P9 forms part of: uuid:354c91e0-b3fa-11de-98c6-0002a5d5c30a (D2 Digitization Process) (Data Acquisition Event)
27
Object Acquisition Event
Object Acquisition Event: uuid:07f05f40-b415-11de-9d a5d5c30b (D2 Digitization Process) WHAT (acquired object): L1 digitized: uuid:e4761f00-0ce7-11e0-81e c9a66 (E22 Man-Made Object) P1 is identified by: (E42 Identifier) (all “known” URIs) L4 has preferred label: Kazafani Boat, vase, (Literal E62 String) L53 is not uniquely identified by: Kazafani Boat (Literal E62 String) L53 is not uniquely identified by: Bronze Age model of a boat (Literal E62 String) L55 has inventory no: (Literal E62 String) P2 has type: (E55 Type) (vase) P3 has note: “Deep hollow hull with in-curving flat-topped gunwale ….” (Literal E62 String) P50 has current keeper: uuid:6f2972e6-ad9e-4a72-930d-263f01e75d8c (E40 Legal Body) (Archaeological Museum of Nicosia)
28
Calibration Event Calibration Event: uuid:07f05f40-b415-11de-9d a5d5c30c (D2 Digitization Process) P2 has type: (E55 Type) L1 digitized: (E18 Physical Thing) L4 has preferred label: block of bariumsulfate (10x10x1cm) (Literal E62 String) P2 has type: (E55 Type) (color chart, ruler, greyscale) WHEN: L31 has starting date-time: T16:04:34Z (xs:dateTime E61 Time Primitive) L32 has ending date-time: T16:04:34Z (xs:dateTime E61 Time Primitive) OUTPUT: L20 has created: uuid:07f05f40-b415-11de-9d a5d5c31c (D9 Data Object)
29
Capturing Event Capturing Event: uuid:07f05f40-b415-11de-9d a5d5c30n (D2 Digitization Process) L4 has preferred label: Capture 1_0 for Boat (Literal E62 String) P2 has type: (E55 Type) SUPER-EVENTS: P9 forms part of: uuid:07f05f40-b415-11de-9d a5d5c30b (D2 Digitization Process) (Object Acquisition Event) WHEN: L31 has starting date-time: T16:07:54Z (xs:dateTime E61 Time Primitive) L32 has ending date-time: T16:07:54Z (xs:dateTime E61 Time Primitive)
30
Capturing Event Capturing Event: uuid:07f05f40-b415-11de-9d a5d5c30n (D2 Digitization Process) (cont’d) OUTPUT: L20 has created: uuid:07f05f40-b415-11de-9d a5d5c31g (D9 Data Object) (image file, zip file ...) L4 has preferred label: 1_0.ply (Literal E62 String) P2 has type: (E55 Type) P2 has type: (E55 Type) P43 has dimension: (E54 Dimension) P2 has type: (E55 Type) P90 has value: (xs:integer E60 Number) P91 has unit: (E58 Measurement Unit) P2 has type: (E55 Type) P90 has value: (xs:integer E60 Number) P91 has unit: (E58 Measurement Unit)
31
ARC 3D web service component
Perform the 3D reconstruction of an artefact from images retrieved from the RI For an input sequence of images, ARC 3D produces a calibration matrix and a depth map for each image identified as usable for the reconstruction. This output data is then ingested into the RI so that it can be retrieved and loaded into MeshLab to perform the final reconstruction step (integration of the depth maps). ARC 3D component Images Used images + + Depth map for each used image + metadata
32
ARC 3D Process Event Process Event: uuid:2f7d22db-1d89-11e0-ac c9a66 (D3 Formal Derivation) L4 has preferred label: Processing of Ivory Panel raw data with Arc3D (Literal E62 String) P2 has type: (E55 Type) P2 has type: (E55 Type) WHO: L29 has responsible organisation: (E40 Legal Body) L30 has operator: uuid:2f7d22d3-1d89-11e0-ac c9a66 (E21 Person) WHEN: L31 has starting date-time: T08:00:00Z (xs:dateTime E61 Time Primitive) L32 has ending date-time: T10:00:00Z (xs:dateTime E61 Time Primitive) WHERE: P7 took place at: (D23 Room)
33
ARC 3D Process Event Process Event: uuid:354c91e0-b3fa-11de-98c6-0002a5d5c50c (D3 Formal Derivation) (cont’d) WITH WHAT (Software): L2 used as source: (D14 Software) L4 has preferred label: ARC3D (Literal E62 String) P2 has type: (E55 Type) P2 has type: (E55 Type) L33 has maker: (E39 Actor) L4 has preferred label: KULeuven PSI VISICS (Literal E62 String)
34
ARC 3D Process Event Process Event: uuid:354c91e0-b3fa-11de-98c6-0002a5d5c50c (D3 Formal Derivation) (cont’d) WHAT (Input): L21 used as derivation source:uuid:2f7d22d2-1d89-11e0-ac c9a66 (D9 Data Object) L4 has preferred label: A dome-out.zip (Literal E62 String) WHAT (Derivative output): L22 created derivative: uuid:2f7d22dc-1d89-11e0-ac c9a (D9 Data Object) L4 has preferred label: Arc3D-A _dmy.v3d (Literal E62 String) P2 has type: (E55 Type) P2 has type: (E55 Type) (calibration files, depth map files, CUN file and respective images)
35
Reasoning: A Coherent Semantic Net
software, algorithms software, algorithms devices, device models Processing metadata who, when, how, what , using what params 2nd Acquistion who, when, where, what , using what models Processing metadata who, when, how, what , using what object metadata features, history Acquistion metadata who, when, where, what , using what Processing metadata who, when, how, what , using what raw data objects Acquistion metadata subevents models params meshs Acquistion metadata subevents meshs raw data objects meshs params meshs …+ who - when - where
36
Good Reasons for Reasoning (3D-COFORM)
The integrated semantic network of provenance metadata allows for supporting data consistency, interpretation, reuse, preservation Management: garbage collection of all reproducible intermediate results (classify software!) export packages: collect all acquisition data and parameters used for one model. Preservation: monitor obsoletion of all processing tools, format viewers necessary to interprete or reprocess certain data. Property propagation to subevents and derivatives (economy & consistency): For instance, “Which object represents my mesh?” result in long query paths: “Jesus Christ”. forms part of “Ascension” is carried by: “Ivory Panel”. was digitized by: “MiniDomeEvent ”.has created: “A model v1.zip”.used as derivation source:…….
37
3D-COFORM: Concatenated Metadata
A NXTENG whole model v1.zip A _nxtng_5_degrees_ complete_bjbrown.ply has_created created_derivative created_derivative A NXTENG whole model v1.rdf 7Ivory_NE_MeshLabProcEvent.rdf A Degree scans.zip forms_part_of 9Ivory_NE_MeshLabProcEvent.rdf 4IvoryPanel_NE_DetSeqEvent.rdf 6Ivory_NE_MeshLabProcEvent.rdf A corner scans.rdf used_as_derivation_source forms_part_of A Master.zip forms_part_of used_as_derivation_source A Retouched.zip created_derivative has_created 3IvoryPanel_NE_ObjAcqEvent.rdf A corner scans.zip used_as_derivation_source created_derivative 8Ivory_NE_MeshLabProcEvent.rdf digitized used_as_derivation_source 3IvoPan_LegacyData.rdf A dome-out.zip used_as_derivation_source 2009CA5307v Coloured.ply 4Ivory_Arc3DProcEvent.rdf Arc3D-A _dmy.v3d 5Ivory_MeshLabProcEvent.rdf has_created created_derivative digitized created_derivative Digitization_Process Formal_Derivation Sub-events Data_Object Legend Man_Made_Object A dome-out.rdf forms_part_of 2009CR4851_0.rdf has_created 1IvoryPanel_ObjAcqEvent.rdf forms_part_of … 2009CR4851_0.tif 2009CA5306_0.rdf … forms_part_of 2IvoryPanel_DocEvent.rdf forms_part_of has_created 2009CA5306_0.tif
38
Conclusions CRMdig provides a good model high-level model for empirical provenance of digital data, open for further specialization, integrated with arbitrary context representations CRM-CRMDig outperforms competitors in expressive power and integration potential. Future work: Theory of property propagation to subevents and derivatives Needed: Theories of feature conservation by kinds of derivation processes. Links:
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.