ICS-FORTH May 23, 2009 1 An Ontological Approach to Digital Preservation Metadata Martin Doerr Foundation for Research and Technology - Hellas Institute.

Slides:



Advertisements
Similar presentations
1 ICS-FORTH EU-NSF Semantic Web Workshop 3-5 Oct Christophides Vassilis Database Technology for the Semantic Web Vassilis Christophides Dimitris Plexousakis.
Advertisements

Brief Introduction to Provenance "As data becomes plentiful, verifiable truth becomes scarce
CRM digital v3.0 CRM digital v3.0 An Extension of CIDOC-CRM to support provenance metadata Martin Doerr, Maria Theodoridou FORTH-ICS, Greece January 2013.
Q UERY L ANGUAGE C ONSTRUCTS FOR P ROVENANCE Murali Mani, Mohamad Alawa, Arunlal Kalyanasundaram University of Michigan, Flint Presented at IDEAS 2011.
A Stepwise Modeling Approach for Individual Media Semantics Annett Mitschick, Klaus Meißner TU Dresden, Department of Computer Science, Multimedia Technology.
CRMarchaeo CRMarchaeo v1.2.1
CRMarchaeo Modelling Context, Stratigraphic Unit, Excavated Matter
Open Provenance Model Tutorial Session 2: OPM Overview and Semantics Luc Moreau University of Southampton.
1 ICS –FORTH, Oct.30-Nov.4,2006, Cyprus Documenting Events in Metadata Martin Doerr, Athina Kritsotaki Center for Cultural Informatics Institute of Computer.
Open Provenance Model Tutorial Session 7: Open Provenance Model Vocabulary.
1 CIDOC CRM + FRBR ER = FRBR OO … an equation for a harmonised view of museum information and bibliographic information Martin Doerr First CASPAR Seminar.
Chapter 4 Quality Assurance in Context
Melbourne, October 13, Electronic Communication on Diverse Data - The Role of the oo CIDOC Reference Model - Martin Doerr (ICS-FORTH, Crete, Greece)
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
Provenance-aware faceted search Peter Fox Stephan Zednik Patrick West Tetherless World Constellation, RPI EGU 2010.
CAD/CAM Design Process and the role of CAD. Design Process Engineering and manufacturing together form largest single economic activity of western civilization.
Geography 465 Overview Geoprocessing in ArcGIS. MODELING Geoprocessing as modeling.
Adaptive Hypermedia Meets Provenance Evgeny Knutov Paul De Bra Mykola Pechenizkiy GAF project: Generic Adaptation Framework (project is supported byNWO.
Metadata : Setting the Scene or a Basic Introduction Wendy Duff University of Toronto, Faculty of Information Studies.
Computational Thinking Related Efforts. CS Principles – Big Ideas  Computing is a creative human activity that engenders innovation and promotes exploration.
Martin Doerr, Gerald Hiebel, Institute of Computer Science
Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.
Carlos Lamsfus. ISWDS 2005 Galway, November 7th 2005 CENTRO DE TECNOLOGÍAS DE INTERACCIÓN VISUAL Y COMUNICACIONES VISUAL INTERACTION AND COMMUNICATIONS.
ICS-FORTH May 25, The Utility of XML Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Heraklion, May.
ICS – FORTH, August 31, 2000 Why do we need an “Object Oriented Model” ? Martin Doerr Atlanta, August 31, 2000 Foundation for Research and Technology -
ICS-FORTH October 14, The CIDOC CRM, factor for the integration and presentation of cultural information Martin Doerr Foundation for Research and.
Idea-garden.org SOCIAL SEMANTIC INFORMATION SPACE An Interactive Learning Environment Fostering Creativity Grant agreement no: nd CIDOC CRM-SIG.
Web Explanations for Semantic Heterogeneity Discovery Pavel Shvaiko 2 nd European Semantic Web Conference (ESWC), 1 June 2005, Crete, Greece work in collaboration.
The Data Attribution Abdul Saboor PhD Research Student Model Base Development and Software Quality Assurance Research Group Freie.
Harmonising without Harm: towards an object-oriented formulation of FRBR aligned on the CIDOC CRM ontology Maja Žumer (University of Ljubljana) & Patrick.
1 Yolanda Gil Information Sciences InstituteJanuary 10, 2010 Requirements for caBIG Infrastructure to Support Semantic Workflows Yolanda.
Managing the Record of Research At the Smithsonian Using SIdora SAA Research Forum August 12, 2014.
Working Group: Practical Policy Rainer Stotzka, Reagan Moore.
USING METADATA TO FACILITATE UNDERSTANDING AND CERTIFICATION ABOUT THE PRESERVATION PROPERTIES OF A PRESERVATION SYSTEM Jewel H. Ward, Hao Xu, Mike C.
Integrating Business Process Models with Ontologies Peter De Baer, Pieter De Leenheer, Gang Zhao, Robert Meersman {Peter.De.Baer, Pieter.De.Leenheer,
P12 occurred in the presence of (was present at) P11 had participant P16 used specific object P25 moved P31 has modified P92 brought into existence P33.
Preserving the Scientific Record: Preserving a Record of Environmental Change Matthew Mayernik National Center for Atmospheric Research Version 1.0 [Review.
A CIDOC CRM – compatible metadata model for digital preservation
The Beauty and Joy of Computing Lecture #3 : Creativity & Abstraction UC Berkeley EECS Lecturer Gerald Friedland.
Dimitrios Skoutas Alkis Simitsis
Procedures for managing workflow components Workflow components: A workflow can usually be described using formal or informal flow diagramming techniques,
Knowledge Representation of Statistic Domain For CBR Application Supervisor : Dr. Aslina Saad Dr. Mashitoh Hashim PM Dr. Nor Hasbiah Ubaidullah.
1 Exploring time and space in the annotation of museum catalogues: The Sloane virtual exhibition experience Stephen Stead Vienna November 2014 University.
A Context Model based on Ontological Languages: a Proposal for Information Visualization School of Informatics Castilla-La Mancha University Ramón Hervás.
Interoperability & Knowledge Sharing Advisor: Dr. Sudha Ram Dr. Jinsoo Park Kangsuk Kim (former MS Student) Yousub Hwang (Ph.D. Student)
Conceptual Data Modelling for Digital Preservation Planets and PREMIS Angela Dappert.
10/24/09CK The Open Ontology Repository Initiative: Requirements and Research Challenges Ken Baclawski Todd Schneider.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 1 DATABASE SYSTEMS Instructor Ms. Arwa Binsaleh.
Dr. Norman Swindells Ferroday Ltd Birkenhead, U.K. Conservation of digital data.
From FRBR to FRBR OO through CIDOC CRM… A Common Ontology for Cultural Heritage Information Patrick Le Bœuf, National Library of France International Symposium.
A Semantic Web Approach for the Third Provenance Challenge Tetherless World Rensselaer Polytechnic Institute James Michaelis, Li Ding,
Lifecycle Metadata for Digital Objects November 15, 2004 Preservation Metadata.
CIMA and Semantic Interoperability for Networked Instruments and Sensors Donald F. (Rick) McMullen Pervasive Technology Labs at Indiana University
2/26/2004 Dan Swaney 1 Preservation Metadata and the OAIS Information Model A Metadata Framework to Support the Preservation of Digital Objects A review.
Digital Diplomatics Nov 2013 Adam Jansen University of British Columbia.
Co-funded by the European Union under FP7-ICT Co-ordinated by aparsen.eu #APARSEN Provenance Interoperability and Reasoning Yannis Tzitzikas Assistant.
An Approach to Software Preservation
Harmonized EDM-CRM-FRBRoo
Modelling Intellectual Processes: The object-orient FRBR Model
Harmonized EDM-CRM-FRBRoo-CRMdig
From FRBR to FRBROO through CIDOC CRM…
Exercise: understanding authenticity evidence
Harmonized EDM-CRM-FRBRoo
FRBRoo and performing arts
Active Data Management in Space 20m DG
CRMarchaeo Modelling Context, Stratigraphic Unit, Excavated Matter
Metadata in Digital Preservation: Setting the Scene
Semantic Interoperability in Digital Library Systems
Modelling Intellectual Processes: The object-orient FRBR Model
Modelling Intellectual Processes: The object-orient FRBR Model
Presentation transcript:

ICS-FORTH May 23, An Ontological Approach to Digital Preservation Metadata Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Prague, Czechia May 23, 2009 Center for Cultural Informatics

ICS-FORTH May 23,  Cultural and scientific data cannot be understood without knowledge about the meaning of the data and the ways and circumstances of their creation  We use Metadata to assess  encoding (used formats, tools, instruments)  meaning (context of creation, experimental setups, background knowledge, etc. ),  relevance (described things, their status, their conditions),  quality (credibility, authenticity, calibration, tolerances, possible errors),  possibilities of Improvement and Reprocessing.  From generation to use, permanent storage, reuse (life-cycle)  No standards yet! Digital Preservation Metadata

ICS-FORTH May 23,  Required: Reliable interoperable registration of the creation and modification processes and contextual conditions – “provenance metadata”, through time.  Solution: a common core ontology to explain the meaning of various data structures describing highly specialized processes.  Idea:  Metadata and scientific data and are historical records!  Tool-mediated creation and machine-supported processing is initiated, on behalf of and controlled by human activity.  Things, data, people, times and places are causally related by events.  Other relations are either deductions from events or found by observation events.  The CIDOC CRM (ISO21127) can be used as core model! Digital Preservation Metadata

ICS-FORTH May 23,  Three applications so far:  For A completely CRM-based model for provenance (scientific workflow) metadata for generating RTI images. (combines up to 2000 individual shots).  For the European Integrated Project CASPAR on Digital Preservation: — Could explicate OAIS PDI Type “Provenance Information” and authenticity as a queries to the CRM.  European IP 3D-COFORM: Digital Provenance of 3D Models.  We have added 10 classes and some properties under the CRM:  Relation of human action and machine action.  Digitization as a measurement and information object creation  Formal derivation: feature preservation between input and output The CRM Digital Extended applications – Digital Provenance

ICS-FORTH May 23, C2 Digitization Process E7 Activity E65 CreationE16 Measurement C11 Digital Measurement Event C7 Digital Machine Event C10 Software Execution C3 Formal Derivation E5 Event C12 Data Transfer Event E11 Modification The CRM Digital Digital Events

ICS-FORTH May 23, C1 Digital Object E54 Dimension C9 Data Object E73 Information Object E70 Thing C8 Digital Device E22 Man-Made Object E84 Information Carrier C13 Digital Information Carrier The CRM Digital Digital Things

ICS-FORTH May 23, C7 Digital Machine Event E7 Activity E65 Creation E70 Thing P16 used specific object (was used for) E28 Conceptual Object C8 Digital Device C1 Digital Object S10 had input (was input of) C1 Digital Object E5 Event P9 consists of (forms part of) E73 Information Object E22 Man-Made Object S11 had output (was output of) P94 has created (was created by) deduction E4 PeriodE19 Physical Object P8 took place on or within (witnessed) S12 happened on device (was device for) The CRM Digital Human creation by machine events

ICS-FORTH May 23, C7 Digital Machine Event C8 Digital Device C1 Digital Object S10 had input (was input of) C1 Digital Object S11 had output (was output of) S12 happened on device (was device for) C10 Software Execution C3 Formal Derivation S2 used as source (was source for) C1 Digital Object S13 used parameters (parameters for) C1 Digital Object The CRM Digital Software Execution

ICS-FORTH May 23, E11 Modification E7 Activity E65 Creation P125 used object of type (was type of object used in) E55 Type C11 Digital Measurement Event S15 measured thing of type (was type of thing measured by) C9 Data Object E54 Dimension P40 observed dimension (was observed in) The CRM Digital Digital Measurement (Activity view) C7 Digital Machine Event S20 has created (was created by) E16 Measurement

ICS-FORTH May 23, C2 Digitization Process E24 Physical Man-Made Thing E11 Modification P31 has modified (was modified by) E65 Creation E28 Conceptual Object C1 Data Object E73 Information Object E70 Thing P128 carries (is carried by) E18 Physical Thing P94 has created (was created by) S20 has created (was created by) S1 digitized (was digitized by) C11 Digital Measurement Event C13 Digital Information Carrier S18 has modified (was modified by) S19 stores (is stored on) C1 Digital Object S15 measured thing of type (was type of thing measured by) The CRM Digital Digitization = feature transfer physical-digital

ICS-FORTH May 23, C7 Digital Machine Event C8 Digital Device C1 Digital Object S11 had output (was output of) S12 happened on device (was device for) C12 Data Transfer Event S10 had input (was input of) C1 Digital Object E11 Modification P31 has modified (was modified by) E84 Information Carrier C1 Digital Object S14 transferred (was transferred by) C8 Digital Device S16 has receiver (was receiver for) S15 has sender (was sender for) The CRM Digital Unreliable transfer

ICS-FORTH May 23, P29custody received by (received custody through) E39 Actor Vincent van Gogh Foundation P28 custody surrendered by (surrendered custody through) E39 Actor Vincent Willem van Gogh P29custody received by (received custody through) P28 custody surrendered by (surrendered custody through) P29custody received by (received custody through) P50 has current keeper (is current keeper of) P30 transferred custody of (custody transferred through) E73 Information ObjectE10 Transfer of Custody The custody passing to Theo's widow E10 Transfer of Custody The custody passing to the van Gogh Foundation E39 Actor Theo van Gogh E39 Actor Johanna van Gogh-Bonger E10 Transfer of Custody The custody passing to Johanna's son Preservation Metadata history of physical objects

ICS-FORTH May 23, P14.1 in the role of P131 is identified by (identifies) E39 Actor “the creator of ADT music” E82 Actor Appellation “Georges Aperghis” E55 Type “Composer” P14 carried out by (performed) E65 Creation “The conception of ADT” P14.1 in the role of P131 is identified by (identifies) E39 Actor “the creator of ADT libretto” E82 Actor Appellation “Peter Szendy” P131 is identified by (identifies) E55 Type “Writer” P94 has created (was created by) E28 Conceptual Object “ADT” Preservation Metadata creation of born-digital objects

ICS-FORTH May 23, C1 Digital Object CreteSmall.png C1 Digital Object Crete.jpg S2 used as source (was source for) C1 Digital Object Crete.png C3 Formal Derivation JPG2PNG conversion C3 Formal Derivation Reduce png resolution P94 has created (was created by) P94 has created (was created by) S2 used as source (was source for) E29 Design or Procedure JPG2PNG Algorithm X P33 used specific technique (was used by) P32 used general technique (was technique of) E55 Type JPG2PNG E55 Type Software P16 used specific object (was used for) E28 Conceptual Object Adobe Photoshop CS2 P2 has type (is type of) P2 has type (is type of) E55 Type JPG P2 has type (is type of) E55 Type PNG P2 has type (is type of) P2 has type (is type of) C1 Digital Object color depth=24 resolution = 600 compression level = 5 S13 used parameters (parameters for) Preservation Metadata transformation of digital objects

ICS-FORTH May 23,  Authenticity can be defined on Object History: Given: Man-Made Object O1, “was present at” Event E1 (typically creation or publication) Man-Made Object O2, “was present at” Event E2 (typically ingestion or validation) Information Object X1 “is carried by” O1 (historical carrier) Information Object X2 “is carried by” O2 (current carrier) O2 is “authentic” if O2 = O1, or X1 = X2  Reasoning on completeness/security of curation and carrier transfer chain and/or comparison of multiple assumed current carriers. Preservation Metadata Authenticity

ICS-FORTH May 23, The Open Provenance Model  An annotated causality graph defined as a record of a past (or current) execution  Three node types  Artifact - Immutable piece of state, which may have a physical embodiment in a physical object, or a digital representation in a computer system.  Process - Action or series of actions performed on or caused by artifacts, and resulting in new artifacts.  Agent - Contextual entity acting as a catalyst of a process, enabling, facilitating, controlling, affecting its execution.  Nodes can be annotated with properties  Processes operate in one or more Roles (R)

ICS-FORTH May 23,  Nodes are connected by edges  used(R)  wasGeneratedBy(R)  wasControlledBy(R)  wasTriggeredBy  wasDerivedFrom Ag P A P P PP A A A used(R) wasGeneratedBy(R) wasControlledBy(R) wasTriggeredBy wasDerivedFrom The Open Provenance Model

ICS-FORTH May 23,  Does not distinguish between material and immaterial objects  Does not explicitly model the concept of an Event, a concept of prominent importance.  Without the notion of event and also of physical objects that are carriers (devices) of information, it is not possible for example, to describe adequately the conditions under which a photograph was taken  the way OPM treats Processes resembles events, however the corresponding ontological structure of OPM is not rich enough.  provenance information recorded according to CRMdig can be mapped to an OPM-based view, but not the other way around The Open Provenance Model

ICS-FORTH May 23, The CIDOC CRM Conclusions  The CIDOC model and a suitable extension allow for representing all provenance related preservation metadata.  Specific tools need more models of specific parameter sets, that do not influence the integration of and reasoning on the provenance chain.  There is no competitive generic model that consistently describes material and digital objects and their related history.  Relationship between human and machine action still needs refinement: Using OWL we can avoid the ambiguity of multiple IsA.