Steps Towards a Theory of Information Preservation Giorgos Flouris, Carlo Meghini Istituto di Scienza e Tecnologie dell’ Informazione (ISTI) CNR, Pisa,

Slides:



Advertisements
Similar presentations
1 Knowledge Representation Introduction KR and Logic.
Advertisements

Ontology Assessment – Proposed Framework and Methodology.
Critical Reading Strategies: Overview of Research Process
Introducing Formal Methods, Module 1, Version 1.1, Oct., Formal Specification and Analytical Verification L 5.
Situation Calculus for Action Descriptions We talked about STRIPS representations for actions. Another common representation is called the Situation Calculus.
Transformations We want to be able to make changes to the image larger/smaller rotate move This can be efficiently achieved through mathematical operations.
Basic Structures: Sets, Functions, Sequences, Sums, and Matrices
Basic Structures: Sets, Functions, Sequences, Sums, and Matrices
ISBN Chapter 3 Describing Syntax and Semantics.
1 Semantic Description of Programming languages. 2 Static versus Dynamic Semantics n Static Semantics represents legal forms of programs that cannot be.
Formal Methods in Software Engineering Credit Hours: 3+0 By: Qaisar Javaid Assistant Professor Formal Methods in Software Engineering1.
1 Introduction to Computability Theory Lecture15: Reductions Prof. Amos Israeli.
1 Introduction to Computability Theory Lecture12: Reductions Prof. Amos Israeli.
CPSC 411, Fall 2008: Set 12 1 CPSC 411 Design and Analysis of Algorithms Set 12: Undecidability Prof. Jennifer Welch Fall 2008.
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
1 Undecidability Andreas Klappenecker [based on slides by Prof. Welch]
Algorithms and Problem Solving-1 Algorithms and Problem Solving.
Where are the Semantics in the Semantic Web? Michael Ushold The Boeing Company.
Algorithms and Problem Solving. Learn about problem solving skills Explore the algorithmic approach for problem solving Learn about algorithm development.
Software Requirements
Chapter 8_2 Bits and the "Why" of Bytes: Representing Information Digitally.
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
Describing Syntax and Semantics
Programming Logic and Design, Introductory, Fourth Edition1 Understanding Computer Components and Operations (continued) A program must be free of syntax.
Chapter 1 Program Design
School of Computer ScienceG53FSP Formal Specification1 Dr. Rong Qu Introduction to Formal Specification
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 5 Slide 1 Requirements engineering l The process of establishing the services that the.
ELEMENTARY NUMBER THEORY AND METHODS OF PROOF
PROGRAMMING LANGUAGES The Study of Programming Languages.
Data vs. Information OUTPUTOUTPUT Information Data PROCESSPROCESS INPUTINPUT There are 10 types of people in this world those who read binary and those.
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
A Logical Model for Digital Archives Rathachai Chawuthai Information Management CSIM / AIT Draft document 0.1.
Simple Program Design Third Edition A Step-by-Step Approach
Knowledge representation
Verification and Validation Overview References: Shach, Object Oriented and Classical Software Engineering Pressman, Software Engineering: a Practitioner’s.
Evaluation of software engineering. Software engineering research : Research in SE aims to achieve two main goals: 1) To increase the knowledge about.
Rathachai Chawuthai. Preface Draft idea only Something may be informal – Formula sign may be informal, such as, dark delta – No any axioms – Not enough.
CS/IT 138 THEORY OF COMPUTATION Chapter 1 Introduction to the Theory of Computation.
The XCL Languages Digital Preservation – The Planets Way Dresden, April 23 rd 2010 Manfred Thaller, Universität zu Köln.
Algorithms and their Applications CS2004 ( ) Dr Stephen Swift 1.2 Introduction to Algorithms.
1 Term Paper Mohammad Alauddin MSS (Government &Politics) MPA(Governance& Public Policy) Deputy Secretary Welcome to the Presentation Special Foundation.
This chapter is extracted from Sommerville’s slides. Text book chapter
Logical Circuit Design Week 2,3: Fundamental Concepts in Computer Science, Binary Logic, Number Systems Mentor Hamiti, MSc Office: ,
Preservation Strategies: Intro to the OAIS Reference Model Curt Tilmes NASA Version 1.0 Review Date.
CMPF144 FUNDAMENTALS OF COMPUTING THEORY Module 5: Classical Logic.
Pattern-directed inference systems
LOGIC AND ONTOLOGY Both logic and ontology are important areas of philosophy covering large, diverse, and active research projects. These two areas overlap.
Copyright © Cengage Learning. All rights reserved. CHAPTER 7 FUNCTIONS.
Semantic Web - an introduction By Daniel Wu (danielwujr)
44220: Database Design & Implementation Modelling the ‘Real’ World Ian Perry Room: C41C Ext.: 7287
MA/CSSE 474 Theory of Computation Decision Problems DFSMs.
Introduction to Earth Science Section 2 Section 2: Science as a Process Preview Key Ideas Behavior of Natural Systems Scientific Methods Scientific Measurements.
L To identify the services that the customer requires from a system and the constraints under which it operates and is developed.
Section 2.1. Section Summary Definition of sets Describing Sets Roster Method Set-Builder Notation Some Important Sets in Mathematics Empty Set and Universal.
An Axiomatic Basis for Computer Programming Robert Stewart.
Chapter 2 With Question/Answer Animations. Section 2.1.
CS1Q Computer Systems Lecture 2 Simon Gay. Lecture 2CS1Q Computer Systems - Simon Gay2 Binary Numbers We’ll look at some details of the representation.
Research Methodology Class.   Your report must contains,  Abstract  Chapter 1 - Introduction  Chapter 2 - Literature Review  Chapter 3 - System.
SWE 513: Software Engineering
Lecture №1 Role of science in modern society. Role of science in modern society.
Psychonomics: the ontology of psychology Psychonomics: The ontology of psychology.
Software Engineering, COMP201 Slide 1 Software Requirements BY M D ACHARYA Dept of Computer Science.
A Classification of Ontology Change Giorgos Flouris 1,2, Dimitris Plexousakis 2, Grigoris Antoniou 2 This work was carried out during the first author's.
Program Design. Simple Program Design, Fourth Edition Chapter 1 2 Objectives In this chapter you will be able to: Describe the steps in the program development.
1 Solving Problems with Methods Questions. 2 Problem solving is a process similar to working your way through a maze. But what are these “steps” and what.
Unit 2 Technology Systems
Essentials of Oral Defense (Legal English)
Writing an Engineering Report (Formal Reports)
Representations & Reasoning Systems (RRS) (2.2)
Theory of Computation Lecture 23: Turing Machines III
Presentation transcript:

Steps Towards a Theory of Information Preservation Giorgos Flouris, Carlo Meghini Istituto di Scienza e Tecnologie dell’ Informazione (ISTI) CNR, Pisa, Italy Invited Talk (PresDB-07)

23/03/2007Giorgos Flouris, PresDB-072 Introduction Preservation: –Very important, difficult and interesting problem –Need for preservation is self-evident Notes on this work: –Ongoing work for CASPAR (suggestions welcome) –About digital objects (not about databases, but can be applied to databases) –The focus of this work is not to perform preservation, but to describe formally what it means to perform preservation

23/03/2007Giorgos Flouris, PresDB-073 Purpose We are trying to come up with a formal, mathematical, logic-based description of preservation as a scientific discipline, to the end of deriving a methodology resting on solid grounds (then, we will try to apply this methodology to CASPAR)

23/03/2007Giorgos Flouris, PresDB-074 The Need for a Theory of Information Preservation Why is such a theory important? –A formal, theoretical, mathematical framework allows the proof of impossibility and existential results –Allows us to ground existing (and future) methods upon a common formalism for comparison –Provides a set of formal desirable properties for existing and future preservation methods –Allows proving that a preservation method works well (or does not work well) Where practitioners believe, a theory can prove

23/03/2007Giorgos Flouris, PresDB-075 Preservation Types The first letter of the English alphabet PRODUCERCONSUMER Knowledge Level Symbol Level Understands Concept Reads Symbol Writes Symbol Understands Concept Reads Bits Writes Bits KR Level Information Preservation Data (or Object) Preservation Bit Preservation A A Time

23/03/2007Giorgos Flouris, PresDB-076 Preservation Types Example CityTemperatureDate Athens1208/03/07 Pisa1106/03/07 Edinburgh811/03/07 Bit Preservation: Database is not corrupt (error correction techniques, backups, refreshment of media) Data Preservation: Database can be opened (preserve format specification) Information Preservation: Database can be understood (temperatures in Celsius, dates in dd/mm/yy)

23/03/2007Giorgos Flouris, PresDB-077 Statics Digital Object and UCK A digital object depends on external information: –Bit Format (ASCII codes, integer representation, …) –Symbols’ Format (23/03/07 or 03/23/07) –Background Knowledge (what is the meaning of 23/03/07) A digital object is attached to a single Underlying Community Knowledge (UCK) that contains this information Therefore: –A digital object carries no meaning by itself –Its meaning (semantics) is derived from the attached UCK

23/03/2007Giorgos Flouris, PresDB-078 Statics Schematically

23/03/2007Giorgos Flouris, PresDB-079 Information to be Preserved: Questions and Answers Digital object: a set of questions and answers –Not all information in a digital object needs to be preserved –Example: a document (content, format, fonts, pagination) The exact information to be preserved depends on: –Type of digital object –Producer’s intentions –Digital object’s intended reader (Designated Community) –Legal issues –Practical considerations –…

23/03/2007Giorgos Flouris, PresDB-0710 Statics Information Preservation Structure (IPS) IPS = UCK + Digital Object –UCK: –Digital Object: L is further broken down: –L= IPS UCK Digital Object LTQans L V VIVI P PCPC ⊧

23/03/2007Giorgos Flouris, PresDB-0711 IPS and Preservation Models Preservation models provide a methodological framework for determining the content of an IPS –OAIS (ISO standard 14721:2003) Representation Information (UCK) –Structural Information –Semantic Information Preservation Description Information (questions and answers) –Provenance –Reference –Context –Fixity Digital object’s content (questions and answers)

23/03/2007Giorgos Flouris, PresDB-0712 Purpose of Preservation

23/03/2007Giorgos Flouris, PresDB-0713 Preservation and Change UCK evolves –If digital objects remained the same, they would be either unreadable or would carry the wrong meaning Thus we need a methodology that will indicate the appropriate changes to all digital objects attached to a UCK, as a function of: –The old digital object –The old UCK (producer’s UCK) –The new UCK (consumer’s UCK) –The UCK evolution specification

23/03/2007Giorgos Flouris, PresDB-0714 Belief Change, Ontology Evolution and Information Preservation (1) Initial thought: use well-established methods from belief change (belief revision) and ontology evolution Not possible, in general: –The UCK may be a logic not supported by the above fields –Changes may affect the logic itself –Changes may be of infinite nature –Input/output may be different Example: Roman to Arabic numerals – III  3 – IV  4 – …

23/03/2007Giorgos Flouris, PresDB-0715 Belief Change, Ontology Evolution and Information Preservation (2) However, it is possible under some assumptions: –The logic does not change –The logic in UCK is supported –Old UCK and digital object are known, evolution is known –Change can be finitely described using standard models Example from astronomy: –Pluto was a Planet –Planet definition changed recently (24/08/06, Prague) –Pluto reclassified as a Dwarf Planet

23/03/2007Giorgos Flouris, PresDB-0716 Dynamics Schematically (Idealized Case) Producer Consumer

23/03/2007Giorgos Flouris, PresDB-0717 Dynamics Schematically (General Case) Producer Consumer Expanded Various levels of preservation: complete, essential, modulo logical equivalence, indirect, approximate, partial, …

23/03/2007Giorgos Flouris, PresDB-0718 Dynamics IPS Evolution Structure (IPSES) IPS UCK Digital Object LTQans L V VIVI P PCPC ⊧ Producer IPS UCK Digital Object LTQans L V VIVI P PCPC ⊧ Expanded IPS UCK Digital Object LTQans L V VIVI P PCPC ⊧ Consumer ⊇ Mapping needs a finite representation: Turing Machines IPSES’ definition is incomplete Need a way to compute the green arrow from the information given (old digital object, producer’s UCK, consumer’s UCK, IPSES) IPSES

23/03/2007Giorgos Flouris, PresDB-0719 Putting it All Together General Ideas What is preservation? Preservation is the process of retaining the meaning of a digital object unaltered for readers with different background, software, hardware etc What are the preservation types? Bit Preservation Bits are not corrupt Data Preservation Bits’ format is understood/read Information Preservation Information is understood

23/03/2007Giorgos Flouris, PresDB-0720 Putting it All Together Statics What is a digital object? A digital object is a sequence of bits (no meaning) What gives meaning to a digital object? The underlying (often implicit) format, knowledge, symbols’ meaning etc, represented by UCK What should be preserved? A set of questions and their answers How do we determine the content of an IPS? Preservation models can help

23/03/2007Giorgos Flouris, PresDB-0721 Putting it All Together Dynamics (General) Why is preservation needed? Underlying knowledge (UCK) evolves; if digital objects remained the same, they would be not understood or be misunderstood When is preservation achieved? When digital objects retain their meaning Can other research fields help? Belief Revision and Ontology Evolution, but only partially

23/03/2007Giorgos Flouris, PresDB-0722 Putting it All Together Dynamics (IPSES) How can we describe UCK evolution? Using an expanded UCK, plus a mapping and a number of correspondences between the UCKs Is preservation always possible? No; various levels of preservation How should digital objects evolve? Open question; a function of the old digital object, the two UCKs and the UCK evolution information (IPSES)

23/03/2007Giorgos Flouris, PresDB-0723 Future Work Calculate the evolution of the digital object as a function of: –Old digital object –Producer’s UCK –Consumer’s UCK –IPSES (evolution information) Ongoing work: refinements might be required Extensive testing of the theory (real-world examples) Tie the theory to more useful in practice structures

23/03/2007Giorgos Flouris, PresDB-0724 Acknowledgements This work was carried out during Giorgos Flouris’ tenure of an ERCIM “Alain Bensoussan” Fellowship Programme. This work was partially supported by the EU project CASPAR (FP IST ).

23/03/2007Giorgos Flouris, PresDB-0725 BACKUP SLIDES

23/03/2007Giorgos Flouris, PresDB-0726 The last letter of the English alphabet Preservation Types Revisited The 6 th letter of the Greek Alphabet PRODUCERCONSUMER Knowledge Level Symbol Level Understands Concept Reads Symbol Writes Symbol Understands Concept Reads Bits Writes Bits KR Level Information Preservation Data (or Object) Preservation Bit Preservation Z Z

23/03/2007Giorgos Flouris, PresDB-0727 Preservation Types Joke Analogy In order to laugh at a joke, you must: –Hear the joke (bit preservation) The sound waves should reach your ears; if you are in another room, you won’t laugh at the joke –Understand the joke (data preservation) You should understand the language; if I say a joke in Greek, you won’t laugh at the joke –Understand the context of the joke (information preservation) You should understand what the joke is about; if I say a joke about the political situation in Greece, you won’t laugh at the joke

23/03/2007Giorgos Flouris, PresDB-0728 Statics Underlying Community Knowledge (UCK) UCK: a logical formalism, plus a logical theory Because logics are: –Formal –Able to express knowledge –Suitable to capture question-answering (using inference) –Well-studied, mature, well-established field with rich results –Allow building theories to express background knowledge We don’t embrace any particular logic

23/03/2007Giorgos Flouris, PresDB-0729 Contents of a UCK Common Knowledge Knowledge P1 Knowledge P2 Knowledge P3 Digital Object Knowledge C1 Knowledge C2 UCK ProducerIntended Consumer

23/03/2007Giorgos Flouris, PresDB-0730 Dynamics Notes on IPSES IPS Evolution Structure (IPSES): –IPSES = UCK + mapping Exact specification of the change (no side-effects) –Usually change is partially specified (has side-effects) –Determining side-effects is orthogonal to preservation Change may be infinite (finite representation needed) –Example: Roman and Arabic numerals –Need Turing Machines to represent the mapping