What is a text within the Digital Humanities, or some of them at least? Manfred Thaller, Universität zu Köln Digital Humanities 2012, July 20 th 2012.

Slides:



Advertisements
Similar presentations
PC in TB Manfred Thaller PLANETS TB meeting, DenHaag, Sept 28th. '06.
Advertisements

PC/4 Manfred Thaller PLANETS TB meeting, DenHaag, Sept 29th. '06.
Characterisation Adrian Brown The National Archives, UK.
Lecture 2: Basic Information Theory TSBK01 Image Coding and Data Compression Jörgen Ahlberg Div. of Sensor Technology Swedish Defence Research Agency (FOI)
CHAPTER 2 GC101 Program’s algorithm 1. COMMUNICATING WITH A COMPUTER  Programming languages bridge the gap between human thought processes and computer.
Chapter 7 User-Defined Methods. Chapter Objectives  Understand how methods are used in Java programming  Learn about standard (predefined) methods and.
CS292 Computational Vision and Language Pattern Recognition and Classification.
Introduction to Gröbner Bases for Geometric Modeling Geometric & Solid Modeling 1989 Christoph M. Hoffmann.
CS 454 Theory of Computation Sonoma State University, Fall 2011 Instructor: B. (Ravi) Ravikumar Office: 116 I Darwin Hall Original slides by Vahid and.
Describing Syntax and Semantics
Chapter 8 Arrays and Strings
Foundations This chapter lays down the fundamental ideas and choices on which our approach is based. First, it identifies the needs of architects in the.
55:148 Digital Image Processing Chapter 11 3D Vision, Geometry Topics: Basics of projective geometry Points and hyperplanes in projective space Homography.
Management Information Systems Lection 06 Archiving information CLARK UNIVERSITY College of Professional and Continuing Education (COPACE)
Introduction to Database Systems 1.  Assignments – 3 – 9%  Marked Lab – 5 – 10% + 2% (Bonus)  Marked Quiz – 3 – 6%  Mid term exams – 2 – (30%) 15%
The PLANETS-Ontology in the context of the PLANETS-Testbed and the XCL-Software.
An Information Theory based Modeling of DSMLs Zekai Demirezen 1, Barrett Bryant 1, Murat M. Tanik 2 1 Department of Computer and Information Sciences,
Unit 2: Engineering Design Process
Towers of Hanoi. Introduction This problem is discussed in many maths texts, And in computer science an AI as an illustration of recursion and problem.
Electromagnetism Lecture#1 [Introduction] Instructor: Engr. Muhammad Mateen Yaqoob.
Logics for Data and Knowledge Representation
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
Foundations of Computer Science Computing …it is all about Data Representation, Storage, Processing, and Communication of Data 10/4/20151CS 112 – Foundations.
Digital Image Processing CCS331 Basic Transformations 1.
CHATS IN THE CLASSROOM: EVALUATIONS FROM THE PERSPECTIVES OF STUDENTS AND TUTORS AT CHEMNITZ UNIVERSITY OF TECHNOLOGY, COMMUNICATION ON TECHNOLOGY AND.
The XCL Languages Digital Preservation – The Planets Way Dresden, April 23 rd 2010 Manfred Thaller, Universität zu Köln.
© 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang 5-1 Chapter 5 Business Intelligence: Data.
1 What is HTML? Standardized codes Web pages SGML Descriptive markup Tags.
Discrete Mathematics 이재원 School of Information Technology
The Beauty and Joy of Computing Lecture #3 : Creativity & Abstraction UC Berkeley EECS Lecturer Gerald Friedland.
EXtensible Characterisation Languages (XCL) Manfred Thaller, (University at Cologne) DPP meeting, Glasgow, Nov. 23 rd 2006.
The Balance Between Theoretical and Practical Work Within Electrical and Computer Engineering Courses Dr. Bahawodin Baha March Development Partnerships.
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
Visual Information Systems Recognition and Classification.
1 Digital Preservation Testbed Database Preservation Issues Remco Verdegem Bern, 9 April 2003.
1 A Historical Perspective on Conceptual Modelling (Based on an article and presentation by Janis Bubenko jr., Royal Institute of Technology, Sweden. June.
CSA2050 Introduction to Computational Linguistics Lecture 1 Overview.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?
SOFTWARE DESIGN. INTRODUCTION There are 3 distinct types of activities in design 1.External design 2.Architectural design 3.Detailed design Architectural.
CONTENTS Area of study 2 Key knowledge Key skills.
3.2 Semantics. 2 Semantics Attribute Grammars The Meanings of Programs: Semantics Sebesta Chapter 3.
Description of Bibliographic Items. Review Encoding = Markup. The library cataloging “markup” language is MARC. Unlike HTML, MARC tags have meaning (i.e.,
1 / 48 Formal a Language Theory and Describing Semantics Principles of Programming Languages 4.
Mestrado em Ciência de Computadores Mestrado Integrado em Engenharia de Redes e Sistemas Informáticos VC 15/16 – TP14 Pattern Recognition Miguel Tavares.
Syntax and Grammars.
Cooperative Computing & Communication Laboratory A Survey on Transformation Tools for Model-Based User Interface Development Robbie Schäfer – Paderborn.
XCL-Tools in relation to Significant characteristics in Planets Manfred Thaller Universität zu* Köln *University at not of Cologne.
COMP135/COMP535 Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman Chapter 2 Lecture 2 – Digital Representations.
College of Computer Science, SCU Computer English Lecture 1 Computer Science Yang Ning 1/46.
Softwaretechnologie für Fortgeschrittene Teil Thaller Stunde VI: Information revisited … Köln 9. Januar 2014.
1 Prints: The Language of Industry. Learning Objectives Identify the importance of prints. Discuss historical processes and technologies related to prints.
Chapter 2 Data in Science. Section 1: Tools and Models.
CS223: Software Engineering
Connecting Architecture Reconstruction Frameworks Ivan Bowman, Michael Godfrey, Ric Holt Software Architecture Group University of Waterloo CoSET ‘99 May.
Auszug aus: What is a text within the Digital Humanities, or some of them at least? Manfred Thaller, Universität zu Köln Digital Humanities 2012, July.
Convergences between modern languages and language(s) of schooling – Sweden –
What We Need to Know to Help Students’ Scores Improve What skills the test measures How the test relates to my curriculum What skills my students already.
Technical Reports ELEC422 Design II. Objectives To gain experience in the process of generating disseminating and sharing of technical knowledge in electrical.
Computer Systems Architecture Edited by Original lecture by Ian Sunley Areas: Computer users Basic topics What is a computer?
Describing Syntax and Semantics
CSCI-235 Micro-Computer Applications
Revision Lecture
ELECTRONIC MAIL SECURITY
Theory of Computation Turing Machines.
ELECTRONIC MAIL SECURITY
CHAPTER 1: THE DATABASE ENVIRONMENT AND DEVELOPMENT PROCESS
Respectful Type Converters
UNIT-II CHAPTER-4 SOFTWARE REQUIREMENT DEFINITION
Presentation transcript:

What is a text within the Digital Humanities, or some of them at least? Manfred Thaller, Universität zu Köln Digital Humanities 2012, July 20 th 2012

Information I

Claude Shannon: "A Mathematical Theory of Communication", Bell System Technical Journal, Shannon 3

The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. (Shannon, 1948, 379) Shannon 4

It is wet outside. It must be raining … 5

The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning; that is they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem. (Shannon, 1948, 379) Shannon 6

It is wet outside. It must be raining … 7

Shannon It is wet outside. It is wet outside … 8

Shannon

„Ladder of Knowledge“ wisdom knowledge information data 10

Information 11

Data 12

Data are stored. E.g.: 22°C. Information are data interpreted within a context: "In this lecture hall the temperature is 22°C". This context is fixed and identical for all recipients of information. Data  Information 13

Knowledge is the result of a more complex process. E.g. the decision, derived from the room temperature of 22 ° centigrade, to get out of your jacket; or not. This context is different between recipients of information. Information  Knowledge 14

Data 22 ° C 22 ‘ ’ So … Information 22 ° C in lecture hall M 22 ° 22 [ NOT ASCII { 0, 22 } ] 15

Langefors “Infological Equation”: original I = i (D, S, t) I ::= Information i() ::= interpretative process D ::= Data S ::= Previous knowledge t ::= time Börje Langefors, Essays on Infology, Studentliteratur: Lund, 1995 Langefors 16

Information II

Receiving information … The kink has been kiled. Oh, that was John II. Very strange spelling, even for that time.. 18

Receiving information … Oh, that was John II. Very strange spelling, even for that time.. Notice: We can not consult the sender any more …. ? 19

Langefors “Infological Equation”: original I = i (D, S, t) I ::= Information i() ::= interpretative process D ::= Data S ::= Previous knowledge t ::= time Börje Langefors, Essays on Infology, Studentliteratur: Lund, 1995 Langefors 20

Langefors “Infological Equation”: generalization 1 I 2 = i (I 1, S 2, t) I ::= Information i() ::= interpretative process D ::= Data S ::= Previous knowledge t ::= time Börje Langefors, Essays on Infology, Studentliteratur: Lund, 1995 Langefors 21

Receiving information … Oh, that was John II. Very strange spelling, even for that time.. Notice: We can not consult the sender any more …. ? 22

Receiving information … Or, was it 100 years earlier? Very strange spelling, even for that time.. Notice: We can not consult the sender any more …. ? 23

Langefors “Infological Equation”: generalization 2 I x = i (I x-1, S x, t) I ::= Information i() ::= interpretative process D ::= Data S ::= Previous knowledge t ::= time Börje Langefors, Essays on Infology, Studentliteratur: Lund, 1995 Langefors 24

Langefors “Infological Equation”: generalization 3 S x = s (I x-1, t) I ::= Information i() ::= interpretative process D ::= Data S ::= Previous knowledge, s = knowledge generating process t ::= time Börje Langefors, Essays on Infology, Studentliteratur: Lund, 1995 Langefors 25

Langefors “Infological Equation”: generalization 4 I x = i (I x-α, S x-β, t) I ::= Information i() ::= interpretative process D ::= Data S ::= Previous knowledge t ::= time Börje Langefors, Essays on Infology, Studentliteratur: Lund, 1995 Langefors 26

Langefors “Infological Equation”: generalization 5 I x = i (I x-α, s(I x-β, t), t) I ::= Information i() ::= interpretative process D ::= Data S ::= Previous knowledge t ::= time Börje Langefors, Essays on Infology, Studentliteratur: Lund, 1995 Langefors 27

Data 22 ° C 22 ‘ ’ Remember … Information 22 ° C in lecture hall M 22 ° 22 [ NOT ASCII { 0, 22 } ] 28

Changeable datatypes int myVariable; char myVariable; temperature myVariable; obj myVariable; myVariable.useAsInt(); myVariable.useAsChar(); myVariable.addInterpretation(temperature,Centigrade); 29

Notes: (1)If this is so, the assumption of Comp. Sci., that information is represented by structures on which algorithms operate, can be replaced by a more general understanding, according to which information is a state of a set of perpetually active algorithms. (2) Has that any practical meaning? Langefors 30

A practical interlude

► Photoshop ► Planets: the problem 32

png tiff Extractor Comparator image info 2 image info 1 the same? Format conversion png rulestiff rules Planets: the vision 1 33

Obj 1 Obj 2 Extractor Comparator object info 2 object info 1 the same? Format conversion rule set 1rule set 2 Planets: the vision 2 34

Planets: the vision 3 Obj 1 Obj 2 Extractor Comparator XCDL 2 XCDL 1 the same? Format conversion XCEL 1XCEL 2 Specification of „similiarity“ to be used: „comparator comparison [Language] “ (coco). Specification of „similiarity“ observed: „comparator results [Language] “ (copra). Abstract description of file content: „eXtensible Characterisation Definition Language“ (XCDL), able to describe the content of digital objects (=1 + n more files), processible by a software tool for further analysis. Machine readable form of a file format specification: „eXtensible Characterisation Extraction Language“ (XCEL), able to describe any machine readable format in a formal language, processible by a software tool for extraction of content as XCDL. 35

This is a text … fontsize 48 unsignedInt8 Text in XCDL 36

7A 11 9B F4 DA 9C B … … title Ebstorf Mappa Mundi ASCII Image in XCDL 37

Generalizing the practical solution

Allows to make statements about the proximity of two objects on the "y" axis. Irrespective of the "shape" of the object. Dimensions: geometry 39

Allows to make statements about the proximity of two objects on the "y" axis. Irrespective of the “object" that is at the abstract position. Dimensions: textual / conceptual 40

Dimensions are by definition orthogonal. Dimensions can have any sort of metric:  Rational: { - ∞ … + ∞ }  Integer range: { 0 … 100 }  Nominal: { medieval, early modern, modern }  Image: {, }  … Dimensions: metrics 41

(1) Biggin (2) Biggin (3) Biggin (4) Biggin Which of the chunks are more similar to each other: (1) and (2) or (1) and (3)? Four texts … 42

… in a coordinate space. 43

Liber exodi glosatus An image in a textual coordinate space 44

Liber exodi glosatus An text in an image coordinate space 45

An image in a semantic coordinate space Bishop Cardinal Monk Priest Monk 46

Semantics in an image coordinate space Bishop Cardinal Monk Priest Monk 47

Generalization 1 Biggin Visualization {bold, italic} Interpretation {surname, topographic name} 48

Generalization 2 Series of atomic content tokens Conceptual dimension 1 Conceptual dimension 2 49

Generalization 3 { T, C 1, C 2 } 50

Generalization 4 { T, { C 1, C 2, …, C n } } 51

Generalization 5 { T, C n } (1) Texts are sequences of content carrying atomic tokens. (2) Each of these tokens has a position in an n-dimensional conceptual universe. 52

Generalization 6 { X, Y, C n } 53

Generalization 7 { T 1, T 2, C n } (1) Images are planes of content carrying atomic tokens. (2) Each of these tokens has a position in an n- dimensional conceptual universe. 54

Generalization 8 I ::= { { T 1, T 2, … T m }, C n } (1) Information objects are m-dimensional arrangements of content carrying atomic tokens. (2) Each of these tokens has a position in an n- dimensional conceptual universe. 55

Generalization 9 I ::= {T m, C n } (1) Information objects are m-dimensional arrangements of content carrying atomic tokens. (2) Each of these tokens has a position in an n- dimensional conceptual universe. (3) All of this, of course, is recursive … 56

Another practical interlude

Virtual Research Environments  Virtuelles deutsches Urkundennetz (Virtual network of German charters) 58

digitization editing research transcription publication symbol manipulation teachteach A model of historical research

base image editorial coordinates semantic coordinates textual coordinates publication coordinates symbol coordinates didactic coordinates A model of historical research

Conclusion

Summary (1)All texts, for which we cannot consult the producer, should be understood as a sequence of tokens, where we should keep the representation of the tokens and the representation of our interpretation thereof completely separate. (2)Such representations can be grounded in information theory. (3)These representations are useful as blueprints for software on highly divergent levels of abstraction. 62

Thank you! 63