ICS-FORTH July, 2000 1 Classifying Historical Documents Maria Theodoridou, Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

1 ICS-FORTH EU-NSF Semantic Web Workshop 3-5 Oct Christophides Vassilis Database Technology for the Semantic Web Vassilis Christophides Dimitris Plexousakis.
DELOS WP5 Workshop: Semantic Interoperability in DL systems, 17 th September 2004, Bath, UK Semantic Interoperability in Digital Library Systems Task 3:
1 CIDOC CRM + FRBR ER = FRBR OO … an equation for a harmonised view of museum information and bibliographic information Martin Doerr First CASPAR Seminar.
Documenting the Resource Malcolm Polfreman
Information Technology IBM DB2 Content Manager “Lunch N Learn” 03/14/2007.
ICS-FORTH Which Period Is It? A Methodology To Create Thesauri Of Historical Periods Martin Doerr, Athina Kritsotaki, Stephen Stead.
TC3 Meeting in Montreal (Montreal/Secretariat)6 page 1 of 10 Structure and purpose of IEC ISO - IEC Specifications for Document Management.
Metadata: An Introduction By Wendy Duff October 13, 2001 ECURE.
Digitisation and Access to Archival Collections: A Case Study of the Sofia Municipal Government (1878 – 1879) Maria Nisheva-Pavlova, Pavel Pavlov Faculty.
Rutgers University Libraries What is RUcore? o An institutional repository, to preserve, manage and make accessible the research and publications of the.
Using Metadata in CONTENTdm Diana Brooking and Allen Maberry Metadata Implementation Group, Univ. of Washington Crossing Organizational Boundaries Oct.
Soft. Eng. II, Spr. 2002Dr Driss Kettani, from I. Sommerville1 CSC-3325: Chapter 7 Title : Object Oriented Analysis and Design Reading: I. Sommerville,
Internet Resources Discovery (IRD) IBM DB2 Digital Library Thanks to Zvika Michnik and Avital Greenberg.
Descriptive Metadata o When will mods.xml be used by METS (aip.xml) ?  METS will use the mods.xml to encode descriptive metadata. Information that describes,
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
A.M.TammaroNMPLIS Summer School, Tblisi, July 5th-15th, 2010 Digital library concepts Anna Maria Tammaro.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
ICS – FORTH, August 31, 2000 Why do we need an “Object Oriented Model” ? Martin Doerr Atlanta, August 31, 2000 Foundation for Research and Technology -
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Data Exchange Tools (DExT) DExT PROJECTAN OPEN EXCHANGE FORMAT FOR DATA enables long-term preservation and re-use of metadata,
Harmonising without Harm: towards an object-oriented formulation of FRBR aligned on the CIDOC CRM ontology Maja Žumer (University of Ljubljana) & Patrick.
Database Design - Lecture 1
DSpace, CyberCemeteries and Other Active Sites for Community Networking Records Maria Esteva and Sue Soy School of Information, UT Austin Austin History.
The GNM-DMS a Document Management System for the Germanische Nationalmuseum Martin Doerr, ICS-Forth Siegfried Krause, GNM April 2004.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
LIS 506 (Fall 2006) LIS 506 Information Technology Week 11: Digital Libraries & Institutional Repositories.
Information Retrieval and Knowledge Organisation Knut Hinkelmann.
ICS-FORTH January 11, Thesaurus Mapping Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Bath, UK, January.
Archival information system ARHiNET Croatian national archival information system Vlatka Lemić Croatian State Archives, Croatia.
A CIDOC CRM – compatible metadata model for digital preservation
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Metadata: Essential Standards for Management of Digital Libraries ALI Digital Library Workshop Linda Cantara, Metadata Librarian Indiana University, Bloomington.
Lifecycle Metadata for Digital Objects (INF 389K) September 18, 2006 The Big Metadata Picture, Web Access, and the W3C Context.
Archival Information Packages for NASA HDF-EOS Data R. Duerr, Kent Yang, Azhar Sikander.
Library Repositories and the Documentation of Rights Leslie Johnston, University of Virginia Library NISO Workshop on Rights Expression May 19, 2005.
Producer Questions 6 December Producer Questions 2 Purpose The SIP standard envisions the development of a formal model of the data for.
Digital Preservation: Current Thinking Anne Gilliland-Swetland Department of Information Studies.
ALA Institutional Repository Update ALA Archives at the University of Illinois Urbana-Champaign Chris Prom Cara Bertram Denise Rayman.
ICS-FORTH Thesauri of Historical Periods A Proposal for Standardization Martin Doerr, Athina Kritsotaki Heraklion, Crete, June
Collecting History: Profiles in Science Alexa T. McCray National Library of Medicine Bethesda, MD Stanford University August 21, 1999.
Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer.
Functional Requirements for Bibliographic Records The Changing Face of Cataloging William E. Moen Texas Center for Digital Knowledge School of Library.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
Institutional Repositories July 2007 DIGITAL CURATION creating, managing and preserving digital objects Dr D Peters DISA Digital Innovation South.
Surveying the landscape: collection-level description & resource discovery JISC/NSF DLI Projects meeting, Edinburgh, 24 June 2002 Pete Johnston UKOLN,
Lifecycle Metadata for Digital Objects September 4, 2002 Overall framework: OZ meets WC3.
Presenting Documents How to Build a Digital Library Ian H. Witten and David Bainbridge.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
FIND IT! USING LIBRARY CATALOGING CONCEPTS TO ORGANIZE AND MAKE RECORDS FINDABLE DIONNE L. MACK, INTERIM DIRECTOR OF QUALITY OF LIFE DEPARTMENTS.
Developing a Dark Archive for OJS Journals Yu-Hung Lin, Metadata Librarian for Continuing Resources, Scholarship and Data Rutgers University 1 10/7/2015.
CENTRAL/WESTERN MASSACHUSETTS AUTOMATED RESOURCE SHARING Digitization GOALS & THEIR LOGISTICS Michael J. Bennett Digital Initiatives Librarian C/WMARS,
Online Information and Education Conference 2004, Bangkok Dr. Britta Woldering, German National Library Metadata development in The European Library.
Metadata requirements for archiving structured data Alice Born Statistics Canada Joint UNECE/Eurostat/OECD Work Session on Statistical Metadata (9-11 April.
Drill Workflow- Make a workflow using the task and decision boxes on the board to simulate a student getting up and going to school in the morning. Use.
Metadata models to support the statistical cycle: IMDB
IS 130 Information systems 1
Database Systems Chapter 3 1.
Content-level intellectual control for digital archives
Active Data Management in Space 20m DG
Application of Dublin Core and XML/RDF standards in the KIKERES
DIGITAL LIBRARY MANAGEMENT
Metadata to fit your needs... How much is too much?
Metadata in Digital Preservation: Setting the Scene
Semantic Interoperability in Digital Library Systems
Robin Dale RLG OAIS Functionality Robin Dale RLG
Attributes and Values Describing Entities.
I-ASIST Meeting April 11, 2006 Stacy Kowalczyk
Presentation transcript:

ICS-FORTH July, Classifying Historical Documents Maria Theodoridou, Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Heraklion - Crete

ICS-FORTH July, The classification problem p automatic transcription not possible u inaccurate OCR software u interpretation dependent p manual keyword assignment u time consuming process u keywords not necessarily unique u inconsistent between users u not obvious for users in retrieval p complete classification only on parts of data base u by different aspects u at different times u by different people

ICS-FORTH July, pDublin Core METAdata Elements pEAD Encoded Archival Description Document Type Definition pISAD (G) General International Standard Archival Description Archival standards

ICS-FORTH July, Task Analysis pArchivist maintains the inventory u Organizes fonds and subfonds (manageable units and provenance) u assigns identification numbers to ensure integrity u documents provenance, chronology of collective units. p Handling of the material is hazardous to health and to the material. u Replace access by electronic surrogate u Preserve electronic copies for preservation of contents p Researchers are granted access to study parts u focused studies - resulting in publications u primary information partially overlaps between studies.

ICS-FORTH July, Idea of Operation p Scanned images replace access to originals. p Researchers should leave core documentation on partial contents p Ergonomic classification user interface (minutes per document) p Thesauri assist classification

ICS-FORTH July, Classification structure p Classification by semantic net of metadata. uAnalysis of entities of the archive material uClassification of documents by: u(1) Date and type of administrational act u(2) described activities usyntactic structure to describe multiple and nested activities uNotion of identity of persons, places, objects uCoherent classification on instance and concept level

ICS-FORTH July, classification subfonds_of Fonds CurrentFondsFilmArchive ArchivalType Subfonds Current Subfonds Historical Subfonds HistoricalFonds copy_of copy_of_ part belongs_to attribute generalisation derived_from part_of corresponding ArchivalDescription structural Historical Archives Modelling collections

ICS-FORTH July, Physical ArchivalType Conceptual ArchivalType UnitOfDescription Item classification subfonds_of (s) Fonds CurrentFondsFilmArchive ArchivalType Subfonds Current Subfonds Historical Subfonds HistoricalFonds copy_of (d) copy_of_ part (d) belongs_to (s) attribute generalisation ArchivalDescription structural (s) derived_from (d) corresponding (c) originates_from (c) kept_in (c) part_of (c) Historical Archives Modelling collections and objects

ICS-FORTH July, Physical ArchivalType Conceptual ArchivalType UnitOfDescription Microfilm Sheet File Book ItemUnit Series Item DocumentPicture BookPage Shot SheetPage Photograph contains (s) contains_first (s) contains_second (s) copy_of (d) corresponds_to (c) ArchivalDescription structural (s)derived_from (d)corresponding (c) classification attribute generalisation Historical Archives Modelling objects vs. contents

ICS-FORTH July, EventTypeDescriptionType UnitOfDescription SheetPage Fonds ItemUnit Item DocumentPicture ScanningEditing Transcription Occurence history ConceptualArchival Type PhysicalArchival Type ArchivalTypeElectronicDocument Type ActionType ElectronicProcessing Type ArchivalDescription structuralderived_fromcorresponding ElectronicProcessingElectronicDocument product Translation ScannedPage produced_from result corresponds_to classification attribute generalisation Historical Archives Modelling processes

ICS-FORTH July, pFor levels: uThe act of documentation uThe act of administration uThe targeted social activity uOther related activities and items pQuestions that need to be answered: uWho? Persons and organizations uWhere? Places uWhen? Time uWhat? Objects uHow? Activities and actions Historical Archives The Facets

ICS-FORTH July, Facet Polyhierarchies Instances (metadata) Manuscripts’ Digital Library Historical Archives Faceted classification by concepts

ICS-FORTH July, Instances (metadata) Manuscripts’ Digital Library Historical Archives Faceted classification by concepts- An example Persons and Organisations Individuals Martin Houses Places house nr.415 live in Facet Polyhierarchies is Martin’s

ICS-FORTH July, Historical Archives The ARCHON classification Item has type: Document Type has publication date: Date has creation date: Date has description: Activity has activity type: Activity Type has actor type: Actor Type has object type: Object Type has place type: Place Type happened at: Date has actor: Actor has type: Actor Type has place: Place has type: Place Type has object: Object has type: Object Type has related activity: Activity

ICS-FORTH July, Historical Archives The ARCHON classification pWhere: uActivity Type = marriage, selling, condemnation, tax regulation, statistics.. uActor Type = Pasha, judge, farmer,…., but also: Witness, u Place Type= City, village, monastry, prefecture…. uObject Type= house, payment, privilege….

ICS-FORTH July, ARXONHierarchy Περιγραφή ΤόποςΈγγραφοΑντικείμενοΔραστηριότηταΧρόνος ARXONFacet classification attribute generalization Δράστης Είδος Facet Κτίσματα Χριστιανικός Μήνας Μουσουλμανικός Μήνας Κινητό Διοικητικός Τόπος Ακίνητο Μη Υλικό Φυσικός Τόπος Περιεχόμενο Διοικητικές Πράξεις Δικαστικές Περιπτώσεις Ρόλος στην υπόθεση Πρόσωπο Φορέας Παρουσία στην υπόθεση Εκδότης/Παραλήπτης Άλλα

ICS-FORTH July, Classifying Historical Documents Conclusions pFaceted classification by concepts uhas high precision umaintains identity of concepts and not keywords ucreates a base of domain knowledge upreserves the syntactic structure of the expression used for the classification