Download presentation
Presentation is loading. Please wait.
Published bySuzanna Sullivan Modified over 9 years ago
1
ICS-FORTH July, 2000 1 Classifying Historical Documents Maria Theodoridou, Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Heraklion - Crete
2
ICS-FORTH July, 2000 2 The classification problem p automatic transcription not possible u inaccurate OCR software u interpretation dependent p manual keyword assignment u time consuming process u keywords not necessarily unique u inconsistent between users u not obvious for users in retrieval p complete classification only on parts of data base u by different aspects u at different times u by different people
3
ICS-FORTH July, 2000 3 pDublin Core METAdata Elements pEAD Encoded Archival Description Document Type Definition pISAD (G) General International Standard Archival Description Archival standards
4
ICS-FORTH July, 2000 4 Task Analysis pArchivist maintains the inventory u Organizes fonds and subfonds (manageable units and provenance) u assigns identification numbers to ensure integrity u documents provenance, chronology of collective units. p Handling of the material is hazardous to health and to the material. u Replace access by electronic surrogate u Preserve electronic copies for preservation of contents p Researchers are granted access to study parts u focused studies - resulting in publications u primary information partially overlaps between studies.
5
ICS-FORTH July, 2000 5 Idea of Operation p Scanned images replace access to originals. p Researchers should leave core documentation on partial contents p Ergonomic classification user interface (minutes per document) p Thesauri assist classification
6
ICS-FORTH July, 2000 6 Classification structure p Classification by semantic net of metadata. uAnalysis of entities of the archive material uClassification of documents by: u(1) Date and type of administrational act u(2) described activities usyntactic structure to describe multiple and nested activities uNotion of identity of persons, places, objects uCoherent classification on instance and concept level
7
ICS-FORTH July, 2000 7 classification subfonds_of Fonds CurrentFondsFilmArchive ArchivalType Subfonds Current Subfonds Historical Subfonds HistoricalFonds copy_of copy_of_ part belongs_to attribute generalisation derived_from part_of corresponding ArchivalDescription structural Historical Archives Modelling collections
8
ICS-FORTH July, 2000 8 Physical ArchivalType Conceptual ArchivalType UnitOfDescription Item classification subfonds_of (s) Fonds CurrentFondsFilmArchive ArchivalType Subfonds Current Subfonds Historical Subfonds HistoricalFonds copy_of (d) copy_of_ part (d) belongs_to (s) attribute generalisation ArchivalDescription structural (s) derived_from (d) corresponding (c) originates_from (c) kept_in (c) part_of (c) Historical Archives Modelling collections and objects
9
ICS-FORTH July, 2000 9 Physical ArchivalType Conceptual ArchivalType UnitOfDescription Microfilm Sheet File Book ItemUnit Series Item DocumentPicture BookPage Shot SheetPage Photograph contains (s) contains_first (s) contains_second (s) copy_of (d) corresponds_to (c) ArchivalDescription structural (s)derived_from (d)corresponding (c) classification attribute generalisation Historical Archives Modelling objects vs. contents
10
ICS-FORTH July, 2000 10 EventTypeDescriptionType UnitOfDescription SheetPage Fonds ItemUnit Item DocumentPicture ScanningEditing Transcription Occurence history ConceptualArchival Type PhysicalArchival Type ArchivalTypeElectronicDocument Type ActionType ElectronicProcessing Type ArchivalDescription structuralderived_fromcorresponding ElectronicProcessingElectronicDocument product Translation ScannedPage produced_from result corresponds_to classification attribute generalisation Historical Archives Modelling processes
11
ICS-FORTH July, 2000 11 pFor levels: uThe act of documentation uThe act of administration uThe targeted social activity uOther related activities and items pQuestions that need to be answered: uWho? Persons and organizations uWhere? Places uWhen? Time uWhat? Objects uHow? Activities and actions Historical Archives The Facets
12
ICS-FORTH July, 2000 12 Facet Polyhierarchies Instances (metadata) Manuscripts’ Digital Library Historical Archives Faceted classification by concepts
13
ICS-FORTH July, 2000 13 Instances (metadata) Manuscripts’ Digital Library Historical Archives Faceted classification by concepts- An example Persons and Organisations Individuals Martin Houses Places house nr.415 live in Facet Polyhierarchies is Martin’s
14
ICS-FORTH July, 2000 14 Historical Archives The ARCHON classification Item has type: Document Type has publication date: Date has creation date: Date has description: Activity has activity type: Activity Type has actor type: Actor Type has object type: Object Type has place type: Place Type happened at: Date has actor: Actor has type: Actor Type has place: Place has type: Place Type has object: Object has type: Object Type has related activity: Activity
15
ICS-FORTH July, 2000 15 Historical Archives The ARCHON classification pWhere: uActivity Type = marriage, selling, condemnation, tax regulation, statistics.. uActor Type = Pasha, judge, farmer,…., but also: Witness, u Place Type= City, village, monastry, prefecture…. uObject Type= house, payment, privilege….
16
ICS-FORTH July, 2000 16 ARXONHierarchy Περιγραφή ΤόποςΈγγραφοΑντικείμενοΔραστηριότηταΧρόνος ARXONFacet classification attribute generalization Δράστης Είδος Facet Κτίσματα Χριστιανικός Μήνας Μουσουλμανικός Μήνας Κινητό Διοικητικός Τόπος Ακίνητο Μη Υλικό Φυσικός Τόπος Περιεχόμενο Διοικητικές Πράξεις Δικαστικές Περιπτώσεις Ρόλος στην υπόθεση Πρόσωπο Φορέας Παρουσία στην υπόθεση Εκδότης/Παραλήπτης Άλλα
17
ICS-FORTH July, 2000 17 Classifying Historical Documents Conclusions pFaceted classification by concepts uhas high precision umaintains identity of concepts and not keywords ucreates a base of domain knowledge upreserves the syntactic structure of the expression used for the classification
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.