National Library of Finland Metadata in the Digitisation Process Cultural unity and diversity of the Baltic Sea Region – common history, different languages, mixed culture Helsinki, 21st–22 nd October 2010 Tiina Ison, Senior Analyst, National Library of Finland
Outline 1.Front End - National Digital Library and Long Term Preservation (KDK/PAS) 2.Back End - Digitisation Production Process, METS Profiles 3.Descriptive Metadata 4.Administrative/Technical Metadata 5.Structural Metadata 6.Wrapping things together: METS Profile 7.Processes towards distrubed work, crowd soucing, annotaiton and ontologies
1. Frond End: National Digital Library and Long- Term Preservation Infrastructure Ministry of Education Libraries / Archives / Museums BACK END SYSTEMS In their digitisation production memory institutions produce authentic, trustworthy digitised content and collections OPM-KD Project , digitisation production revewed ulletin09/article6.html Infrastructure Intiatives: National Digital LibraryNational Long-Term Prservation Rights Management... METS profiles Kansallisen Digitaalisen Kirjaston Arkkitehtuuri
2. Back End: Digitisation Production Processes, METS Profiles SOURCE MATERIAL PHYSICAL COLLECTIONS Structural metadata METS, ALTO METS EXPORT Packesges include: JPEG2000 OCR TXT as ALTO XML PDF JPEG(150) METSXML MARCXML DIGITAL RESOURCE COMPREHENSIVE DIGITIAL COLLECTIONS Standards & OAI-PMH complient METS SIP packages Two Bibliographic Records CATALOUGING SCANNING POST PROCESSING LEVEL OF MARK UP Articles Illustrations Poems Descriptive metadata MARC21/MODS Administrative/technical metadata MIX/PREMIS Newspapers Serials Books Parchments Notes Maps Audio
3. Descriptive Metadata CATALOUGING Catalogued Items Un-catalogued Items – Minimal bibligraphic record Bar Code ID’s – Unique ID’s for Physical Items Ingest of bibliographic metadata into digitisation produciton MARC21 conversion into MARCXML (MODS) Two bibliographic recrods – physical and digital (link 776) Post cataloguing for minimal records Enrichmnent of catalogue
4. Administrative/Technical Metadata An XML Schema designed for expressing technical metadata for digital still images Technical Metadata for Digital Still Images - (NISO Z39.87 Data Dictionary) MIX: Image width, Color space, color profile, Scanner metadata, Digital camera settings Preservation Metadata/Premis (information about actions on object, on even, on technical environment) Rights Metadata (access restriction) Persistent ID’s SCANNING
5. Structural Metadata Navigation, use and access ? Logical Structure Physical Structure METS structMap – relatinships between parts POST PROCESSING
6. Level of Structural Mark Up Material types books, serials, newspaoers, audio, projects Granularity - different level of structural mark up - i.e. article, illustration, poem Granularity - all material types: pages, footnotes, running title, tables, advertisemnts, image (captions and categories) Labour intensive Phased approach in production Crowd sourcing LEVEL OF MARK UP
7. Wrapping things together; METS Profiles METS profiles for different material types monographs, serials, newspapers, audio… Export files : JPEG2000, lossless, PDF, OCR TXT as ALTO XML, JPEG (150dpi), METSXML and MARCXML METS container or wrapper provides a SIP package for delivery and exchange of digital objects accross systems that is OAI-PMH compliant. Wraps descriptive, administrative and structural metadata + PREMIS. MODS and MARCXML for descriptive and bibliographical metadata ( ( MIX for image technical metadata ( PREMIS for preservation metadata ( (standardi salkku)
8. Processes towards distributed work, crowd sourcing, annotation and ontolgies OCR Correction Content and context as part of digitisation processes… Automatic and semiautomatic proccess for data extraction … Distributed work processes i.e. for: Mark up level OCR correction Controlled annotation Social tagging
THANK YOU