Metadata Metadata Mark-up and Management © Adolf Knoll, National Library of the Czech Republic.

Slides:



Advertisements
Similar presentations
Delivering textual resources. Overview Getting the text ready – decisions & costs Structures for delivery Full text Marked-up Image and text Indexed How.
Advertisements

DIGITIZATION OF RARE LIBRARY MATERIALS Metadata -Introduction Mark-up © Adolf Knoll, National Library of the Czech Republic.
DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
METS: An Introduction Structuring Digital Content.
Introduction to metadata for IDAH fellows Jenn Riley Metadata Librarian Digital Library Program.
Content Types: Markup and Multimedia. Introduction Markup languages use extra textual syntax to encode: –Formatting / display information –Structure information.
Project 1 Introduction to HTML.
Publishing Workflow for InDesign Import/Export of XML
Multimedia Authoring Tools Jon Ivins DMU. Essence of Multimedia… n Combination and integration of different media elements for presentation via a unified.
WMES3103 : INFORMATION RETRIEVAL
© Tefko Saracevic, Rutgers University1 metadata considerations for digital libraries.
Introduction to HTML 2006 CIS101. What is the Internet? Global network of computers that are connected and communicate via a series of Protocols Protocols.
WWW and Internet The Internet Creation of the Web Languages for document description Active web pages.
Developing a Basic Web Page with HTML
HTML, XML, PDF Pros and Cons.
Introducing HTML & XHTML:. Goals  Understand hyperlinking  Understand how tags are formed and used.  Understand HTML as a markup language  Understand.
DIGITIZATION OF RARE LIBRARY MATERIALS Metadata Format Access to Digital Documents © Adolf Knoll, National Library of the Czech Republic.
Chapter 12 Creating and Using XML Documents HTML5 AND CSS Seventh Edition.
Unit no. 4 Mark-up Adolf Knoll National Library of the Czech Republic
EAD: A Technical Introduction Julie Hardesty, Metadata Analyst June 3, 2014.
16-1 The World Wide Web The Web An infrastructure of distributed information combined with software that uses networks as a vehicle to exchange that information.
CPS120: Introduction to Computer Science The World Wide Web Nell Dale John Lewis.
XML and XSL Institutional Web Management 2001: Organising Chaos.
XML modelling Adolf Knoll National Library of the Czech Republic
Mark Sullivan University of Florida Libraries Digital Library of the Caribbean.
Application of Standards for Visual Representation of ALM Objects Adolf Knoll National Library of the Czech Republic.
Interoperable Digitised Content “Discover, search, extract, link, associate, and view digitised content” Les Carr.
Copyright, UCL LEADERS: Linking EAD to Electronically Retrievable Sources Interoperability: Where the irresistible force of flexibility meets the immovable.
Chapter 1 Understanding the Web Design Environment Principles of Web Design, 4 th Edition.
Week 1 Understanding the Web Design Environment. 1-2 HTML: Then and Now HTML is an application of the Standard Generalized Markup Language Intended to.
Metadata Xiangming Mu. What is metadata? What is metadata? (cont’) Data about data –Any data aids in the identification, description and location of.
EAD Revision: Response to Call for Comments Bill Stockting: Co-Chair TS-EAD: EAD Roundtable/EAD Revision Forum: SAA Annual Meeting, Chicago, 24 August.
Introduction to XML. XML - Connectivity is Key Need for customized page layout – e.g. filter to display only recent data Downloadable product comparisons.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
TEXT ENCODING INITIATIVE (TEI) Inf 384C Block II, Module C.
Metadata: Essential Standards for Management of Digital Libraries ALI Digital Library Workshop Linda Cantara, Metadata Librarian Indiana University, Bloomington.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
XML and Digital Libraries M. Zubair Department of Computer Science Old Dominion University.
1 Senn, Information Technology, 3 rd Edition © 2004 Pearson Prentice Hall James A. Senn’s Information Technology, 3 rd Edition Chapter 12 Creating Web-Enabled.
Chapter 1 Understanding the Web Design Environment Principles of Web Design, 4 th Edition.
Unit no. 5 Digital Library Adolf Knoll National Library of the Czech Republic © Adolf Knoll, National Library of the Czech Republic.
Overview of HTML and XML. Contents n History n Usage n Examples n Advantages n Disadvantages.
Digitization Programmes National Library of the Czech Republic Adolf Knoll
An OO schema language for XML SOX W3C Note 30 July 1999.
Digital Media Technology Week 5: XML and Presentation Peter Verhaar.
Introduction to metadata
Using XML to store Descriptive Metadata Richard Murphy Rosarie O’Riordan Central Statistics Office Ireland.
Slavic Digital Text Workshop 2006 The Open Archives Initiative Protocol for Metadata Harvesting: an Opportunity for Sharing Content in a Distributed Environment.
Introduction to Markup Languages January 31, 2002.
XML A Language Presentation. Outline 1. Introduction 2. XML 2.1 Background 2.2 Structure 2.3 Advantages 3. Related Technologies 3.1 DTD 3.2 Schemas and.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Document Computing Technologies for Managing Electronic Document Collections Ross Wilkinson... [et al.] Circulation Counter [RES3H] ZA4080.D
National Library of the Czech Republic Integration of digital materials into EDL Adolf Knoll National Library of the Czech Republic Helsinki CENL Workshop.
Digitization – Basics and Beyond workshop Interoperability of cultural and academic resources New services for digitized collections Muriel Foulonneau.
XML The Extensible Markup Language (XML ), which is comparable to SGML and modeled on it, describes how to describe a collection of data. A standard way.
Standards for representing meeting metadata and annotations in meeting databases Standards for representing meeting metadata and annotations in meeting.
From XML to DAML – giving meaning to the World Wide Web Katia Sycara The Robotics Institute
HTML HYPER TEXT MARKUP LANGUAGE. INTRODUCTION Normal text” surrounded by bracketed tags that tell browsers how to display web pages Pages end with “.htm”
HTML HyperText Markup Language Victoria E. Kozlek.
April 20023CSG11 Electronic Commerce Markup languages John Wordsworth Department of Computer Science The University of Reading
Introduction to metadata for IDAH fellows Jenn Riley Metadata Librarian Digital Library Program.
 XML derives its strength from a variety of supporting technologies.  Structure and data types: When using XML to exchange data among clients, partners,
Chapter 29. Copyright 2003, Paradigm Publishing Inc. CHAPTER 29 BACKNEXTEND 29-2 LINKS TO OBJECTIVES Attach an XML Schema Attach an XML Schema Load XML.
Delivering textual and visual resources. Overview Case studies Methods for providing access Structures for delivery Full text Marked-up Image and text.
A centre of expertise in digital information management UKOLN is supported by: Metadata – what, why and how Ann Chapman.
Web Design Principles 5 th Edition Chapter 3 Writing HTML for the Modern Web.
1 XML and XML in DLESE Katy Ginger November 2003.
XML QUESTIONS AND ANSWERS
Prepared for Md. Zakir Hossain Lecturer, CSE, DUET Prepared by Miton Chandra Datta
Cataloging the Internet
Presentation transcript:

Metadata Metadata Mark-up and Management © Adolf Knoll, National Library of the Czech Republic

Metadata It is added value to digital files for which it forms a container  to identify them  to enable easier access and navigation  to control the entire compound document  to enable archival storage  to enable research work and publication of even critical editions, etc.

Compound Document The document consisting of interconnected metadata and data files  the metadata are added descriptions (mostly pieces of text)  the data are any external files produced by digitizing pieces of original documents (images, texts, sound files, even video files)

What is described? OBJECTS - of which the document consists and which build the document - which have their unchanging substance - whose representations can vary in their different occurrences - which can have some important additional characteristics

Object OISEAU BIRD PTÁK VOGEL Cock Kohout Hahn Eagle Orel Adler Penguin Tučňák Pinguin Falcon Sokol Falke Duck Kachna Ente

Objects They are defined by the creator or interpreter of the document They can be built from any sequence or amount of bits in metadata or data areas It should be established:  which types of objects must be distinguished  how they should be marked

Object OISEAU We have decided to have such an object (animal with wings, feathers, laying eggs) We have decided to mark anything having these characteristics as OISEAU We know that this object has different names in different languages (bird, pták, Vogel, птица, pasăre, …) We know that in reality only concrete birds appear (duck, cock, falcon, penguin, eagle, …)

Objects and contents Semantically poor content formal object (paragraph, heading, note, …) used for formatting languages built on these objects are used for output (HTML, MS WORD, …) PRESCRIPTIVE MARK-UP Semantically rich content content oriented object (author, flower, house, …) used for understanding languages built on these objects are used for description (MARC, TEI, EAD, DOBM, …) DESCRIPTIVE MARK-UP

SGML Standard Generalized Markup Language a general language to mark objects to be applied, it needs to become more concrete (this is made via DTD) thus, second level applications can be written these applications are used directly or they require additional definitions (DTDs) SGML applications: HTML, XML, TEI, …

DESIGNING OUR PROJECT

What do we need? Open communication Internal precision and cohesion of markup Multiple output, reuse of marked data, liberty to add new marked data Complex document control and management Open and flexible content-oriented description principle

What do we work with? For a manuscript having 300 pages, we work with: more than 1500 digital data files produced through digitization (Gallery, Preview, Internet, User, Excellent quality levels: 300x5 + images for covers, end-sheets,...) more than 300 description metadata files (each digitized piece of the original + files for bibliographic and technical descriptions + technological files) This means that the above mentioned requirements must be applied to a complex document consisting of hundreds of computer files, which play various roles.

Independency Metadata should be independent of display – pure values We must know:  which features of objects to describe – we need DESCRIPTION RULES  how to mark up these objects – we need RULES for MARK-UP  how to formalize which objects and how will be described – DTD  how to display the compound document – we need rules for display (transformation rules) If the platform is SGML or XML, we write DTD and XSL tools. type of document; place

place of publishing; publisher; date; addressee

description elements author type of document: postcard place: Hronov place of publishing: Hronov publisher: Karel Šefelín date: 1914 addressee: František Bittnar annotation: Streets of Hronov in 1914; postcard written by my great-grandmother to her husband making military service However, maybe there are better rules, e.g. AACR2 defining how to describe a postcard – we should take them or some approach largely applied than this proposal of ours.

how to mark the elements? In DTD: In Metadata File: Hronov

write postcard Hronov Karel Šefelín 1914 František Bittnar Streets of Hronov in 1914; postcard written by my great-grandmother to her husband making military service

publish XSL transformation of the XML files … in order to display them Index by a database tool and provide even a better access Link metadata with image data This is work for professionals

tools Simple browsing Internet access tools