1 Metadata Andy Powell Technical Development and Research UKOLN University of Bath

Slides:



Advertisements
Similar presentations
Dublin Core for Digital Video: Overview of the ViDe Application Profile.
Advertisements

THE DONOR PROJECT Titia van der Werf-Davelaar. Project Financed by: Innovation of Scientific Information Provision (IWI) Duration: –phase 1: 1 may 1998.
Subject Based Information Gateways in The UK Coordinated Activities in The UK Within the UK Higher Education community, the JISC (Joint Information Systems.
28 March 2003e-MapScholar: content management system The e-MapScholar Content Management System (CMS) David Medyckyj-Scott Project Director.
Why metadata matters for libraries... Rachel Heery UKOLN: The UK Office for Library and Information Networking, University of Bath
1 Demystifying metadata Ann Chapman UKOLN University of Bath UKOLN is funded by Resource: The Council for Museums, Archives and Libraries, the Joint Information.
Metadata vocabularies and ontologies Dr. Manjula Patel Technical Research and Development
1 RDF Tools Brian Kelly UKOLN University of Bath Bath, BA2 7AY UKOLN is funded by the British Library Research and Innovation Centre,
The metadata challenge for libraries: a view from Europe Michael Day UKOLN: The UK Office for Library and Information Networking, University of Bath
Implementation of resource descriptions in UKOLN projects Rachel Heery UKOLN: The UK Office for Library and Information Networking, University of Bath.
UKOLN, University of Bath
February Harvesting RDF metadata Building digital library portals with harvested metadata workshop EU-DL All Projects concertation meeting DELOS.
The Discovery Landscape in Crystallography UKOLN is supported by: Monica Duke UKOLN, University of Bath, UK – eBank UK project A centre.
EAD in A2A Bill Stockting, Senior Editor A2A and EAD Working Group: Central Archives of Historical Records, Warsaw, 26 April 2003.
StatCat Building a Statistical Data Finder ssrs.yale.edu/statcat Steven Citron-Pousty Ann Green Julie Linden Yale University.
Information Retrieval in Practice
An Introduction to Metadata by Wendy Duff ECURE 2000 October 6, 2000.
Metadata: An Introduction By Wendy Duff October 13, 2001 ECURE.
© Tefko Saracevic, Rutgers University1 metadata considerations for digital libraries.
1 CS 502: Computing Methods for Digital Libraries Lecture 17 Descriptive Metadata: Dublin Core.
OLC Spring Chapter Conferences Metadata, Schmetadata … Tell Me Why I Should Care? OLC Spring Chapter Conferences, 2004 Margaret.
Overview of Search Engines
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
1 CS 430: Information Discovery Lecture 15 Library Catalogs 3.
1 WebWatch: Monitoring Web Developments In The UK Brian Kelly UK Web Focus UKOLN University of BathURL Bath, BA2 7AY
UKOLUG - July Metadata for the Web RDF and the Dublin Core Andy Powell UKOLN, University of Bath UKOLN.
Metadata and identifiers for e- journals Copenhagen Juha Hakala Helsinki University Library
CEN/ISSS DC workshop, January The UK approach to subject gateways Rachel Heery UKOLN University of Bath UKOLN is.
A Lightweight Approach To Support of Resource Discovery Standards The Problem Dublin Core is an international standard for resource discovery metadata.
1 © Netskills Quality Internet Training, University of Newcastle Metadata Explained © Netskills, Quality Internet Training.
8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.
Dublin Core and metadata: a tutorial Lorcan Dempsey Andy Powell UKOLN, University of Bath (with a little help from our friends)
1 CS 430: Information Discovery Lecture 14 Automatic Extraction of Metadata.
WebWatch Ian Peacock UKOLN University of Bath Bath BA2 7AY UK
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
Organizing Internet Resources OCLC’s Internet Cataloging Project -- funded by the Department of Education -- from October 1, 1994 to March 31, 1996.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Meta Tagging / Metadata Lindsay Berard Assisted by: Li Li.
Linking electronic documents and standardisation of URL’s What can libraries do to enhance dynamic linking and bring related information within a distance.
1 CS 430: Information Discovery Lecture 7 Descriptive Metadata 3 Dublin Core Automatic Generation of Catalog Records.
The Resource Discovery Network and OAI Andy Powell UKOLN, University of Bath UKOLN is funded by Resource: The Council.
Automated Benchmarking Of Local Authority Web Sites Brian Kelly UK Web Focus UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by:
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
Modularization and Interoperability: Dublin Core and the Warwick Framework Sandra D. Payette Digital Library Research Group Cornell University November.
Accessing a national digital library: an architecture for the UK DNER Andy Powell ELAG 2001, Prague 7 June 2001 UKOLN, University of Bath
Metadata and Documentation Iain Wallace Performing Arts Data Service.
Metadata Bridget Jones Information Architecture I February 23, 2009.
2nd Concertation Day 18 February 2000 The Charity Centre RSLP Collection Description.
1 An Introduction to Metadata Brian Kelly UK Web Focus UKOLN University of Bath BA2 7AY
Metadata for the Web Andy Powell UKOLN University of Bath
A Quick Introduction to Metadata Michael Day UKOLN: The UK Office for Library and Information Networking, University of Bath
Slavic Digital Text Workshop 2006 The Open Archives Initiative Protocol for Metadata Harvesting: an Opportunity for Sharing Content in a Distributed Environment.
SCHEMAS Workshop Bath - May 2000 Andy Powell, UKOLN Example tool/registry integration UKOLN is funded by Resource: The Council.
METADATA & META TAGS Presented by Jong Hun Kim INF 385E Information Architecture and Design I September 28, 2004.
A centre of expertise in digital information managementwww.ukoln.ac.uk DCMI Affiliates: Implications for Institutions Rosemary Russell UKOLN University.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
1 Future Of The Web Brian Kelly, UK Web Focus UKOLN University of Bath Bath, BA2 7AY UKOLN is funded by the British Library Research.
Digitization – Basics and Beyond workshop Interoperability of cultural and academic resources New services for digitized collections Muriel Foulonneau.
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
RDN Architecture Andy Powell UKOLN, University of Bath UKOLN is funded by the Library and Information Commission,
Differences and distinctions: metadata types and their uses Stephen Winch Information Architecture Officer, SLIC.
Interoperability and Standards for Bibliographic Applications Poul Henrik Jørgensen Danish Library Centre Telematics for.
Describing resources II: Dublin Core CERN-UNESCO School on Digital Libraries Rabat, Nov 22-26, 2010 Annette Holtkamp CERN.
A centre of expertise in digital information management UKOLN is supported by: Metadata – what, why and how Ann Chapman.
Attributes and Values Describing Entities. Metadata At the most basic level, metadata is just another term for description, or information about an entity.
fulfilling the DESIRE for knowledge
WHAT DOES THE FUTURE HOLD? Ann Ellis Dec. 18, 2000
Introduction to Metadata
Attributes and Values Describing Entities.
Presentation transcript:

1 Metadata Andy Powell Technical Development and Research UKOLN University of Bath

2 Metadata What is metadata? an introduction The Dublin Core metadata for the Web Metadata management Models for dealing with Web-site metadata UKOLN metadata projects overviews (and problems)

3 What is metadata? by definition:..data about data....data which provides information about a resource.. by example: title, author, subject classification, shelf mark digital format, terms and conditions, location (URL)

4 What is metadata? (2) by usage: Resource discovery –Searching, location –Authentication –Quality/rating Semantic interoperability Resource management User interface –Grouping resources for printing –3-D visualisations

5 Range of formats Dublin Core IAFA SOIF MARC TEI headers CIMI SimpleRich robot generated hand crafted Alta Vista NetFirst Lycos

6 Where is metadata? Embedded within resource HTML tags Linked to resource Remote database distributed union (centralised)

7 Who creates metadata? Publisher side author webmaster institution Service side search service third party creators robot generated hand crafted

8 Dublin Core 15 element core metadata set Primarily intended to aid resource discovery on the Web Main usage currently embedded into HTML META tags All elements optional and repeatable Status? Agreed syntax for embedding in HTML Still discussion about the use of some of the elements

9 Dublin Core History 4 DC meetings Dublin, Warwick, Dublin, Canberra (DC-5 - Helsinki coming soon) Mailing list discussions W3C interest RDF (PICS-NG), MCF Various projects Still no significant interest yet from the big search engines :-(

10 DC Elements - 1 Title Subject intended to promote use of controlled vocabularies but in practice likely to be used for uncontrolled list of keywords Description abstract Creator Publisher

11 DC Elements - 2 Contributor Date the date ‘the resource was made available in its present form’. Agreed default format uses subset of ISO 8601, e.g Type category of resource - document, image, sound, home page, novel, poem, etc. Still much discussion about the content of this element Format MIME type Identifier

12 DC Elements - 3 Source Language language of the resource - NOT the metadata Relation no guidelines for usage currently Coverage separate working party looking at usage Rights rights management seen as too complex for DC. This will give a URL to some external information

13 Simple Example UKOLN Home Page...

14 Element qualifiers Need to refine meaning in some cases TYPE Refines meaning of element - sub-divides element namespace SCHEME Element value taken from external schema, e.g. LCSH for DC.subject, Z39.53 for DC.language LANGUAGE Language of element value (not of the resource being described!)

15 Examples - TYPE Original DC.creator tag Non-personal author Author’s address

16 Examples - SCHEME Library of Congress Subject Heading … or …

17 Metadata Management Practical issues of using Dublin Core for Internet resource description... UKOLN metadata system Requirements 3 models for metadata management Implementation at UKOLN

18 UKOLN metadata system requirements Easy to use Work with a variety of methods of creating HTML Simple migration to future metadata formats Separate metadata from resource

19 Managing Dublin Core (1) HTML Authoring tool Pros… Simple May be useful for training and familiarisation Cons… May not be possible with all editors Maintenance problems Easy to make errors Embed by hand using HTML or text editor

20 DC-dot A Web based tool for creating Dublin Core tags Automatic generation of some tags based on content of the resource Forms based editing of tags Cut-and-paste output into HTML Conversion to other formats… SOIF, ROADS/WHOIS++, USMARC, GILS...

21 Managing Dublin Core (2) Web-site management tool Pros… Use of Web-site management tools likely to increase Object-oriented database approach Cons… Proprietry formats Early days - too early to evaluate use for metadata yet? Use Web-site management tool, for example NetObjects Fusion

22 Managing Dublin Core (3) On the fly generation Pros… Separates metadata from resource Future migration fairly simple Cons… Performance Lack of integration with HTML tools Server specific Hold Dublin Core separately and embed on-the-fly using server-side include (SSI)

23 UKOLN metadata system (1) Embed on-the-fly Apache SSI script Store metadata using SOIF records Use MS-Access as tool to create the records Associate metadata with resource by co-locating them in the Web server filestore

24 UKOLN metadata system (2) MS-Access Database HTML editor …... …... { keywords{13}: xxx, yyy, zzz description{14}: blah blah b author{13}: Stark, Isobel... { keywords{13}: xxx, yyy, zzz description{14}: blah blah b author{13}: Stark, Isobel... } intro.html.soif Apache syntax for calling server-side script

25 UKOLN metadata system (3) MS-Access front end... Filename browser Text boxes Name choosers UKOLN specific metadata

26 UKOLN metadata system (4) UKOLN Web server …... …... intro.html intro.html.soif SSI script { keywords{13}: xxx, yyy, zzz description{14}: blah blah b author{13}: Stark, Isobel... { keywords{13}: xxx, yyy, zzz description{14}: blah blah b author{13}: Stark, Isobel... } Web robot

27 Issues Performance Interaction with Web caches Dublin Core vs Alta Vista style metadata Granularity Which pages should have metadata?

28 What's the point... …of embedding DC tags? Alta Vista isn't going to look for them But, worth doing... within individual projects within specific communities (e.g. eLib) Improve local search facilities e.g. load SOIF records into a Netscape Catalogue Server Web-site management benefits

29 UKOLN Metadata projects ROADS Software for Subject Service DESIRE European Web indexing NewsAgent Current awareness service for Library and Information Staff BIBLINK Information flow from publishers to National Bibliographic Agencies

30 ROADS Resource Organisation and Discovery in Subject-based Services Web based tools for Subject Services SOSIG, ADAM, OMNI, … Manage and search Internet resource descriptions ROADS templates (based on IAFA templates) WHOIS++

31 ROADS - WHOIS++ (1) Simple client-server search and retrieve protocol Developed originally for ‘white pages’ applications Offer search facilities across several Subject Services Distribute a Subject Service across several physical servers Query routing - centroids and CIP

32 ROADS - WHOIS++ (2) Centroid generated by ADAM contains… “you’ll find the string ‘mona’ in the ‘title’ attribute of at least one record in the ADAM database”. CGI-based WHOIS++ client SOSIG OMNI ADAM CIP sharing of centroids Web browser

33 DESIRE European Web cataloguing Subject Services EuroSOSIG (Bristol), EELS (Lund), Arts (Koninklijke Bibliotheek) Manually created ROADS templates European Web Index based on Nordic Web Index (NWI) Robot generated, all resources Multiple servers linked with Z39.50 GILS

34 DESIRE - current work (1) Internationalisation of ROADS Use of robots to: aid manual cataloguing of resources build indexes based on list of URLs in a ROADS database Robot will use embedded Dublin Core if available

35 DESIRE - current work (2) Re-design of EWI robot - including: support for Dublin Core EWI records GILS-II compatible Allow users to search across subject services and the EWI using Z39.50 by converting ROADS records into GILS records by building a WHOIS++ to Z39.50 gateway

36 NewsAgent Current awareness service for LIS... Distributed database servers at LITC, FD, UKOLN - Z39.50 metadata (and some full-text) based on DALI Mixture of content streams Variety of access methods Web, and Z39.50 clients user-configurable profiles

37 NewsAgent - Content Journals Program, VINE, Journal of Librarianship and Information Science News and briefing material LA, IIS, UKOLN (Ariadne), BL, LITC Web pages lists and USENET news

38 NewsAgent - Harvesting Web crawler looking for embedded Dublin Core Limiting the harvest –simple heuristics –use of Dublin Core Relation element parser

39 BIBLINK Information flow between publishers traditional new - CD-ROM or Web (new to publishing) and National Bibliographic Agencies British Library, UK Biblioteca Nacional, Madrid, Spain Bibliothèque Nationale de France, Paris Koninklijke Bibliotheek, Den Haag, Netherlands Nasjonalbiblioteket, Rana, Norway Universitat Oberta de Catalunya, Barcelona, Spain

40 BIBLINK - research Scope Electronic publications suitable for inclusion in National Bibliographies Metadata Dublin Core (with extensions!), SGML DTD Identifiers ISBN, ISSN, SICI, DOI, URN Transmission Simple or Web crawler Authentication MD5 hash assigned to each resource

41 BIBLINK - data set Minimum data set –Author, Title, Publisher, Place of Publication, Price, Extent (size), Keywords, Description, Edition/Version, Date of Publication, System Requirements, Format, Language, Terms and Conditions, Frequency, Identifier, Contributor, Checksum Similar to DC but some don’t fit… Issues over conversion to MARC

42 NBAs/National Libraries Publishers BIBLINK - demonstrator Dublin Core UNIMARC ??MARC Cataloguing in Publication(CIP) level records Conversion on to local MARC format using USEMARCON Enhanced records optionally returned to publishers

43 Conclusions Think about metadata as a ‘process’ Dublin Core syntax now stable enough to use Use within projects initially Choose metadata management model appropriate to your site Consider long term maintenance and transition to other formats