UKOLN is supported by: Approaches to Metadata Quality Marieke Guy QA Focus A centre of expertise in digital information management
Introduction This presentation will cover: What is metadata quality? Why is it important? Why are we not creating quality metadata at the moment? A simple case study – the acronym tag How can we improve metadata quality? How do QA processes fit in?
Metadata Quality: What? Deciding on what metadata you need This is done by considering what are you trying to achieve - functional requirements e.g. searching records by title, author name, keywords Schema/application profile Ensuring metadata is created in a controlled way Catalogue guidelines, train cataloguers etc. Quality control processes Usability tests – does your metadata help you achieve your requirements? Quality is about fitness for purpose
Metadata Quality: Why? Interoperability Resource discovery –Poor recall and precision –Inconsistency of search results –Ambiguities With low quality metadata you can only offer a low quality service
Metadata Quality: Why not? Timely Costly Lack of skills Lack of guidelines Lack of usable tools Lack of validation and QA Barrier to participation
Case Study: Acronym Tag Phrase elements add structural information to text fragments abbr tag indicates an abbreviated form (e.g., WWW, HTML, URI, et al. etc.) and includes initialisms acronym tag indicates an acronym (e.g., FAIR, CETIS, etc.). The title attribute can be used to provide the full or expanded form of the expression E.g. WWW
Acronym Tag: Why? Using the abbr and acronym tags allows you to add metadata to your HTML documents. This would then help with accessibility (screen readers) and understanding jargon Enables you to create a glossary or acronyms for your site – time saving For examples of how to do this see Tom Heath’s acronym tool - Recently markup helped by introduction of acrobot
Acronym Tag: Issues (1) People don’t know this tag exists! Confusion over whether or is used –All acronyms are abbreviations, but all abbreviations are not acronyms –Acronyms are actually a subset of abbreviations –Consider the differences in pronunciation Lack of consistency in way words are pronounced e.g. URL, SQL W3C states that authors should use style sheets to specify the pronunciation of an abbreviated form
Acronym Tag: Issues (2) Some abbreviations are confusing because they –Are excepted into everyday language e.g. info, Mac –Are abbreviated in one language but spoken in others e.g. e.g short for exempli gratia but used as for example –No longer mean anything e.g.UKOLN Should they be marked up? Does the reader need more information? How relevant are they?
Acronym Tag: Issues (3) Nesting decisions e.g. FAQs in the tags vs just FAQ with the 's' left outside Capitalisation in the meanings e.g. hewlett-packard vs Hewlett-Packard Punctuation e.g. I.T. vs IT International Context
Acronym Tag: Issues (4) Markup errors – rather than – or rather than markup in attributes foo "> Invalid characters e.g. unescaped character entities such as & (&)
Acronym Tag: Solutions To deal with the issues when using the acronym tag QA Focus have developed: –A policy – Oxford ED No punctuation Formal definition – additional info in normal text –A set of procedures – CMS? Staff development –Liaison strategy
General Ways Forward (1) Use of metadata schemas Use of cataloguing guidelines Use of controlled vocabularies/taxonomies Collaboration between resource creators and information specialists
Ways Forward (2) More targeted help Input/submission tools –Use of examples –Usability testing e.g. field order –Integration of guidelines/help –Recognition of the limitation of forms Further discussion in the community of the issues involved QA Processes
QA Processes Random sampling of metadata input Automated systems and tools –Metadata analysis –Data cleansing –Data enhancement Look at automated creation vs manual creation –Cheap software/expensive checking –Expensive software/expensive checking
Useful URLs (1) UKOLN - QA Focus - Exploiting ACRONYM And ABBR HTML Elements - Using The ACRONYM And ABBR HTML Elements On The QA Focus Web Site - W3C details on acronyms -
Useful URLs (2) Acrobot: Abbreviation and Acronym Generator - Building Quality Assurance into Metadata Creation (Barton, Currier & Hey) - Abbreviations, Acronyms, Initialisms -