Presentation is loading. Please wait.

Presentation is loading. Please wait.

Metadata ARLIS Study Day 9 September 2009 John Hargreaves Technical Support Officer JISC Digital Media.

Similar presentations


Presentation on theme: "Metadata ARLIS Study Day 9 September 2009 John Hargreaves Technical Support Officer JISC Digital Media."— Presentation transcript:

1 Metadata ARLIS Study Day 9 September 2009 John Hargreaves Technical Support Officer JISC Digital Media

2 JISC Digital Media JISC Digital Media is a JISC Advisory Service providing advice and guidance to the UK Further and Higher Education communities on all aspects of finding, making, managing and using digital images, moving images and sound files.

3 Services –Web resources http://www.jiscdigitalmedia.ac.uk/ –Helpdesk info@jiscdigitalmedia.ac.uk (for FE/HE; limited support for other sectors) –Training http://www.jiscdigitalmedia.ac.uk/training/ –Email list and blog http://www.jiscmail.ac.uk/tasi http://www.jiscdigitalmedia.ac.uk/blog/ –Consultancy

4 Metadata Content: Areas to cover What is metadata? What metadata do I need to collect? Where does metadata come from? How is metadata organised? The importance of vocabularies Real examples of metadata collection and use

5 What is Metadata? Image courtesy of stock.xchng

6 What is Metadata? OED: “Data operating at a higher level of abstraction” “Useful information about stuff” - serves purpose - has structure - is referential - potentially anything! Common definition: “Structured data about data”

7 “Useful” – Purposes –Finding, identifying and understanding a resource Descriptive/Discovery metadata e.g. “Title”, “Subject” –Creating, managing and preserving a resource Administrative, Technical, and Preservation metadata e.g. “Format”, “Filesize”

8 “Useful” – Purposes –Organising and relating resources Structural and Packaging metadata e.g. “Is part of”, “Master image location” –Using a resource Usage and User-contributed metadata e.g. “Published in”, “License requirements”, “User rating”

9 “Information” - Structure –General categories e.g. “Format” or “Subject” Metadata schemas –Specific values e.g. “JPEG” or “Dog” Metadata vocabularies

10 “About Stuff” – Reference –Different ‘levels’ of a resource (e.g. collection, item, component) –Different ‘layers’ within a resource (e.g. physical resources, intermediaries, digital resources) –Things outside the resource (e.g. rights ownership)

11 Some initial questions… –What am I actually describing? –For whom? –For what purposes? –What categories and vocabularies might I need to assemble?  Where am I going to get the metadata from?  Where am I going to keep it?

12 Metadata can have different origins… –“Implicit”– derived from the image itself (typically technical data) –“Explicit” – brought to the image (typically descriptive metadata; might be ‘legacy’ data, or newly created) –New metadata might be: Provided by an image contributor Inferred from a context Added by a cataloguer Added by a user All of above

13 … and can exist in different locations –Embedded within the digital resource itself –Held in a traditional database –Within an XML encoding Jga-0019a Sanctuary of Apollo

14 “Metadata Communities” –Libraries (e.g. Yale Library Catalogue http://orbis.library.yale.edu/) http://orbis.library.yale.edu/ –Individual, published, non-unique items –Long tradition of highly standardised metadata, shared cataloguing, interoperability (e.g. AACR2/MARC, DDC, LC Name Authorities…)

15 “Metadata Communities” –Archives (e.g. Online Archive of California http://www.oac.cdlib.org/)http://www.oac.cdlib.org/ –Large, unique collections, context very important, limited resources –Common standards are relatively recent, Collection descriptions (“Finding aids”) (e.g. ISAD(G)/EAD, ISAAR(CPF)…)

16 “Metadata Communities” –Museums and galleries (e.g. British Museum http://www.britishmuseum.org/)http://www.britishmuseum.org/ –Large, unique and often diverse collections, context and administration important –Have typically developed in-house approaches, common standards relatively recent (e.g. CDWA, Spectrum…)

17 “Metadata Communities” –Photographers/Picture Libraries (e.g. UCAR Atmospheric Research Photo Library http://www.fin.ucar.edu/ucardil/) http://www.fin.ucar.edu/ucardil/ –Individual items, simple systems, focus on metadata within images –In-house approaches, “niche” standardisation (for e.g. technical and embedded metadata)

18 Choosing, adapting, and mapping schemas –Ideally we’d pull a schema off the shelf and begin cataloguing –Choice is clear for some collections but difficult for others (esp. where collection spans resource types or communities) –Adaptation is common and generally necessary (but needs to be done carefully!) –You might be combining several standard schemas or developing your own and mapping to standards for particular purposes

19 Dublin Core  International (ISO 15836-2003) cross-community standard for describing digital resources http://dublincore.org/ http://dublincore.org/  Concentrates on descriptive/ discovery metadata  “1:1 rule” (1 record for 1 thing)  Frequently adapted, mapped-to, used to achieve interoperability Title Creator Subject Description Publisher Contributor Date Type Format Identifier Source Language Relation Coverage Rights

20 Three ways to adapt a schema  Adapting schemas (1) Extend (2) Qualify (3) Simplify Consequences for interoperability

21 VRA Core –Visual Resources Association –Version 4.0 is now also available –Concentrates on descriptive/discovery metadata –For art and cultural images –Influenced by Dublin Core –1:1 rule (Work/Image) –Frequently adapted –http://www.vraweb.org/http://www.vraweb.org/ Record Type Type Title Measurements Material Technique Creator Date Location ID Number Style/Period Culture Subject Relation Description Source Rights

22 IPTC Core  International Press and Telecommunications Council  Schema for embedding metadata within an image  Version 1.0 (for XMP) launched in 2004  http://www.iptc.org/IPTC4XMP/ http://www.iptc.org/IPTC4XMP/ Contact Information (e.g. Creator, Address, Email) Content Information (e.g. Description, Keywords) Image Information (e.g. Intellectual Genre, Location) Status Information (e.g. Title, Source, Copyright)

23 SEPIADES Safeguarding European Photographic Images for Access –For photographic collections –Very extensive, with many sub-categories –Covers description and administration, physical works and their digital reproductions –Multi-level description which can describe a whole collection at many levels at once (based on archival metadata) –http://www.knaw.nl/ecpa/sepia/workinggroups/wp5/sepiadestool/ sepiadesdef.pdfhttp://www.knaw.nl/ecpa/sepia/workinggroups/wp5/sepiadestool/ sepiadesdef.pdf

24 CDWA Categories for the Description of Works of Art Describes art works or cultural objects Museum/gallery community Extensive with many sub-categories Covers description and administration, original works and their reproductions Can describe complex objects with multiple parts Note that there is a ‘lite’ version http://www.getty.edu/research/conducting_resea rch/standards/cdwa/index.htmlhttp://www.getty.edu/research/conducting_resea rch/standards/cdwa/index.html

25 Some Established Mappings –Mapping metadata schemas: Getty crosswalks: http://www.getty.edu/research/conduc ting_research/standards/intrometadat a/crosswalks.html http://www.getty.edu/research/conduc ting_research/standards/intrometadat a/crosswalks.html UKOLN resources: http://www.ukoln.ac.uk/metadata/ http://www.ukoln.ac.uk/metadata/

26 Vocabularies: “Controlling your Language” Image courtesy of stock.xchng

27 Why Use Controlled Vocabularies? –Better retrieval –Improved cataloguing efficiency and consistency –‘Disambiguate’ the language (e.g. ‘bank’) –Put things in their place (e.g. classify, identify relationships) –Support interoperability (improved cross-searching and metadata sharing)

28 Ways to Control Vocabularies –Data entry rules or guidelines –Formal subject headings –Thesauri –Classifications –Authority lists (people, places, events…) –In-house keyword lists –Uncontrolled cataloguer-added keywords? –Combination of approaches

29 Formal Controlled Vocabularies Great Britain - - History - - Norman period, 1066-1154 Anglo-Norman 942.02 William the Conqueror William I, King of England, 1027 or 8-1087 Library of Congress Subject Heading (LCSH) Art and Architecture Thesaurus (AAT) Full hierarchy = Styles and Periods \ European \ Medieval \ Anglo-Norman Dewey Decimal Classification (DDC) 900=History, 940=European History, 942=British History, 942.02=Norman period Library of Congress Name Authorities Cataloguer keyword

30 What about ‘Uncontrolled’ Keywords? –Made up by a cataloguer at the point of cataloguing –Not an either/or situation – your metadata can accommodate both –A mix of both can assist with retrieval

31 Alternative Vocabularies Consider some more creative approaches: –Ask some of your users to ‘catalogue’ a representative sample of your collection –Get your users to do the cataloguing! (e.g. tagging or “folksonomies” – more later) –Get the technology to do the cataloguing! (e.g. CBIR – more later) –Draw on vocabularies from other communities, traditions and disciplines –Use an alternative vocabulary source (e.g. a children’s encyclopaedia, book index)

32 CBIR & ‘Folksonomy’ using Flickr  Exploring Flickr by colour: http://labs.systemone.at/retrievr/ http://labs.systemone.at/retrievr/  Using Flickr to catalogue a collection http://www.flickr.com/photos/Library_of_Congress/ http://www.flickr.com/photos/Library_of_Congress/

33 Another kind of user metadata  User-generated metadata  Web browser ‘cookies’  Page tracking  Failed search analysis  Can provide very useful feedback  Can enable you to offer additional services to users (e.g. customisation and email notification)

34 Examples for evaluation –Galaxy Zoo - http://www.galaxyzoo.org/http://www.galaxyzoo.org/ –Staffordshire PastTrack - http://www.staffspasttrack.org.uk/ http://www.staffspasttrack.org.uk/ –History Wired - http://historywired.si.edu/http://historywired.si.edu/

35 Back to those initial questions –What am I actually describing? –For whom? –For what purpose? –What categories and vocabularies might I need to assemble? –Where am I going to get the metadata from? –Where am I going to keep it?  How am I going to exploit it?

36 Further Support and Guidance Web site: http://www.jiscdigitalmedia.ac.uk/ helpdesk: http://www.jiscdigitalmedia.ac.uk/helpdesk/ JISC Mail: https://www.jiscmail.ac.uk/cgi- bin/webadmin?A1=ind0907&L=JISCDIGITALMEDIA


Download ppt "Metadata ARLIS Study Day 9 September 2009 John Hargreaves Technical Support Officer JISC Digital Media."

Similar presentations


Ads by Google