Download presentation
Presentation is loading. Please wait.
1
Using Metadata in CONTENTdm Diana Brooking and Allen Maberry Metadata Implementation Group, Univ. of Washington Crossing Organizational Boundaries Oct. 29, 2002
2
Outline The metadata “environment”: factors that influence basic decisions Structure of metadata: Dublin Core, field structure in CONTENTdm Content standards: what goes into the fields, formatting, controlled vocabularies The data dictionary: bringing it all together
3
Metadata: what is it? Data about data –“Metadata are data that describe the attributes of a resource; characterize its relationships; support its discovery, management, and effective use; and exist in an electronic environment.” (Sherry Vellucci, LRTS 44 (1), 1999) Commonly known as cataloging
4
Metadata: how is it used? For description: information for display with the image For searching: users search for images by searching for text attached to the image
5
Basic Decisions: Description How much information do you have? How much information do your users need/want? –What is depicted in the image? –Who created it? –Why is it important? Why did you select it? How much detail do you need to go into?
6
Basic Decisions: Searching How will users find the images? What will they be looking for? What aspects are they interested in? How will you find the images? What are your staff’s needs? At what level do you need to distinguish images from one another? At what level do you need to bring like resources together?
7
Decision Factors Size of file –50 images (small enough to browse) –10,000 images (need for more precise searching) –10,000 images of many different things vs. 10,000 images of trains
8
Decision Factors Audience –General public vs. specialists (e.g., railroad enthusiasts) Institutional mission –Say you are a railroad museum (audience expectations)
9
Decision Factors Legacy data –Starting from scratch –Years of good cataloging –Years of inconsistent cataloging Software issues –What kind of data can the system handle? –What are its search capabilities –Short-term vs. long-term view
10
Basic Dublin Core Metadata What is the Dublin Core Metadata Element Set (DCMES) Why was it developed, and how has it been developed. A short history of the DC Initiative is available at http://www.dublincore.org/about/overview/ http://www.dublincore.org/about/overview/
11
Dublin Core Metadata Element Set There are15 basic elements See Dublin Core Element Set, Version 1.1 - Reference DescriptionDublin Core Element Set, Version 1.1 - Reference Description But, it is adaptable and expandable to fit the needs of different users by the use of “Applications profiles”
12
Dublin Core and CONTENTdm CONTENTdm is designed around the Dublin Core (Very) basic overview of how CONTENTdm works –CONTENTdm uses DC element names as file names –Because each database has constant file names it is easy to combine them to search either one or more collections
13
Dublin Core mapping An example: –Collection A has a field “Photographer” mapped to DC:Creator, and Collection B has a field “Artist” mapped to DC:Creator. Searching across both databases searches the CONTENTdm index “Creat*” and retrieves data from the index for both “Photographers” and “Artists” for collections A + B or A+B+n…
14
Dublin Core and searching What are the practical consequences of this? –In cross database searching, one can search on specific fields. However, the names of these fields will not be Photographer or Artist, but “Creator” because that is the common name of the index in each collection. –However you can do a keyword search on all “searchable” fields in the database whether they are mapped to a Dublin Core field or not.
15
Modern Book Arts field labels –bibliographic description = descr0 –text production = descr1 –image production = descr2, etc. Cross-database search index –Description = descr*
16
Dublin Core tips –It is important to make sure that you are careful about what information you put in searchable fields, even if they are not mapped to a DC element. –If you have multiple collections it is very important to make sure that the same type of data is mapped to the same DC elements consistently
17
Content Standards Used for choosing and formatting the data that goes into the fields. Increase coherence and intelligibility of description Enhance reliability of retrieval Enable compatibility with other collections (cross- database searching) Makes maintenance and possible migration of data to other software easier
18
Standards = Consistency “Date” field: dates should always be formatted the same wayDate “Photographer” field: same person’s name should always appear in the same formPhotographer “Subject” field: same topic should have the same term used to describe it across imagesSubject If different terms or formats are used, the user may not even realize that more than one search is necessary
19
Examples of Content Standards For description: Anglo-American Cataloging Rules, 2 nd ed., 2002 revision (libraries) Graphic Materials: Rules for Describing Original Items and Historical Collections, 1982; revisions available electronically (libraries, also museums, historical societies, LC Prints & Photo., CORBIS)
20
Content Standards: Controlled Vocabularies “Any subset of the lexicon of a natural language. A list of preferred and nonpreferred terms produced by the process of vocabulary control. Types of controlled vocabularies include subject heading lists and thesauri.” (NISO)
21
Controlled vocabs for which fields? When you need consistency across images, user searches to find all … –Proper names for things (people, places, etc.) –Subjects depicted in the images Not necessary when you have… –Fields that contain data more likely to be unique to the particular image (title, notes, other free text fields)
22
Remember… You can have fields that don’t use controlled vocabularies, but where you still need consistency in format: –Dates –Image numbers –Physical description You could create your own controlled vocab lists (if you really had to)
23
Controlled Vocabularies For names: Library of Congress/National Authority File: http://authorities.loc.gov http://authorities.loc.gov Union List of Artist Names (Getty): http://www.getty.edu/research/tools/vocabulary/ul an http://www.getty.edu/research/tools/vocabulary/ul an USGS Geographic Names Information System: http://geonames.usgs.gov/gnishome.html http://geonames.usgs.gov/gnishome.html
24
Controlled Vocabularies For subjects: Library of Congress Subject Headings: http://authorities.loc.gov http://authorities.loc.gov LC Thesaurus for Graphic Materials: http://www.loc.gov/rr/print/tgm1 http://www.loc.gov/rr/print/tgm1 Art & Architecture Thesaurus (Getty): http://www.getty.edu/research/tools/vocabulary/aat http://www.getty.edu/research/tools/vocabulary/aat Chenhall’s Nomenclature (The Revised Nomenclature for Museum Cataloging. Walnut Creek: Altamira Press, 1995)
25
Vocabulary conflicts? DC Subject: LCSH vs. AAT –Church buildings vs. Churches DC Coverage: LC vs. Board of Geographic Names –Moscow vs. Moskva Challenge of meeting needs of diverse collections and users, while maintaining consistency within and between databases
26
Data Dictionaries For each project a data dictionary documents: Database-specific field labels Mapping of fields to DC elements Data formatting instructions for each field Recommended controlled vocabularies UW data dictionaries: http://www.lib.washington.edu/msd/mig/datadicts/ default.html http://www.lib.washington.edu/msd/mig/datadicts/ default.html MOHAI
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.