Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tools and Techniques for Creating, Maintaining, and Distributing Shareable Metadata Jenn Riley Metadata Librarian Indiana University Digital Library Program.

Similar presentations


Presentation on theme: "Tools and Techniques for Creating, Maintaining, and Distributing Shareable Metadata Jenn Riley Metadata Librarian Indiana University Digital Library Program."— Presentation transcript:

1 Tools and Techniques for Creating, Maintaining, and Distributing Shareable Metadata Jenn Riley Metadata Librarian Indiana University Digital Library Program

2 What does this record describe? http://museum.university.edu/unique identifier State University Museum of Ichthyology, Fish Field Notes jpeg These pages may be freely searched and displayed. Permission must be received for subsequent distribution in print or electronically. Please go to http://museum,univeristy,edu/ for more information. image 1926; 0070; 06; Little S. Br. Pere Marquette R.; THL26-68; 71300; 71301; 71302; 71303; 71304; 71305; 71306; 71307; 71308; 71309; 07; 1926/07/06; R12W; S09; Second collector Moody; T16N Cottus bairdi; Esox lucius; Cottus cognatus; Etheostoma nigrum; Salmo trutta; Oncorhynchus mykiss; Catostomus commersoni; Pimephales notatus; Margariscus margarita; Rhinichthys atratulus; mottled sculpin; northern pike; slimy sculpin; johnny darter; brown trout; rainbow trout; white sucker; bluntnose minnow; pearl dace; blacknose dace; bairdi; lucius; cognatus; nigrum; trutta; mykiss; commersoni; notatus; margarita; atratulus; Cottus; Esox; Cottus; Etheostoma; Salmo; Oncorhynchus; Catostomus; Pimephales; Margariscus; Rhinichthys; 1926-07-06; ; Boleosoma; Salmo; Hyborhynchus; Semotilus; ; fario; gairdneri--irideus; atronasus--obtusus--meleagris UND Michigan 1926 Langlois, v. 1 1926--1926; Record harvested via OAI PMH 2-27-2007

3

4 Collection Registries ????? GEM Photograph from Indiana University Charles W. Cushman Collection

5 Why we should care Library/archive/museum data is useful ◦ Even when objects aren’t digitized It’s our mission to distribute information We should be leaders in the networked information environment We have good ideas, but others do too We should therefore make it easier for our data to be used by others

6 Shareable Metadata… Is quality metadata Promotes search interoperability - “the ability to perform a search over diverse sets of metadata records and obtain meaningful results” (Priscilla Caplan) Is human understandable outside of its local context Is useful outside of its local context Preferably is machine processable

7 Shareable Metadata as a View Metadata is not monolithic Metadata should be a view projected from a single information object Create multiple views appropriate for groups of important sharing venues Depends on: ◦ Use ◦ Audience

8 The 6 Cs & Lots of Ss of Shareable Metadata Content Coherence Context Communication Consistency Conformance to Standards

9 Content How element values are structured affect whether the record is shareable For your institution, the resource and the defined audience choose the appropriate: ◦ Vocabularies ◦ Content standards ◦ Granularity of description ◦ Version of the resource to describe ◦ Elements to use Don’t include empty elements in shared records

10 Coherence A shareable metadata record should make sense on its own, outside of the local institutional context and without access to the resource itself Place values in appropriate elements Repeat elements instead of “packing” multiple values into one field Avoid local jargon, abbreviations and codes Ensure mappings from local to shared metadata formats result in coherent records

11 Context Appropriate context allows a user to understand a resource based on the metadata record alone Shareable metadata records should: ◦ Include information not used locally ◦ Exclude information only used locally Collection level records can help, but don’t rely on them

12 Communication Information supplementing your metadata records can be useful to an aggregator ◦ Intended audiences ◦ Record creation methods ◦ Controlled vocabularies used ◦ Content standards used ◦ Accrual practices ◦ Existence of analytical or supplementary materials ◦ Provenance of materials Can be within or external to a sharing protocol

13 Consistency Consistency allows aggregators to apply same indexing or enhancement logic to an entire group of records Can be affected by change in policy or personnel over time Pay special attention to consistency of: ◦ How metadata elements are used ◦ How (and which) vocabularies are used for a particular element ◦ Syntax encoding schemes

14 Conformance to Standards Technical conformance to all types of standards is essential. Without it, processing tools and routines simply break. ◦ Sharing protocols (e.g. OAI-PMH) ◦ Metadata structure standards ◦ Controlled vocabularies and syntax encoding schemes ◦ Content standards ◦ Technical standards (e.g. XML, character encoding)

15 Generic high-level workflow Write metadata creation guidelines Choose standards for native metadata Who to share with? Choose shared metadata formats Plan Create metadata (thinking about shareability) Create Perform conceptual mapping Perform technical mapping Validate transformed metadata Test shared metadata with protocol conformance tools Transform Implement sharing protocol Share Communicate with aggregators See who is collecting your metadata Review your metadata in aggregations Assess

16 No single “right” workflow exists for all situations Our tools sometimes dictate parts of our workflow ◦ Be careful not to let them do this too much - tools serve us, not vice-versa Start workflow design from well-defined goals (not processes) Fundamental principles to follow ◦ Put the right information in from of the right person at the right time ◦ Ensure shareability is a common theme underlying it all ◦ Generate multiple views from a single master

17 Choose the best tools for the job Important every step of the way ◦ Programming languages ◦ Commercial or open-source software packages ◦ Repository solutions ◦ Metadata creation interfaces Promotes both efficiency and quality Define needed functionality, and negotiate (compromise) from there

18 Thinking big picture Must find a reasonable balance between the perfect solution for a single set of materials and fully streamlined processes that treat everything the same way One approach - define categories of material and design reusable workflows for each

19 Defining categories of material By resource type ◦ Text ◦ Documentary images ◦ Art images ◦ Musical audio recordings ◦ etc…. (including getting more specific) By managing institution? ◦ May provide barriers for our users - see Elings/Waibel: “Metadata for All” article in First Monday, 2007 ◦ But institutional mission is a factor in determining the appropriate views of a resource to share

20 Reusable parts of workflow Decisions on metadata structure standards, content standards, controlled vocabularies, etc. Metadata creation tools Automated processing techniques XSLT stylesheets and other data management code SIP/AIP/DIP architecture Delivery systems

21 Generalization is worth the effort You will have to go back and do it again at some point ◦ Fixing typos, errors, etc. ◦ Adding new content over time ◦ Adding new metadata format or sharing mechanism ◦ Migration to another system Need both workflow tools and documentation to be accessible Generalization will allow you to minimize the effort redoing something and focus more on the new stuff

22 Make the most of automation Automate the repetitive tasks as much as feasible, but only where it makes sense For example: ◦ Create as much technical metadata as possible from the file itself ◦ Derive basic structural metadata from filenaming conventions ◦ Develop automated processes that are triggered when an XML file is placed in a “drop box” or submitted via a specialized tool ◦ Develop easy-to-use tools to apply the same metadata to a defined group of records

23 Basic workflow at IU (1) Metadata standards chosen Metadata creation guidelines written and tools developed/adapted Fedora content model developed or existing appropriate one identified Metadata/markup created (and perhaps digitization performed) ◦ Sometimes in phases by different people

24 Basic workflow at IU (2) Metadata transformed via XSLT (one per category of material, with some tweaking for each collection) into all desired formats, and loaded into Fedora Metadata for sharing loaded into OAI-PMH data provider Appropriate staff alerted for parallel metadata creation for OPAC (generally collection level) Note several opportunities for greater efficiency

25 One step at a time Implementing shareable metadata practices likely will be done incrementally We’re still learning how to best achieve effective shareability Best practices grow and change over time Must be positioned to respond quickly to new metadata standards and technologies as they evolve

26 Shareable metadata isn’t just about OAI-PMH Some other options: ◦ Lightweight APIs (e.g., OpenLibrary) ◦ Google SiteMaps ◦ OpenURL ◦ SRU ◦ OAI-ORE ◦ Linked data Jim Michalko, RLG: library data sharing mechanisms are “high value and low participation” Notice Z39.50 isn’t on this list.

27 Promoting new uses The academic institution-built metadata (and/or content) aggregation seems to have plateaued ◦ See Ricky Erway RLG report “Seeking Sustainability” We must provide a variety of options for accessing our data, to support a variety of uses We shouldn’t necessarily stop collaboration and aggregation, but we should allow others to do this too, with our metadata (and maybe even our content)

28 Terminologies services Sharing our authority data is potentially even more useful than sharing our descriptive data RLG/OCLC doing some work in this area ◦ Moving terminologies to the “network level” Some possible uses ◦ Give me more information on this concept/person/etc. ◦ What are this term’s broader, narrower, related terms? ◦ What are all the synonyms for this term?

29 Tools supporting the creation of shareable metadata Our existing metadata creation tools are embarrasingly bad Current technologies provide many opportunities for improvement Good tools make it easy to do the right thing and hard to do the wrong thing Can operate when metadata is first created or in a later review step Here are some ideas…

30 Directly in XML Generally only a good idea for markup languages, rather than metadata structure standards ◦ And often not even then Some supplemental tools can help ◦ Validation to Schema/DTD (of course) ◦ “Preview” function ◦ “Report card” function, e.g., with Schematron

31 Modularize All metadata for a resource doesn’t have to be created at once ◦ Transcription vs. authority work vs. subject analysis ◦ Descriptive vs. technical vs. structural ◦ Us vs. users! Provide optimized views for each metadata creation function ◦ Perhaps even different systems ◦ But always provide metadata creators with a way to see how the metadata will be used

32 Abandon the record-centric approach Patterns (and outliers) emerge from data in the aggregate Reporting capabilities ◦ Sortable, deduplicated lists of values from a given field or set of fields ◦ How many of this field per record ◦ How many distinct values used in this field ◦ Data overlap between fields

33 Useful features Data type validation (while entering data in that field!) Auto-complete Record-level validation Spell check Integration of metadata creation guidelines into software tools

34 Integration of controlled vocabularies Should be seamless Provide access to entire authority record rather than just the heading For short vocabularies, provide a combo box For longer vocabularies ◦ Auto-complete ◦ Ajax-y interactions with hierarchical and alphabetical views Similar features could be used to perform maintenance of vocabularies

35 Working around system limitations Many digital asset management systems don’t support a second shareable copy of records Do your best to split the difference with system records Use creative interface design for your local system Use extra-protocol documentation for communicating with aggregators Lobby your vendor!

36 Good practice requires collaboration One person can’t do it all Implementing shareable metadata requires a primary advocate to ensure shareability is a consideration at all steps of the workflow Many people will need to be involved

37 Role of metadata specialists Often are the shareable metadata advocate Choose standards and sharing protocols Write metadata creation guidelines Be prepared to compromise!

38 Role of technical staff Evaluate feasibility of technical plans Help with prioritization of options Locate and evaluate existing code to minimize duplication of effort Abstract specific processes for general use

39 Other collaborators Collection managers User specialists Project managers Catalogers/metadata creators Reference staff Granting agencies

40 Final thoughts about sharing Shareable metadata represents a fundamental shift in thinking ◦ Your metadata is no longer a destination, it is information that will serve as building blocks for other services ◦ Your metadata must operate effectively in an increasingly decontextualized environment Creating shareable metadata ◦ Will require more work on your part ◦ Will require our software to support (more) standards ◦ Is no longer an option, it’s a requirement

41 Yes, this is hard… …and we’re just starting to learn how to do it effectively and efficiently There’s plenty of room for leadership in this area.


Download ppt "Tools and Techniques for Creating, Maintaining, and Distributing Shareable Metadata Jenn Riley Metadata Librarian Indiana University Digital Library Program."

Similar presentations


Ads by Google