Presentation is loading. Please wait.

Presentation is loading. Please wait.

Western Waters Digital Library

Similar presentations


Presentation on theme: "Western Waters Digital Library"— Presentation transcript:

1 Western Waters Digital Library
Building a Multi-State Aggregated Collection Using CONTENTdm® Carol Hixson Head of Metadata and Digital Library Services Knight Library, University of Oregon Kenning Arlitsch Head of Information Technology Marriott Library, University of Utah CNI Task Force Meeting December 6, 2004

2 Overview Led by Greater Western Library Alliance (GWLA) Funded by IMLS
Objectives: Begin developing comprehensive information resource Establish a viable technical infrastructure Serve as a collaborative model Initial river basins focus: Platte, Colorado, Rio Grande, and Columbia Begin developing comprehensive information resource that will eventually include information of all kinds: legal, technical, scientific, historic; and all formats: photographs, documents, maps, audio/video, datasets, etc. IMLS proposal focuses on government surveys, homestead and user rights, compact agreements, reclamation, and state regulatory water issues. CNI Task Force Meeting December 6, 2004

3 Overview (cont’d) Geographically distributed collections
12 of 30 GWLA institutions currently participate Each site runs CONTENTdm server CONTENTdm Multi-Site Server at Univ. of Utah harvests only metadata creates aggregated index for single site search results link out to remote sites Balancing local control vs. central usability CNI Task Force Meeting December 6, 2004

4 Current Participants Arizona State U. Brigham Young U.
Colorado State U. Oregon State U. University of Arizona University of New Mexico University of Oregon University of Nebraska University of Nevada-L.V. University of Utah University of Washington Washington State U. CNI Task Force Meeting December 6, 2004

5 CNI Task Force Meeting December 6, 2004

6 CNI Task Force Meeting December 6, 2004

7 CNI Task Force Meeting December 6, 2004

8 CNI Task Force Meeting December 6, 2004

9 CNI Task Force Meeting December 6, 2004

10 CNI Task Force Meeting December 6, 2004

11 CNI Task Force Meeting December 6, 2004

12 CNI Task Force Meeting December 6, 2004

13 CNI Task Force Meeting December 6, 2004

14 CNI Task Force Meeting December 6, 2004

15 CNI Task Force Meeting December 6, 2004

16 Challenges Metadata* Searching aggregated metadata*
Selection of content Communication* Technology Consortia funding models CNI Task Force Meeting December 6, 2004

17 Metadata Challenges Project participants have agreed to follow the Western States Dublin Core Metadata Best Practices, version 2.0 The standards provide considerable latitude for some elements Some participants are harvesting from legacy collections that were created without reference to these standards Underlying CONTENTdm is mapping to Dublin Core –simple and qualified. Dublin Core itself provides a lot of latitude in application of its standards, which is why there are application profiles being developed for a variety of communities. Even within DC, there is confusion between different elements and overlap. A prime example is Contributor vs. Creator. The definitions for these two elements use a lot of overlapping terminology – and it begs the question of why one would even need to distinguish between these roles in an element set which calls itself “Core.” Locally, a site administrator makes decisions on the use of these elements. In a consortial setup, with 11 or more participating sites, such local decisions on a basic core element can affect the effectiveness of searching across collections. The Western States document gives considerably more guidance on how to map different types of information to various Dublin Core elements. However, there is still considerable latitude in input standards for various fields. In addition, some participants are harvesting from legacy collections created without reference to those collections, resulting in inconsistent values being supplied in the same field. We have set up a metadata task force to help us try to resolve some of these challenges. The group is led by me and Nancy Chaffin of(Colorado State), this group also includes Kayla Willey (BYU), Brad Eden (UNLV), and Terry Reese (Oregon State). CNI Task Force Meeting December 6, 2004

18 Different Application of Metadata Standards
Digitization Specifications Mandatory and repeatable Not mapped to a Dublin Core element Refers to a variety of standards Lot of local latitude in: labeling field input standards One of the required fields for Western Waters is one that the Western States document labels Digitization specifications. It is not mapped to any DC element, yet it is required and repeatable. The field says to use the field to record technical information about the digitization of the resource. It lists a number of strongly recommended elements for this area (which can be put into one single field or several separate fields), including: file size, quality, compression, extent of master file; as well as a number of recommended elements such as creation hardware and software, preferred presentation, object producer, operating system, checksum value, and creation methodology. At UO, we capture the different bits of information in several distinct fields and we often map the distinct bits to a DC element. This still complies with the project guidelines. Utah also captures the same type of data, but in different fields and with different input conventions for those fields. There is huge variation in the amount and presentation of this information – including whether it’s available for public display or not. We’re both in compliance with the standards we’re using but have applied those standards in very different ways. Is this important? hard to say, at this point. CNI Task Force Meeting December 6, 2004

19 Application of Metadata Standards (Cont’d)
Date.Original and Date.Digital Both fields are mandatory (when applicable) Western States Best Practices document gives clear guidance Both map to Dublin Core Date Both say to follow W3C – Date Time Format yyyy-mm-dd ( for July 16, 1897) This is the first issue that the metadata group worked on because we considered it low-hanging fruit – in that the standards were clear and didn’t provide latitude. Nevertheless, the legacy data from some of the harvested collections doesn’t conform. Some of those sites (like OSU) have or plan to convert their legacy data so that it fits in with the standards. Others are less willing to do so. Does it matter? The data is currently searchable, regardless of the format. Might it matter depending on other search interfaces or system upgrades? Possibly. CNI Task Force Meeting December 6, 2004

20 DC Mapping and Aggregated Searching
This slide illustrates one of the major metadata challenges – the difficulty of aggregated searching. One of the most challenging searching challenges is on Subject. This slide shows the underlying DC mapping for subject within the UO’s WWDL collection. Note that there are five different fields mapped to DC Subject. In the local UO collection, we have built search interfaces that allow us to create more meaningful searches. CNI Task Force Meeting December 6, 2004

21 UO’s Local Site http://libweb. uoregon. edu/catdept/digcol/wwdl/index
Here’s the top page for entry into UO’s local site. CNI Task Force Meeting December 6, 2004

22 Local Customized Search Interfaces
Locally, we can create separate search interfaces for the different types of subject. So, the fact that we use both LC and TGM terms is not a problem in our local site. They are in separate fields and we have built search interfaces to reflect that. CNI Task Force Meeting December 6, 2004

23 Local Customized Search Interfaces
This is one of the customized search interfaces we’ve developed at UO. On the central site, however, such customization and separation of distinct types of subject data is not possible CNI Task Force Meeting December 6, 2004

24 No Mapping to Encoding Schema
One of the challenges for the aggregated searching is that the software does not allow for mapping to encoding Schema. CONTENT does allow for mapping to qualified DC but does not extend that to encoding schema. If it did, that would provide some assistance for libraries using different source vocabularies. At the moment, there is considerable redundancy and contradiction between TGM and LCSH. In an aggregated search, this can produce confusing search results. CNI Task Force Meeting December 6, 2004

25 Inconsistent Search Results
If one searches on subject within the aggregated site on the TGM term “Afro-Americans” there is only one result. If one happens to search on African-Americans, the LCSH term, there are 8 results (fortunately still including the single result found with the TGM search.) The Western States standards recommend only that libraries use a controlled vocabulary, but doesn’t specify either the source vocabulary or the way it should be utilized. CNI Task Force Meeting December 6, 2004

26 Inconsistent Search Results (cont.)
      Developing effective search mechanisms and granularity across collections is complicated, and requires strict adherence to metadata standards. Yet, there is great variation in local practice – all reflecting carefully thought-out needs at the local level. For instance, searching on the subject term Bonneville Dam (Or. and Wash.) – which is the strict LC form for this name of the dam, produces no search results. There are many records in the collection that have this term mapped to subject. However, searching on Bonneville Dam Oregon and Washington does return results, including those from the UO collection. ContentDM treats OR as a Boolean operator and not as a word so using the standard LC abbreviation for the state of Oregon would not suit us. Other sites are following LC and AACR2 strictly. We also break place names into a separate field on our local site and spell out all states fully, rather than using the LC/AACR2 abbreviations. Other sites do not. We are all in compliance with the Western States document but have made very local decisions. Reaching consensus on subject fields would be extremely difficult – yet this has an enormous impact on aggregated searching. There is also considerable variation in how terms are applied. Some apply broader as well as specific terms (since the software doesn’t support any hierarchical linking) and others try to follow traditional library cataloging philosophy of only using the most specific terms that match the item. Doing a search on Columbia River in the aggregated site retrieves 407 hits, many of which are of images that do not show the river itself. How do you resolve issues like that and still move forward on building the joint collection? Do we really want to try to replicate the MARC/AACR2 environment with different standards-setting bodies and strict compliance? Is it necessary to do that? I say no. Others would disagree. CNI Task Force Meeting December 6, 2004

27 Communication GWLA direction and oversight WWDL communication forums
Executive Board/Library directors (30) Digital Projects Task Force WWDL communication forums Electronic discussion list Monthly reports and conference calls with project coordinators Semi-annual meetings at ALA Online repository for reports, standards, minutes, etc. Consortia are fragile Misunderstandings arise as result of misinformation Are problems result of software, our metadata, our interfaces? Issues/concerns/opinions must be voiced openly Good communication is crucial in large consortia projects. GWLA is a large organization and interested subgroups include the Executive Board, the 30 library directors themselves, and the digital projects task force. The daily management of the WWDL itself includes several communication forums, including an electronic discussion list, monthly phone calls with the project coordinators, monthly reports, and meetings at ALA conferences. Despite these opportunities misunderstandings sometimes result, and often they are due to misinformation that is repeated without verification. Are the problems we’re experiencing the result of the software, our application of metadata, or the interfaces we build? Issues and concerns such as these must be voiced openly because consortia are inherently fragile. CNI Task Force Meeting December 6, 2004

28 Opportunities Widening use of metadata standards
Collectively digitize more types of material Develop consensus on methods for presenting different types of material Improve CONTENTdm software by providing consortia feedback to DiMeMa Inc. CNI Task Force Meeting December 6, 2004

29 Contact Information Website – Presentation - Carol Hixson, Head of Metadata and Digital Library Services Knight Library, University of Oregon (541) Kenning Arlitsch, Head of Information Technology Marriott Library, University of Utah (801) Adrian Alexander, Executive Director Greater Western Library Alliance (816) CNI Task Force Meeting December 6, 2004


Download ppt "Western Waters Digital Library"

Similar presentations


Ads by Google