Download presentation
Presentation is loading. Please wait.
1
University of Michigan’s OAIster Lessons Learned Kat Hagedorn OAIster/Metadata Harvesting Librarian University of Michigan, DLPS October 7, 2002
2
Overview of Issues Use of tools (harvester, transformation script) Metadata foibles Communication with data providers What users want Questions we’d like to answer
3
Use of Tools Using the harvester was easy and very functional Transformation script evolved as system evolved Lessons learned: Re-harvesting set from scratch involves getting into the SQL Scheduling harvests can be time intensive; can’t harvest more than 10 medium-size repositories at a time Normalization and filtering routines were perfected as we progressed (e.g., removal of super-long records so ranking algorithms worked appropriately)
4
Metadata Foibles Inconsistent use of tags “oai-dc” used instead of “dc” Namespace occurrences placed willy-nilly Widely varying data values DC Subject uses thesaurus or doesn’t Resource types either too general or too specific XML validation errors, assumedly due to incorrect UTF-8 formatting Repositories differ in levels of specificity
5
Communication With Data Providers Currently, about 10% of data providers have XML validation problems Solved a number of these by contacting the data provider directly Most more than willing to look at their data again Informal community that could be made more formal (one fix helps multiple service providers)
6
What Users Want It depends… We know what irritates them No browse function Not knowing exactly how to search in a new service Metadata inconsistencies Continual improvements to service
7
Questions We’d Like to Answer Duplication Restrictions Normalization Relevancy ranking “Best” answers Portal offshoots…especially as data increases
8
Contact Info Kat Hagedorn UM Digital Library Production Service khage@umich.edu http://oaister.umdl.umich.edu/ For technical info: Mike Burek, mburek@umich.edu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.