Why are there so many catalogs and what can we do about it? Robin Wendler (and Dale Flecker) November 2, 2000 Tufts Metadata Conference.

Slides:



Advertisements
Similar presentations
Collections Management Software for Museums and Archives r e d i s c o v e r y s o f t w a r e. c o m O V E R V I E W P R E S E N T A T I O N.
Advertisements

Catherine Worrall Slide Library Co-ordinator, University College Falmouth.
Museums and Digital Repositories October, The punch line… In the digital realm, museums: * are very much like libraries * tend to share the same.
1. The Digital Library Challenge The Hybrid Library Today’s information resources collections are “hybrid” Combinations of - paper and digital format.
Introduction to metadata for IDAH fellows Jenn Riley Metadata Librarian Digital Library Program.
Special collections and digital libraries: a new role for consortia? Dale Flecker Harvard University Library.
Chapter 2. Slide 1 CULTURAL SUBJECT GATEWAYS CULTURAL SUBJECT GATEWAYS Subject Gateways  Started as links of lists  Continued as Web directories  Culminated.
Visibility Information Exchange Web System. Source Data Import Source Data Validation Database Rules Program Logic Storage RetrievalPresentation AnalysisInterpretation.
PRIMO AT THE ROYAL LIBRARY OF DENMARK Integrated search – Google of the library? Helsinki, October
StatCat Building a Statistical Data Finder ssrs.yale.edu/statcat Steven Citron-Pousty Ann Green Julie Linden Yale University.
1 Adaptive Management Portal April
Introduction to Library Research Gabriela Scherrer Reference Librarian for English Languages and Literatures, University Library of Bern.
Using Metadata in CONTENTdm Diana Brooking and Allen Maberry Metadata Implementation Group, Univ. of Washington Crossing Organizational Boundaries Oct.
16 months…. The Visibility Information Exchange Web System is a database system and set of online tools originally designed to support the Regional Haze.
Kristin Eberle Monica Hampton Carmen Velasquez Kristin Eberle Monica Hampton Carmen Velasquez Knowledge Management.
Introduction to Implementing an Institutional Repository Delivered to Technical Services Staff Dr. John Archer Library University of Regina September 21,
OLC Spring Chapter Conferences Metadata, Schmetadata … Tell Me Why I Should Care? OLC Spring Chapter Conferences, 2004 Margaret.
IMT530- Organization of Information Resources1 Feedback Like exercises –But want more instructions and feedback on them –Wondering about grading on these.
Implementing ISO Aleta Vienneau and David Danko ESRI.
EMu and Archives NA EMu Users Conference – Oct Slide 1 EMu and Archives Experiences from the Canada Science and Technology Museum Corporation.
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
SharePoint Step by Step Step by Step Table of Contents Portal versus Communities sites How to View All Your Project Sites The Basic SharePoint Layout SharePoint.
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
Session 7 Selection of Online Resources and Options for Providing Access.
An Architecture for Online Information Integration on Concurrent Resource Access on a Z39.50 Environment Michalis Sfakakis 1 and Sarantos Kapidakis 2 An.
Teaching Metadata and Networked Information Organization & Retrieval The UNT SLIS Experience William E. Moen School of Library and Information Sciences.
1 The NSDL: A Case Study in Interoperability William Y. Arms Cornell University.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
I Never Met a Data I Didn’t Like Metadata Issues in Local and Shared Digital Collections Presentation to ALCTS Electronic Resources Interest Group January.
Organizing Your Information
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
NARA’s New Authority Sources: Authority Files and Thesauri in ARC C. Jerry Simmons Authority Team Leader, Lifecycle Coordination Staff National Archives.
Meta Tagging / Metadata Lindsay Berard Assisted by: Li Li.
JENN RILEY METADATA LIBRARIAN IU DIGITAL LIBRARY PROGRAM Introduction to Metadata.
Metadata Lessons Learned Katy Ginger Digital Learning Sciences University Corporation for Atmospheric Research (UCAR)
Sherry Lake Candidate for Metadata Specialist for User Projects.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
Resource Description and Access Deirdre Kiorgaard Australian Committee on Cataloguing Representative to the Joint Steering Committee for the Development.
Metadata and Documentation Iain Wallace Performing Arts Data Service.
Discovery Metadata for Special Collections Concepts, Considerations, Choices William E. Moen School of Library and Information Sciences Texas Center for.
PACSCL Consortial Survey Initiative Group Training Session February 12, 2008 at The Historical Society of Pennsylvania.
Introduction to metadata
Lifecycle Metadata for Digital Objects November 1, 2004 Descriptive Metadata: “Modeling the World”
Best Practices for Digital Imaging and Metadata Roy Tennant The Library, University of California, Berkeley
Merging Metadata from Multiple Traditions: IN Harmony Sheet Music from Libraries and Museums Jenn Riley Metadata Librarian Indiana University Digital Library.
Introduction to Metadata Jenn Riley Metadata Librarian IU Digital Library Program.
Integrating Access to Digital Content Sarah Shreeves University of Illinois at Urbana-Champaign Visual Resources Association 23 rd Annual Conference Miami.
Search Interoperability, OAI, and Metadata Sarah Shreeves University of Illinois at Urbana-Champaign Basics and Beyond Grainger Engineering Library April.
Introduction to the Semantic Web and Linked Data
Intellectual Works and their Manifestations Representation of Information Objects IR Systems & Information objects Spring January, 2006 Bharat.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Primary sources in undergraduate education: the who, what, why, and where Charlotte Nunes, Mellon Fellow in Digital Scholarship Department of Research.
National Library of the Czech Republic Integration of digital materials into EDL Adolf Knoll National Library of the Czech Republic Helsinki CENL Workshop.
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
JISC/NSF PI Meeting, June Archon - A Digital Library that Federates Physics Collections with Varying Degrees of Metadata Richness Department of Computer.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
The Catalog of the Future: Integrating Electronic Resources By Dana M. Caudle Cataloging Librarian Auburn University Libraries
Introduction to metadata for IDAH fellows Jenn Riley Metadata Librarian Digital Library Program.
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
Smart Linking With SFX SFX Training, Intranet Internet range of authorities, technologies A&I e-print FTXT OPAC FTXT A&I Electronic Scholarly Information.
OAI metadata: why and how Jenn Riley Metadata Librarian Indiana University.
Attributes and Values Describing Entities. Metadata At the most basic level, metadata is just another term for description, or information about an entity.
Some basic concepts Week 1 Lecture notes INF 384C: Organizing Information Spring 2016 Karen Wickett UT School of Information.
WHAT DOES THE FUTURE HOLD? Ann Ellis Dec. 18, 2000
Introduction to Metadata
VI-SEEM Data Repository
Attributes and Values Describing Entities.
Metadata to fit your needs... How much is too much?
Attributes and Values Describing Entities.
Presentation transcript:

Why are there so many catalogs and what can we do about it? Robin Wendler (and Dale Flecker) November 2, 2000 Tufts Metadata Conference

Catalogs galore Traditional library materials Social science data sets Art and Cultural Images Archives Botanical specimens Biomedical Images Geo-spatial Data Networked Resources Robin: Research resources do not reside only in libraries. The visibility made possible by the web is breaking down the barriers between different kinds of research organizations, not least in the minds of researchers. the greater availability of automation options and, most importantly, the easy distribution made possible by the web have influenced more departments to make their information available online and has led to rising expectations about online access to collections and services At Harvard, this manifests itself in the development of multiple systems to provide public access to different parts of the collections. These parallel public access systems augment HOLLIS as ways to make Harvard resources known to the community. These systems are tailored to a particular kind of intellectual access, which is often closely but not exclusively associated with a particular type of material In each of these systems, some form of cataloging is created, and internal consistency in the data is necessary if searches of the system are to yield predictable results. The information is created in different departments, by staff with different kinds of training, with reference to different standards and vocabularies Robin: Research resources do not reside only in libraries. The visibility made possible by the web is breaking down the barriers between different kinds of research organizations, not least in the minds of researchers. the greater availability of automation options and, most importantly, the easy distribution made possible by the web have influenced more departments to make their information available online and has led to rising expectations about online access to collections and services At Harvard, this manifests itself in the development of multiple systems to provide public access to different parts of the collections. These parallel public access systems augment HOLLIS as ways to make Harvard resources known to the community. These systems are tailored to a particular kind of intellectual access, which is often closely but not exclusively associated with a particular type of material In each of these systems, some form of cataloging is created, and internal consistency in the data is necessary if searches of the system are to yield predictable results. The information is created in different departments, by staff with different kinds of training, with reference to different standards and vocabularies

REASONS FOR MULTIPLE CATALOGS Desire for autonomy Varying functional requirements Community-specific conventions, terminology Different metadata formats appropriate for different materials or in different contexts

DESIRE FOR AUTONOMY –libraries –museums –archives –herbaria –academic departments –research labs –hospitals –... Catalogs operated by different administrative units such as …units which may have more interest in interoperating with their fellows across institutional boundaries than with other kinds of organizations within the institution

FUNCTIONAL NEEDS DIFFER Library catalogs: –support circulation; placing holds, recalls, or requests from remote storage –optimized for searching a large database and browsing large result sets –draw a line between finding and using material –use standards to support large scale exchange of metadata –standard metadata lends itself to automated processing (e.g., authority control, identifying duplicates, merging records, creating well-ordered result lists

FUNCTIONAL NEEDS DIFFER Image catalogs: integrate display of images with the catalog; “light table”, image comparison tools Geospatial catalogs: search via “bounding polygon” interface and determine relevance based on proportion of overlap, support “preview” rendering of data Statistical data catalogs: order datasets from ICPSR, exploratory statistical modeling Biomedical image catalogs: link between research projects, supporting images and resulting publications

TERMINOLOGY AND CONVENTIONS DIFFER For people, organizations, places, topics... –Libraries use Library of Congress and Medical Subject Headings –VIA uses the Art and Architecture Thesaurus and Union List of Artists Names –Herbaria use standardized botanical names and form personal names according to centuries-old practice –Geodesy uses conventional notations for geographic coordinates

METADATA DIFFERS... Because of historically different practices –Library standards require describing the object in hand –Photo collection standards describe the object pictured –Archives describe collective materials as they are organized And these differences are reflected in the formats used to record the descriptions

METADATA DIFFERS... With the structure of what is being described –Image cataloging is often hierarchic, with many pictures of a single described object, site, etc. –The cataloging for an archival collection is structured to replicate the logical arrangement of the collection –Dataset descriptions include variables and their locations

METADATA DIFFERS... With community schemes and standards –Libraries use MARC and AACR2 –The GIS community uses FGDC –The archival community uses EAD –The survey data community will be using DDI –The image community will use VRA Core –The text encoding community uses TEI Start with elements, move toward rules

MORE REASONS FOR MULTIPLE CATALOGS Smaller catalogs are easier to use –1.8% of all HOLLIS searches exceed maximum result set limit (126,659 searches of 7 mill. in FY99) –fewer functions to learn, but those used more often Specific catalogs can be tailored to targeted audiences –increasing precision of search results –providing richer (or more frequently needed) functionality

BUT... Multiple catalogs are confusing –How does a user know where to look? Multiple catalogs are inconvenient –Need to repeat a search multiple times

SOME POSSIBLE SOLUTIONS Replicated descriptions Distributed search Super-catalog Links

REPLICATED DESCRIPTIONS Same material described in more than one catalog –MARC AMC records and EAD finding aids –MARC and the library Portal –MARC for ICPSR datasets and Harvard/MIT Data Center records Geodesy to experiment with single point of metadata creation/maintenance feeding two catalogs (HOLLIS and Geodesy)

REPLICATED DESCRIPTIONS Issues –Can be labor intensive –Added maintenance burden –Mapping between metadata standards doesn’t work well ALWAYS involves some loss (of data, of meaning, of specificity, and/or of accuracy) may be extremely difficult, e.g., Hierarchical VIA records or EAD finding aids would not map well into MARC

DISTRIBUTED SEARCH SEARCH FRONT END 1. QUERY SYSTEM 1 SYSTEM 2 SYSTEM 3 2. QUERY 3. RESPONSE 4. SUMMARY OR CONSOLIDATED RESPONSE

DISTRIBUTED SEARCH Front-end query interface –Reformats user query as appropriate for each target system May allow user to choose which target(s) to query –Sends queries in parallel –Handles search results May consolidate results into single set May simply summarize number of hits, and pass user to specific target system to display results

DISTRIBUTED SEARCH -- ISSUES Front-end system is complex –Need to understand each target system Search syntax Results responses and formats Easier if all targets support Z39.50 –Constant maintenance is required as target systems are modified Performance sensitive to weakest link

DISTRIBUTED SEARCH -- ISSUES Target systems frequently have non-parallel functions or use different terminology “find author” vs “find person” “cancer” vs “neoplasms” Consolidating results into a single set is difficult –How to de-duplicate when same item is described in more than one system –How to order heterogeneous result sets –How to display heterogeneous data formats

SUPER-CATALOG SUPER- CATALOG 2. QUERY SYSTEM 1 SYSTEM 2 SYSTEM 3 3. RESPONSE 1. CONTRIBUTE METADATA 1. CONTRIBUTE METADATA 1. CONTRIBUTE METADATA

SUPER-CATALOG Union catalog of data from separate systems –Data collected through contribution or via “harvesting” Data may require homogenizing –Format –Data elements –Terminology

SUPER-CATALOG -- ISSUES Homogenizing can be complex –Terminology particularly difficult Homogenizing tends towards least-common- denominator –If one contributor only labels “person”, cannot offer “author” search Likely to produce a catalog of “apples and oranges” –Single photographs/whole archival collections

RELATED IDEA: “ACADEMIC LYCOS” Super catalog built from data in many academic research catalogs across institutions Built on Internet search engine technology –Based on familiar concepts and interfaces Being explored by DLF with Mellon foundation encouragement

LINKS Supports navigation and assistance for sequential searching of multiple systems After searching one catalog, user given options of pursuing same query in other sources Primary exemplar is SFX system

SFX SYSTEM SYSTEM 1 1. QUERY 2. RESULTS WITH “LINKS?” BUTTON SFX SYSTEM 3. USER CLICKS “LINKS?” BUTTON 4. PAGE WITH MULTIPLE LINK OPTION BUTTONS SYSTEM 2 5. BUTTON GENERATES PRE-FORMATTED SEARCH 6. RESULTS

LINKS -- ISSUES Each source system must be modified to provide appropriate “LINKS?” button Links server must understand data formats and search syntax for each linked system Does not address problems of non-parallel terminology and search functionality Potential user frustration, as many links will be dead ends

THEREFORE…. Many approaches, no ideal solution –Fundamental problem in digital libraries –Problem and solutions being widely analyzed today