A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007.

Slides:



Advertisements
Similar presentations
Endeca Taking a different path Cindi Holt Information Services Manager September, 2007.
Advertisements

Tony Melvyn Product Manager OCLC Delivery Services Enhancement Overview for ALI, Academic Libraries of Indiana March 11, 2011.
October 2007Internet Librarian International The Impact of 2.0 lipstick, cowbells and serendipity in the OPAC Dave Pattern, Library Systems Manager.
SHARED COLLECTIONS, SHARED RECORDS? RESOURCE SHARING AT THE META-LEVEL Charley Pennell, NCSU - Natalie Sommerville, Duke TRLN Annual Meeting, 13 July 2012.
ICOLC October 4, 2001 OCLC Services. Purpose Libraries’ web-based information portal needs –Maximize consortia’s role in their members’ use of database.
Chapel Hill 03-Mar-2006 Using Endeca for a Catalog Interface “So, yeah, the catalog sucks, but what are you going to do about it?” Andrew K. Pace Head,
WEB OPAC 2.0 Discovering a better search tool Kevin Collins & Darren Chase, Stony Brook University.
Catalogs for the Future Andrew K. Pace NCSU Libraries March 24, 2006 Library Automation: Yesterday’s Technology Tomorrow ILS Vendors: Squandering our money.
Opening the Door: using Endeca for a faceted catalog Emily Lynema NCSU Libraries MLC: Discovery & Access March 2, 2007.
YOU ONLY THINK YOU’RE LIKE GOOGLE : COMPARATIVE USER EXPERIENCE OF DISCOVERY PLATFORMS Rice Majors Faculty Director of Libraries Information Technology.
Next Generation OPAC Technologies and NEOS Looking into the Future Kenton Good, Web Development Librarian, University of Alberta Libraries Dan Mirau, Library.
PRIMO AT THE ROYAL LIBRARY OF DENMARK Integrated search – Google of the library? Helsinki, October
BC Integration of Systems and Resources MetaLib at Boston College Theresa Lyman Digital Resources Reference Librarian Boston College Libraries.
Engineering Village ™ ® Basic Searching On Compendex ®
Web- and Multimedia-based Information Systems. Assessment Presentation Programming Assignment.
Information Retrieval in Practice
Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.
River Campus Libraries Metadata That Supports Real User Needs Jennifer Bowen Head of Cataloging University of Rochester Libraries David Lindahl Director.
Searching Without a Net:
Relevance ranking of results from MARC-based catalogues: from guidelines to implementation exploiting structured metadata Tony Boston and Alison Dellit.
Making sense of the data jumble Trinity College Library Dublin’s Discovery Solution Experience Arlene Healy & Charles Montague Digital Systems and Services.
Catalog Transformed: From Traditional to Emerging Models of Use Andrew K. Pace NCSU Libraries Feb. 7, 2006.
The Future of the Online Catalog Andrew K. Pace NCSU Libraries July 28, 2006 Library Automation: Yesterday’s Technology, Tomorrow.
What difference a good tool? using Endeca for a faceted catalog Emily Lynema NCSU Libraries ACRL Delaware Valley Chapter Fall Program November 3, 2006.
The FCLA Endeca Project By Michele Newberry. M.Newberry2 Why ENDECA?  Already proven by NCSU  Build on NCSU’s work instead of starting from zero  Product.
Forward to the past : resurrecting faceted NCSU Libraries Charley Pennell NCSU Libraries North Carolina State University
AGent 2.0 Cataloging AGCat –Replaces WindowsCat/FullCat UDMM Interactive authority control Subject heading translation Bibliographic resources Cataloging.
The Dis-integrated Library System of the Future Kristin Antelman NCSU Libraries October 28, 2005.
Connecting users to Collections Collection Development/Resource Sharing Conference March 26, 2009 Jean Phillips Florida Center for Library Automation
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Discovery Tool VuFind: Georgia Tech's Implementation Enhances Findability of Resources Larry Hansard & Lori Critz GIL Users Group Meeting /15/08.
Improving the Catalogue Interface using Endeca Tito Sierra NCSU Libraries.
EXtensible Catalog David Lindahl University of Rochester.
Next generation library catalogs and the integration of gazetteer information for geographical research Julie Sweetkind-Singer Assistant Director of Geospatial,
Alberto Isoardo Seminario autunnale CIBER Novembre 2007 ROMA.
Project Overview Bibliographic merging, Endeca, and Web application.
Plagiarism What is it? Any time a student represents work done by someone else as his or her own, that student has committed an act of plagiarism.
Future of Cataloging RDA and other innovations pt.1.
The FCLA Endeca Project By Michele Newberry. M.Newberry2 Current OPAC environment  Aleph 500 v.15.5  Heavily customized to reflect pre- implementation.
Technical Services 2.0: “Mashing up” traditional and new services Rebecca Kemp Serials Coordinator, UNC Wilmington ACRL/NY Annual Symposium 2007 “Library.
Support.ebsco.com Basic Searching for K-12 School Libraries Tutorial.
OpenURL Link Resolvers 101
NCSU Libraries Endeca and faceted browsing: Giving the user a useful catalog Scott Warren NCSU Libraries South Carolina Library Association Annual Meeting.
7. Approaches to Models of Metadata Creation, Storage and Retrieval Metadata Standards and Applications.
OPAC Review Catalog functions Inventory and control Locating known items Discovery tool.
WorldCat Local & World Cat Quick Start a new way to search your library’s resources and the world beyond.
University Library System, CUHK 香港中文大學圖書館系統 University Library System The Chinese University of Hong Kong Simple, Flexible and Informative - Personalised.
NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006.
NCSU Libraries Andrew Pace & Emily Lynema NCSU Libraries May 24, 2006.
OPAC Search & Navigation. “OPAC Complainers” “There is certainly no dearth of OPAC complainers. You have Andrew Pace (OPACs suck), and Roy Tennant (You.
OPAC Training aid (Library solutions & Library world)
CyberTools ® for Libraries User’s Group Meeting 2001 Tri-Chapter MLA Review of Past Five Months Technology Business Imminent Tasks Future Plans Closed.
Library of Congress Working Group on the Future of Bibliographic Control ~ March 8, 2007 Users and Uses of Bibliographic Data: The Promise and Paradox.
Implementing a Faceted Search Framework Emily Lynema & Andrew K. Pace NC State University Libraries ASIS&T Seminar April 9, 2007.
Endeca: a faceted search solution for the library catalog Kristin Antelman & Emily Lynema UNC University Library Advisory Council June 15, 2006.
Web Z: A Non-Programmers Perspective Sandy Card State University of New York at Binghamton March 23, 1999.
Free the Data: creating a web services interface to the online catalog Emily Lynema NC State University Libraries Code4lib 2007 February 28, 2007.
Implementation of a faceted catalog search solution Kristin Antelman & Emily Lynema NCSU Libraries Feb. 7, 2006.
NCSU Endeca 2 1/2 Years On From NextGen to Normalcy Charley Pennell Principal Cataloger for Metadata NCSU Libraries North Carolina State University.
Unbundling the ILS: Deploying an e-commerce catalog search solution Andrew Pace & Emily Lynema NCSU Libraries April 12, 2006.
How "Next Generation" Are We? A Snapshot of the Current State of OPACs in U.S. and Canadian Academic Libraries Melissa A. Hofmann and Sharon Yang, Moore.
A Faceted Interface to the Library Catalog Tito Sierra NCSU Libraries ALA Midwinter Meeting January 20, 2007.
Delivers local and global resources and OCLC e-Content in a single search Paul Cappuzzello Senior Library Services Consultant
The Koha Experience: An Academic Library Perspective
Professional development training on cataloging at the University Wisconsin-Madison Memorial Library, USA 14th October -24th October, 2016 Aigerim Shurshenova.
EBSCO Discovery Service (EDS)
Library Content Comparison System
EBSCO Discovery Service (EDS)
IDEALS at the University Of Illinois: A Case Study of Integration Between an IR and Library Discovery Systems Sarah L. Shreeves University of Illinois.
Sophia Katsarska Eighth AMICAL Conference Beirut, April 2011
Presentation transcript:

A New Kind of Catalog Charley Pennell Principal Cataloger for Metadata North Carolina State University North Carolina Library Association 2007

Where is this talk headed? Local motivation National trends What is Endeca? Features Does Endeca work? Where are we going from here? Where is everybody else going?

Why a new catalog? What was wrong with the old one?

A little TRLN catalog primer TRLN libraries (Duke, NCCU, NCSU, UNC- CH) jointly develop and maintain BIS, DRA implemented for catalog (UNC & Duke continue Acq/Serials modules), No integrated keyword/browse capability, Web2 catalog implemented, Sirsi & DRA “merge” in 2002; Taos DOA

A little TRLN catalog primer 2 NCSU & NCCU to Unicorn; Duke to Aleph; UNC-CH to Millenium, Sirsi/Dynix merger, 2004: vendor focus shifts (even more) toward school/public market While agreeing to continue to support Web2, S/D increasingly looking to merge all product catalogs into single interface

What was the catalog lacking? Simplicity: a simple, hopefully uncluttered interface Interactivity: ways to interact with results to get better results Forgiveness: just fix my typos and case errors, don’t make me feel stupid! Response time: always Real-time sorting: the limit is how many?!! Relevance ranking: as if! Web services: use the Web to repurpose data, enable mash-ups, add-ons & improvements

Which interface is ready for immediate use?

So, why DOES everyone think that the catalog sucks stinks? "Most integrated library systems, as they are currently configured and used, should be removed from public view." - Roy Tennant, OCLC

The old model

The integrated library system Historically, the ILS developed as an inventory control system for use by library staff only First library automation systems (Plessey, CLSI, Geac, Innovative) were designed around circulation or acquisitions functions Interaction time was calibrated to the slow pace of backroom work where the audience was basically captive Staff focus on known-item searching, not resource discovery

The catalog as part of the ILS The first integrated OPACs were veneers on top of existing inventory management systems—patrons & staff competed for system resources! They still do! First OPACs allowed for browse only; early keyword searching restricted to certain fields (A/T/S) only Libraries with no IT support were stuck with what their vendor provided and the enhancement process for improvements Libraries with IT support created their own systems: BIS, NOTIS, Clarement Colleges, Georgetown, PALS, DOBIS/LIBIS

The state of the ILS in 2007 Customer demands for increasing functionality in a marketplace with little $$ to spend has reduced the ILS vendor pool through mergers and buyouts New functionality (multi-search, ERMS, E-Ref, ILL, etc.) increasingly being met by stand-alone and third party applications Increasing competition from open source (Koha, Evergreen, Scriblio, LibraryThing) and e-commerce Q: Is our dogged adherence to MARC the only thing keeping the remaining ILS vendors afloat?

The state of the catalog 2007 Library users’ search expectations have been conditioned by interactions with commercial Websites and Google, with which Libraries can barely afford to compete, but must Libraries are becoming increasingly virtual as users interact with us online (e-resources, Second Life) User expectations for online experiences are more interactive, instantaneous, and inviting

Perhaps most importantly… The information resources represented in the catalog represent a shrinking percentage of what end users need or want Calhoun’s Aristotelian vs. Copernican views of the catalog

What do users want from the OPAC? Make subject searching in online catalogs easier using post- Boolean probabilistic searching with automatic spelling correction, term weighting, intelligent stemming, relevance feedback, and output ranking Streamline users' book selection decisions at the catalog by adding tables of contents and back-of-the-book indexes to cataloging (i.e., metadata) records Reduce the many failed subject searches by expanding the online catalog with full texts—journal and newspaper articles, encyclopedias, dissertations, government documents, etc. Increase finding strategies in online catalogs through the library classification -- Markey, Karen (2007). “The online library catalog: Paradise lost and paradise regained”, D-Lib Magazine, 13(1/2).

“Many researchers express surprise at the brevity (from one to three words) of the queries people submit to online systems. Belkin tells why so few words make up their queries, "Precisely because of the inquirer's lack of knowledge about a problem area, it is impossible to specify what would resolve it." For Belkin, the saving grace is the inquirer's ability to recognize what he or she wants or does not want during the course of the search. Therein lies an important solution to the problem— information systems that report results for easy eyeballing and instantaneous recognition of relevant possibilities.” – Karen Markey

What is an Endeca?

A software company based in Cambridge, MA A search and information access technology provider for a number of major e-commerce websites Developers of the Endeca Information Access Platform

Endeca features Commercial- strength search/sort speeds Site customizable relevance ranking Faceted browse True browsing (LC classification) Spell-checking ”Did you mean?” Automatic word stemming

Endeca at NCSU Libraries Went live in January 2006 Works with a text version of a daily snapshot of Libraries’ MARC & other metadata Used to improve the discovery portion of the library catalog Interoperates with ILS for holdings, current availability status Web2 interface still present for known item & authority searching

Implementation timeline License / negotiation: Spring 2005 Acquire: Summer 2005 Implementation: August 2005 : vendor training September 2005 : finalize requirements October 2005 – January 2006 : design and development January 12, 2006 : go-live date Widen to TRLN partners: Winter 2008

Implementation Team Implementation Team brought together from IT, DLI, Cataloging, Collections, Reference, Circulation Worked on indexing, UI, usability testing, etc. Areas of contention Number of initial search boxes (1 or 2) Order, grouping of facets Placement of classification hierarchies, breadcrumbs Use of “search” and “browse” on tabs Visualization aided by Tito’s wireframes

8 th (and Final) Revision: Aggregate holdings information by library. Reduces complexity of continuing and online resources. Brief view vs. Full view gives user choice about displaying holdings.

NCSU Endeca features Facets Results Call # browse Breadcrumbs

Features we started with Faceted browse Availability facet Breadcrumbs Spell check / Did you mean Hierarchical subject browse based on LCC Fuzzy link to live Web2 data New book browse for titles added in last week only

Features that we’ve added New book browse based on relative date (last week, last month, last three months) RSS feeds based on user results “Search within” results Send search to TRLN partners Static unique link to live Web2 data

Relevance ranking Based on locally customizable algorithm: Most relevant: query exactly as entered For multi-term searches: phrase match Field match title match more relevant than notes match Other factors: number of fields matched weighted frequency static ordering (publication date, circulation stats)

Faceting at the NCSU Libraries Follows on what we have learned from the commercial Web search model Mines metadata already available via MARC record, local class number, ILS item categories, circ status, and date stamping Required massive clean-up of 6xx subdivisions Allows both pre- and post-coordinate limits Uses table mapping to enable drilling down through call number results

Facet refinements Availability Author Library Format Language New(ness) LC Classification Subject: Topic Subject: Genre Subject: Region Subject: Era

A single facet need not represent data from a single field Single Unicorn item types (Book, Kit, Manuscript, Map, Data set) Multiple Unicorn item types (Audio, Microform, Thesis/Dissertation, Software & Multimedia, Videos) Leader byte 07 (Bib lvl): Journal, Magazine Library (Online)

Ranking facet results by number of postings makes sense in a short list, but not in a long list

The author facet is less useful in some types of searches …

… than others!

Technical overview Raw MARC data NCSU exports and reformats Flat text files Data Foundry Parse text files Indices MDEX Engine NCSU Web Application HTTP Information Access Platform

MARC ingest MARC  flat text file(s) for ingest by Endeca. Transformation accomplished with MARC4J. Opportunity to manipulate data on the back-end.

Transformed data

The end result… Video

Other Endeca library catalogs Phoenix Public Library: McMaster University: Florida Center for Library Automation Individual Florida universities etc.

Does Endeca work?

Problems: authority control Endeca is a keyword search engine; “browse” can only be effected using sort options There is no authority control within Endeca itself, rather it relies on AC within ILS To make use of available metadata, subjects were split along subdivisions. Authors were not Talks were held with the vendor to explain the potential for drawing on authority x-refs to collocate searches

Problems: subject context Problems with wrong delimiter values (esp. $v) Problems maintaining context in atomized LCSH One-way relationships English language$vDictionaries$xSpanish Chronological headings devoid of geographic context Cuba$xHistory$yRevolution, 1959 Phrase headings expressed in multiple subdivisions Prisoners$xAbuse of

Problems: subject hierarchies Chronological hierarchy not built into $y “19 th century” does not subsume , , , , , Civil War, , etc. Geological periods exist as text only (Ordovician, Pleistocene, etc.) Some chronological headings are expressed as text in 650$a Middle Ages Nineteen sixties Geographic hierarchy not consistent between 651 and 650 $zNorth Carolina$zRaleigh $aRaleigh (N.C.) BT/NT/RT relationships from authority file lacking

Some potential solutions Search behavior education FAST (Faceted Application of Subject Terminology) Web2 x-refs to redirect searches to Endeca Combining $z hierarchies Hierarchy lists

What do our users think?

“The new Endeca system is incredible. It would be difficult to exaggerate how much better it is than our old online card catalog (and therefore that of most other universities). I've found myself searching the catalog just for fun, whereas before it was a chore to find what I needed.” - NCSU Undergrad, Statistics “The new library catalog search features are a big improvement over the old system. Not only is the search extremely fast, but seemingly it's much more intelligent as well.” - NCSU faculty, Psychology

Usability testing

Usage statistics

Newness wearing off? March ‘06 - May ‘06 July ‘06-January ‘07

July 06 – Jan 07

Where are we going from here?

Future directions Additional hierarchies (geographic names, dates) Make use of NAF, SAF, particularly cross-reference structure Massage underlying metadata Addition of Date Cataloged – Done! Addition of LC Class numbers to e-resources – Done! FRBR work numbers/records? – Tested! FAST headings? Accommodation of true browse for all indexes

Future opportunities Expanding the scope of the implementation to the 10M records in TRLN (Duke, NCCU, NCSU, UNC- Chapel Hill) Enrich catalog through external web services: book jackets, reviews, TOC, etc. – Amazon, OCLC. LibraryThing, Bowker Syndetics Build use-case based cross-application shopping cart functionality Integrate catalog w/other tools through web services—“Free the Data”

Web services…

Mobile device searching

Where is everybody else going? Catalogs detaching themselves from ILS Detached data lends itself to experimentation Don’t have to throw out baby with bathwater when better interfaces come out Data itself safe and secure in ILS MARC becoming superfluous; MARC’s granularity NOT! Social interaction: reviews, folksonomic tags, ratings

Phoenix Public Library on Endeca

III’s new faceted catalog, Encore

ExLibris Primo at Vanderbilt

Athens County, OH—Koha Zoom open source

Georgia PINES—Evergreen open source

Casey Bisson’s Scriblio

Danbury Public powered by LibraryThing

OCLC WorldCat Local at UW

Thanks for listening! Charley Pennell Principal Cataloger for Metadata NCSU Libraries North Carolina State University Raleigh, NC More info at: