1 Transparency in Discovery: Marshall Breeding April 7, 2014 Issues and progress in the ecosystem of Index-based Discovery
Description Breeding, co-chair of the NISO Open Discovery Initiative, describes the general landscape of library resource discovery products, the trend toward web-scale, index-based services, and some of the issues that sparked this initiative to bring increased transparency and other improvements to the ecosystem involving libraries, content providers, and discovery service creators. 2
Online Catalog Books, Journals, and Media at the Title Level Not in scope: –Articles –Book Chapters –Digital objects Scope of Search Search: Search Results ILS Data
Index-based Discovery Search: Digital Collections Web Site Content Institutional Repositorie s … E-Journals Reference Sources Search Results Pre-built harvesting and indexing Consolidated Index ILS Data Aggregated Content packages (2009- present) Usage- generated Data Customer Profile Open Access
Discovery Service Installations Product Installed EBSCO EDS Primo AquaBrowser Encore LS2 PAC Summon Enterprise Civica Sorcer Axiell Arena Chamo
Discovery Concerns Important space for libraries and publishers Discovery brings value to library collections Discovery brings uncertainty to publishers Uneven participation diminishes impact Ecosystem dominated by private agreements Complexity and uncertainty poses barriers for participation 6
Heterogeneous Representations Content objects represented by –MARC Records for books and journal titles –Citation data for articles –Full text for articles –Full text for books –Abstracts and Indexing data Controlled vocabularies, related terms, abstracts, selected index terms produced by subject experts –Other metadata or enrichment
Discovery index issues Indexing full-text enables keyword-based relevancy Citations or structured metadata provide basic terms to support search & retrieval and faceted navigation A&I terms provide access points, relevancy indicators that cannot be reproduced algorithmically Important to understand what is indexed –Currency, dates covered, full-text or citation –Many other factors 8
Library Perspective Strategic investments in subscriptions Strategic investments in Discovery Solutions to provide access to their collections Expect comprehensive representation of resources in discovery indexes –Problem with access to resources not represented in index –Encourage all publishers to participate and to lower thresholds of technical involvement and clarify the business rules associated with involvement Need to be able to evaluate the coverage and performance of competing index-based discovery products
Collection Coverage? To work effectively, discovery services need to cover comprehensively and evenly the body of content represented in library collections What primary publishers participate? What secondary or A&I publishers participate ? Is content indexed at the citation or full-text level? What are the restrictions for non-authenticated users? How can libraries understand the differences in coverage among competing services?
Web-scale search problem Search: Search Results Pre-built harvesting and indexing Consolidated Index ?? ? Non Participating Content Sources Non Participating Content Sources Problem in how to deal with resources not provided to ingest into consolidated index Digital Collections Web Site Content Institutional Repositories … E-Journals ILS Data Aggregated Content packages
Representation of A&I Important to understand how a discovery service incorporates A&I resources –Does it receive content from the A&I provider directly and make use of value-added terminology –If not: citations or full-text indexing of some portion of the titles represented in the A&I product –NOT the same, and possibly misleading 12
Evaluating the Coverage of Index-based Discovery Services Intense competition: how well the index covers the body of scholarly content stands as a key differentiator Difficult to evaluate based on numbers of items indexed alone. Important to ascertain how your library’s content packages are represented by the discovery service. Important to know what items are indexed by citation, which are full text, and how A&I content is handled
Some Key Areas for Publishers 1.Expose content appropriately 2.Trust that access to material will be controlled consistent with subscription terms 3.“Fair” Linking 4.Materials not disadvantaged or underrepresented in library discovery implementations 5.Usage reporting
Library Technology Reports The Current State of Library Resource Discovery Products: Context, Library Perspectives, and Vendor Positions In press for Publication January 2014
LTR Components Vender questionnaire Library Survey Industry announcements Other articles and publications
Library Discovery Survey Academic 247 Consortium 15 Government Agency 2 Law 7 Medical 5 Museum 1 National 1 Other 1 Public 96 Special 14 State 4 Theology 3 Survey executed to gather data from libraries regarding their experiences with discovery services Responses received by 396 Libraries: 29 Countries represented, 252 responses from United States
Overall Effectiveness
Comprehensiveness: Academic Libraries
Relevancy Effectiveness
Objectivity in Discovery
Objectivity in Discovery: Academics
Example Product rating chart
OPEN DISCOVERY INITIATIVE 24
Facilitate a healthy ecosystem among discovery service providers, libraries and content providers ODI context
ODI Pre-History June 26, 2011: Exploratory ALA Annual July 2011: NISO expresses interest Aug 7, 2011: Proposal drafted by participants submitted to NISO Aug 2011: Proposal accepted by D2D Vote of approval by NISO membership Oct 2011: ODI launched Feb 2012: ODI Workgroup Formed 26
Organization Reports in NISO through Document to Delivery topic committee (D2D) Staff support from NISO through Nettie Lagace Co-Chairs –Jenny Walker (Ex Libris) –Marshall Breeding (Library Consultant) D2D Observers: Jeff Penka (OCLC) Lucy Harrison (CCLA) 27
ODI Timeline MilestoneTarget DateStatus Appointment of working groupDec 2011 Approval of charge and initial work planMar 2012 Agreement on process and toolsJun 2012 Completion of information gatheringJan 2013 Completion of initial draftJun 2013 Completion of final draftSep 2013 Public commentNov 2013 Revision and ApprovalApr
Balance of Constituents LibrariesPublishersService Providers 29 Marshall Breeding, Vanderbilt University Jamene Brooks-Kieffer, Kansas State University Laura Morse, Harvard University Ken Varnum, University of Michigan Sara Brownmiller, University of Oregon Lucy Harrison, College Center for Library Automation (D2D liaison/observer) Michele Newberry Lettie Conrad, SAGE Publications Roger Schonfeld, ITHAKA/JSTOR/Portico Jeff Lang, Thomson Reuters Linda Beebe, American Psychological Assoc Aaron Wood, Alexander Street Press Jenny Walker, Ex Libris Group John Law, Serials Solutions Michael Gorrell, EBSCO Information Services David Lindahl, University of Rochester (XC) Jeff Penka, OCLC (D2D liaison/observer)
ODI Project Goals: Identify … needs and requirements of the three stakeholder groups in this area of work. Create recommendations and tools to streamline the process by which information providers, discovery service providers, and librarians work together to better serve libraries and their users. Provide effective means for librarians to assess the level of participation by information providers in discovery services, to evaluate the breadth and depth of content indexed and the degree to which this content is made available to the user.
Specific deliverables Standard vocabulary NISO Recommended Practice: –Data format & transfer –Communicating content rights –Levels of indexing, content availability –Linking to content –Usage statistics –Evaluate compliance Inform and Promote Adoption 31
ODI Stakeholder Survey Collected data from Sept 11 thru Oct 4, 2012 Each subgroup developed questions pertinent to it area of concern 32
Selected results Libraries: do you use a discovery service? –Yes: 74%, Planning to soon: 17%, No: 5%, Don’t know: 4% Smallest discoverable unit: –Component title: 9%, Article: 25%, Collective work record: 11%, All the above: 50% Linking from A&I entry: 75 prefer linking to full text on original publisher’s server 33
Content providers (74) Contribute data: Yes-All: 44%, Some: 48%, No: 8% –Current data: 12%, Current + back files: 85 Barriers to contributing: –IP concerns, technology, staff resources Challenges in delivery: –Complicated formats: 15%, transmission of data: 18, allocation of personnel: 23%, can’t automate: 12%, None: 20% 34
Issues surrounding A&I resources Concern that A&I resources not be freely available to non authenticated users and only for subscribing institutions How to “credit” A&I data that contributes to search results –Example: Index entry produced by enhancing full-text with A&I data Preservation of the value added by A&I in the discovery ecosystem 35
ODI Final Report Issued for public Comment Comment period closed November 18,
Report Topics Introduction –In scope / out of scope –Terms and definitions Evolution of Discovery –Related initiatives Recommendations 37
General Recommendations Create oversight group Conformance checklist for: –Discovery Service Providers –Content Providers 38
Recommendations for Content Providers Content providers should make items available to discovery service providers. –Basic: Citations: specific metadata elements –Enhanced: additional metadata + Full-text Provide to Libraries: disclosure of participation in discovery services 39
Recommendations for Discovery Service Creators Disclosure of content indexed –Specific metadata fields Fair / non-biased linking –Mechanisms for libraries to choose versions preferred for linking –Annual statement regarding neutrality of linking or relevance –Provide links to A&I services when applicable Usage statistics to Publishers 40
Current work Next Steps Finalize document based on comments from ODI members Submit for final approval by NISO D2D Hopefully finished by the end of April
Connect with ODI ODI Project website: Interest group mailing list: ODI: 42
Discovery Service Trends Progress in cooperation between content providers and discovery –ProQuest announces deals with OCLC and Ex Libris Technical development of discovery services continues –Improved methods for relevancy and tools for exploring library resources Convergence of Discovery with new Library Services Platforms 43
Convergence Discovery and Management solutions will increasingly be implemented as matched sets –Ex Libris: Primo / Alma –Serials Solutions: Summon / Intota –OCLC: WorldCat Local / WorldShare Platform –Except: Kuali OLE, EBSCO Discovery Service Both depend on an ecosystem of interrelated knowledge bases API’s exposed to mix and match, but efficiencies and synergies are lost 44
QUESTIONS? 45