Web-scale Discovery Tools and the Backgrounding of Government Information CHRISTOPHER C. BROWN, UNIVERSITY OF DENVER, MAIN LIBRARY ELECTRONIC RESOURCES.

Slides:



Advertisements
Similar presentations
Six Steps to Effective Library Research
Advertisements

Introduction to Online Resources Aeronautics & Astronautics, Mechanical Engineering and Ship Science Michael Whitton November 2011 & February 2012 University.
1 SUBJECT DATABASES ENGLISH 115 Hudson Valley Community College Marvin Library Learning Commons.
Summon: Web-scale discovery. Agenda Web-scale Discovery Defined How Summon Works Summon User Experience (live demonstration) Additional Resources.
Other Nursing Databases – Part 2 MEDLINE, Dissertations & Theses, Cochrane and ERIC.
1 ONESEARCH/ WRITING A THESIS STATEMENT ENGLISH 115 Hudson Valley Community College Marvin Library Learning Commons.
Introduction to Online Resources Aeronautics & Astronautics, Mechanical Engineering and Ship Science Michael Whitton February/March 2013 University Library.
Multidisciplinary Databases. Academic Search Elite This database includes scientific articles, as well as information related to the social sciences,
Literature Review Week 3 Lecture 1. School of Information Technologies Faculty of Science, College of Sciences and Technology The University of Sydney.
Free Federal Legislative History Sources on the Web Sue Altmeyer Electronic Services Librarian Cleveland Marshall College of Law January 10, 2008.
Starting Research and Finding Sources Comm Arts II Mr. Wreford.
Copyright © Allyn & Bacon (2007) Conducting Library Research Graziano and Raulin Research Methods: Appendix C This multimedia product and its contents.
Finding Scholarly Articles and Research Data in Education Kathleen Carter Arnulfo L. Oliveira Memorial Library
CS 101 Students Presented by Ken Ryan Winter, 2008 GETTING THE MOST OUT OF THE LIBRARY OUT OF THE LIBRARY.
GOVT 4348 – Terrorism Hillary Campbell Government Documents Librarian
Clive Wright Senior Director of Discovery Products for Europe and Latin America.
PLUG-INs Information Fujariah Colleges
Academic Advantage Series Library Resources and Skills Dr. Bryan Carson, J.D., M.I.L.S., Ed.D. 906 Cravens Library or Substituting.
The Darkening of Government Information Christopher C. Brown Reference Technology Integration Librarian / Government Documents Librarian University of.
Energy and the Environment Research Resources Locating Government Publications and other material Donna Burton Documents Librarian Documents Librarianx6635.
Research Before you begin research, please review this PowerPoint. There is valuable information about how and where to research.
The Fullerton College Library. Welcome to Library Research.
BEST KEPT SECRETS: LOCATING FEDERAL INFORMATION Valery King, Government Information Librarian Oregon State University Libraries & Press ORSLA – March 7,
Digitization Panel August 12, 2010 Christopher C. Brown, coordinator Mike Culbertson, Colorado State U. James Mauldin, GPO.
English 115 Subject Databases Hudson Valley Community College Marvin Library Learning Commons 1.
English 115 GoogleScholar/ OneSearch Hudson Valley Community College Marvin Library Learning Commons 1.
HELPING YOUR LIBRARY BE THE BEST PARTNER FOR RESEARCH.
Embracing Change Oliver Pesch Chief Strategist, E-Resources EBSCO Information Services.
Web Scale Discovery Service Vs Federated Search NIKESH NARAYANAN
Finding Primary Documents A Tutorial. What Are Primary Sources? Although the terms primary and secondary are not always sharply divided, in general. primary.
The Research Process Getting the Information You Need.
Finding Credible Sources
Searching for Information and Library Databases. Knowing… When When Where Where How to find information isn’t easy How to find information isn’t easy.
GOVT 3325 – American Public Policy Hillary Campbell Government Documents Librarian.
PUBLIC SCHOOL LAW Part 12: Primary Legal Sources: Legislative History Sources.
Research for English 201 Ielleen Miller, Reference/Instruction Librarian Website:
HOW TO FIND INFORMATION 2012 BY HILDÉ VAN WYK. WHERE DO I FIND INFORMATION FOR MY ASSIGNMENT? In the library - Books, Journals, Newspapers, Encyclopaedias,
How do I begin a Research Project?. Research? What is it and how do you make sure you use your resources wisely? A good research paper should have between.
Encouraging An Informed Citizenry: Locating and Using Congressional Research Service Reports Starr Hoffman Librarian for Digital Collections University.
© 2010 Deep Web Technologies, Inc. Taking the Library Back from Google Abe Lederman, President and CTO Deep Web Technologies May 12, 2010.
Uncovering the Invisible Web. Back in the day… Students used to research using resources hand-picked by librarians and teachers. These materials were.
+ The Use of Databases in the Instructional Program Increasing Rigor and Inquiry Throughout the Curriculum Donna Dick, Jacob Gerding, and Michelle Phillips.
English 115 Subject Databases Hudson Valley Community College Marvin Library Learning Commons 1.
Presenter : Audrey Thompson, Social Work Librarian, Howard University. Using Information Resources in Developing a Course Syllabus.
Unit 5 Commercial Databases. Can You Find an Answer? n Connect to Social Sciences Abstracts n Search: u Cold war (keyword): ______ items u Cold war (title):______.
Taking the Library Back from Google Abe Lederman, President and CTO October 18-20, 2007.
1 Manual LIMO Content  What’s LIMO?  Content of LIMO  Getting started in LIMO  Performing Searches  Using the Search Results  Managing.
Chapter 20 Asking Questions, Finding Sources. Characteristics of a Good Research Paper Poses an interesting question and significant problem Responds.
Among the skills we’ll address today....  Constructing a search for scholarly articles (Where? How?)  Working with your search results  Locating the.
Native Issues-CATC: Research Strategies Jane Long, Assistant Professor MLIS, University of Oklahoma MA, English, Wright State University BA, English, Southwestern.
 If the president signs a bill, passed by both legislative chambers, it becomes law. Usually there are remarks made at the time a bill is signed into.
Searching for Grey Literature Farhad Shokraneh Medical Information Specialist Cochrane Schizophrenia Group Presentation at: Information Retrieval Topics:
King’s College London Pre-Sessional Programme Searching for sources: Creating a bank of information.
Off Campus Library Services: Nursing Education
Gain Global Exposure: Partner with EBSCO to Promote your Scholarship
Digitality and Research: What has Changed
Science Literacy Project
LIS 100 IFEST What should you know?.
Library Access 24/7 Did you know that you can do research without actually coming to the KC Library on campus? You have access to our databases and ebooks:
Federated & Meta Search
A SPEAKER’S GUIDEBOOK 4TH EDITION CHAPTER 9
A Closer Look at Three Useful Web Sites
Christopher C. Brown Reference Librarian
IL Step 3: Using Bibliographic Databases
USER MANUAL - WORLDSCINET
Networked Information Resources
Searching articles in databases at URI Library
Christopher C. Brown Reference Librarian
A SINGLE SEARCH BOX IS NOT ENOUGH Transforming Discovery with the Summon® service, 360 Link and ProQuest Flow Nicolas Espeche ProQuest Workflow Solutions.
USER MANUAL - WORLDSCINET
Presentation transcript:

Web-scale Discovery Tools and the Backgrounding of Government Information CHRISTOPHER C. BROWN, UNIVERSITY OF DENVER, MAIN LIBRARY ELECTRONIC RESOURCES & LIBRARIES, AUSTIN, TX, FEB. 23, 2015

The Days of Broadcast Search Often referred to as “federated search”, but not It was disintegrated search, and maybe you could say federated results Characterized by long waits (connection wait times, handshakes, mapping of search terms, merging and “deduping” of results, messy results) Searching across metadata. No FT searching.

Metasearch or broadcast search was a fail Inordinate delays Messy “on the fly” merging Sloppy clustering Missing results No FT searching

Metasearch or Broadcast Search Innovative technology, but never lived up to expectations Web Metasearch Examples: Metasearch, Dogpile Library Metasearch Examples: MetaLib, Research Pro, ENCompass, Central Search/360 Search, WebFeat, AGent, iBistro

Metasearch or Broadcast Search Was often referred to as “federated search” but it was not; at best it was federated results. Web-scale discovery tools are the real “federated search”. But since vendors already used that term, they had to invent “discovery”. Query Merging and de-duping on-the-fly Federated Results Metadata and sometimes FT Query Federated Search and Federated Results Broadcast Search Webscale Discovery Metadata only Information Silos

Breadth and Depth Breadth refers to the number and kinds of resources covered. Google Scholar is rather narrow in breadth, covering scholarly articles, selected technical reports, and Google Books “bleadthrough”. Discovery tools are much broader in scope, covering books and ebooks, magazine articles, scholarly articles, trade publications, newsletters, newspapers, dissertations, technical reports, maps, audio-visual materials, institutional repositories, and many other sources. Depth refers to how deeply a search tool goes down into the resource. Google (Google Web, Google Scholar, and Google Books) usually indexes full text of nearly everything. Discovery tools vary greatly in their full text search reach. Sometimes they don’t have access to the full text, only metadata. Other times they have access but choose not to provide full text search access.

Metasearch tools were weak, but they searched rather evenly Magazines Scholarly Journals Dissertations Surface Searching Newspapers eBooks Print Books Gov Info

Indexing: title, author, keywords, abstract Gov Info Magazines Scholarly Journals Dissertations Discovery Tools Surface Searching Deep Searching Newspapers eBooks Print Books Full text of content – Every last word Google Scholar Full text of content – Every last word Scholarly Journals Google Books “bleedthrough” Misc: tech rpts, other Let’s Face It: Google Scholar Does It Better: There is Room for Improvement

Google Scholar: Setting the Bar

Discovery Tools are not very good at exposing primary sources Not all primary sources are created equal Archival resources: unique and interesting Government information: arguably one of most important primary sources – public policy affects us all. Primary Sources Include: Vendor primary sources (archival collections) Institutional repositories or Digital repositories Government primary sources

Why do Discovery Tools succeed more than metasearch tools? The Information Access Anomaly Book (average) Journal Article (average) Google (Scholar/Books) Typical Length - full text (FT) 200 pages x 400 = 80,000 words 15 pages x 400 = 6,000 words Surrogate Record (SR) words (75 ave.) words (400 ave. 1 ) SR to FT ratio1 to 10,6661 to 151 to What this chart means: Very often at the Reference Desk, student will say “Why doesn’t the library have any books on my topic?” Actually, we do; it’s just that our discovery tools are weak.

Relevance vs. Discovery Metadata Full Text Higher RelevanceHigher Discovery Full text indexing within Discovery Tools is uneven, maybe even erratic. Efforts should be made to increase full text presence within their indexing.

Unique Library of Congress Trials A unique opportunity in 2012

Testing Discovery Systems Select item to be tested (scholarly article, ebook, government publication etc.) Test each of the four discovery tools to ensure metadata is present Test each of the four to see if they can retrieve full text (test both with and without quotes around text) Text should not contain gremlin characters or cross lines (line breaks) Use Google as a control (Google Web, Google Scholar, Google Books)

TestG. ScholarSummonEDSPrimoWC Disc. CitationYes Text1Yes No Text2Yes No Wood, Phillip K., Kenneth J. Sher, and Patricia C. Rutledge. "College student alcohol consumption, day of the week, and class schedule." Alcoholism: Clinical and Experimental Research 31, no. 7 (2007): Text 1: drinking but based instead on the sum of the number of Text 2: night of drinking. In this case, it is not the time spent Test 1

Test 2 Citation: Garcia, Sònia, Teresa Garnatje, and Aleš Kovařík. "Plant rDNA database: ribosomal DNA loci information goes online." Chromosoma 121, no. 4 (2012): Text 1: "FISH data is stored prior to its publication, without being" Text 2: "outline of how to work with the database. The Simple" TestG. ScholarSummonEDSPrimoWC Disc. CitationYes Text1Yes No Text2Yes No

Test 3 Pound, Pandora, Shah Ebrahim, Peter Sandercock, Michael B. Bracken, and Ian Roberts. "Where is the evidence that animal research benefits humans?." BMJ 328, no (2004): Text 1: An unpublished study by Ciccone and Candelise Text 2: Moreover, if animal experiments fail to inform medical Text 3: single consultations. PP and SE applied (unsuccessfully) to the TestG. ScholarSummonEDSPrimoWC Disc. CitationYes Text1Yes NoYesNo Text2Yes NoYesNo

Test 4 Tartter, Molly A., and Lara A. Ray. "A prospective study of stress and alcohol craving in heavy drinkers." Pharmacology Biochemistry and Behavior 101, no. 4 (2012): Text 1: "Participants were contacted for an on-line follow-up at 6 and 12 months after evaluation in the laboratory" Text 2: "The ACQ was modified to encompass the four quantitative indices of alcohol use recommended" TestG. ScholarSummonEDSPrimoWC Disc. CitationYes Text1Yes NoYesNo Text2Yes NoYesNo

Test 5 Muggah, Robert, and Keith Krause. "Closing the gap between peace operations and post-conflict insecurity: towards a violence reduction agenda." International Peacekeeping 16, no. 1 (2009): Text 1: "armed violence prevention and reduction programmes that draw upon" Text 2: "before, during, and after wars come to a close. Armed violence does" TestG. ScholarSummonEDSPrimoWC Disc. CitationYes Text1Yes No Text2Yes NoYesNo

Testing for Government Information Choice of benchmark: Fdsys -

Sources for Government Information The Government Publishing Office has these freely available tools: Catalog of Government Publications (CGP) – library catalog, metadata only Metalib – metasearch tools, searches metadata only - FDsys – Repository, searches metadata and full text - ◦FDsys searches full text by default. ◦In 2011 FDsys officially replaced GPO Access as the official repository for Legislative, Executive, and Judicial Branch documents that GPO hosts. ◦FDsys was engineered from the ground up to bring quick, faceted searching for discovery of government information. ◦FDsys content is available for anyone to download and for third-party vendors to utilize.

Catalog of Government Publications: GPO’s Online Catalog

MetaLib: the GPO Metasearch Platform Metalib is a broadcast search for government information. No FT searching, only metadata. Access to Archival Databases (AAD) System - NARA DONE9930 AGRICOLA BooksDONE11 AUL Index to Military Periodicals DONE0 Catalog of U.S. Government Publications (CGP) DONE88 Education Resources Information Center (ERIC) FETCHING10000 EPA Publications and Newsletters DONE10 Federal Digital System (FDsys) DONE Library of Congress (LOC) DONE12830 PubMedDONE11

FDsys – Default FT Searching of Official Content Although some of this content does not lend itself to inclusion in Discovery Tools, some of these subsets are essential for students to work with public policy, legislative research, and history.

Priorities of what Full text Government Information to Include in Discovery Tools Congressional Reports – legislative intent (1995-present) Congressional Hearings – social history and policy making (1995-present) Congressional Documents – budget, treaties, special reports, and legislation-related (1995- present) Public and Private Laws (1995-present) Compilation of Presidential Documents (1993-present) Public Papers of the Presidents ( ) Economic Report of the President (1995-present)

Summon has govdocs – but their own PQ Congressional content, not FDsys

Congressional Hearings and Congressional Reports Hearings are important because they can provide valuable background information including statistics, scientific research, social implications, and stakeholder perspectives. Reports are even more important that hearings in that they can show legislative intent – why a bill is needed. They can provide majority and minority perspectives and section-by-section explication on bills.

Test cases – Congressional Hearings Hearings are important because they can provide valuable background information including statistics, scientific research, social implications, and stakeholder perspectives.

Gov Test 1 – Congressional Hearings TestGoogleFDsysSummonEDSPrimoWC Disc. CitationYes Cat rec only Text1Yes No Text2Yes No Citation: United States Exotic bird species and the Migratory Bird Treaty Act: oversight field hearing before the Subcommittee on Fisheries Conservation, Wildlife and Oceans of the Committee on Resources, U.S. House of Representatives, One Hundred Eighth Congress, first session, Tuesday, December 16, 2003, in Annapolis, Maryland. Washington: U.S. G.P.O. Text1: "change to the MBTA that would make clear that invasive birds are not protected" Text2: "it is going into the Everglades system, in and around the"

Test cases – Congressional Reports Reports are even more important that hearings in that they can show legislative intent – why a bill is needed. They can provide majority and minority perspectives and section-by-section explication on bills.

Gov Test 2 – Congressional Report United States Authorizing the designation of national environmental research parks by the Secretary of Energy, and for other purposes report (to accompany H.R. 2729) (including cost estimate of the Congressional Budget Office). Washington, D.C.: U.S. G.P.O. Text1: "Biggert for her co-sponsorship and Ranking Member Hall for his" Text2: "Chair GORDON. Let me again in closing say that just because we" TestGoogleFDsysSummonEDSPrimoWC Disc. CitationYes Cat rec only Text1Yes No Text2Yes No

Gov Test 3: Compilation of Presidential Documents Remarks on the Patient Protection and Affordable Care Act, April 1, Compilation of Presidential Documents pdf pdf Text 1: "health insurance who didn't just a few years ago, and that's something to be proud of" Text 2: "understand. I've got to admit, I don't get it. Why are folks working so hard for people not" TestGoogleFDsysSummonEDSPrimoWC Disc. CitationYes Cat rec only No Text1Yes Yes*No Text2Yes Yes*No *Not the freely available FDsys content.

Conclusions Government information, when it is available through a Discovery Tool, is available through licensed content, not freely available sources. Government information should not be backgrounded, buried with the many other content types. Since GPO will hand over FDsys content to any vendor that wants it, vendors need to figure out how to acquire it, even if it means developing a new ingest method.

What can be done? Ask your vendor Ask GPO

A Charge to Vendors Keep working on relevance ranking All more full text into your index (include all types of content, scholarly articles, magazine articles, books and ebooks) Work with GPO to acquire FDsys metadata and full text The vendor that figures this out first will have a competitive edge.

Questions? Christopher C. Brown, Reference Technology Integration Librarian; Government Documents Librarian University of Denver, Main Library - (303)