Deep Indexing: Harnessing the Power of Data Discovery Mark Hyer VP, Secondary Publishing.

Slides:



Advertisements
Similar presentations
Publishers Web Sites Standard Features. Objectives Access publishers websites Identify general features available on most publishers websites Know how.
Advertisements

EndNote Web Reference Management Software (module 5.1)
History Study Center Primary and secondary sources documenting global history 2010.
Annual Reviews: A Nonprofit Scientific Publisher Bringing the Best Review Literature to the Worldwide Scientific Community for over 75 Years.
Scientist in the Electronic Library-is Deep Indexing of Use for Research? Helle Lauridsen Technology Manager Proquest CSA UK
New Search Concepts – the Hidden Data Internet Librarian London 2007 Helle Lauridsen. Technology Manager
CSA Illustrata: Natural Sciences Providing Leadership in Deep Indexing Dan Dyer CSA
Looking Ahead – Information Studies in the Workplace Help Us Design For the Future Carol Tenopir University of Tennessee web.utk.edu/~tenopir/
EDUCATION DATABASES: OVERVIEW. Primary Journal Databases Available for Education Education specific: ProQuest Education Journals Professional Development.
CSE594 Fall 2009 Jennifer Wong Oct. 14, 2009
1 SUBJECT DATABASES ENGLISH 115 Hudson Valley Community College Marvin Library Learning Commons.
Annual Reviews: A Nonprofit Scientific Publisher Bringing the Best Review Literature to the Worldwide Scientific Community for over 75 Years 1.
ACS PUBLICATIONS An Overview of Products & Services A C S P U B L I C A T I O N S H I G H Q U A L I T Y. H I G H I M P A C T.
JINR / CERN Grid and advanced information systems 2012 Anne Gentil-Beccot CERN Library GS/SIS The Library behind the scene Opportunities for Scientific.
NATIONAL LIBRARY OF MEDICINE PubMed Central Brooke Dine National Library of Medicine Medical Library Association Conference May 2004.
Exploring the Academic Invisible Web Das wissenschaftliche Invisible Web erkunden Dr. Dirk Lewandowski Heinrich-Heine-Universität Düsseldorf, Information.
Emerald Fulltext Is the products of MCB University PressIs the products of MCB University Press Established in 1967, in the name of Emerald intelligence.
Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.
Information Retrieval February 24, 2004
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
Supplementing the Library Collection with Digital Content from Engineering Departments Karen Clay Stanford University.
Using ProQuest Databases Jackson Community College Atkinson Library.
Using the ERIC Database This tutorial will show you how to access ERIC which contains citations, abstracts and some full-text materials from journals and.
Guide 26. REVIEW OF ONLINE RESOURCES. Welcome to our INFORMATION LITERACY review of online resources The objective is to summarize all the databases subscribed.
What’s new in search? Internet Librarian Oct 29 th 2007.
Finding Psychology Research Articles for Review 1.
PubMed/How to Search, Display, Download & (module 4.1)
Copyright © Allyn & Bacon 2008 This multimedia product and its contents are protected under copyright law. The following are prohibited by law: any public.
Proprietary and Confidential ProQuest Information & Learning Providing Leadership in Deep Indexing Helle Lauridsen Technology Manager, Proquest CSA LIDA,
An index to tables and figures in research articles RationaleStructureSearching -- אמיר מערכות מידע תלתן 96 יבנה טל נייד
What’s new at CSA Vicki Soto, Aquatic Sciences Supervisor Craig Emerson, Vice President, Editorial
Biological Science Database Proquest WEDAD AL-HUSAINAN ISD/NSTIC Kuwait Institute for Scientific Research November/2012.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
IL Step 1: Sources of Information Information Literacy 1.
How To Conduct Background Research For Your Paper.
Review of The Literature
Slide Title CSA Illustrata – a new way of searching… Sean Mckone Area Sales Manager.
English 115 Subject Databases Hudson Valley Community College Marvin Library Learning Commons 1.
DTIC Discovery Tools 28 March 2012 Moderator: Kapin L. Ferguson.
LIS 506 (Fall 2006) LIS 506 Information Technology Week 11: Digital Libraries & Institutional Repositories.
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of Designing the User Interface: Strategies for Effective Human-Computer.
Depth customization of DSpace: Best practices and techniques of institutional repository at IIT Kanpur, India By S. K. Vijaianand V. D. Shrivastava Gaurav.
Thomson Scientific October 2006 ISI Web of Knowledge Autumn updates.
Proposals and Formal Reports
What are they and what should you know about them?
English 115 GoogleScholar/ OneSearch Hudson Valley Community College Marvin Library Learning Commons 1.
ASMLibrary-MYP-9/18/09 MYP – Personal Project ASM Library EBSCO databases.
OpenURL Link Resolvers 101
OARE Module 5A: Scopus (Elsevier). Table of Contents About Scopus (Elsevier) Using Scopus Search Page Results/Refine Search Pages Download, PDF, Export,
How to read a scientific paper
Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.
Navigating An Introductory Guide for Librarians Brought to you by:
Presented by Dr. S. C. Jindal Librarian Central Science Library University of Delhi Delhi Information Competency.
LOGO A comparison of two web-based document management systems ShaoxinYu Columbia University March 31, 2009.
Indexing of Tables and Figures: Scientists’ Reaction Carol Tenopir University of Tennessee web.utk.edu/~tenopir/
Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer.
English 115 Subject Databases Hudson Valley Community College Marvin Library Learning Commons 1.
Introducing. Fresh, uncluttered Home Page INTUITIVE Simple navigation by country or theme. GRANULAR Search and click directly to tables, chapters, articles.
1 Manual LIMO Content  What’s LIMO?  Content of LIMO  Getting started in LIMO  Performing Searches  Using the Search Results  Managing.
(Click to advance the presentation.). The best source for locating these articles is the collection of research databases at the Online Library. While.
Deep Indexing in ProQuest Health and Medical Databases.
MEDLINE®/PubMed® PubMed for Trainers, Fall 2015 U.S. National Library of Medicine (NLM) and NLM Training Center An introduction.
Physical Science Matter and Energy – Day 3 Materials Needed:
Going beyond overview articles
Chapter Two: Review of the Literature
USER MANUAL - WORLDSCINET
The New LexisNexis® Statistical
Text Features.
Chapter Two: Review of the Literature
USER MANUAL - WORLDSCINET
Presentation transcript:

Deep Indexing: Harnessing the Power of Data Discovery Mark Hyer VP, Secondary Publishing

Attempts to Search Tables and Figures Today Deep Indexing- A Quick Overview Deep Indexing Projects Questions for the Group Agenda

Explosion of content available through: –Abstract and Indexing –Full Text –Dissertations –Government Documents Ex. Insect Resistant Corn Stats in South Dakota –Special Collections –Social Networks –Special Collections –Open Web…… Search and Retrieval

Search and Retrieval – Bad News For Researchers, Libraries, Information Centers ● Confusion ●Un-scholarly citing ●Irrelevant/meaningless ●Value erosion

Before the article Months in the field or lab resulting in images, tables, graphs Aggregation of information from other articles figures are often used to assess relevance A highly non-linear visual environment

The Finished Article

Article Level Indexing Not interested Abstract and title – the basic indexing Ok, but… We cannot search this

Jens Vigen –(CERN: European Organization for Nuclear Research (since 1954))HEP Information, the Bloomsbury Conference Imagine your ideal scientific information system in five years…Which additional features will be important? conference slides to articles) 74% 68% 70%

Deep Indexing: A New Approach to Searching Scholarly Literature

ProQuest Findings: –Figures and tables represent the distilled essence of research – the closest thing to raw datasets –Tables and Figures often invisible using traditional article-level and full-text searching False Hits when searching for specific tables and figures Large Result Sets To Browse –Establish the relevance of the related data and findings – direct links to document after qualification

–Identification of tables and figures (images, charts, maps, etc.) found within a document –Extraction of critical data and information within and surrounding each table or figure (including full caption) to provide indexing What is Deep Indexing? Deep Indexing Of Data Summaries Makes Objects Visible And Retrievable

Deep Indexing: A New Approach to Searching Scholarly Literature

BioText Search Engine The system indexes all open access articles available at PubMed Central. New articles are indexed daily. The current collection consists of more than 300 journals, 40,000 articles, 100,000 figures, and 60,000 tables.PubMed Central

Google Images Needs ”labelling” to retrieve anything sensible: Make sure the page topic corresponds to the search term Make an image of standard size{like 800 x 600 } Make the image in.jpg or.gif format.{Not necesarry, but recommended } Use the search term in the alt attribute, the title attribute and make sure that you have included the width and height declarations Or the use of their new “labeller”: You'll be randomly paired with a partner who's online and using the feature. Over a 90-second period, you and your partner will be shown the same set of images and asked to provide as many labels as possible to describe each image you see. When your label matches your partner's label, you'll earn some points and move on to the next image until time runs out. Does not search within the tables and figures and other images, just in the labelling surrounding them

CSA Illustrata- First commercial deep indexing project CSA Illustrata makes it possible to search indexed: Graphs Illustrations Maps Photographs Tables Transmission/emis sion images

Deep Indexing

Contextual.....

Relevant.....

" Overwhelmingly, respondents said the ability to search for specific types of objects would make a difference in their search and discovery processes…... save time... work more efficiently... aid in presentations... find more relevant results." Tenopir, C., & Sandusky, R.J. (2006). The Value of CSA Deep Indexing for Researchers - Draft Final Report And we asked the researchers..

New Object Types Tech Technology Graph Types Control Chart Diffraction Pattern Gantt Chart Phase Diagram Diagram Types Pole Figure Spectrum Table Types Truth Table Illustration Types Architectural Design or Blueprint Floor Plan Mechanical Design or Blueprint Floor Plan Organizational Chart Schematic – Circuit Diagram

Questions for the Group What are the most important sources for objects that may be hidden? –Government Users –Research Community at Large If you start a deep indexing project, what Resources will you need? Document Acquisition

Machine-Assisted Indexing: Subject, Taxonomic; Geographic, Statistical Manual Indexing Indexing Review Indexing Creation of the Tables & Figures Index Scan OCR XML or variant PDF text PDF image Hardcopy Article Acquisition Manual Image Zoning Image Processing Automated Image Extraction

Deep Indexing of Objects Categorization What “type” of object is represented... a map, a graph? Addition of Descriptors Terms from the dependent and independent variables What are the units? Subject, taxonomic, geographic terms from caption and article text What statistical techniques are used for analysis? … Figure – Graph – 3D Surface Plot Figure – Graph – Histogram/Bar Chart Figure – Illustration – Gene/Protein – Maps & Sequences Figure – Illustration – Molecular Structure Figure – Map – Bathymetric Map Figure – Map – Topographic Map Figure – Photograph – Satellite Image …

Selection of other relevant data May help researcher filter search results Include: –Presence of a predictive model –Indication of data display in color –Number of figure panels Deep Indexing of Research Material

For more information: -Library Journal Webinar -Whitepaper -Reviews Stay in Touch…Network…. Phone To Learn More