CONCERT 2000, Taipei Adding Value to Full Text Databases: A Look at the Digital Vault and Intelligent Document Linking By:Richard Hollingsworth Bell & Howell Information and Learning
Mission To effectively search multiple knowledge sets? Process needs to be intuitive Organization is imperative Answers need to be precise (of course!) Don’t want to create more confusion
Approaches Meta Searching Search everything at once Controlled Searching Need to begin with a ‘core’ set of data. Index Taxonomy
Meta Searching Meta Searching is the buzz Search everything at once
Difficulties in Meta Searching Not all databases are created equal Different Thesauri Different Vocabularies Different Engines Different Formats Z39.50 Is one solution Well… we all know what that means!
Example User searches on ‘Clinton and campaign finance reform’ Search Web, Biographies, Business, Health, News, OPAC, Reference Results are either organized in categories or combined.
Results Confusing at best Why search bibliographies for “campaign finance reform”? Doesn’t highlight related or relevant topics and figures to the original query. Doesn’t add any intelligence into the process. Navigating from one set to another.
Summary on Meta Searching =
Controlled Searching Start with a ‘known’ core Integrate relevant components around it
Example User searches on ‘Clinton and campaign finance reform’ Search General Reference, Business and News Link to OPAC, Biographies, and other reference material.
Results Core relevant list of articles to begin the research process Advantages Known query syntax Expected results output Easier navigation Simple method of joining ‘relevant’ resources Supports idea generation and thinking.
Controlled Searching Start with a ‘known’ core Integrate relevant components around it =
A solution! ProQuest IDL Intelligent Document Linking Three major components Term recognition or markup Knowledgebase(s) External content sources
Markup Term recognition Sophisticated software that marks-up terms in the text such as people, places, and companies. Extension of our auto-indexing software. Software is tunable. Limit to the vocabulary Remove the limits
Knowledgebase Known list of ‘answers’ First integrated knowledgebase is The WorldBook Encyclopedia Easy to read Conveniently available in the ProQuest vault Great fit for current general reference and news products.
External Sources OPAC Best of Web (Index of Web Sites) Other subscription and free sites. Dictionary Maps 3rd party Database Subscriptions Others….
Future Additional KnowledgeBases Health example follows Business Adding premium content sources to supplement our high quality A&I databases.
Additional support for external sources Help us define these! Additional markup capabilities Subjects/Concepts Products etc... Future
Digital Vault TM Opening the Vault on 500 Years of History
Bell & Howell Information and Learning’s microfilm “vault” is the largest commercially available collection 20,000 periodicals, 7,000 newspapers and 400 Research Collections and 1,000,000 dissertations 3 climate controlled underground vaults Over 5.5 billion page images Using the microfilm contained in the vaults to create the largest digital collection Digitizing The Vault
First Digital Vault product released Early English Books Online (EEBO) Focus in 2000 Bringing a core collection of periodicals to libraries Two New Collections - Gerritsen’s Collection on Women’s History - Genealogy and Local History Digital Sanborn Maps Opening The Vault
Early English Books Online EEBO - Digitized version of 3 “Early English Books”-related collections “Total surviving record of the English-speaking world from ” This is the first book printed in English by the famous printer, William Claxton in While printed in English it was actually printed in France. (Huntington Library)
Early English Books Online Database “transforms scholarship” of material from this era Electronic works are now used in a classroom setting - not just for graduate study Search and retrieve works instantly Covers virtually every subject area Science, mathematics, engineering, women’s studies, etc.
Periodicals from the Digital Vault TM Connecting tomorrow’s library with the past
Digital Vault - Periodicals Definition Web-access to page images from retrospective journals and magazines Cover to cover coverage - opinion and fact remain intact alongside the advertisements of the day Creation of ‘dirty ACSII’ for keyword searching plus TOC level access Seamless integration with current content in ProQuest
Digital Dissertations
Maintaining the Scholarly Record Capturing dissertation research in depth: From 1938 to the present, Bell & Howell Information and Learning has been the recognized repository for dissertations in North America. World-wide access to over 1.6 million citations More than 1 million titles available in full text Each year over 55,000 new titles added Retrospective North American coverage to 1861 REFERENCE COPIES ON DEMAND ARCHIVE
Creating the Digital Library Beginning with 1997 submissions, we are converting all incoming paper dissertations to Adobe PDF format. Currently there are over 170,000 full text dissertations and Master’s theses available for downloading in our digital archive. We are now accepting dissertations in digital format. Institutions can submit via CD-ROM or a FTP server.
ProQuest Digital Dissertations Access to the the Dissertation Abstracts database: Visitors have free access to over 100,000 titles Library subscription to the entire 1.6 million citation database including free access to all PDF files from their school Free twenty-four page previews for all PDF files Search by both fixed-field and key word
Adding Value to Full Text Quality of Index and Abstracts SiteBuilder Technology Intelligent Document Linking Digital Vault Initiative and relevant knowledgebase(s)