Seamless Searching of Numeric and Textual Resources Funded by a National Library Leadership Grant from the Institute of Museum and Library Services Michael.

Slides:



Advertisements
Similar presentations
Develop a search statement for searching a database? First, you need to understand what a database is and how it is compiled. Then, you can learn how to.
Advertisements

Subject Analysis: An Introduction Based on BASIC SUBJECT CATALOGING USING LCSH edited by Lori Robare.
The world’s libraries. Connected. Using Authorities to Improve Subject Searches Beyond Libraries – Subject Metadata in the Digital Environment and Semantic.
Application of Subdivisions June 22, 2003 ALA Annual Conference, Toronto.
Entering A New ERA : The European Research Area Ken Miller UK Data Archive University Of Essex June 11-15, 2002.
SLIDE 1IS 257 – Fall 2007 Codes and Rules for Description: History 2 University of California, Berkeley School of Information IS 245: Organization.
Web- and Multimedia-based Information Systems. Assessment Presentation Programming Assignment.
December 9, 2002 Cheshire II at INEX -- Ray R. Larson Cheshire II at INEX: Using A Hybrid Logistic Regression and Boolean Model for XML Retrieval Ray R.
Sociology 416: Families in Poverty Library Presentation by Hua Yi Fall, 2004.
Hanoi, Dec 6, 2008ECAI-PNC Laptops1 Laptops and Libraries: Decentralized Access to Explanatory Resources Michael Buckland University of California, Berkeley.
Sociology 315: Sex Roles Library presentation by Hua Yi Fall, 2004.
7/16/2002JCDL 2002, Ray Larson The “Entry Vocabulary Index” Approach to Multilingual Search Ray R. Larson, Fredric Gey, Aitao Chen, Michael Buckland University.
Data and design issues in historical GIS II: The place-based information interface. Contextualizing Places: Gazetteers, Maps, and Bibliographical Searches.
Introduction to Searching Databases and Records. What is a database? A database is a large, organized collection of information. Addresses Recipes Citations.
Sociology 480: Seminar in Comparative Sociology Library Presentation by Hua Yi Spring, 2006.
Searching Text and Data via Common Geography 1 SEARCHING TEXT AND DATA via COMMON GEOGRAPHY Geographic Information Retrieval: Searching Text and Data via.
Access to Digital Heritage Resources using What, Where, When and Who Michael Buckland Electronic Cultural Atlas Initiative University of California, Berkeley.
Introduction to Searching Databases. UW Libraries Catalog  Use to locate items in the library system Books Journal subscriptions Other material Some.
Introduction to Searching Databases and Records. What is a database? A database is a large, organized collection of information. Addresses Recipes Citations.
Introduction Ebsco Host. Public Libraries Have many databases you can search to find journal, magazine, and newspaper articles. Of these, Ebsco is one.
The Significance of Vocabulary Michael Buckland School of Information Management and Systems University of California, Berkeley.
Introduction to Searching Databases and Records. What is a database? A database is a large, organized collection of information. Addresses Recipes Citations.
Searching for articles in an online database: InfoTrac OneFile InfoTrac OneFile is one of the most comprehensive databases that C-N Libraries has. It has.
The added value information service that focuses on the European Union, the countries of Europe, and on the issues of concern to citizens, stakeholders.
International Atomic Energy Agency INIS Training Seminar Principles of Information Retrieval and Query Formulation 07 – 11 October 2013 Vienna, Austria.
New Innovative Access to Educational and Cultural Multimedia Contents Yuka Egusa Educational Resources Research Center, National Institute for Educational.
Urban Growth and Structure Kreg Walvoord And Hillary Campbell.
Planning & Available from home, 24/7 Class handout Reference books list with explanations and examples (click on the cover)
Use the Library of Congress Subject Headings to do a subject search? Subject headings are words or phrases that are established to represent a subject.
Statistical Abstract of the United States- Value of Data Ian O’Brien Branch Chief, Statistical Compendia Branch, U.S. Census Bureau.
Federal Department of Home Affairs FDHA Swiss Federal Office of Culture FOC Swiss National Library SNL Multilingual Access to Subjects (MACS) Patrice Landry.
Improving Access to Audio- Visual Materials by Using Genre/Form Terms OLAC Conference 1-3 October 2004 Montreal, Quebec.
1 Catalog Displays, Retrieval, and FAST May 31, 2005.
‘The Universal Catalogue’ a cultural sector viewpoint David Dawson Senior Policy Adviser (Digital Futures) Museums, Libraries and archives Council.
FAST and simple: faceted subject headings Jorge A. González P. July 13, 2015 Technical Services Standing Committee meeting.
Library Research. Objectives Locate books and articles in the library using the online catalog Explore subject directories Explore digital libraries and.
© 2000 OCLC Online Computer Library Center, Incorporated Dawn Lawson Manager, Electronic Products OCLC Forest Press Introduction to WebDewey.
1 Public Relations Library Instruction Public Relations Library Instruction Christine Adams Business & Economics Librarian Phone: (330)
EUscreen: Examining An Aggregator ’ s Role in Digital Preservation Samantha Losben Digital Preservation - Final Project December 15, 2010.
Testing and Improving Interoperability The Z39.50 Interoperability Testbed William E. Moen School of Library and Information Sciences Texas Center for.
Producción de Sistemas de Información Agosto-Diciembre 2007 Sesión # 8.
NARA’s New Authority Sources: Authority Files and Thesauri in ARC C. Jerry Simmons Authority Team Leader, Lifecycle Coordination Staff National Archives.
NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006.
NCSU Libraries Andrew Pace & Emily Lynema NCSU Libraries May 24, 2006.
W orkshops in I nformation S kills and E lectronic R esources Oxford University Library Services – Information Skills Training Social Sciences Web of Knowledge.
Subject Gateway KIV SUBJECT GATEWAY – WHAT IS IT? Internet based service To locate high quality information available on the Internet.
Endeca: a faceted search solution for the library catalog Kristin Antelman & Emily Lynema UNC University Library Advisory Council June 15, 2006.
Introduction to Searching Databases and Records. What is a database? A database is a large, organized collection of information. Addresses Recipes Citations.
© Ex Libris Ltd. All Rights Reserved. From Library Systems to Information SystemsMetaLib Jenny Walker ICOLC 2001.
Basics of Information Retrieval and Query Formulation Bekele Negeri Duresa Nuclear Information Specialist.
Mr. P’s Class Term Paper All the Steps on the Path to an “A” Term Paper in World History.
BUILDING NANOBANK Data Structure and Selection Criteria Jason Fong and Emre Uyar University of California, Los Angeles 1.
Information Retrieval Transfer Cycle Dania Bilal IS 530 Fall 2007.
Introducing OECDiLibrary Wize Nordic Phone
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
Health: International Finding information on health care delivery in other countries.
ENG 110 / HIS 113 Mortola Library.  Understand the nature and potential uses of a variety of secondary sources.  Locate books pertaining to your research.
Research and Projects: Z, M, and Beyond! William E. Moen School of Library and Information Sciences Texas Center for Digital Knowledge University of North.
Library of Congress Authorities on the Web: Accessing Authority Records via the LC Online Catalog ALA Annual Conference 2002 Ann Della Porta Integrated.
MARC Content Designation Use I mplications for indexing & interoperability William E. Moen School of Library and Information Sciences Texas Center for.
1 Shelflisting and Filing Rules and Subject Authority Control May 11, 2005.
MEDLINE®/PubMed® PubMed for Trainers, Fall 2015 U.S. National Library of Medicine (NLM) and NLM Training Center An introduction.
How Researchers Search for Manuscript and Archival Collections Susan Hamburger, Ph.D. Penn State Society of American Archivists Denver, CO August 31, 2000.
Theory, Tools, History: A Brief Introduction August 17, 2016.
WHAT DOES THE FUTURE HOLD? Ann Ellis Dec. 18, 2000
Multilingual Indexes for Detection and Translation
Vocabulary, Statistics, Time and Geography
Cataloging Tips and Tricks
Time Period Directories
Advanced search techniques in databases
Presentation transcript:

Seamless Searching of Numeric and Textual Resources Funded by a National Library Leadership Grant from the Institute of Museum and Library Services Michael Buckland, Aitao Chen, Fredric Gey and Ray Larson Friday Afternoon Seminar, Feb 14,

From numbers to texts: Iritani, Evelyn. "Normalizing ties to Vietnam important steps for U.S. firms; California stands to profit handsomely when barriers fall to trade with fast-growing country." Los Angeles Times v114 (July 12, 1995):D1. An article found using the keywords “Import” and “Vietnam” as query.

From text to numbers: "U.S. bans import of most European meat". Los Angeles Times v116, n14 (Dec 14, 1997):A22. (On fear of mad cow disease.) "Ban on cattle and sheep is extended to all Europe." New York Times v147, sec1 (Dec 14, 1997):16(N), 42(L). (The U.S. Agriculture Department responds to threat of 'Mad Cow' disease). Topic of interest: imports of beef to the United States from Britain The sources at showhttp://govinfo.kerr.orst.edu/import/import.html No reported edible beef imports from the United Kingdom.

Seamless Search Project Goals: Phase I: The development and demonstration of a library gateway providing search support for searching both text and socio-economic numeric databases. Phase II: The demonstration of a library gateway supporting searches between text and numeric database.

Data Sets to create Entry Vocabulary Indexes: MELVYL MARC Files A study of operant conditioning under delayed reinforcement in early infancy Infant psychology. Operant conditioning. Number of MARC records in the training data set: ~4,246,000. Book title LC Subject Headings A sample training record extracted from a MARC record.

doc1 doc2 doc3 doc4 doc5 behavior infant infancy psychology Infant psychology Operant conditioning Infant development Psychology Parent and child child attitude baby development Title WordsDoc IDsLCSHs Statistical association of title words and LCSH

Word to LCSH Entry Vocabulary Index (EVI) 1alcoholism alcoholic alcohol alcoholism and employment drug abuse alcohol, ethyl drinking of alcoholic beverages substance abuse Rank LCSHWeight List of the LCSHs that are most closely associated, statistically, with the query word: alcoholism.

Words to LCSH Entry Vocabulary Index (EVI) 1 economic policy german (west) switzerland regional planning economics92.14 Rank LCSHWeight List of LCSHs that are most closely associated, statistically, with the German query word: Wirtschaftspolitik. Note: The top-ranked LCSH “economic policy” happens to be the English translation of the German word “Wirtschaftspolitik”.

Words to LCSH Entry Vocabulary Index (EVI) 1 peanut cookery (peanut butter) cookery (peanuts) peanut industry peanut butter butter schulz, charles m cookery Rank LCSHWeight List of LCSHs that are most closely associated, statistically, with the phrase peanut butter as a query.

Word to LCSH Entry Vocabulary Index (EVI) 1 world war, vietnamese conflict, united states world war, vietnam Rank LCSHWeight List of LCSHs that are most closely associated with the German query: Vietnam War. Note: “Vietnam War” is not an established (authorized) LCSH. The established LCSH is “Vietnamese conflict”.

LCSH to Words Entry Vocabulary Index 1 alcohol alcoholism abuse drug drink alcoholic treatment prevention problem addiction Rank WordsWeight List of words that are most closely associated, statistically, with the Library of Congress Subject Heading: Alcoholism.

EVI-based Access to MELVYL Free-form query Ranked list of LCSHs MELVYL Z39.50 SERVER HTTP/Z39.50 Gateway httpd evi access Search results Full MARC record Web server gateway access EVI Web Browser Other Z39.50 SERVERS Z39.50 HTTP CGI

Counting California Database ( A collection of some 3,000 numeric tables. Organized into 16 topics and 184 subtopics. Sample topics: Banking, Finance and Insurance Elections Population and Demographics Social Services and Public Assistance Sample subtopics under Agriculture and Natural Resources: Farms and Farming Fishing Forestry and Lumber Minerals

Enhanced Access to Counting California Database Conventional probabilistic retrieval of numeric tables using table captions, mapping query to text of captions. Access to numeric tables through the words-to-subtopic entry vocabulary index. education libraries STATISTICS, STATEWIDE SUMMARY BY TYPE OF LIBRARY CALIFORNIA, TO A sample record created from

Probabilistic Access to Counting California Database Search results for the query: public libraries in California gives ranked list of captions:

EVI-based Access to Counting California Database Ranked list of subtopics that are most closely associated, statistically, with the query: personal/individual income tax. 1income government earnings and tax revenues property tax property tax personal income tax59.99

Numeric Tables with Subtopic: Personal income tax.

EVI LCSH marcnew query search results captions numeric table numeric database online catalog search interface 1 search interface Traverse Searching Between Online Catalogs and Numeric Databases

Melvyl MARC record as source of a query

Extract from MARC as a query Any caption can become a query

Final Report on “Seamless Searching of Numeric and Textual Resources” Project, Two sequels: 1.Adding search by place: “Going Places in the Catalog: Improved Geographic Access,” funded by a National Library Leadership Project from the Institute of Museum and Library Services, Multilingual Search Across Multiple Genres: Proposal submitted Feb 13, 2003!