The Significance of Vocabulary Michael Buckland School of Information Management and Systems University of California, Berkeley.

Slides:



Advertisements
Similar presentations
Sam Hastings University of North Texas School of Library and Information Sciences User Input into Image Retrieval Design.
Advertisements

WFM 6202: Remote Sensing and GIS in Water Management
Leveraging Your Taxonomy to Increase User Productivity MAIQuery and TM Navtree.
Page 1 June 2, 2015 Optimizing for Search Making it easier for users to find your content.
Entering A New ERA : The European Research Area Ken Miller UK Data Archive University Of Essex June 11-15, 2002.
Engineering Village ™ ® Basic Searching On Compendex ®
7/16/2002JCDL 2002, Ray Larson The “Entry Vocabulary Index” Approach to Multilingual Search Ray R. Larson, Fredric Gey, Aitao Chen, Michael Buckland University.
Introduction to Library Research Gabriela Scherrer Reference Librarian for English Languages and Literatures, University Library of Bern.
Access to Digital Heritage Resources using What, Where, When and Who Michael Buckland Electronic Cultural Atlas Initiative University of California, Berkeley.
Thesaurus Design and Development
Geography, Time, and the Representation of Cultural Change – Experience from a Large Collaboration: The Electronic Cultural Atlas Initiative (ECAI) Michael.
Seamless Searching of Numeric and Textual Resources Funded by a National Library Leadership Grant from the Institute of Museum and Library Services Michael.
Lesson 2 Technology: Federated Searching Explained.
Printed Resources and Digital Information The Digital Difference in Reference Collections Michael Buckland, School of Information Management & Systems,
Incorporating Historical and Geographical Dimensions into a Search Interface Michael Buckland Electronic Cultural Atlas Initiative University of California,
July 7, 2008ISKO Montréal1 ISKO 2008, Montréal 4W Vocabulary Mapping Across Diverse Reference Genres Michael Buckland and Ryan Shaw (& others) Electronic.
DATABASES FROM HCT LIBRARIES. HCT has many online databases for students to use to find information. A database is a collection of information organized.
Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007.
OARE Module 5B: Searching for Scientific Research Using Environmental Issues and Policy Index (EBSCO)
Teaching Metadata and Networked Information Organization & Retrieval The UNT SLIS Experience William E. Moen School of Library and Information Sciences.
Library Catalogs. What is a catalog? A set of records that provide information about the items that the catalog represents. Metadata: Information about.
JIRAWAT PROMPORN TRAINING Dept. BOOK PROMOTION & SERVICE CO., LTD. ปรับปรุงล่าสุด 26/07/50.
Automatic Subject Classification and Topic Specific Search Engines -- Research at KnowLib Anders Ardö and Koraljka Golub DELOS Workshop, Lund, 23 June.
Library Resources Barbara Dorward November Previous session  Catalogues  Library resources  Finding information on the web  Evaluation of information.
1 Intra- and interdisciplinary cross- concordances for information retrieval Philipp Mayr GESIS – Leibniz Institute for the Social Sciences, Bonn, Germany.
1 Catalog Displays, Retrieval, and FAST May 31, 2005.
Databases Indexes & Abstracts. Indexes & Abstracts = Serials When most librarians think about science and technology they think about serials and the:
Fast-track to Innovation Inspec on EBSCO HOST January and February 2006 Amy Barnes Inspec Customer Relationship Manager, Europe, Africa.
LIS 506 (Fall 2006) LIS 506 Information Technology Week 11: Digital Libraries & Institutional Repositories.
Lecture Four: Steps 3 and 4 INST 250/4.  Does one look for facts, or opinions, or both when conducting a literature search?  What is the difference.
Basic Catalog Searching Rich Edwards Innovative Coordinator Washington State Library.
The Library Cataloging Tradition Marty Kurth CS 431 February 9, 2005 [slides stolen from Diane Hillmann]
© 2000 OCLC Online Computer Library Center, Incorporated Dawn Lawson Manager, Electronic Products OCLC Forest Press Introduction to WebDewey.
MEDLINE for Medical Research Juliet Ralph and César Pimenta Hilary Term 2007.
NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006.
NCSU Libraries Andrew Pace & Emily Lynema NCSU Libraries May 24, 2006.
Keyword vs. Controlled Vocabulary Searching 12 Basic Skills for IQ.
Information in the Digital Environment Information Seeking Models Dr. Dania Bilal IS 530 Spring 2006.
Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science.
Librarians vs. Automation Carolyn Weber Lucio Campanelli Will Hohyon Ryu.
1 CS 430: Information Discovery Lecture 25 Cluster Analysis 2 Thesaurus Construction.
Resources for Biological Research Catherine Dockerty and Sophie Wilcox February 2008.
Information in the Digital Environment Information Seeking Models Dr. Dania Bilal IS 530 Spring 2005.
Concepts and phrases 2. checked out (on loan): ödünç verilmiş/kullanıcı üzerinde The circulation status of an item that has been charged to a borrower.
RESEARCH – DOING AND ANALYSING Gavin Coney Thomson Reuters May 2009.
Translating Dialects in Search: Mapping between Specialized Languages of Discourse and Documentary Languages Vivien Petras UC Berkeley School of Information.
The physical parts of a computer are called hardware.
Introduction to GAMS, Netlib, Numerical Recipes CS 3414.
Finding Information in the Atmospheric Sciences n Overview of library services n MadCat, the library catalog n Searching n Finding journal articles n Finding.
WISER : The Ovid databases Ovid is the platform for searching many of the life science and medicine databases. Juliet Ralph, Radcliffe Science Library.
Librarians vs. Automation Carolyn Weber Lucio Campanelli Will Hohyon Ryu.
Information Retrieval Transfer Cycle Dania Bilal IS 530 Fall 2007.
Health: International Finding information on health care delivery in other countries.
ENG 110 / HIS 113 Mortola Library.  Understand the nature and potential uses of a variety of secondary sources.  Locate books pertaining to your research.
LIS 204: Introduction to Library and Information Science Week Nine Kevin Rioux, PhD.
User-Friendly Systems Instead of User-Friendly Front-Ends Present user interfaces are not accepted because the underlying systems are too difficult to.
DESIGN AND DEVELOPMENT OF NOAA VIRTUAL LIBRARIES: THE INTERSECTION OF TRADITIONAL LIBRARY KNOWLEDGE AND CUTTING EDGE INFORMATION TECHNOLOGIES Dottie Anderson.
Realtime Financial Monitoring and Analysis System May 2010 Lietu Search Engine.
A Faceted Interface to the Library Catalog Tito Sierra NCSU Libraries ALA Midwinter Meeting January 20, 2007.
The ___ is a global network of computer networks Internet.
September 2003, 7 th EDG Conference, Heidelberg – Roberta Faggian, CERN/IT CERN – European Organization for Nuclear Research The GRACE Project GRid enabled.
분류론 대학원 CLASSIFICATION AS A SEARCH TOOL Do you have anything on ‘stegosaurus’? stegosaurus dinosaurs prehistoric animals, prehistoric.
1 CS 430: Information Discovery Lecture 28 (a) Two Examples of Cluster Analysis (b) Conclusion.
Slides Template for Module 3 Contextual details needed to make data meaningful to others CC BY-NC.
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
CrissCross, Seoul
Using computers to search electronic databases
Vocabulary, Statistics, Time and Geography
Information Retrieval in Digital Libraries: Bringing Search to the Net
Overview of Computer system
Presentation transcript:

The Significance of Vocabulary Michael Buckland School of Information Management and Systems University of California, Berkeley

The Significance of Vocabulary An economic claim: Vocabulary problems reduce the benefits and return on investment in information services. Vocabulary is used for indexicality, therefore issues of identity are central to LIS. Vocabulary is central to digital libraries. Vocabulary central to explaining the history of conceptions of LIS!

A correctly formed Library of Congress Subject heading, but who would think of such search terms? God --- Knowableness --- History of doctrines --- Early church, ca Congresses.

Economic Rationale: Massive investment in repositories Large investment in categorization schemes: classifications, thesauri, concept codes, headings, … Categorization schemes usually specialized and stylized Increasingly unfamiliar to searchers, hence ineffective, inefficient use

Remedy Support for searching unfamiliar metadata vocabularies: Interface to translate searcher’s vocabulary into system’s vocabulary.

Examples Automobile import, export data (Census Bureau) Automobiles? No data. Cars? “Railway or tramway stock” (Passenger motor vehicles, spark ignition engine.)

“Automobiles”, also know as... TL / in Library of Congress Classification in U.S. Patent Classification in Standard Industrial Classification

Example: Coastal pollution F SU COASTAL POLLUTION 0 F TW COASTAL POLLUTION SUMMARIZE SUBJECTS LCSH Marine pollution Coastal zone management Water --- Pollution Petroleum industry and trade Beach erosion Coasts Barrier islands MeSH Seawater Water pollution Bacteria Water microbiology Air pollution Environmental monitoring Bathing beaches

International Harmonized Commodity Classification System: “Computer” HS 84: “Nuclear reactors, boilers, machines and mechanical appliances” HS 8471: “Automatic data processing machines and units thereof, magnetic or optical readers, machines for transcribing data” HS : “Digital auto data proc mach contng in the same housing a CPU and input & output device”

INSPEC Thesaurus subdomain- based indexes: “Water” subdomain: Fission reactor safety; Fission reactor fuel; Polymers; Organic insulating materials; Water supply; Cable insulation; Insulation testing; and Insulating oils. “Biology” subdomain: Water; Biomechanics; Physiological models; Neurophysiology; Cellular effects of radiation. “Information Studies” subdomain: Agriculture; Natural resources; Forecasting theory; Operations research; Erosion.

Example: Vietnam War. U.C. MELVYL Online Catalog FIND XSU VIETNAM WAR Search Results: 0 records FIND XSU VIETNAMESE CONFLICT Search Results: 4,190 records

Dictionaries don’t always help Emanuel Goldberg: Aerial photography using a “Drachen” Actual meaning: Aerodynamic tethered balloon. Standard contemporary English was: Aerostat. German: Drachen (= Kite in dictionary)

“Entry vocabulary” search interfaces: Software and algorithms map natural language vocabulary to specialized metadata terms. Allows users to enter ordinary language queries while taking advantage of existing subject headings, categorization Uses co-occurrence statistics to link users’ ordinary language terms to system vocabularies Statistical association between lexical items in titles and abstracts and the system’s metadata vocabulary Suggests most likely system vocabulary

Thesaurus navigation Facilitates browsing where structure is present: Broader, narrower, related terms Guides searcher to other parts of the structure Retrieval set analysis Navigation within micro-domain

Web access: WWW forms-based application supported by Perl Supports searches on remote repositories Four subdomain dictionaries in three databases --- BIOSIS (Biological abstracts): subdomain “water” --- INSPEC: subdomains: “information science”, “water” --- U.S. Patent Office classification

Statement of work: Varied prototype Entry Vocabulary Modules. Unintrusive development of EVMs by agents Sensitivity to subdomains. Natural language processing to augment statistical term frequency. Recommendations for metadata “codebooks” for numeric databases.