Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX Amy J. Warner, PhD

Slides:



Advertisements
Similar presentations
Taxonomy as Content Outline, Site Map and Search Aid SLA NWR Vancouver October 6, 2006 Marjorie M.K. Hlava President
Advertisements

1 Federica Paradisi Italian National Bibliography Classification and Indexing Division National Central Library of Florence (Italy) Linking DDC numbers.
BS 8723 advances to encompass interoperability Stella G Dextre Clarke Convenor, IDT/2/2 Working Group of BSI.
Taxonomies, Lexicons and Organizing Knowledge Wendi Pohs, IBM Software Group.
Advanced Searching Engineering Village.
Leveraging Your Taxonomy to Increase User Productivity MAIQuery and TM Navtree.
Standards for Controlled Vocabularies
Standards for networked knowledge organisation systems Ron Davies European Library Automation Group Bucharest, April 2006.
Environmental Terminology System and Services (ETSS) June 2007.
Module 6a: Intro to Controlled Vocabularies, Taxonomies and Classification IMT530: Organization of Information Resources Winter 2007 Michael Crandall.
Using Metadata in CONTENTdm Diana Brooking and Allen Maberry Metadata Implementation Group, Univ. of Washington Crossing Organizational Boundaries Oct.
SchemaLogic Workshop Part 2 Tools for Enterprise Metadata Management and Synchronization Prepared for the University of Washington Information School Applied.
Thesaurus Design and Development
Federal Controlled Vocabularies Data Architecture Sub-Committee (DAS) April 8, 2010 Brand K. Niemann.
A Registry for controlled vocabularies at the Library of Congress
1 Languages for aboutness n Indexing languages: –Terminological tools Thesauri (CV – controlled vocabulary) Subject headings lists (CV) Authority files.
Knowledge organisation and information architecture, Nils Pharo Knowledge organisation and the Web Nils Pharo, 6th November 2002.
Sunday May 4 – 5 PM Bradford, Hlava, McNaughton
What do you hate most about the web?
Vocabulary & languages in searching
The NICE taxonomy: a case study of developing a corporate taxonomy Sadia Mughal Health Libraries Conference 19 th July 2010.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
An Introduction to Content Management. By the end of the session you will be able to... Explain what a content management system is Apply the principles.
By Carrie Moran. To examine the Metadata Object Description Schema (MODS) metadata scheme to determine its utility based on structure, interoperability.
Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.
Controlled Vocabulary & Thesaurus Design Planning & Maintenance.
Terminology services and the DDC: the High-Level Thesaurus and beyond Presented to the symposium Dewey goes Europe: on the use and development of the Dewey.
Indexing Knowledge Daniel Vasicek 2014 March 27 Introduction Basic topic is : All Human Knowledge Who Cares? Simple Examples.
 Copyright 2006 Digital Enterprise Research Institute. All rights reserved. Collaborative Building of Controlled Vocabularies Crosswalks Mateusz.
THE YEE CATALOGING RULES: FRBRIZED CATALOGING RULES WITH AN RDF DATA MODEL FOR THE SEMANTIC WEB Presented to ALCTS FRBR Interest Group, ALA Annual 2010,
1 4. Content Organization In this chapter you will learn about: Organizational schemes: classification systems for organizing content into groups Organizational.
1 Catalog Displays, Retrieval, and FAST May 31, 2005.
Making the Most of CVs Ontology Scope Note: Branch of philosophy that deals with being Branch of philosophy that deals with being Broader Term: Metaphysic.
Taxonomies: Insuring compatibility and crosswalks Marjorie M. K. Hlava Access Innovations / Data Harmony
Copyright © 2006 Access Innovations, Inc. 1 Building Taxonomies Part 5 Alice Redmond-Neal Access Innovations, Inc. Enterprise Search Summit New York City,
D4: SKOS and HIVE—Enhancing the Creation, Design and Flow of Information Speakers: Hollie White Jane Greenberg Coordinator: Alan Keely.
Controlled Vocabulary Working Group Virtual Water Cooler Session April 6-7, 2009 Moderator: John Porter rm.action?confKey=jhp7e.
Controlled Vocabulary & Thesaurus Design Hierarchies & Taxonomies.
Definition of a taxonomy “System for naming and organizing things into groups that share similar characteristics” Taxonomy Architectures Applications.
Overview of ISO NP Stella G Dextre Clarke Convenor, IDT/2/2 Working Group of BSI and Project Leader for ISO NP
TOPIC: Transportation Research Thesaurus: Taxonomy Development and Use Cases 14 February :00 PM EST Presented by Jay Ven Eman, Ph.D., CEO Access.
Evolving MARC 21 for the future Rebecca Guenther CCS Forum, ALA Annual July 10, 2009.
ISO 25964: a standard in support of interoperability Stella G Dextre Clarke Project Leader, ISO NP
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Knowledge Representation Semantic Web - Fall 2005 Computer.
Controlled Vocabulary & Thesaurus Design Planning & Maintenance.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
Chapter 4: Content OrganizationCopyright © 2004 by Prentice Hall What do you hate most about the web? Number one answer: I can’t find what I’m looking.
Information Architecture & Design Week 5 Schedule -Planning IA Structures -Other Readings -Research Topic Presentations Nadalia your Presentations.
Controlled Vocabulary & Thesaurus Design Hierarchies.
Metadata Common Vocabulary a journey from a glossary to an ontology of statistical metadata, and back Sérgio Bacelar
MODULE 3 Internet Basics © Paradigm Publishing, Inc.1.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
APS Taxonomy Project Arthur Smith, American Physical Society April 2014.
Revising ANSI/NISO Z39.19 Updates for the 21 st Century.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
Controlled Vocabulary & Thesaurus Design Associative Relationships & Thesauri.
ORGANIZATION OF ELEMENTS OF INFORMATION The Thesaurus.
Semantics and the EPA System of Registries Gail Hodge IIa/ Consultant to the U.S. Environmental Protection Agency 18 April 2007.
Charlyn P. Salcedo Instructor Types of Indexing Languages.
Thesaurus Displays. For any thesaurus display, you may need to make several decisions. These decisions are likely to include which types of terms will.
Slide 6 HMD1SPI376 - Slide 6. What is the Relationship Between BT and NT?  Normally, BT and NT are "inverse" links. In other words, if X is a broader.
Ontologies COMP6028 Semantic Web Technologies Dr Nicholas Gibbins
1 How do we describe something? n What something is about? –What the content of an object is “about”? n Different methods (Wilson, 1968) –counting terms.
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
Controlled Vocabularies Ilia State University, July 2010 Elisabeth Jijavadze, Natia Gabrichidze 1.
Information Organization
Taxonomies, Lexicons and Organizing Knowledge
Taxonomies and Classification for Organizing Content
THESAURUS CONSTRUCTION: GROUND WATER
Presentation transcript:

Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX Amy J. Warner, PhD

2 Epicurious.com

3 Navigation/Taxonomy Vehicle BrandsVehicle Parts CarsVehicle Accessories MR2Spider Carriers Celica Bicycle Carriers Matrix Ski Carriers Avalon Roof Racks Camry Solarus Splash Guards Camry Security Systems PriusTires Corolla ˚ ECHO ˚ SUVs/VansEngines & Transmissions Land Cruiser ˚ Sequoia ˚ ` 4 Runner Sienna Highlander RAV4 Trucks Tundra Tacoma Celica Brochure Camry Brochure

4 Synonym Rings Cholesterol Blood Cholesterol Serum Cholesterol Good Cholesterol Bad Cholesterol LDL.

5 Medline

6 MeSH & UMLS

7 Controlled Vocabulary Defined  A subset of natural language.  A list of preferred and (sometimes)variant terms.  With semantic relationships (hierarchical and associative) (sometimes) defined.  Used to tag document attributes (describe facets). –Topic / Subtopic –Audience –Language –Form  Or can be used to create labeling scheme for navigation.

8 Cornerstones of Vocabulary Control  Use unambiguous labels/search terms.  Make distinctions among labels/search terms clear.  Make choices about wording and specificity of labels/search terms based on user testing and on size of collection.  Use other semantic relationships (hierarchical, associative) if necessary to organize large lists of labels/search terms.

9 Continuum of Vocabulary Control Less More Synonym Control USE/Used for relationship Vehicle crashes USE Vehicle collisions Vehicle collisions UF Vehicle crashes Synonym Rings Vehicle collisions Vehicle crashes Crashes Collisions Hierarchical Relationships Broader/Narrower Terms Vehicle collisions NT Truck collisions Truck collisions BT Vehicle collisions Browse Categories Vehicle safety Truck safety Truck collisions Vehicle safety Site Index Taxonomies Associative Relationships Part/Whole Cause/Effect etc. Vehicle parts RT Vehicles Vehicles RT Vehicle parts

10 Steps in Controlled Vocabulary Construction  Group terms by subject (facet analysis)  Link synonyms and variants. Synonym Rings Vehicle collisions Vehicle crashes Crashes Collisions  Identify broader and narrower terms. Taxonomies / Hierarchies  Identify related terms. Thesauri

11 Purposes of Standard  Base choices on ‘best practice’.  Base choices on known principles.  Foster interoperability.

12 Current NISO Thesaurus Standard  Guidelines for the construction, format, and management of monolingual thesauri: Z  Not a technical standard, but a set of guidelines.  Emphasizes search thesauri.  Emphasizes postcoordinate retrieval.  Used mainly for abstracting and indexing services.  Does not put the standard in context.

13 Why Revise  Not revised since  Number of downloads high, reflecting interest.  Does not take the web environment into account. –Navigation schemes are controlled vocabularies too. –Is out of date in terms of computing technology in general: Software for managing thesauri has advanced. Software for leveraging thesauri though an interface has advanced.  Currently little attention paid to user testing.

14 Term forms  Currently –Emphasizes rigid rules for grammatical form. –Emphasizes short phrases as terms.  Suggested revision –Loosen rules on grammatical form. –Allow for longer, more complex phrases.  Rationale –Software can perform automatic stemming. –Navigation schemes are more precoordinate.

15 Semantic Relationships  Current standard –Only accounts for explicit equivalence relationships. –Hierarchical relationship only allowed for genus-species relationship, with a few exceptions. –Associative relationship only allowed across categories.  Proposed revision –Provide guidelines for choosing unambiguous labels. –Provide guidelines for loose, browse categories.  Rationale –Labeling schemes and pick lists often do not account for explicit synonymy relationships. –Hierarchical navigation schemes need to be less rigid.

16 Browse Categories

17 Usability Testing  Current standard –Discusses users but does not include guidelines for testing with users.  Proposed revision –Provide guidelines for open card sort testing of high level categories. –Provide guidelines for closed card sorting of term groups under high level categories.  Rationale –User testing important consideration for choose terms and term relationships.

18 Display  Current standard –Emphasizes print copies of thesauri. –Screen display section oriented toward display of print copy.  Proposed revision –Oriented more toward displays of vocabularies that only exist in digital format.  Rationale –Most web vocabularies do not have print counterparts.

19 Interoperability  Current standard –Does not address issues associated with interoperability  Proposed revision –Will address major issues and problems associated with interoperability, including multiple languages  Rationale –Being able to share information within and among organizations

20 Construction and Maintenance  Current standard –Emphasizes maintenance problems in print vocabularies. –Discusses software that manages stand-alone vocabularies.  Proposed revision –Advance standards for changing, adding, deleting terms automatically. –Provide guidance for software that is connected to information retrieval systems.  Rationale –Software has advanced significantly.

21 Process for Revising Standard  Appoint editor.  Appoint advisory group.  Draft revision.  Discuss drafts with advisory group.  Vote on final draft by NISO board.

22 Editor & Advisory Group  Amy Warner, lexonomy.com  Vivian Bliss, Microsoft  Carol Brent, ProQuest  John Dickert, U.S. DoD  Lynn El-Hoshy, Library of Congress  Emily Fayen, SDC liaison  Patricia Harpring, Getty  Stephen Hearn, American Library Association  Sabine Kuhn, American Chemical Society/Chemical Abstracts  Pat Kuhr, H.W. Wilson  Diane McKerlie, Design Strategy  Peter Morville, Semantic Studios  Stuart Nelson, National Library of Medicine  Diane Vizine-Goetz, OCLC  Marcia Lei Zeng, Special Libraries Association

23 Progress to Date  Agreement on scope of revision.  Agreement that guidelines should be placed in context.  Agreement that guidelines should be educational as well as prescribing best practice.  Agreement that guidelines should be forward looking in terms of new technologies.  Agreement to write guidelines for elements and features that all vocabularies have in common, then consider their differences.  Survey conducted to determine use of standard, other standards, software.

24 Other Players  Communication with editor of British Standard.  Communication and work with W3C to address issues of implementation of controlled vocabularies.

25 Relationship with Semantic Web and OWL  Semantic Web is an ontological framework.  Both terms in the ontology and the relationships between them are standardized using OWL (Web Ontology Language).  Both the terms and the relationships are ‘deep’ semantically.  This is a structure into which ‘shallower’ terms provided by using Z39.19 could be inserted.  This would enhance interoperability because although we would not have complete agreement on vocabularies, we would have agreement on an effective structure for exchanging them.

26 Contact Me Amy J. Warner