1 How do we describe something? n What something is about? –What the content of an object is “about”? n Different methods (Wilson, 1968) –counting terms.

Slides:



Advertisements
Similar presentations
1 Thesauri, Controlled Terminologies, and other solutions Paul Miller (UKOLN) & Matthew Stiff (mda)
Advertisements

Subject Analysis: An Introduction Based on BASIC SUBJECT CATALOGING USING LCSH edited by Lori Robare.
Controlling values The equivalence relationship. The vocabulary problem What is this?
HELP! Language confusion occurring! Image from
6. Applying metadata standards: Controlled vocabularies and quality issues Metadata Standards and Applications Workshop.
Text Operations: Preprocessing. Introduction Document preprocessing –to improve the precision of documents retrieved –lexical analysis, stopwords elimination,
SLIDE 1IS 257 – Fall 2007 Thesaurus Construction and Use University of California, Berkeley School of Information IS 245: Organization of.
Introduction to Library Research Gabriela Scherrer Reference Librarian for English Languages and Literatures, University Library of Bern.
Introduction to Library Research Gabriela Scherrer Reference Librarian for English Languages and Literatures, University Library of Bern.
Using Metadata in CONTENTdm Diana Brooking and Allen Maberry Metadata Implementation Group, Univ. of Washington Crossing Organizational Boundaries Oct.
Thesaurus Design and Development
Module 7b: Extracting/Controlling Terms and Semantic Relationships IMT530: Organization of Information Resources Winter 2007 Michael Crandall.
1 Vocabulary & languages in indexing & searching Connection: indexing searching
1 Languages for aboutness n Indexing languages: –Terminological tools Thesauri (CV – controlled vocabulary) Subject headings lists (CV) Authority files.
International Atomic Energy Agency INIS : International Nuclear Information System Yves Turgeon Head, INIS Unit International Atomic Energy Agency.
Sunday May 4 – 5 PM Bradford, Hlava, McNaughton
Vocabulary & languages in searching
EuroVoc, Eurlex, EU Bookshop Danica Maleková, Publications Office STS Bratislava, 22 October 2010.
The NICE taxonomy: a case study of developing a corporate taxonomy Sadia Mughal Health Libraries Conference 19 th July 2010.
Information Organization
Languages are bridges … not barriers Chiara Carlucci – CEDEFOP Library ReferNet Technical Meeting September 2009.
Introduction to Library Research Gabriela Scherrer Reference Librarian for English Languages and Literatures, University Library of Bern.
Terminology services and the DDC: the High-Level Thesaurus and beyond Presented to the symposium Dewey goes Europe: on the use and development of the Dewey.
Improving Access to Audio- Visual Materials by Using Genre/Form Terms OLAC Conference 1-3 October 2004 Montreal, Quebec.
Unified Medical Language System® (UMLS®) NLM Presentation Theater MLA 2005 May 16 & 17, 2005 Rachel Kleinsorge.
Terminology and Standards Dan Gillman US Bureau of Labor Statistics.
Vocabularies in the VO Alasdair J G Gray Norman Gray Iadh Ounis.
D4: SKOS and HIVE—Enhancing the Creation, Design and Flow of Information Speakers: Hollie White Jane Greenberg Coordinator: Alan Keely.
ICS-FORTH January 11, Thesaurus Mapping Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Bath, UK, January.
Incorporating ARGOVOC in DSpace-based Agricultural Repositories Dr. Devika P. Madalli & Nabonita Guha Documentation Research & Training Centre Indian Statistical.
AAT Art & Architecture Thesaurus. Diffuse list of museum standards
Tommie Curtis SAIC January 17, 2000 Open Forum on Metadata Registries Santa Fe, NM SDC JE-2023.
Controlled Vocabulary & Thesaurus Design Hierarchies & Taxonomies.
Conceptual Maps and Thesauri : A Comparison of Two Models of Representation Arising from Different Disciplinary Traditions Lalthoum Saàdani and Suzanne.
Definition of a taxonomy “System for naming and organizing things into groups that share similar characteristics” Taxonomy Architectures Applications.
The UNESCO Thesaurus Meeting for Managers of UNESCO Documentation Networks Meron Ewketu UNESCO Library June
ISO 25964: a standard in support of interoperability Stella G Dextre Clarke Project Leader, ISO NP
1 Controlled Vocabularies Paul Miller Interoperability Focus UKOLN U KOLN is funded by Resource: the Council.
Sharing Ontologies in the Biomedical Domain Alexa T. McCray National Library of Medicine National Institutes of Health Department of Health & Human Services.
INFO Week 8 Subject Indexing & Knowledge Representation Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.
Thesauri usage in information retrieval systems: example of LISTA and ERIC database thesaurus Kristina Feldvari Departmant of Information Sciences, Faculty.
New Tools for astronomy librarians D Donna Thompson SLA PAM Roundtable June 9, 2014.
Controlled Vocabulary & Thesaurus Design Hierarchies.
Subject Analysis and Vocabulary Control Spring 2006, 6 March Bharat Mehra IS 520 (Organization and Representation of Information) School of Information.
IAEA International Atomic Energy Agency International Nuclear Information System (INIS) INIS SUBJECT ANALYSIS: Subject Indexing INIS Training Seminar
Subject Headings for Reference Everything You Need to Know About Subject Headings in One Easy Lesson By Dr. Nancy J. Becker Presented by Dr. Kevin Rioux.
June 2003INIS Training Seminar1 INIS Training Seminar 2-6 June 2003 Subject Analysis Thesaurus and Indexing Alexander Nevyjel Subject Control Unit INIS.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
Controlled Vocabulary & Thesaurus Design Associative Relationships & Thesauri.
LIS 204: Introduction to Library and Information Science Week Nine Kevin Rioux, PhD.
ORGANIZATION OF ELEMENTS OF INFORMATION The Thesaurus.
Subject Access to Your Information Sandy Tucker Texas A&M University Libraries August 1, 2006 Second International Symposium on Transportation Technology.
Charlyn P. Salcedo Instructor Types of Indexing Languages.
Slide 6 HMD1SPI376 - Slide 6. What is the Relationship Between BT and NT?  Normally, BT and NT are "inverse" links. In other words, if X is a broader.
Ontologies COMP6028 Semantic Web Technologies Dr Nicholas Gibbins
Controlling values for information organization 384C – Organizing Information Spring 2016 Karen Wickett School of Information University of Texas at Austin.
Some basic concepts Week 1 Lecture notes INF 384C: Organizing Information Spring 2016 Karen Wickett UT School of Information.
Information organization Week 2 Lecture notes INF 380E: Perspectives on Information Spring 2015 Karen Wickett UT School of Information.
Food and Agriculture Organization of the UN Library and Documentation Systems Division Slide 1 July 2005 Mapping CAT to AGROVOC 6 th AOS Workshop Vila.
Subject Analysis: An Introduction
Subject Headings for Reference
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
Information Organization
COMP6215 Semantic Web Technologies
Subject Access: Indexing and Abstracting
NeurOn: Modeling Ontology for Neurosurgery
Information Organization
Introduction to Semantic Metadata & Semantic Web
PubMed.
THESAURUS CONSTRUCTION: GROUND WATER
Presentation transcript:

1 How do we describe something? n What something is about? –What the content of an object is “about”? n Different methods (Wilson, 1968) –counting terms (objective method) –complete description/summarization –unifying thought(s) –What stands out (main points) n Challenges –Non-text

2 Languages for aboutness n Indexing languages: –Terminological tools Thesauri (CV – controlled vocabulary) Subject headings lists (CV) Authority files for named entities (people, places, structures, organizations) –Classification –Keyword lists –Natural language systems (broad interpretation)

3 Aboutness: How to do it! n Read the document [Intellectual reading] –look for key features –many indexers mark up the items –rarely have time to read the whole document n Determine aboutness [Conceptual analysis] n Translate aboutness into the vocabulary or scheme you are using –In general: Subject headings: 1-3 headings –Descriptors, 5-8 descriptors –Classification: 1 notation (should it only be one!?).

4 Features of indexing languages: n With the exception of a few general domain tools, they are generally domain specific. –MeSH –NASA Thesaurus –Astronomy Thesaurus –ERIC thesaurus n Concepts (or concept representations) are arranged in a discernable order 

5 Language schema designs n Classified--grouping –Hierarchies and facets MeSH Browser Art and Architecture (Getty AAT) n Alphabetical -- horizontal –Verbal/Alphabetical (ordering/filing challenges)

6 Controlled Vocabulary n Why do we have a controlled vocabulary? n Three of you independently identify a new human gene, and each separately name it different things. n How do we handle references/resolving/utilizing this concept which has different names. Let alone, across languages?!

7 Controlled Vocabulary n A list or a database of subject terms in which each concept has a preferred terms or phrase that will be used to represent it in the retrieval tool; the terms not used have references (syndetic structure), and often scope notes. Their can be aliases for preferred terms (so the all three of your gene names get recorded and are matchable to the preferred term).

Example n For gene names, there is an authority, HUGO Gene Nomenclature Committee that designates an official curated name for gene. n During the research process however, there may have been multiple initial names. 8

More Examples n Most processs however, do NOT have standardized naming. n For instance genetic conditions are not named in one standard way. Doctors treating patients often propose the first name, but often expert working groups later revise to more appropriate name. 9

Cont’d n The basic genetic or biochemical defect that causes the condition (for example, alpha-1 antitrypsin deficiency);alpha-1 antitrypsin deficiency n One or more major signs or symptoms of the disorder (for example, hypermanganesemia with dystonia, polycythemia, and cirrhosis);hypermanganesemia with dystonia, polycythemia, and cirrhosis n The parts of the body affected by the condition (for example, craniofacial-deafness-hand syndrome);craniofacial-deafness-hand syndrome n The name of a physician or researcher, often the first person to describe the disorder (for example,Marfan syndrome, which was named after Dr. Antoine Bernard-Jean Marfan);Marfan syndrome n A geographic area (for example, familial Mediterranean fever, which occurs mainly in populations bordering the Mediterranean Sea); orfamilial Mediterranean fever n The name of a patient or family with the condition (for example, amyotrophic lateral sclerosis, which is also called Lou Gehrig disease after the famous baseball player who had the condition).amyotrophic lateral sclerosis 10

11 Thesaurus (structured thesaurus) n Lexical semantic relationships n Composed of indexing terms/descriptors n Descriptors = representations of concepts n Concepts = Units of meaning (Svenonius)

12 Thesaurus n Preferred terms n Non-preferred terms n Semantic relations between terms n How to apply terms (guidelines, rules) n Scope notes n Adding terms (How to produce terms that are not listed explicitly in the thesaurus)

13 Preferred Terms n Control form of the term Spelling, grammatical form Theatre / Theater MLA / Modern language association n Choose preferred term between synonyms Brain cancer or Brain Neoplasms?

14 Common thesaural identifiers n SN Scope Note –Instruction, e.g. don’t invert phrases n USE Use (another term in preference to this one) n UF Used For n BT Broader Term n NT Narrower Term n RT Related Term

15 Semantic Relationships n Hierarchy n Equivalence n Association

16 Hierarchies of Meaning ‘Glass’ ‘Beer Glass’ ‘Wine Glass’ ‘Red wine glass’ ‘White wine glass’ From: Controlled Vocabularies/ Paul Miller Interoperability Focus UKOLN

17 Hierarchy n Level of generality – both preferred terms n BT (broader term) –Robins BT Birds n NT (narrower term) –Birds NT Robins –Inheritance, very specific rules

18 Equivalence n When two or more terms represent the same concept n One is the preferred term (descriptor), where all the information is collected n The other is the non-preferred and helps the user to find the appropriate term

19 Equivalence n Non-preferred term USE Preferred term –Nuclear Power USE Nuclear Energy –Periodicals USE Serials n Preferred term UF (used for) Non-preferred term –Nuclear Energy UF Nuclear Power –Serials UF Periodicals

20 Association n One preferred term is related to another preferred term n Non-hierarchical n “See also” function n In any large thesaurus, a significant umber of terms will mean similar things or cover related areas, without necessarily being synonyms or fitting into a defined hierarchy

21 Association n Related Terms (RT) can be used to show these links within the thesaurus –Bed RT Bedding –Paint Brushes RT Painting –Vandalism RT Hostility –Programming RT Software