Download presentation
Presentation is loading. Please wait.
Published byLauren Eaton Modified over 9 years ago
1
Thesauri usage in information retrieval systems: example of LISTA and ERIC database thesaurus
Kristina Feldvari Departmant of Information Sciences, Faculty of Philosphy in Osijek Lorenza Jägera 9, Osijek, Croatia
2
How many of you use thesaurus to refine your retrieval?
3
Content Why thesaurus? Thesaurus and IR systems
User comprehension of thesaurus Comparison of ERIC and LISTA database thesaurus: 1) Which functions do thesauri support? 2) How are thesauri displayed in databases? Conclusion
5
Why thesaurus? Lack of quality controlmain problem Thesaurus is a vocabulary of key words, i.e., a standardized set of terms and phrases authorized for use in an indexing system to describe a subject area or information domain (Librarian Lexicon, 1984.)
6
Thesaurus and IR systems
limits and controls the diversity of natural languages - expression that should be used for each concept (D.Bawden, 2001.) guides indexing and retrieval based on controlled as well as natural language indexing (M. Lykke Nielsen, 2004.)
7
Applications of thesaurus in storage and retrieval
To serve as a term authority for indexers, so that only "acceptable" terms are employed by indexers. To enable indexers quickly to find the "right" term to signify a concept in mind-"right" in the sense that the term must not only connote the proper concept but also must be appropriately specific (or general) with respect to the information being indexed. To serve as a means of validating the results of the indexing effort, from the viewpoint of correctness of spelling, to insure that non-preferred synonyms are not employed by indexers, and to "flag" any terms newly required by the system. To enable the addition of cross-references between terms in any publication and to validate such cross-references to guarantee against circularity and ¨blindness¨ . To enable appropriate formulation of queries put to either printed or computerized indexes. To provide a starting point for other systems which require a vocabulary significantly similar to the one encompassed by the thesaurus at hand. To encourage consistent use of terminology by authors, abstractors, and other originators of information
8
User comprehension of thesaurus
Three main questions: Thesaurus interface design Processing options End-user warrant
9
ERIC database thesaurus: main features
Alphabetical listing of terms Browsing the Thesaurus by 41 categories Boolean operators (basic and advanced search) to refine retrieval and possibility of truncation
10
ERIC database thesaurus: main features
Seven types of cross- references are used: Scope Note (SN), Use For (UF) and Use (USE) references, Narrower Terms (NT), Broader Terms (BT), Related Terms (RT) and Parenthetical Qualifiers Also contains: Record Type; Category; Use Term; Add Date and Posting Parenthentical qualifiires-used to identify a particular indexable meaning of a homograph Record Type: (indicates the status of a term; one of the following will be used: Main - term is a descriptor used by indexers to organize the ERIC database by subject ; Synonym - term is not a descriptor; see 'Use Term', below; Dead - term is no longer a Descriptor; see alternative provided) Category: Indicates the broad group of terms to which this specific Descriptor belongs; searchers can browse the Thesaurus by category to identify additional search terms. Use term: Directs searcher to use either a single Main term or a combination of Main terms Add Date: Shows when the term was added to the Thesaurus Posting: Provides the number of records indexed with the term.
11
LISTA database thesaurus: main features
3 types of displays: Term begins with, Term contains and Relevancy ranked display Six types of cross- references are used: ScopeNote (SN), Use For (UF) and Use (USE) references, Narrower Terms (NT), Broader Terms (BT), and Related Terms (RT) Boolean operators to refine retrieval and truncation
12
ERIC database thesaurus: advantages and deficiences
Basic thesaurus introduction Search tips Help topics and tutorials, ¨users feedback¨ Thesaurus updates list Category searchable Containing: record type, use term, added date and posting Deficiences: Lack of relavance rating Some terminology not well explained (e.g. ¨n/a¨)
13
LISTA database thesaurus: advantages and deficiences
Cross-reference USE (USE) appears in all displays Possibility of ¨exploding¨ the term Relavance rating Deficiences: Lack of thesaurus introduction, search tips and tutorials Lack of terminology explanation Lack of homograph identification
14
Conclusion Information sources are growing enormously- need for more effective information retrieval Boolean operators and keyword searching in retrieval are not enough because of linguistic problems that can occur Thesaurus copes with these problems -vital retrieval tool in databases The main problem - limited users’ thesauri comprehension
15
Thank you for your attention!
Kristina Feldvari Departmant of Information Sciences, Faculty of Philosphy in Osijek Lorenza Jägera 9, Osijek, Croatia
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.