Presentation is loading. Please wait.

Presentation is loading. Please wait.

Thesauri usage in information retrieval systems: example of LISTA and ERIC database thesaurus Kristina Feldvari Departmant of Information Sciences, Faculty.

Similar presentations


Presentation on theme: "Thesauri usage in information retrieval systems: example of LISTA and ERIC database thesaurus Kristina Feldvari Departmant of Information Sciences, Faculty."— Presentation transcript:

1 Thesauri usage in information retrieval systems: example of LISTA and ERIC database thesaurus
Kristina Feldvari Departmant of Information Sciences, Faculty of Philosphy in Osijek Lorenza Jägera 9, Osijek, Croatia

2 How many of you use thesaurus to refine your retrieval?

3 Content Why thesaurus? Thesaurus and IR systems
User comprehension of thesaurus Comparison of ERIC and LISTA database thesaurus: 1) Which functions do thesauri support? 2) How are thesauri displayed in databases? Conclusion

4

5 Why thesaurus? Lack of quality controlmain problem Thesaurus is a vocabulary of key words, i.e., a standardized set of terms and phrases authorized for use in an indexing system to describe a subject area or information domain (Librarian Lexicon, 1984.)

6 Thesaurus and IR systems
limits and controls the diversity of natural languages - expression that should be used for each concept (D.Bawden, 2001.) guides indexing and retrieval based on controlled as well as natural language indexing (M. Lykke Nielsen, 2004.)

7 Applications of thesaurus in storage and retrieval
To serve as a term authority for indexers, so that only "acceptable" terms are employed by indexers. To enable indexers quickly to find the "right" term to signify a concept in mind-"right" in the sense that the term must not only connote the proper concept but also must be appropriately specific (or general) with respect to the information being indexed. To serve as a means of validating the results of the indexing effort, from the viewpoint of correctness of spelling, to insure that non-preferred synonyms are not employed by indexers, and to "flag" any terms newly required by the system. To enable the addition of cross-references between terms in any publication and to validate such cross-references to guarantee against circularity and ¨blindness¨ . To enable appropriate formulation of queries put to either printed or computerized indexes. To provide a starting point for other systems which require a vocabulary significantly similar to the one encompassed by the thesaurus at hand. To encourage consistent use of terminology by authors, abstractors, and other originators of information

8 User comprehension of thesaurus
Three main questions: Thesaurus interface design Processing options End-user warrant

9 ERIC database thesaurus: main features
Alphabetical listing of terms Browsing the Thesaurus by 41 categories Boolean operators (basic and advanced search) to refine retrieval and possibility of truncation

10 ERIC database thesaurus: main features
Seven types of cross- references are used: Scope Note (SN), Use For (UF) and Use (USE) references, Narrower Terms (NT), Broader Terms (BT), Related Terms (RT) and Parenthetical Qualifiers Also contains: Record Type; Category; Use Term; Add Date and Posting Parenthentical qualifiires-used to identify a particular indexable meaning of a homograph Record Type: (indicates the status of a term; one of the following will be used: Main - term is a descriptor used by indexers to organize the ERIC database by subject ; Synonym - term is not a descriptor; see 'Use Term', below; Dead - term is no longer a Descriptor; see alternative provided) Category: Indicates the broad group of terms to which this specific Descriptor belongs; searchers can browse the Thesaurus by category to identify additional search terms. Use term: Directs searcher to use either a single Main term or a combination of Main terms Add Date: Shows when the term was added to the Thesaurus Posting: Provides the number of records indexed with the term.

11 LISTA database thesaurus: main features
3 types of displays: Term begins with, Term contains and Relevancy ranked display Six types of cross- references are used: ScopeNote (SN), Use For (UF) and Use (USE) references, Narrower Terms (NT), Broader Terms (BT), and Related Terms (RT) Boolean operators to refine retrieval and truncation

12 ERIC database thesaurus: advantages and deficiences
Basic thesaurus introduction Search tips Help topics and tutorials, ¨users feedback¨ Thesaurus updates list Category searchable Containing: record type, use term, added date and posting Deficiences: Lack of relavance rating Some terminology not well explained (e.g. ¨n/a¨)

13 LISTA database thesaurus: advantages and deficiences
Cross-reference USE (USE) appears in all displays Possibility of ¨exploding¨ the term Relavance rating Deficiences: Lack of thesaurus introduction, search tips and tutorials Lack of terminology explanation Lack of homograph identification

14 Conclusion Information sources are growing enormously- need for more effective information retrieval Boolean operators and keyword searching in retrieval are not enough because of linguistic problems that can occur Thesaurus copes with these problems -vital retrieval tool in databases The main problem - limited users’ thesauri comprehension

15 Thank you for your attention!
Kristina Feldvari Departmant of Information Sciences, Faculty of Philosphy in Osijek Lorenza Jägera 9, Osijek, Croatia


Download ppt "Thesauri usage in information retrieval systems: example of LISTA and ERIC database thesaurus Kristina Feldvari Departmant of Information Sciences, Faculty."

Similar presentations


Ads by Google