User - thesaurus interaction in a web- based database: an evaluation of users’ search term selection behaviour Ali Asghar Shiri and Crawford Revie Department of Computer and Information Sciences University of Strathclyde Glasgow, UK Infotech Oulu International Workshop on Information Retrieval IR’2001 September , Oulu, Finland
Context of the research User-centred approach to the IR process Thesauri as sources of search terms Search term selection for query formulation and expansion Thesaurus-enhanced interfaces Objective: To explore ways of enhancing interfaces using thesaural knowledge structures
Rationale behind the study Most investigations to date focus on: a) the use of thesauri as sources of search term for professional searchers b) mediated online search environments c) thesauri as query expansion sources Developments in user interfaces which provide thesaurus interaction; (commercial systems as well as research-based prototypes)
Materials and methods Information retrieval system Subjects Search requests Data gathering techniques - quantitative - qualitative
The Ovid interface to the CAB thesaurus
The Ovid interface to the CAB Abstracts database
Framework for analysis Three measures were defined: states, moves and terms states move types a) conceptual moves: browsing, terms selection, etc. b) moves associated with system features: perform search, combine searches, etc. terms a) user supplied; b) system suggested Begin search state Mappin g state Thesauru s state Combin e state Result display state
Percentage of different move types Number of moves of different types per search
Descriptors browsed and selected by users Average number of initial, browsed and selected terms per search
Relationship between the number of moves and the number of descriptors viewed Relationship between the number of moves and the percentage of descriptors selected The relationship between the number of moves made and the number of descriptors viewed and selected
Model of user browsing/selection behaviour Look at average values – approx. (3: 60: 7); taking N to be no. of initial terms we propose a ‘(N: 20N: 2N)’ rule for (initial: viewed: selected) However, there are a wide range of values - viewed (6 to 132); selected (1 to 22) Better to categorise searches/behaviour, e.g.: - few initial terms/low level of interaction; - few initial terms/high degree of interaction; - many initial terms/low level of interaction, etc.
Initial term entry Few terms (2 or less)Many terms (more than 2) Number of terms remains similar Number of terms increased by a factor of 1.5 or more Effect of thesaurus interaction on term selection Thesaurus Interface Impact Matrix (TIIM) 27 S1 6 S2 53 S6 73 S12 97 S8 109 S7 7 S3 25 S4 27 S9 74 S5 132 S11 80 S10 Terms viewed Search number Average number of terms viewed = 29 Average number of terms viewed = 93 Average number of terms viewed = 20 Average number of terms viewed = 95
Users’ evaluation / impressions thesaurus seen as useful tool for: alternative search terms; reformulation of queries; providing alternative perspectives interface found to be simple to learn and use. Specific mention made of hypertext descriptors and checkboxes for browsing/choosing descriptors suggested improvements to interface, such as: use of Boolean operators while selecting descriptors; position of icons and buttons
Future work Full-scale study involving more users will explore: patterns of user behaviour during thesaurus-based browsing and searching activity the relationships between the users’ initial search terms and those selected from the thesaurus the relationship between query types and thesaurus browsing/searching features which best support a thesaurus-enhanced search interface design implications and application of findings