Improving search efficiency for economic evaluations in major databases using semantic technology Julie Glanville, Carol Lefebvre, Pamela Negosanti, Bill Porter Oct 2010
Overview Why are we interested in economic evaluations? Can economic evaluations be identified efficiently at present? This research project Methods Results Discussion Next steps
Why are we interested in economic evaluations? Systematic reviews and technology assessments frequently consider cost-effectiveness as well as effectiveness outcomes This information is published in economic evaluations Cost-effectiveness analyses Cost-utility analyses Cost-benefit analyses Issues in identifying reports of economic evaluations Poor reporting abstracts may contain terms which signal an economic evaluation but not an explicit term Economics is often mentioned in passing in abstracts Increases number of irrelevant records retrieved
Can economic evaluations be identified efficiently? In healthcare databases Yes and No Specific economic evaluation databases are available (NHS EED and HEED) BUT may need to carry out top up/supplementary searches in large bibliographic databases Beyond healthcare Seem to be no economic evaluation databases Need to search large bibliographic databases such as ERIC and Criminal Justice Abstracts
What about search filters?
Can search filters help? In healthcare databases Many search filters search filters to find economic evaluations in EMBASE and MEDLINE achieve high sensitivity (100%) (1) BUT they have poor precision (less than 4%): very high proportion of irrelevant studies are retrieved (1) Beyond health Few filters available Issues of precision likely to be similar to health ( 1)Glanville J, Kaunelis D, Mensinkai S. How well do search filters perform in identifying economic evaluations in MEDLINE and EMBASE. Int J Tech Assess Hlth Care 2009;25:
This research project How can we improve efficiency of retrieval of economic evaluations in large bibliographic databases? Traditional Boolean approaches don’t seem to be helping Indexing isn’t very helpful at present Can semantic analysis software help? Collaboration with Expert System to explore potential for identifying economic evaluations using their Cogito software
Semantic Net
Semantic analysis Analysis hat assigns a meaning, a sense, to a syntactic structure and consequently to a linguistic unit, according to the knowledge contained in the semantic network.
Methods Gold standard set of 1950 economic evaluation records (published 2000, 2003, 2006) identified from NHS EED and then downloaded from MEDLINE. Comparator set of 4136 matching MEDLINE records for the 3 years (2000, 2003, 2006) not economic evaluations But identified using the NHS EED filter Loaded into Cogito Divided randomly into test sets and validation sets Used in-built semantic analysis and also created new rules to categorise economic evaluations to categorise records as economic evaluations or non-economic evaluations
Testing and validation Test set 975 economic evaluations 2068 comparator records Validation set 975 economic evaluations 2068 comparator records
Results Test set (Gold Standard records=975) (Comparator records = 2068) Validation set (Gold Standard records=975) (Comparator records = 2068) Number of gold standard (GS) records retrieved 975 Number of comparator records retrieved Sensitivity (number GS retrieved/number of GS records) 100% Precision (number of GS retrieved/number of records retrieved) 82.77%71.69%
Results, 2 Precision (combined Test and Validation sets) Sensitivity (combined Test and Validation sets) Using Cogito in-built semantic rules (no filter)77.23%100% Using filter with records scoring 50 78% 90% Using filter with records scoring 10080%85% Using filter with records scoring 20081%83%
Discussion Cogito performs as well as Boolean searching in terms of sensitivity Cogito has a much improved precision score compared to performance of Boolean filters Over 70% (Cogito) compared to under 10% (Glanville et al) Cogito performs well ‘out of the box’ Although early training efforts did not improve precision, further exploration might yield improved results
Next steps Identifying funding to carry out further exploration Exploring economic evaluation identification optimisation further Exploring the effects of importing results from a range of databases into Cogito Exploring whether semantic analysis has potential to achieve improvements in retrieval of other hard to find research where filters do not perform well diagnostic test accuracy studies and quality of life research Exploring the potential of semantic analysis for analysing records by study design obtained from a range of databases in h ealthcare, social care, education and criminal justice contexts in-built rules are database independent.
For further information Julie Glanville, York Health Economics Consortium Bill Porter at Expert System