Rapid information retrieval by creating a parallel implementation of Medline Bob Badgett Dept of Medicine UTHSC San Antonio 1/2006
As Mark Twain reportedly put it, "Be careful about reading health books; you may die of a misprint" Johnson T. NEJM 1998
X Only errors that led to proximate adverse event Discharges have 12% adverse events
The most common diagnosis in primary care is… Questions occur in 1/3 of visits –We pursue answers to 55% of their questions –Find answers to 70% (with difficulty in 40%) –Result is only 40% of their questions being answered (guessing in 60%) The “diagnosis of information failure” occurs in about 20% of patients –Twice as common as the most frequent single primary care diagnosis
MEDLINE searching is misery when in a hurry –30 minutes to search –50% of clinical searches by experts fail –Compared to librarians, clinicians find 50% less relevant articles 50% more irrelevant articles Doctors have two minutes available
Current search engine –Live searching of MEDLINE –Iterative searching – queries per day –Internationally recognized Review: equivalent to PubMed –Basis of current grant proposals NLM in collaboration with American College of Physicians, Thomson-MicroMedex, others
Current method –Externally searches MEDLINE via PubMed –PubMed’s publicly stated limit is one search every 8 seconds –We do ~6 per query
Users of proposal Department of Medicine, UTHSC San Antonio –Bob Badgett School of Health Information Sciences, UTHSC Houston –Elmer Bernstam
Knowledge management – 1. Vastness USPSTF 1 – 1989: 60 Topics USPSTF 2 – 1996: 70 Topics USPSTF 3 – : >80 Topics Prevention: 7.4 hours/day Rx: Increasing # of meds
Knowledge management – Vast & complex Articles come –13 million citations –Half million added per year –MEDLINE’s doubling time is 15 years Articles go –1/3 of research eventually refuted/attenuated JAMA PMID: –Original studies - T1/2 = 45 years Ann Intern Med PMID: –Practice guidelines – T1/2 = 6 years JAMA PMID: Some articles never should have been –25 of 33 streptokinase studies maybe were not needed. PMID: But there is more…
Knowledge management – Misinformation Manuscript reviewers prefer manuscripts they agree with –J Lab Clin Med PMID: Quality of reviews and textbooks –Original author misquoted in 15% of references –Errors in citation of references - 25% BMJ PMID: Biases that hinder disseminination –Publication bias against negative studies BMJ PMID: Industry sponsored research Media coverage of unpublished articles –1/3 never published
Proposed search engine
Overall strategy Search ‘systematic textbook’ –PIER (American College of Physicians) Depending on query –National Guidelines Clearinghouse –FDA –CDC –Others In case nothing found (20%?) –Evidence is too subtle or recent –MEDLINE
MEDLINE the data 15 million records in xml –Currently 52 GBs –Growing at 6 GBs per year Updated weekly Its thesaurus, MeSH is 23 descriptors and is updated yearly The UMLS meta-thesaurus has 5 million concept names
MEDLINE Strategy Original studies Systematic reviews Practice guidelines Other types 3-4 iterations with increasingly restrictive limits 12 searches per query Need subscecond