Using Patient Data to Retrieve Health Knowledge James J. Cimino, Mark Meyer, Nam-Ju Lee, Suzanne Bakken Columbia University AMIA Fall Symposium October 25, 2005
Automated Retrieval with Clinical Data Understand Information Needs 1 Get Information From EMR 2 Automated Translation 5 Resource Terminology 4 Presentation 7 Resource Selection 3 Querying 6 MRSA
What’s Hardest about Infobuttons? It’s not knowing the questions It’s not integrating clinical info systems It’s not linking to resources It’s translating source data to target terms
Automated Retrieval with Clinical Data Understand Information Needs 1 Get Information From EMR 2 Automated Translation 5 Resource Terminology 4 Presentation 7 Resource Selection 3 Querying 6 MRSA
What’s Hardest about Infobuttons? It’s not knowing the questions It’s not integrating clinical info systems It’s not linking to resources It’s translating source data to target terms
Types of Source Terminologies Uncoded (narrative): –Radiology reports (?) "…infiltrate is seen in the left upper lobe." Coded –Lab tests (6,133) AMIKACIN, PEAK LEVEL –Sensitivity tests (476) AMI 6 MCG/ML –Microbiology results (2,173) ESCHERECHIA COLI –Medications (15,311) UD AMIKACIN 1 GM VIAL
Types of Target Terminologies Narrative search: –PubMed –RxList –Up to Date –Micromedex –Lab Tests Online –OneLook –National Guideline Clearinghouse Coded resource: –Lexicomp –CPMC Lab Manual Coded search –PubMed
The Experiments Identify sources of patient data Get random sample of terms for each source
Term Samples 100 terms from radiology reports using MedLEE 100 Medication ingredients 100 Lab test analytes 100 Microbiology results 94 Sensitivity test reagents
The Experiments Identify sources of patient data Get random sample of terms for each source Translate terms if needed (multiple methods) Perform automated retrieval with terms
Searches Performed Narrative Concept Concept Resource Resource Search Un- Coded C o d e d Radiology Terms Medications Lab Tests Sensitivity Tests Microbiology Results PubMed, NGC, OneLook, UptoDate RxList, Micromedex Lexicomp LabtestsOnline, CPMC Lab PubMed PubMed Manual RxList, Micromedex UptoDate, PubMed PubMed
Mapping Methods Microbiology results to MeSH: – Semi-automated Lab tests to MeSH analytes: –Automated, using UMLS Medications to Lexicomp: –Natural language processing Lab tests to CPMC Lab Manual: –Manual matching
Results: Multiple Documents Retrieval success is represented as percent of terms that successfully retrieved any results; numbers in parentheses indicate average numbers of results (citations, documents, topics, definitions, etc., depending on the target resource) for those searches that retrieved at least one result.
Uncoded versus Coded Searches 1,028/2,173 (47.3%) of microbiology tests terms mapped to MeSH 940/1041 (90.3%) of lab analytes mapped to LOINC 485/940 (51.6%) LOINC analytes mapped to MeSH Result TypeNumberRatio Identical Slight Diff71.44 Large Diff Result TypeNumberRatio Identical Slight Diff Large Diff123.28
Results: Single Document Retrieval success is represented as percent of terms that successfully retrieved any results; numbers in parentheses indicate average numbers of results (citations, documents, topics, definitions, etc., depending on the target resource) for those searches that retrieved at least one result.
Results: Page of Links Results for Rx List and Micromedex are difficult to quantify, because they provided heterogeneous lists of links; rather than provide link counts, we assessed the true positive and false negative rates, shown in brackets.
Micromedex versus RXList 194 Terms 9 missed by both RxList: 163 Micromedex: Terms found by both 22 found by Micromedex but missed by RxList 5 found by RxList but missed by Micromedex
See For Yourself!
Discussion 7 sources, 894 terms, 11 resources, 1,592 searches Automated retrieval is technically possible –Found something % of the time –12/16 experiments “succeeded” % Translation often unsuccessful Automated indexing works Usefulness of translation to MeSH is marginal Good quality when retrieving pages of links (Micromedex and RxList) Good quality when with concept-indexed resources Recall/precision of document retrievals unknown –Need to define the question –Additional evaluation needed
Next Steps Creation of terminology management and indexing suite Formal analysis of qualities of answers
Acknowledgments This work is supported in part by NLM grants R01LM07593 and R01LM07659 and NLM Training Grants LM and P20NR