NLM Medical Text Indexer (MTI) BioASQ Challenge Workshop September 27, 2013 J.G. Mork, A. Jimeno Yepes, A. R. Aronson.

Slides:



Advertisements
Similar presentations
PubMed/MeSH - Medical Subject Headings (module 4.3)
Advertisements

An introduction to Medline (CMM2) Medical Subject Librarian Team.
PubMed Searching: Automatic Term Mapping (ATM) PubMed for Trainers, Spring 2014 U.S. National Library of Medicine (NLM) and NLM Training Center.
NCBI/WHO PubMed/Hinari Course NCBI Literature Databases: PubMed Background.
PubMed: Outline Coverage MeSH, mapping and subheadings Simple search Limits Displaying and managing results MeSH database Single citation matcher.
Introduction to PubMed® (pubmed.gov)
Advanced PubMed Searching for First-year PT Master Students Min-Lin E Fang, MLIS Education and Information Consultant for Nursing and Social and Behavioral.
The NLM Indexing Initiative Alan R. Aronson, PhD Lister Hill Center, National Library of Medicine American Society of Indexers Annual Meeting May 15, 2004.
Semantic indexing in PubMed CERN Workshop on Innovations in Scholarly Communication (OAI8) CERN Workshop on Innovations in Scholarly Communication (OAI8)
PubMed for Trainers, Fall 2011 U.S. National Library of Medicine (NLM) and NLM Training Center PubMed Search Mechanics Tips & Tricks.
MEDLINE®/PubMed® Based on the PubMed for Trainers course, U.S. National Library of Medicine (NLM) and NLM Training Center Jane Bridges, ML, AHIP Associate.
Indexing the Biomedical Literature in a Time of Increased Demand and Limited Resources BioASQ Workshop September 27, 2013 Alan R. Aronson Lister Hill Center,
NLM Online Users’ Meeting May 21, 2012
The National Library of Medicine online resources Salima M’seffar INH- Bibliotheque
U. S. National Library of Medicine NLM Indexing Initiative Tools for NLP: MetaMap and the Medical Text Indexer Natural Language Processing: State of the.
PubMed/MeSH - Medical Subject Headings (Advanced Course: Module 1)
Medical Knowledge Watch at the Belgium Poison Centre Christophe Dupriez 26 June 2007.
Literature Searching: Theories Related to Nursing Care of the Adult Min-Lin Fang, MLIS Education and Information Consultant for Nursing and Social and.
Arpita Bose, MLIS Outreach and Communications Coordinator
CSE 730 Information Retrieval of Biomedical Data The use of medical lexicon in biomedical IR.
HIKM’2006AMTEx Automatic Document Indexing in Large Medical Collections Angelos Hliaoutakis, Kalliopi Zervanou, Euripides G.M. Petrakis Technical University.
MEDLINE®/PubMed® November 21, 2013 NLM’s PubMed for Trainers* Part II Recap for GaIN Virtual Meeting, Fall 2013 Jane Bridges M.L., AHIP Anna Krampl M.S.L.S.,
MS 640: Introduction to Biomedical Information Medical Professionalism Finding Information Using Alumni Medical Library Resources.
Medical Subject Headings (MeSH)
Indexing 1/2 BDK12-3 Information Retrieval William Hersh, MD Department of Medical Informatics & Clinical Epidemiology Oregon Health & Science University.
How to do a literature search Saharuddin Ahmad Aida Jaffar Department of Family Medicine.
DeCS/MeSH description, uses, services, updating Adalberto Tardelli BIREME/PAHO/WHO GHL Workshop March 27, 2007.
NICTA Copyright 2013From imagination to impact Identifying Publication Types Using Machine Learning BioASQ Challenge Workshop A. Jimeno Yepes, J.G. Mork,
Session II: Scientific Publishing and Semantic Web W3C Semantic Web for Life Sciences Workshop October 27, 2004 Moderator: Alan R. Aronson.
Searching Medline Alex Denby Regional MI Manager London Medicines Information Service (Northwick Park Hospital)
Semi-Automatic Indexing of Full Text Biomedical Articles Washington D.C. October 25, 2005 Clifford W. Gay Lister Hill National Center for Biomedical Communications.
Concept-based Image Retrieval: The ARRS GoldMiner ® Image Search Engine Charles E. Kahn Jr., MD, MS Medical College of Wisconsin Milwaukee, Wisconsin,
We will complete another date search by entering 2008 to 2010 in the Specify date range option and clicking on Search.
Survey of Medical Informatics CS 493 – Fall 2004 September 27, 2004.
MEDLINE for Medical Research Juliet Ralph and César Pimenta Hilary Term 2007.
Doug Brutlag 2011 Bibliographic Search Doug Brutlag Professor Emeritus of Biochemistry.
Limits From the initial (HINARI) PubMed page, we will click on the Limits search option. Note also the hyperlinks to Advanced search and Help options.
Medline on OvidSP. Medline Facts Extensive MeSH thesaurus structure with many synonyms used in mapping and multidatabase searching with Embase Thesaurus.
Searching EMBASE Helen Rowlandson Medicines Information Manager Northwick Park Hospital, London.
Shelly Warwick, MLS, Ph.D – Permission is granted to reproduce and edit this work for non-commercial educational use as long as attribution is provided.
Limits From the initial (HINARI) PubMed page, we will click on the Limits search option. Note also the hyperlinks to Advanced search and Help options.
Bio-Med Library Orientation Del Reed Ph.D Bio-Medical Library
Distribution of information in biomedical abstracts and full- text publications M. J. Schuemie et al. Dept. of Medical Informatics, Erasmus University.
PubMed/Limits and Advanced Search (module 4.2). MODULE 4.2 PubMed/Limits & Advanced Search Instructions - This part of the:  course is a PowerPoint demonstration.
GENE INDEXING Janice Ward Indexer/Reviser Index Section, NLM.
U. S. National Library of Medicine The Current State of MetaMap and MMTx UMLS Webcast Alan (Lan) R. Aronson Lister Hill Center/NLM/NIH
DeCS/MeSH Description, uses, services, updating Visit of Isabelle Wachsmuth (WHO) and América Valdes (PAHO) BIREME, São Paulo, August 2007.
Copyright OpenHelix. No use or reproduction without express written consent1.
Medical Text Indexing Joe Thomas Unit Supervisor Index Section, NLM.
PubMed …featuring more than 20 million citations for biomedical literature from MEDLINE, life science journals, and online books.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Automatic Document Indexing in Large Medical Collections.
PubMed Searching: Automatic Term Mapping (ATM) PubMed for Trainers, Fall 2015 U.S. National Library of Medicine (NLM) and NLM Training Center.
The National Library of Medicine and its databases Lívia Vasas, PhD
MEDLINE®/PubMed® PubMed for Trainers, Fall 2015 U.S. National Library of Medicine (NLM) and NLM Training Center An introduction.
GUIDE. P UB M ED
The National Library of Medicine and its databases
PubMed/Filters (Basic Course: Module 5)
PubMed/MeSH - Medical Subject Headings (Advanced Course: Module 1)
PubMed/MeSH - Medical Subject Headings (Advanced Course: Module 1)
Lívia Vasas, PhD 2018 The National Library of Medicine and its databases Mozilla Firefox/Google Chrome Lívia Vasas, PhD.
The National Library of Medicine and its databases
Lívia Vasas, PhD 2018 The National Library of Medicine and its databases Mozilla Firefox/Google Chrome Lívia Vasas, PhD.
PubMed/MeSH - Medical Subject Headings (Advanced Course: Module 1)
Lívia Vasas, PhD 2018 The Nation Library of Medicine and its databases Mozilla Firefox or Google Chrome Lívia Vasas, PhD.
PubMed.
The National Library of Medicine and its databases
PubMed/Limits and Advanced Search (module 4.2)
PubMed/MeSH - Medical Subject Headings (Advanced Course: Module 1)
PHARM Library Orientation
Presentation transcript:

NLM Medical Text Indexer (MTI) BioASQ Challenge Workshop September 27, 2013 J.G. Mork, A. Jimeno Yepes, A. R. Aronson

 The views and opinions expressed do not necessarily state or reflect those of the U.S. Government, and they may not be used for advertising or product endorsement purposes. 2 Disclaimer

 MTI  Overview  Description  Performance  Future Work  Questions 3 Outline

 Summarizes input text into an ordered list of MeSH Headings  In use since mid-2002 (Indexers, Cataloging, HMD)  MTI as First-Line Indexer (MTIFL) since February 2011  Developed with continued Index Section collaboration  Uses article Title and Abstract  Provides recommendations for 93% of indexed articles (2012) MTI - Overview 4 The weathervane. ( ) Before ( ) The in-betweeners. ( ) Valete, salvete. ( )

 MetaMap Indexing Actually found in text  Restrict to MeSH Maps UMLS Concepts to MeSH  PubMed Related Citations Not necessarily found in text 5 MTI

 Large multi-lingual biomedical vocabulary database  UMLS Metathesaurus (currently using 2012AB)  MetaMap Indexing uses a subset:  Only requires UMLS license and for use with US-based projects  2,461,504 concepts with 7,685,881 entries  English Only  75 of the 168 Source Vocabularies  Changes twice a year 6 Unified Medical Language System (UMLS)

 Used for finding UMLS concepts actually in the text.  Better coverage versus just looking for MeSH Headings  Provides our best indicator of MeSH Headings  Handles spelling variants, abbreviations, and synonym identification. (Handles most British Spellings)  Obstructive Sleep Apnea  Obstructive Sleep Apnoea  OSA (3-ways ambiguous) 7 MetaMap Indexing (MMI) * Heart Attack * Myocardial Infarction

8 Restrict to MeSH  Allows us to map UMLS concepts to MeSH Headings  Updated with each UMLS release  Extends MMI abilities by mapping nomenclature to MeSH Encephalitis Virus, California ET: Jamestown Canyon virus ET: Tahyna virus Inkoo virus Jerry Slough virus Keystone virus Melao virus San Angelo virus Serra do Navio virus Snowshoe hare virus Trivittatus virus Lumbo virus South River virus ET: California Group Viruses

9 PubMed Related Citations

 Uses PubMed pre-calculated related articles  Only use MeSH Headings, no Check Tags, no Subheadings, no Supplementary Concepts  Provides terms not available in title/abstract  Used to filter and support MeSH Headings identified by MetaMap Indexing  Can provide non-related terms, so heavily filtered 10 PubMed Related Citations (PRC)

 Forcing Recommendations  New MeSH Headings (first 6 – 12 months)  Correct: 66.96% (2,935 / 4,383)  “B” (Organisms) and “D” (Chemicals and Drugs) in title  Correct: 69.90% (77,882 / 111,416)  Most MeSH Headings and Supplementary Concepts in title  Correct: 81.18% (377,571 /465,128) 11 Special Handling

 Forcing Recommendations (continued)  Check Tag Triggers (~3, Tree Rules)  “fetal heart rate”  Female and Pregnancy  Correct: 81.69% (885,092 / 1,083,457)  496 Triggers – all from Indexer Feedback  “saxs”  X-Ray Diffraction + Scattering, Small Angle  Correct: 65.07% (73,692 / 113,257) 12 Special Handling

MTI Example

 89 Journals currently in MTIFL program – 327 by end of 2015  MTI & MTIFL philosophically different  Almost 30 rules/heuristics used  Special Filtering using MMI & PRC against each other  MMI tends to provide more general terms  PRC tends to provide more specific terms (or terms not related)  Smaller more accurate list of terms than MTI 14 MTI as First Line Indexer (MTIFL) Heuristic #6: MMI Only Term If both MMI & PRC recommend a more specific term, remove the term. Heuristic #6: MMI Only Term If both MMI & PRC recommend a more specific term, remove the term. Heuristic #7: PRC Only Term If MMI does not have a more general term related, remove the term. Heuristic #7: PRC Only Term If MMI does not have a more general term related, remove the term.

15 Performance Focus on Precision versus Recall Fruition of 2011 Changes

 Structured Abstracts  Full Text  Author Supplied Keywords  Improving Subheading Attachment  Expanding MTIFL Program  Assisting on Gene and Chemical Identification Projects  Recommending some Publication Types  Species Detection and Filtering 16 Future Work

 MTI Team Members:  Alan (Lan) R. Aronson:  James G. Mork:  Antonio J. Jimeno Yepes:  Web Site:  17 Questions?

18 Extensible       Same program, five levels of filtering, customized output  All Processing – Base Filtering  Indexing – High Recall Filtering  Cataloging – High Recall Filtering  History of Medicine – High Recall Filtering  MTIFL – Balanced Recall/Precision Filtering  Strict – High Precision Filtering (not currently used)  Ability to Turn Off All Filtering (used in experiments)

19 Data Creation & Management System (DCMS)

 MTI Currently Not Able to Differentiate:  Species specific terms  BIRC3 protein, human  Birc3 protein, mouse  Birc3 protein, rat  Concepts where words are separated by text  “Lon is an oligomeric ATP-dependent protease” in text should recommend Lon Protease (ET for Protease La) 20 Challenges

 Current YTD (November 2012 – August 2013)  Percentage Right (Precision) 21 Performance MTIMTIFL Citations539,1576,846 MMI Only69.18% / 1,313, % / 11,536 PRC Only42.98% / 509, % / 3,839 MMI+PRC54.93% / 1,837, % / 30,075 Overall56.93%73.78%