Semantic indexing in PubMed CERN Workshop on Innovations in Scholarly Communication (OAI8) CERN Workshop on Innovations in Scholarly Communication (OAI8)

Slides:



Advertisements
Similar presentations
PubMed/MeSH - Medical Subject Headings (module 4.3)
Advertisements

PubMed Searching: Automatic Term Mapping (ATM) PubMed for Trainers, Spring 2014 U.S. National Library of Medicine (NLM) and NLM Training Center.
Searching PubMed Anne Beschnett, MLIS Bio-Medical Library
PubMed/MeSH - Medical Subject Headings (module 4.3)
PubMed and its search options Jan Emmerich, Sonja Jacobi, Kerstin Müller (5th Semester Library Management)
NCBI/WHO PubMed/Hinari Course NCBI Literature Databases: PubMed Background.
PubMed: Outline Coverage MeSH, mapping and subheadings Simple search Limits Displaying and managing results MeSH database Single citation matcher.
What is the status of community acquired pneumonia in adults in the United States? Searching PubMed pubmed.gov.
Introduction to PubMed® (pubmed.gov)
Topic Maps applied to PubMed, Extreme’07, August Topic Maps applied to PubMed Giovani Rubert Librelotto Mirkos Martins Henrique Machado Franciscan.
MEDLINE®/PubMed® Based on the PubMed for Trainers course, U.S. National Library of Medicine (NLM) and NLM Training Center Jane Bridges, ML, AHIP Associate.
Searching Pubmed Database استخدام قاعدة المعلومات Pubmed د. سيناء عبد المحسن العقيل قسم الصيدلة الإكلينيكية برنامج مهارات البحث العلمي.
Indexing the Biomedical Literature in a Time of Increased Demand and Limited Resources BioASQ Workshop September 27, 2013 Alan R. Aronson Lister Hill Center,
The National Library of Medicine online resources Salima M’seffar INH- Bibliotheque
U. S. National Library of Medicine NLM Indexing Initiative Tools for NLP: MetaMap and the Medical Text Indexer Natural Language Processing: State of the.
Library Class for TCM Medline & AMED. Medline MEDLINE® is the U.S. National Library of Medicine's® (NLM) premier bibliographic database that contains.
Summary Issues and Suggestions Workshop on The Future of the UMLS Semantic Network NLM, April 8, 2005 Olivier Bodenreider Lister Hill National Center for.
NLM Medical Text Indexer (MTI) BioASQ Challenge Workshop September 27, 2013 J.G. Mork, A. Jimeno Yepes, A. R. Aronson.
NATIONAL LIBRARY OF MEDICINE The PubMed ID and Entrez, PubMed and PubMed Central Edwin Sequeira National Center for Biotechnology Information June 21,
Medical Knowledge Watch at the Belgium Poison Centre Christophe Dupriez 26 June 2007.
The NLM Controlled Vocabulary Medical Subject Headings (MeSH) PubMed for Trainers, Spring 2015 U.S. National Library of Medicine (NLM) and NLM Training.
Arpita Bose, MLIS Outreach and Communications Coordinator
CSE 730 Information Retrieval of Biomedical Data The use of medical lexicon in biomedical IR.
Medical Subject Headings (MeSH)
MeSH Vocabulary.
Research in the Health Sciences Kerry Sullivan, MLIS Health Sciences Librarian February 2010.
Indexing 1/2 BDK12-3 Information Retrieval William Hersh, MD Department of Medical Informatics & Clinical Epidemiology Oregon Health & Science University.
How to do a literature search Saharuddin Ahmad Aida Jaffar Department of Family Medicine.
Why do I get different results ? Terry Ann Jankowski, MLS, AHIP Head, User Experience Health Sciences Library University of Washington.
DeCS/MeSH description, uses, services, updating Adalberto Tardelli BIREME/PAHO/WHO GHL Workshop March 27, 2007.
Text- and Content-based Approaches to Image Retrieval for the ImageCLEF 2009 Medical Retrieval Track Matthew Simpson, Md Mahmudur Rahman, Dina Demner-Fushman,
Unified Medical Language System® (UMLS®) NLM Presentation Theater MLA 2005 May 16 & 17, 2005 Rachel Kleinsorge.
Session II: Scientific Publishing and Semantic Web W3C Semantic Web for Life Sciences Workshop October 27, 2004 Moderator: Alan R. Aronson.
Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland - USA Experiences in visualizing and navigating biomedical.
Betsy L. Humphreys Betsy L. Humphreys Associate Director for Library Operations NLM, NIH, HHS NLM, NIH, HHS National Library.
Searching Medline Helen Rowlandson Medicines Information Manager Northwick Park Hospital, London.
1 st June 2006 St. George’s University of LondonSlide 1 Using UMLS to map from a Library to a Clinical Classification: Improving the Functionality of a.
Survey of Medical Informatics CS 493 – Fall 2004 September 27, 2004.
MEDLINE for Medical Research Juliet Ralph and César Pimenta Hilary Term 2007.
Doug Brutlag 2011 Bibliographic Search Doug Brutlag Professor Emeritus of Biochemistry.
Searching Medline Helen Rowlandson Principal Medicines Information Pharmacist London Medicines Information (Northwick Park) London.
Searching Medline Alex Denby Regional MI Manager London Medicines Information Service (Northwick Park Hospital)
The PubMed ® Game Designed for librarians & library staff From PubMed for Experts Brought to you by NN/LM Pacific Southwest Region February 2013 rev 5.
Shelly Warwick, MLS, Ph.D – Permission is granted to reproduce and edit this work for non-commercial educational use as long as attribution is provided.
UMLS Unified Medical Language System. What is UMLS? A Unified knowledge representation system Project of NLM Large scale Distributed First launched in.
Bio-Med Library Orientation Del Reed Ph.D Bio-Medical Library
Indexing Mathematical Abstracts by Metadata and Ontology IMA Workshop, April 26-27, 2004 Su-Shing Chen, University of Florida
Sharing Ontologies in the Biomedical Domain Alexa T. McCray National Library of Medicine National Institutes of Health Department of Health & Human Services.
Enabling complex queries to drug information sources through functional composition Olivier Bodenreider Lister Hill National Center for Biomedical Communications.
Using Domain Ontologies to Improve Information Retrieval in Scientific Publications Engineering Informatics Lab at Stanford.
Semantic Relation Discovery by Using Co-occurrence Information Background: MEDLINE contains high quality semantic metadata covering more than 22 million.
Searching Medline Helen Rowlandson Principal Medicines Information Pharmacist London Medicines Information (Northwick Park) London.
Part 3 – MeSH (Medical Subject Headings). Instructions This part of the course is a PowerPoint demonstration intended to show a guided tour of the PubMed.
The NLM Catalog 2005 MLA Annual Conference Diane Boehr National Library of Medicine National Institutes of Health U.S. Dept. of Health and Human Services.
Medical Text Indexing Joe Thomas Unit Supervisor Index Section, NLM.
PubMed …featuring more than 20 million citations for biomedical literature from MEDLINE, life science journals, and online books.
Advanced Library Services Developing a Biomedical Knowledge Repository to Support Advanced Information Management Applications Olivier Bodenreider, M.D.,
Oncologic Pathology in Biomedical Terminologies Challenges for Data Integration Olivier Bodenreider National Library of Medicine Bethesda, Maryland -
PubMed Searching: Automatic Term Mapping (ATM) PubMed for Trainers, Fall 2015 U.S. National Library of Medicine (NLM) and NLM Training Center.
Medical Subject Headings (MeSH)
MEDLINE®/PubMed® PubMed for Trainers, Fall 2015 U.S. National Library of Medicine (NLM) and NLM Training Center An introduction.
The National Library of Medicine and its databases a PhD Lívia Vasas February.
Oncology in SNOMED CT NCI Workshop The Role of Ontology in Big Cancer Data Session 3: Cancer big data and the Ontology of Disease Bethesda, Maryland May.
GUIDE. P UB M ED
MeSH: Medical Subject Headings Anne Allen, Heather Braum, Paula Davidson, Ellen Rose LI 804: Organization of Information.
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
The UMLS and the Semantic Web
Livia Vasas PhD Budapest, September 2011.
PubMed.
The National Library of Medicine and its databases
Presentation transcript:

Semantic indexing in PubMed CERN Workshop on Innovations in Scholarly Communication (OAI8) CERN Workshop on Innovations in Scholarly Communication (OAI8) Geneva, Switzerland June 20, 2013 Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland - USA

Orientation u NLM is the world's largest biomedical library l Located in Bethesda, Maryland, near Washington, DC u PubMed provides access to MEDLINE, NLM’s bibliographic database of over 20M citations l MEDLINE covers 5600 journals and adds almost 1M new citations each year l PubMed is part of the Entrez system of the National Center for Biotechnology Information (NCBI) 2

3 Outline u Anatomy of a MEDLINE citation u Types of PubMed searches l Simple text search l Search based on MeSH indexing u Automatic indexing u Beyond topics

Anatomy of a MEDLINE citation 4 Title Abstract Indexing

5 MeSH main heading[/subheading(s)] [+ * for major topic]

Types of PubMed searches

Non-semantic search u PubMed does not require the use of MeSH for querying l Supports “Google-like” text searches n “no librarian required” l But can identify MeSH terms even if they are not labeled as such 7

Non-semantic search Example u Find articles about the cheese Gruyère l Gruyère 8

MeSH (semantic) search u Medical Subject Headings (MeSH) l Controlled vocabulary developed at NLM for indexing and retrieval of MEDLINE citations l ~26,000 descriptors (main headings) l <100 qualifiers (subheadings) l 214,000 supplementary concept records u Hierarchical structure (“tree numbers”) l Supports query expansion (“explosion”) n Search for a descriptor or any of its descendants 9

Simple MeSH search Example u Find articles about drug-induced psychoses l "Psychoses, Substance-Induced"[Mesh] 10

Search with “Explosion” u By default, PubMed retrieves articles indexed with a descriptor or any of its descendants  Use mesh:noexp to prevent “explosion” from happening 11

“Explosion” Example u Find articles about fluoroquinolones (or desc.) l "fluoroquinolones"[Mesh] 12

Search leveraging synonymy in MeSH u MeSH descriptors include related concepts (Entry terms) l Synonyms l Closely related (and clustered or indexing and retrieval purposes) u All terms from a descriptor and its entry terms are used for retrieval in PubMed 13

14

Entry terms for “Addison Disease” 15

Search leveraging UMLS Synonymy u Unified Medical Language System (UMLS) l Terminology integration system l ~130 biomedical terminologies l Synonymous terms clustered into concepts u UMLS synonymy used in PubMed l Query translation happens “behind the scenes” l E.g., search on “primary adrenocortical insufficiency” n Retrieves articles about “Addison’s disease” 16

17

No entry term for “Heart attack” 18

Query translation 19

Subheading restrictions u Subheadings represent the context of use of a particular descriptor l Ciprofloxacin/Adverse effects l Mood Disorders/Chemically induced u Assigned during indexing u Can be queried in PubMed 20

Subheading restrictions Example u Find articles about drugs involved in adverse events l "Chemicals and Drugs Category“/adverse effects[MeSH] 21

Recapitulative example u Find articles about drugs involved in adverse events and drug-induced manifestations l (("Chemicals and Drugs Category"[Mesh]) AND (adverse effects[sh] OR contraindications[sh] OR mortality[sh])) AND (chemically induced[sh] OR (("Drug-Induced Liver Injury"[Mesh:noexp]) OR ("Drug Eruptions"[Mesh:noexp]) OR ("Epidermal Necrolysis, Toxic"[Mesh]) OR ("Drug-Induced Liver Injury, Chronic"[Mesh]) OR ("Erythema Nodosum"[Mesh]) OR ("Serotonin Syndrome"[Mesh]) OR ("Hand-Foot Syndrome"[Mesh]) OR ("Neuroleptic Malignant Syndrome"[Mesh]) OR ("MPTP Poisoning"[Mesh]) OR ("Dyskinesia, Drug-Induced"[Mesh]) OR ("Neurotoxicity Syndromes"[Mesh:noexp]) OR ("Psychoses, Substance-Induced"[Mesh]) OR ("Akathisia, Drug- Induced"[Mesh]))) AND (medline[sb]) 22

Automatic indexing

Automatic indexing Motivation u Indexing by humans is costly and has limited reproducibility u Natural language processing can effectively support named entity recognition u Automatic indexing can produce l Suggestions for human indexers l Final indexing for some journals 24

Automatic indexing Principles u Hybrid approach l Concepts extracted from title and abstract n Mapped from UMLS to MeSH l MeSH descriptors extracted from related citations u Post-processing l Clustering and ranking l Integrate indexing rules n E.g., “rule of 3” –Index with a higher-level descriptor rather than with 3 or more lower-level descriptors 25

u Medical Text Indexer 26 Automatic indexing Workflow

Automatic indexing Applications u MEDLINE indexing l Support MEDLINE indexing at NLM n 3600 new citations processed every weeknight n Suggestions displayed in the indexing environment l “First-line” indexing n For 75 journals n MTI recommendations are used as an indexer n Simply reviewed by a senior indexer u Cataloging and History of Medicine l Assisted indexing 27

Beyond topics

Beyond concepts… relations u Also known as l Facts l Predications l Nano-publications l … u Relation extraction l Usually based on natural language processing (NLP) n E.g., SemRep l Relations stored in (subject, predicate, object) form n With provenance information 29

Experimental application Semantic MEDLINE u Multi-document summarization u Based on a database of 60M predications extracted from MEDLINE u Entities normalized to the UMLS Metathesaurus u Relations aligned with the UMLS Semantic Network u Interfaced with PubMed (for retrieving PMIDs) on a given topic l Forms the basis for summarization 30

31

32 Relation extraction Applications u Enhanced information retrieval l Indexing on relations in addition to concepts or association main heading/subheading u Multi-document summarization l Extract and visualize the facts extracted from 250 recent abstracts on the treatment of Parkinson’s disease u Question answering l Clinical and biological questions u Knowledge discovery l Connect facts from heterogeneous resources

Medical Ontology Research Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland - USA