Information Retrieval and its Application in Biomedicine Hong Yu 1,2, PhD Susan McRoy 1, PhD 1 Department of Computer Science 2 Department of Health Sciences.

Slides:



Advertisements
Similar presentations
1 Alexander Gelbukh Moscow, Russia. 2 Mexico 3 Computing Research Center (CIC), Mexico.
Advertisements

GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
Chapter 5: Introduction to Information Retrieval
NCBI/WHO PubMed/Hinari Course NCBI Literature Databases: PubMed Background.
Khresmoi – Towards improved medical information access Allan Hanbury Vienna University of Technology, Austria.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Health Information Literacy Manual Presentation Module 2 Searching Tools.
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
FROM INFORMATION, KNOWLEDGE Prof. Marti Hearst MIMS Visit Day, 2006 Some Research Projects.
The Semantic Web Week 1 Module Content + Assessment Lee McCluskey, room 2/07 Department of Computing And Mathematical Sciences Module.
Presented by Zeehasham Rasheed
Faculty of Computer Science © 2006 CMPUT 605March 31, 2008 Towards Applying Text Mining and Natural Language Processing for Biomedical Ontology Acquisition.
Copyright © 2006 Pearson Education, Inc. publishing as Benjamin Cummings. The Literature of Health Education Chapter 9.
Why, in the future, all sciences will be computer sciences Barry Smith.
INFORMATION RETRIEVAL VECTOR SPACE MODEL IN-DEPTH PART 3 Thomas Tiahrt, MA, PhD CSC492 – Advanced Text Analytics.
INFORMATION RETRIEVAL VECTOR SPACE MODEL IN-DEPTH PART 1 Thomas Tiahrt, MA, PhD CSC492 – Advanced Text Analytics.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Srihari-CSE730-Spring 2003 CSE 730 Information Retrieval of Biomedical Text and Data Inroduction.
9/30/2004TCSS588A Isabelle Bichindaritz1 Introduction to Bioinformatics.
Retrieval 2/2 BDK12-6 Information Retrieval William Hersh, MD Department of Medical Informatics & Clinical Epidemiology Oregon Health & Science University.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
What IS the Web? Mrs. Wilson Internet Basics & Beyond.
JUMPSTART YOUR DISSERTATION TIME SAVING METHODS FOR SEARCHING AND CITING.
IL Step 1: Sources of Information Information Literacy 1.
1 Information Retrieval and Advanced Internet Services 290N Class Introduction Tao Yang, 2015
CS523 INFORMATION RETRIEVAL COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY.
Information Retrieval CENG 555 Spring Course Web Page Authoritative source of administrivia In-class announcements generally reflected on Web.
Chapter 1 Introduction to Data Mining
Bio-Medical Information Retrieval from Net By Sukhdev Singh.
Information Retrieval and Web Search Lecture 1. Course overview Instructor: Rada Mihalcea Class web page:
Computing Fundamentals Module Lesson 19 — Using Technology to Solve Problems Computer Literacy BASICS.
Course Overview for Web Computing J. H. Wang Sep. 19, 2011.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
Searching the Literature Dr. Kasar P.K. Professor PSM NSCB Medical College Jabalpur Health Action by People.
Overviews of ITCS 6161/8161: Advanced Topics on Database Systems Dr. Jianping Fan Department of Computer Science UNC-Charlotte
B. Prabhakaran1 Multimedia Systems Textbook Any/Most Multimedia Related Books Reference Papers: Appropriate reference papers discussed in class from time.
Knowledge Representation of Statistic Domain For CBR Application Supervisor : Dr. Aslina Saad Dr. Mashitoh Hashim PM Dr. Nor Hasbiah Ubaidullah.
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
The Structure of Information Retrieval Systems LBSC 708A/CMSC 838L Douglas W. Oard and Philip Resnik Session 1: September 4, 2001.
AdvancedBioinformatics Biostatistics & Medical Informatics 776 Computer Sciences 776 Spring 2002 Mark Craven Dept. of Biostatistics & Medical Informatics.
Computing Fundamentals Module Lesson 6 — Using Technology to Solve Problems Computer Literacy BASICS.
March 31, 1998NSF IDM 98, Group F1 Group F Multi-modal Issues, Systems and Applications.
12/7/2015Page 1 Service-enabling Biomedical Research Enterprise Chapter 5 B. Ramamurthy.
Mining the Biomedical Research Literature Ken Baclawski.
Information Retrieval and Web Search Course overview Instructor: Rada Mihalcea.
3. Scientific literature, Internet online resources How to search for bibliographic records effectively? 3) Searching using full text browsers - creating.
Semantic Extraction and Semantics-Based Annotation and Retrieval for Video Databases Authors: Yan Liu & Fei Li Department of Computer Science Columbia.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
3) Creating your own bibliography database Comparison of reference management software
Steve Cassidy Computing at MacquarieNo 1 Searching The Web Steve Cassidy Centre for Language Technology Department of Computing Macquarie University.
1 Information Science Overview n Representation of Information (Knowledge) n Storage of Information n Finding/Re-Finding Information n Exchange/Sharing.
Information Retrieval CIS-462 Dr. Samir Tartir 2013/2014 First Semester.
B. Prabhakaran1 Multimedia Systems Reference Text “Multimedia Database Management Systems” by B. Prabhakaran, Kluwer Academic Publishers. – Kluwer bought.
Information Storage and Retrieval Fall Lecture 1: Introduction and History.
ECE 533 Digital Image Processing
Using computers to search electronic databases
Information Retrieval and Web Search
Major ILS disciplines What does iSchools like SILS study?
Course Summary (Lecture for CS410 Intro Text Info Systems)
Information Retrieval and Web Search
Thanks to Bill Arms, Marti Hearst
Lívia Vasas, PhD 2018 The Nation Library of Medicine and its databases Mozilla Firefox or Google Chrome Lívia Vasas, PhD.
Computer Literacy BASICS
CSE 5290: Algorithms for Bioinformatics Fall 2009
3. Scientific literature, Internet online resources
Information Retrieval CIS-462
Semantic Web Towards a Web of Knowledge - Outline
3. Scientific literature, Internet online resources
Presentation transcript:

Information Retrieval and its Application in Biomedicine Hong Yu 1,2, PhD Susan McRoy 1, PhD 1 Department of Computer Science 2 Department of Health Sciences University of Wisconsin-Milwaukee Sept 4 Introduction

What is Information Retrieval? The field concerned with the acquisition, organization, and searching of knowledge-based information. (Hersh, 2003) The field concerned with the acquisition, organization, and searching of knowledge-based information. (Hersh, 2003)

Speed Up Communication

Information World Wide Web World Wide Web Company Documentations Company Documentations Drug Descriptions Drug Descriptions Medical Records Medical Records Books Books Everything that is text, image, video, and sound, and that can be transformed digitally Everything that is text, image, video, and sound, and that can be transformed digitally

Information in Biomedicine Literature (over 17 million publications) Literature (over 17 million publications) WWW WWW Electronic medical records Electronic medical records Genomics data Genomics data –DNA sequences, etc. Knowledge representation Knowledge representation –Gene Ontology Company databases Company databases –Micromedex drug database

IR in Biomedicine Index Medicus (Billings 1879) Index Medicus (Billings 1879) MEDLARS (NLM 1966) MEDLARS (NLM 1966) SAPHIRE (Hersh 1990) SAPHIRE (Hersh 1990) PubMed (NLM 1996) PubMed (NLM 1996) Arrowsmith (Smalheiser 1998) Arrowsmith (Smalheiser 1998) BioText (Hearst 2003) BioText (Hearst 2003) BioMedQA (Yu 2006) BioMedQA (Yu 2006)

Electronic and Open Publishing Internet and Web have a profound impact on the publishing of knowledge-based information Internet and Web have a profound impact on the publishing of knowledge-based information Most of literature can be electronically available Most of literature can be electronically available Open-access Open-access –The Bethesda Statement on Open Access Publishing ( (April 11, 2003) –The Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities ( berlin/berlindeclaration.html). (2003) berlin/berlindeclaration.htmlhttp:// berlin/berlindeclaration.html –PubMedCentra (NLM 2004)

Quality of Information A lack of quality control A lack of quality control –Anyone can publish online –A wealthy of studies concluded that Web has a poor quality for healthcare information Readability Readability –Hard to read

Information Needs and Seeking Unrecognized needs Unrecognized needs –Clinicians unaware of information needs or knowledge deficit Recognized needs Recognized needs –Clinicians aware of needs but may or may not pursue them Pursued needs Pursued needs –Information seeking occurs but may or may not be successful Satisfied needs Satisfied needs –Information seeking successful

Evidence-Based Medicine

What You Will Learn IR algorithms IR algorithms –Indexing –Query and Retrieval –Evaluation –Text Classification –XML retrieval –Web retrieval

What You Will Learn (Cont.) Open-Source IR tools Open-Source IR tools –What open-source IR tools are available Indexing/retrieval Indexing/retrieval Part-of-speech and syntactic parsing Part-of-speech and syntactic parsing Semantic parsing Semantic parsing Discourse relations Discourse relations Machine-learning classifiers Machine-learning classifiers How to use the tools? How to use the tools?

What You Will Learn (Cont.) State of the art IR systems State of the art IR systems –Baruch 1965 [BLIMP ] –SAPHIRE (Hersh 1990) Retrieval Retrieval –MedLEE (Friedman 1994) Extraction Extraction –PubMed (NLM 1997) PubMed –ARROSMITH Systems (Smalheiser 1998) ARROSMITH Systems ARROSMITH Systems Hidden Relation Discovery Tool Hidden Relation Discovery Tool –GENIES (Friedman 2001) Extraction Extraction

BioText ( Hearst ) BioText ( Hearst ) –Retrieval+Categorization GeneWays ( Rzhetsky ) GeneWays ( Rzhetsky ) –Extraction+Visualization TextPresso ( Muller ) TextPresso ( Muller ) –Retrieval+Extraction iHOP ( Hoffman and Valencia net.org/UniPub/iHOP/ ) iHOP ( Hoffman and Valencia net.org/UniPub/iHOP/ ) net.org/UniPub/iHOP/ net.org/UniPub/iHOP/ –Retrieval BioMedQA ( Yu ) BioMedQA ( Yu ) BioMedQA –Question Answering BioNLP Systems

Advanced NLP applications

Beyond text: Image and Video Image classification Image classification –Finding concepts in captions and annotations –Machine learning on textual & visual features –Determining salient features in text and image separately and merging the results Extracting text from image Extracting text from image –Understanding and correcting OCR (handwriting, equations) –Finding text in images Finding document text related to illustrations Finding document text related to illustrations Video retrieval Video retrieval Video retrieval Video retrieval

Beyond Extraction: Experimental Tools

Resources Annotated collections (GENIA, Medstract, Yapex …) Annotated collections (GENIA, Medstract, Yapex …) Ontologies, tools, knowledge bases … Ontologies, tools, knowledge bases … Publications, Conferences, Evaluations … Publications, Conferences, Evaluations … Centres and web portals Centres and web portals

What We Provide Textbook Textbook –Christopher D. Manning, Prabhakar Raghavan and Hinrich Schutze. Introduction to Information Retrieval. Cambridge University Press, 2007 Introduction to Information RetrievalIntroduction to Information Retrieval retrieval-book.html retrieval-book.html Office hour: Office hour: –Tuesdays, 3-4 pm EMS 710 and by appointment –Hong Yu, –Susan McRoy,

What We Expect Undergraduate: Undergraduate: –30% Homework, 35% Midterm exam, 35% Final exam or project Graduate: Graduate: –20% Midterm exam, 40% Homework, 40% Project: The project may be done individually or in a team of 2-3 people. The final project will include a software system, a 2-3 page written project report, and an oral presentation. The report should describe the problem, the approach, and evaluation and should cite related work where appropriate.