Download presentation
Presentation is loading. Please wait.
1
Biomedical Text Mining and Its Applications
By: Raul Rodriguez-Esteban Presented By: Ankita Tanwar
2
INTRODUCTION & MOTIVATION
Tutorial for Biologists and Computational Biologists Main motivation to spread awareness Introduces to term BioNLP Introduces main concepts in Text Mining Lists multiple tools like Whatizit, GoPubMed, GoGene, …
3
TEXT MINING: MAIN CONCEPTS
Term & Term Recognization Tools: Whatizit, Abner, GoPubMed/GoGene, … Relationships between terms Tools: MedGene, BioGene, Endeavour, G2G, … Discovering Relationships Tool: Arrowsmith Measure of output Quality F-Measure, i.e., harmonic mean between precision and recall Comprehensive text mining Sources of information: Medline and beyond.
4
EXAMPLES OF TEXT RECOGNITION
5
Focusing on Tool GoPubMed
By: Ralph Delfs, Andreas Doms, Alexander Kozlenkov, and Michael Schroeder
6
INTRODUCTION Allows finding information needed through the use of biomedical background knowledge. It doesn't rank, the user does! It retrieves PubMed abstracts for user’s search query and sorts relevant information to the 4 top level categories: What Who Where When
7
MOTIVATION The biomedical literature grows at a tremendous rate and PubMed comprises over abstracts Approaches such as protein interactions, pathways, and micro array data aim to improve literature search But, these approaches do not mimic human information foraging
8
CONTRIBUTIONS Introduction and realization of ontology-based literature search Derived a term-extraction algorithm Derived an induced ontology from the extracted terms
9
Structure of GENE ONTOLOGY
10
GoPubMed: Main Idea The main idea is to use GeneOntology to search and browse PubMed Problems to be solved: How to extract GeneOntology terms from PubMed abstract How to construct the relevant sub-ontology of GO
11
GoPubMed: Term Extraction
Use of Regular Expression: \w matches a word \s a space the dot . any single character To repeatedly match an expression there are three operators: ? requires the preceding pattern to appear once or not at all + requires it to appear once at least once * requires it to appear any number of times (including 0)
12
Term Extraction: Example
Keyword searched: cAMP-dependent kinase Seed term: kinase activity Seed Child: cAMP-dependent protein kinase activity Method to search for such a pattern: kinase \w+ cAMP-dependent .* kinase activity
13
GoPubMed: Induced Ontology
Used to avoid unnecessary parts of ontology not relevant to given abstracts. Given an ontology, 𝑂 and a set of terms 𝑇′′, extracted from abstracts, construct a minimal sub-ontology of 𝑂 Find all the intermediate terms, from terms in 𝑇′′ to root
14
Screenshot of Initial Prototype
15
Screenshot of Current Application
16
THANK YOU!!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.