Textpresso Application and Extensibility Eimear Kenny GMOD Meeting, April 2004.

Slides:



Advertisements
Similar presentations
Annotation of Gene Function …and how thats useful to you.
Advertisements

Genetic Model Organisms worm mouse fish yeast fruit fly weed.
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
An Information Retrieval and Extraction System for C. elegans Literature.
Today’s lecture (Based on Chapter 1): 1. Basic organization of the cell 2. The static cell vs the living cell 3. Experimental approaches in cell physiology.
Drosophila as a model system Paul Adler Gilmer
Map Curation on GrainGenes Victoria Carollo, Gerard Lazo, David Matthews, Olin Anderson Biological Databases Curators Meeting October 2003.
Gene Ontology John Pinney
Forward Genetics What is forward genetics?
GMOD Meeting, May 2005 Patent Pending, Caltech Proprietary Textpresso Search engine for Biomedical Literature ~Eimear Kenny~
2 March, 2005 Chapter 12 Mutational dissection Normal gene Altered gene with altered phenotype mutagenesis.
Gene ontology & hypergeometric test Simon Rasmussen CBS - DTU.
Automating Discovery from Biomedical Texts Marti Hearst & Barbara Rosario UC Berkeley Agyinc Visit August 16, 2000.
COG and GO tutorial.
CBioC: Massive Collaborative Curation of Biomedical Literature Future Directions.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Comparative Expression Moran Yassour +=. Goal Build a multi-species gene-coexpression network Find functions of unknown genes Discover how the genes.
Analysis of GO annotation at cluster level by H. Bjørn Nielsen Slides from Agnieszka S. Juncker.
Lecture 1: Introduction Dr. Mamoun Ahram Faculty of Medicine Second year, Second semester, Principles of Genetics and Molecular Biology.
Getting the most out of FlyBase. Tools –QuickSearch – Controlled Vocabularies, Term Reports and TermLink –QueryBuilder.
Drosophila melanogaster
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
CACAO Training Fall Community Assessment of Community Annotation with Ontologies (CACAO)
Outline Quick review of GS Current problems with GS Our solutions Future work Discussion …
How will we efficiently understand the interactions of ~20,000 genes, with ~200 million potential pairwise interactions? Minimally, we need to use the.
Intralab Workshop - Reactome CMAP Chang-Feng Quo June 29 th, 2006.
GENE ONTOLOGY FOR THE NEWBIES Suparna Mundodi, PhD The Arabidopsis Information Resources, Stanford, CA.
MMAP: mouse Metabolomics Analysis Platform Preeti Bais 09/09/2014.
What did you learn from surfing FlyBase? Why do the inversions in Balancer chromosomes greatly reduce the frequency of crossing over in meiosis?
Improving Curation Efficiency: User Contributions and Textpresso-Based Semi-Automation SAB 2008 WormBase Literature Curators Textpresso.
Organizing information in the post-genomic era The rise of bioinformatics.
A Comparative Genomic Mapping Resource for Grains.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Toward a Unified Gene Page GMOD Meeting, April 2004 Don Gilbert,
1 Gene function annotation. 2 Outline  Functional annotation  Controlled vocabularies  Functional annotation at TAIR  Resources and tools at TAIR.
Overview of Bioinformatics 1 Module Denis Manley..
Analysis of GO annotation at cluster level by Agnieszka S. Juncker.
DATA MANAGEMENT AND CURATION AT TAIR
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Molecular Cell Biology of the Yeast Saccharomyces cerevisiae Lecture I: Biology, Genetics, Genomics and Proteomics Zhang Yi, National Institute of Biological.
The effects of Malathion and the comparison to the NTE1 gene in yeast Ashley Swift Mentor: David Singleton Introduction : Malathion is a widely used organophosphorous.
Ontologies Working Group Agenda MGED3 1.Goals for working group. 2.Primer on ontologies 3.Working group progress 4.Example sample descriptions from different.
Oct.27, 2003 Curator Meeting, Oct Gene Expression Curation ~WormBase, 2003 ~
Topics of AP Biology Adapted from The College Board,
William S. Klug Michael R. Cummings Charlotte A. Spencer Concepts of Genetics Eighth Edition Chapter 21 Dissection of Gene Function: Mutational Analysis.
Protein synthesis Translation.
Biological Networks. Can a biologist fix a radio? Lazebnik, Cancer Cell, 2002.
GMOD – What Next?. Application Areas Genome –Single annotation –Comparative annotation Genetics –Stocks, strains, mutants –QTL –Variation Protein annotation.
Genetic Literature Curation at FlyBase-Cambridge Steven Marygold ABC meeting, December 2007 A Database of.
Molecular Cell Biology Logic and Approaches to Research Cooper.
Tools in Bioinformatics Ontologies and pathways. Why are ontologies needed? A free text is the best way to describe what a protein does to a human reader.
` Comparison of Gene Ontology Term Annotations Between E.coli K12 Databases REDDYSAILAJA MARPURI WESTERN KENTUCKY UNIVERSITY.
Genomics research paper presentation
A hands on tour of the Saccharomyces Genome Database (SGD)
An Information Retrieval and Extraction System for C
KEY CONCEPT A combination of methods is used to study human genetics.
Department of Genetics • Stanford University School of Medicine
KEY CONCEPT A combination of methods is used to study human genetics.
Phenotype Annotation at TAIR
Discovery of possible connection between cln3 and cisd2 gene through notch signaling Rebecca Culpepper.
KEY CONCEPT A combination of methods is used to study human genetics.
KEY CONCEPT A combination of methods is used to study human genetics.
Copyright Pearson Prentice Hall
Grade 9 Science Test #2 Part A: Knowledge Part B: Short Answer
KEY CONCEPT A combination of methods is used to study human genetics.
KEY CONCEPT A combination of methods is used to study human genetics.
KEY CONCEPT A combination of methods is used to study human genetics.
Web Based Semantic Query Answering of Yeast Knowledge
KEY CONCEPT A combination of methods is used to study human genetics.
Presentation transcript:

Textpresso Application and Extensibility Eimear Kenny GMOD Meeting, April 2004

Textpresso Advances Application: Advanced lit. search tool for curators Semi-automated curation tasks Automated curation tasks Extensibility: Implementation of Textpresso for yeast lit.

ABSTRACTFULL TEXT DatatypeHumanSearch termTrue hits Total hits RecallPrecisionTrue hits Total hits RecallPrecision Expression data 327express* %55.5% %36.3% Mapping data 36map*0510% %6.4% RNAi data 220rnai %71.4% %59.5% Transgenes 95transgenes*8238.4%34.8% %21.7% TOTAL %52%6372,11794%30.1%

Textpresso Ontology Relationships Semantic Biological Concepts Gene Transgene Allele Cell or Cell Group Cellular Component Nucleic Acid Organism Entity Feature Life Stage Phenotype Strain Sex Clone Molecular Function Mutant Drugs and Sml Mols Association Consort Effect Purpose Pathway Regulation Comparison Spatial Relation Time Relation Involvement Characterization Method Biological Process Action Bracket Determiner Conjunction Conjecture Negation Preposition Pronoun Punctuation “anti-rabbit IgG polyclonal antibody” “eat-4” “necessary for” “Nomarski” “epipstasis” “co-expressed with” “homologue of” “not” “ZK512.6”

Textpresso Ontology Relationships Semantic Biological Concepts Gene Transgene Allele Cell or Cell Group Cellular Component Nucleic Acid Organism Entity Feature Life Stage Phenotype Strain Sex Clone Molecular Function Mutant Drugs and Sml Mols Association Consort Effect Purpose Pathway Regulation Comparison Spatial Relation Time Relation Involvement Characterization Method Biological Process Action Bracket Determiner Conjunction Conjecture Negation Preposition Pronoun Punctuation “anti-rabbit IgG polyclonal antibody” “eat-4”, “necessary for” “Nomarski” “epipstasis” “co-expressed with” “homologue of” “not” “ZK512.6”

….. activation of let-7 RNA expression downregulates LIN-4 to relieve inhibition of lin-29. Biological Process Regulation Gene Molecular Function Biological Process // activation of let-7 RNA expression down regulates LIN-41 to relieve inhibition of lin-29. // © Textpresso, 2004

Find sentences from the literature that describe genetic interaction! >= 2 named “Gene” && (>= 1 “Association” || >= 1 “Regulation”) Using Textpresso to expediate curation

Interaction TypeABC Genetic Interactions1(0.5%)13(6.5%)39(19.5%) Possible Genetic Interaction3(1.5%)6(3%)14(7%) Non-genetic Interactions4(2%)6(3%)12(6%) No Interaction192(96%)175(87.5%)135(67.5%)

100 sentences per hour!

1,986 articles  17,851 sentences 31.4% Interaction Information 68.6% NO Interaction Information 1,224 Regulation 6.5% 127 Physical Inxn 0.7% 1,825 Possible Inxn 9.8% 3,702 Genetic Inxn 19.8%

MOD’s Disease/Expr/Mut/Other Seqn/Str Did you know ? “The Molecular Database Collection” (NAR , 2002, 2003, 2004)

Textpresso goes to Stanford …… Rob Nash Stan Dong Eimear Kenny Rama Balakrishnan Christopher Lane Eurie Hong Mike Cherry

Implementing Textpresso for Yeast >6,000 Papers (~4,000 full text) 1 week build - add papers (~24 h) - change ontology (rebuild) 8G database Linux >60,000 Journal Article (~15,000 full text) >2 week build -add papers (~3d) -change ontology (rebuild) 30G database? Solaris Worm Build Yeast Build

Adapting Textpresso Ontology for Yeast Life Stage Cell Cycle Life Cycle Cell Name or Group Sex Phenotype  Phenotype Method  Method Gene  Gene Allele  Allele Transgene  Transgene Strain  Strain ?? Clone  Clone Worm biology  Yeast biology

Implementing Textpresso for MODS >6,000 Papers (~4,000 full text) 1 week build - add papers (~24 h) - change ontology (rebuild) 8G database Linux >60,000 Journal Article (~15,000 full text) >2 week build -add papers (~3d) -change ontology (rebuild) 30G database? Solaris Worm BuildYeast BuildFly Build >140,000 Journal Article (? full text) ? build -add papers (?) -change ontology (rebuild) ?G database Solaris

Textpresso Ontology Relationships Semantic Biological Concepts Gene Transgene Allele Cell or Cell Group Cellular Component Nucleic Acid Organism Entity Feature Life Stage Phenotype Strain Sex Clone Molecular Function Mutant Drugs and Sml Mols Association Consort Effect Purpose Pathway Regulation Comparison Spatial Relation Time Relation Involvement Characterization Method Biological Process Action Bracket Determiner Conjunction Conjecture Negation Preposition Pronoun Punctuation Life Cycle FOR FLY Anatomy 1. Chromosomal aberrations? (inversion, polytene, substitution, deletion, balancers, p elements, hypomorphs, hypermorphs) 2. Stresses? (nutrition, temperature, sleep)