TAIR: Bringing together data for the global plant biology community kate dreher curator TAIR/PMN.

Slides:



Advertisements
Similar presentations
Model Organism Databases and Community Annotation
Advertisements

P.H. Bamaiyi pwaveno-h-bamaiyi/ Mendeley Advisor [
Publishers Web Sites Standard Features. Objectives Access publishers websites Identify general features available on most publishers websites Know how.
A Comparative mapping resource ONTOLOGY DEVELOPMENT AND INTEGRATION IN GRAMENE Pankaj Jaiswal Cornell University.
Carnegie Institution for Science, Department of Plant Biology.
TAIR: Bringing together data for the global plant biology community Philippe Lamesch Kate Dreher The Arabidopsis Information Resource
1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida.
Bienvenidos a TAIR! Kate Dreher curator TAIR/PMN.
Extracting information from scientific papers: Challenges and Opportunities for Researchers and Curators DPB.
How pathway databases were created and curated Peifen Zhang Plant Metabolic Network (PMN)
Annotation of Gene Function …and how thats useful to you.
The Arabidopsis Information Resource (TAIR)
Arabidopsis as a model for plant development Eva Huala.
Kate Dreher AraCyc, TAIR, PMN Carnegie Institution for Science
Putting TAIR to work for you hands-on workshop for beginning and advanced users
Part I: Tips and techniques from curators Kate Dreher TAIR, AraCyc, PMN Carnegie Institution for Science.
Applications of GO. Goals of Gene Ontology Project.
Introduction to Mendeley. What is Mendeley? Mendeley is a reference manager allowing you to manage, read, share, annotate and cite your research papers...
The Plant Metabolic Network: PlantCyc, AraCyc, and NEW Metabolic Pathway Databases for Plant Research *K. Dreher, P. Zhang, L. Chae, R.A. Nilo Poyanco,
1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Gene Ontology John Pinney
1 Database Description and Details. Biological & Agricultural Index offers individuals convenient online access to the literature of biology and agriculture.
CACAO - Remote training Gene Function and Gene Ontology Fall 2011
CACAO - Remote training Gene Function and Gene Ontology Fall 2011
CACAO - Penn State Gene Function and Gene Ontology January 2011
Accessing the Data You Need at the Plant Metabolic Network kate dreher biocurator PMN The Carnegie Institution for Science Stanford, CA.
Moving beyond free text. Authors Scientist does research Scientist publishes research results in journal article Old Paradigm:
TAIR resources for plant biology research kate dreher curator TAIR/PMN.
Kate dreher biocurator / plant molecular biologist The Carnegie Institution for Science Stanford, CA Introduction to the Plant Metabolic Network: Data.
Gramene Objectives Develop a database and tools to store, visualize and analyze data on genetics, genomics, proteomics, and biochemistry of grass plants.
New data and tools at TAIR (The Arabidopsis Information Resource)
CACAO Training Fall Community Assessment of Community Annotation with Ontologies (CACAO)
Accessing information in plant metabolic pathway databases at the PMN, Gramene, and SGN Part I: Contents, Search Strategies, and Data Sharing Opportunities.
TAIR Workshop Model Organism Databases and Community Annotation Plant and Animal Genome XVI Conference, San Diego January 13, 2008.
(1) Access the Oryzabase (1) Access the Oryzabase (2) Click the.
Improving Curation Efficiency: User Contributions and Textpresso-Based Semi-Automation SAB 2008 WormBase Literature Curators Textpresso.
The Plant Ontology Consortium website: Contact Information for deliverables Lincoln Stein,
PlantCyc, AraCyc, PoplarCyc and more... Building databases and connecting to researchers at the Plant Metabolic Network kate dreher curator PMN/TAIR.
Organizing information in the post-genomic era The rise of bioinformatics.
Community Interactions: Feedback, Support and Curation Eva Huala The Arabidopsis Information Resource (TAIR)
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Top Four Essential TAIR Resources Debbie Alexander Metabolic Pathway Databases for Arabidopsis and Other Plants Peifen Zhang.
Manual GO annotation Evidence: Source AnnotationsProteins IEA:Total Manual: Total
Introduction to the GO: a user’s guide Iowa State Workshop 11 June 2009.
SRI International Bioinformatics 1 Submitting pathway to MetaCyc Ron Caspi.
Getting Started: a user’s guide to the GO GO Workshop 3-6 August 2010.
Metabolic Pathway Databases and Tools Speaker and Schedule Update PMN (Peifen Zhang) KEGG (auto-slide show) MetaCrop (cancelled)
1 Gene function annotation. 2 Outline  Functional annotation  Controlled vocabularies  Functional annotation at TAIR  Resources and tools at TAIR.
PubSearch Danny Yoo, Iris Xu, Behzad Mahini Pub* Tools Website: Literature Curaotors’ Website:
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
CACAO Training Fall Community Assessment of Community Annotation with Ontologies (CACAO)
This tutorial will describe how to navigate the section of Gramene that provides descriptions of alleles associated with morphological, developmental,
A collaborative tool for sequence annotation. Contact:
Introduction to the GO: a user’s guide NCSU GO Workshop 29 October 2009.
Development and Use of Controlled Vocabularies at the Arabidopsis Information Resource (TAIR) Sue Rhee Carnegie Institution Dept. Plant Biology
SRI International Bioinformatics 1 Editing Pathway/Genome Databases Ron Caspi.
Building and Refining AraCyc: Data Content, Sources, and Methodologies Kate Dreher TAIR, AraCyc, PMN Carnegie Institution for Science.
Getting GO: how to get GO for functional modeling Iowa State Workshop 11 June 2009.
2006 ICAR: TAIR workshop Organizers: Katica Ilic and Peifen Zhang Location: Reception Room, 4th floor A general overview of TAIR website and demonstration.
Welcome to the Protein Database Tutorial. This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
` Comparison of Gene Ontology Term Annotations Between E.coli K12 Databases REDDYSAILAJA MARPURI WESTERN KENTUCKY UNIVERSITY.
CACAO Training ASM-JGI 2012.
Department of Genetics • Stanford University School of Medicine
Functional Annotation of the Horse Genome
Phenotype Annotation at TAIR
Welcome to the Gene and Allele Database Tutorial
Welcome to the Quantitative Trait Loci (QTL) Tutorial
Gramene’s Ontologies Tutorial
Unit 3 Introduction.
Presentation transcript:

TAIR: Bringing together data for the global plant biology community kate dreher curator TAIR/PMN

Acquiring new gene/protein function data in TAIR TAIR curators Research community Using gene function data Searching by function Working with large datasets Connecting to other species Gene/protein function data at TAIR

Functional curation pipeline ~200 papers about Arabidopsis show up in Pubmed each month TAIR curators link papers to appropriate loci Please help us and all researchers who read your paper... Report the AGI locus code for every gene in the paper ASA1 is ambiguous ANTHRANILATE SYNTHASE ALPHA SUBUNIT 1? ATTENUATED SHADE AVOIDANCE 1? ATP SULFURYLASE ARABIDOPSIS 1? AT3G02260 is unique Papers are prioritized for in-depth literature curation First priority – Papers with data about unannotated / novel genes TAIR curators read primary literature to extract gene functional data Capturing gene/protein function data at TAIR

TAIR curators captures gene/protein function data using: Free text Gene descriptions Gene names / symbols Please help us by writing the full name of symbols, especially the first time that they are published GGT2 = Glutamate:Glyoxylate aminotransferase 2 Mutant phenotype descriptions We love to see ABRC stock numbers, SALK/SAIL/GABI-Kat IDs, etc. Please check to see what allele numbers have already been used Textpresso is a big help! Capturing gene/protein function data at TAIR phyA-201

Free text functional data on TAIR locus pages Gene description Gene names / symbols

Free text functional data on TAIR locus pages Mutant phenotypes

TAIR curators captures gene/protein function data using: Controlled vocabularies Gene Ontology terms Molecular function (e.g. transcription factor activity) Biological process (e.g. phosphate transport) Cellular component (e.g. chloroplast) Plant Ontology terms Plant structure (e.g. endosperm) Plant growth and development stages (e.g. root primordium formation) Capturing gene/protein function data at TAIR

GO and PO functional data on TAIR locus pages GO terms PO terms

Getting detailed functional data

Its Do we know what every Arabidopsis protein does? 9024 genes (~30%) are linked to experimental functional data (3/2010) Is there more information out there? ~32% of the PubMed Arabidopsis papers from 2009 were curated ~68% were not curated Additional articles appear in plant journals not indexed by Pubmed How can we get closer to our 2010 goal? On-going TAIR curation Increased community annotation Journal/author collaborations NEW on-line gene functional data submission tool Capturing gene/protein function data at TAIR

Journal collaborations First started in March 2008 with Plant Physiology Current collaborators and methods Submit at ASPB website: Plant Physiology Fill out a spreadsheet: The Plant Journal Use the NEW TAIR gene functional data submission tool Journal of Integrative Plant Biology Plant, Cell and Environment Journal of Experimental Botany Plant Science Environmental Botany Plant Physiology and Biochemistry But you can use the tool TODAY! Contributing new functional data – as you publish

Contributing new functional data – anytime!

Given by publisher or found online

We welcome data from ALL your publications... but please add them one at a time

AT2G01830 WOLWooden Leg

Adding gene function annotations kinase But I actually know that it is a histidine kinase... Try entering a different search term

Adding gene function annotations histidine Is there an even more specific / appropriate term? Check TAIR Keywords

Choosing the best term

Providing an experimental method in vitro But what if my term or method do not appear?

Entering new terms and methods kinasesextuple mutant

Adding additional information Do I have more molecular function data about THIS gene in THIS paper? kinase Yes! Nope!

Adding additional information

in vitro assay

Adding additional information

What Other information can I add? Mutant phenotype information Identity of other loci in double/ triple / quadruple mutants, etc. Description of gene Any other free text information The wol-8 EMS mutant (CS07856) has a point mutation in the first exon that introduces a premature stop codon. The roots of mutant plants fail to respond to the exogenous application of cytokinin.

Covering all the data Do I have any more information to add about OTHER genes in THIS paper? Yes!

Entering the data into the database Nope, no information about OTHER genes in THIS paper? Please us with any questions or problems during or after submitting your data:

Something is better than nothing... If you dont have time to hunt around for the perfect term or method, please just give us what you can But, if possible, please try to be... As complete as possible e.g. If its a kinase, also add that its involved in biological process of phosphorylation As specific as possible e.g. use potassium transporter instead of transporter. Benefits of good annotation Better understanding of individual gene functions Tips for gene function data submission

Something is better than nothing... If you dont have time to hunt around for the perfect term or method, please just give us what you can But, if possible, please try to be... As complete as possible e.g. If its a kinase, also add that its involved in biological process of phosphorylation As specific as possible e.g. use potassium transporter instead of transporter. Benefits of good annotation Better understanding of individual gene functions Better categorization / analysis of large-scale data sets Better functional predictions for newly sequenced genomes Tips for gene function data submission Vandepoele et al, 2009 BAR TAIR

Contributing new functional data – anytime! Many other data types still welcome! How can all this gene/protein function information be put to good use?

Use Gene Search to find genes... involved in a specific biological process with a particular molecular function found in a specific compartment expressed in a particular place and/or during a specific developmental phase Enter keywords for GO or PO terms Can limit by evidence codes Finding the gene(s) you want...

Use Gene Search to find genes... involved in the same biological process with the same molecular function found in the same compartment expressed in the same place and/or at the same time Enter keywords for GO or PO terms Can limit by evidence codes Enter search terms / keywords for gene descriptions Enter search terms / keywords for mutant phenotype Finding the gene(s) you want...

Use Gene Search to find genes... involved in the same biological process with the same molecular function found in the same compartment expressed in the same place and/or at the same time Enter keywords for GO or PO terms Can limit by evidence codes Enter search terms / keywords for gene descriptions Enter search terms / keywords for mutant phenotype Finding the gene(s) you want...

Putting gene/protein functional data to use

Adding value to community-generated gene families Over 150 gene families have been submitted by researchers Working with large data sets

Adding value to community-generated gene familes Over 150 gene families have been submitted by researchers Attach data to your favorite protein family: Adding information to data sets Generate a.txt file of AGI locus codes

Adding information to data sets

Finding related genes in other species Connecting to other species

save as text file

Connecting to other species

We are here to help: Please use the data we provide Please use the tools we provide Please use TAIR to help improve your research! Please contact us if we can be of any help! Make an appointment to meet with us during the conference Please come visit our exhibitor booth – 219 – Plant Genome Resources! Please stop by poster tomorrow night (432 fans so far... )

Acknowledgements TAIR Current Curators: - Tanya Berardini (lead curator – functional annotation) - David Swarbreck (lead curator – structural annotation) - Peifen Zhang (Director and lead curator- metabolism) - Philippe Lamesch (curator) - Donghui Li (curator) - Debbie Alexander (curator) - A. S. Karthikeyan (curator) - Marga Garcia (curator) - Leonore Reiser (curator) Eva Huala (Director) Sue Rhee (Co-PI) Tech Team Members: - Bob Muller (Manager) - Larry Ploetz (Sys. Administrator) - Raymond Chetty - Cynthia Lee - Shanker Singh - Chris Wilks Recent Past Contributors: - Anjo Chi (tech team) - Vanessa Kirkup (tech team) -Tom Meyer (tech team) - Rajkumar Sasidharan (curator)

Department of Plant Biology

We are here to help: Please use the data we provide Please use the tools we provide Please use TAIR to help improve your research! Please contact us if we can be of any help! Make an appointment to meet with us during the conference Please come visit our exhibitor booth – 219 – Plant Genome Resources! Please stop by poster tomorrow night