Prioritization of Avian GO Annotation. 1.59 5 46.62 4 31,819 3 19,979 3 2.1Chicken 2.1829.99108,06949,5163.4Rat 1 3.579.28228,69664,01837.1Mouse 11.414.91415,83036,43736.3Human.

Slides:



Advertisements
Similar presentations
The Human Genome Project Main reference: Nature (2001) 409,
Advertisements

Modeling Functional Genomics Datasets CVM Lesson 3 13 June 2007Fiona McCarthy.
1 Using Gene Ontology. 2 Assigning (or Hypothesizing About) Biological Meaning to Clusters What do you want to be able to to? –Identify over-represented.
Biological Databases Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center
Displaying associations, improving alignments and gene sets at UCSC Jim Kent and the UCSC Genome Bioinformatics Group.
Alternative Splicing As an introduction to microarrays.
Characterizing Alternative Splicing With Respect To Protein Domains BME 220 Project Charlie Vaske.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
UniProt - The Universal Protein Resource
GO Enrichment analysis COST Functional Modeling Workshop April, Helsinki.
Doug Brutlag Professor Emeritus Biochemistry & Medicine (by courtesy) Genome Databases Computational Molecular Biology Biochem 218 – BioMedical Informatics.
Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki.
Rhesus genome annotations Rob Norgren Department of Genetics, Cell Biology and Anatomy University of Nebraska Medical Center.
AgBase: bioinformatics enabling knowledge generation from agricultural omics data Fiona McCarthy.
Managing Data Modeling GO Workshop 3-6 August 2010.
Adding GO for Large Datasets COST Functional Modeling Workshop April, Helsinki.
Adding GO GO Workshop 3-6 August GOanna results and GOanna2ga 2. gene association files 3. getting GO for your dataset 4. adding more GO (introduction)
COURSE OF BIOINFORMATICS Exam_31/01/2014 A.
Copyright OpenHelix. No use or reproduction without express written consent1.
Organizing information in the post-genomic era The rise of bioinformatics.
Browsing the Genome Using Genome Browsers to Visualize and Mine Data.
Strategies for functional modeling TAMU GO Workshop 17 May 2010.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of.
Workshop Aims NMSU GO Workshop 20 May Aims of this Workshop  WIIFM? modeling examples background information about GO modeling  Strategies for.
Introduction to the GO: a user’s guide Iowa State Workshop 11 June 2009.
Proteomic Characterization of Alternative Splicing and Coding Polymorphism Nathan Edwards Center for Bioinformatics and Computational Biology University.
Affymetrix Confidential Transcript Level Expression Profiling from Predicted and Transcribed Sequences with a 5 µm, PM-only Tomato Array.
Getting Started: a user’s guide to the GO GO Workshop 3-6 August 2010.
Systems Biology through Pathway Statistics Chris Evelo BiGCaT Bioinformatics Group – BMT-TU/e & UM Diepenbeek; May
Importing Community annotations into VectorBase. Aims Provide the VectorBase community with tools for improving genome annotation. Must have low entry.
Increasing GO Annotation Through Community Involvement Fiona McCarthy*, Nan Wang*, Susan Bridges** and Shane Burgess** GO.
P HYLO P AT : AN UPDATED VERSION OF THE PHYLOGENETIC PATTERN DATABASE CONTAINS GENE NEIGHBORHOOD Presenter: Reihaneh Rabbany Presented in Bioinformatics.
Getting Started: a user’s guide to the GO TAMU GO Workshop 17 May 2010.
Introduction to the Gene Ontology GO Workshop 3-6 August 2010.
Introduction to the GO: a user’s guide NCSU GO Workshop 29 October 2009.
Cool BaRC Web Tools Prat Thiru. BaRC Web Tools We have.
Update Susan Bridges, Fiona McCarthy, Shane Burgess NRI
Comparative Genomics Methods for Alternative Splicing of Eukaryotic Genes Liliana Florea Department of Computer Science Department of Biochemistry GWU.
.1Sources of DNA and Sequencing Methods.1Sources of DNA and Sequencing Methods 2 Genome Assembly Strategy and Characterization 2 Genome Assembly.
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
Getting GO: how to get GO for functional modeling Iowa State Workshop 11 June 2009.
AgBase Shane Burgess, Fiona McCarthy Mississippi State University.
1 of 28 Evaluating Genes and Transcripts (“Genebuild”)
Accessing and visualizing genomics data
Need a solid base for analysis of future genomes Reference genome criteria: Sequenced genome MOD Functional genomics projects Adequate research community.
GeneConnect Use Cases and Design August 3, GeneConnect Database IDs are linked by Direct Annotation, Inferred Annotation, or Sequence Alignment.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Getting GO annotation for your dataset
The Transcriptional Landscape of the Mammalian Genome
VectorBase genome annotation
Strategies for functional modeling
Introduction to the Gene Ontology
Experimental Verification Department of Genetic Medicine
ENCODE Pseudogenes and Transcription
Department of Genetics • Stanford University School of Medicine
Functional Annotation of the Horse Genome
GO Annotation from different sources
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Strategies for annotation of a genome
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Ensembl Genome Repository.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Identify D. melanogaster ortholog
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Gene Safari (Biological Databases)
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
.1Sources of DNA and Sequencing Methods 2 Genome Assembly Strategy and Characterization 3 Gene Prediction and Annotation 4 Genome Structure 5 Genome.
The genomic distribution of essential and non-essential mouse genes, separated into known and predicted essentiality. The genomic distribution of essential.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Presentation transcript:

Prioritization of Avian GO Annotation

, , Chicken ,06949,5163.4Rat ,69664, Mouse ,83036, Human proteins/gene % predicted proteins No. Proteins (NRPD) No. Entrez Genes Genome Build 2 Species Structural Annotation 1. The rat genome was published only 8 months prior to the chicken genome, yet rat has 2x as many genes in Entrez Gene and 3x as many proteins. 2. After two genome builds chicken still has 5% of genomic sequence that has not been assigned a chromosome and mini-chromosomes have not been sequenced. 3. Chicken genes and proteins are under-represented in public databases. 4. Of the chicken proteins available from NRPD, almost half are predicted based upon computational analysis. 5. On average chicken has only 1 protein per gene so very little is known about isoforms and alternate transcripts in the chicken gene products. NRPD: Non-redundant Protein Database

Phase 1: “Breadth” 7, 478 Chicken entries in UniProtKB  GOA provides IEA mapping for UniProtKB entries Initial strategy for AgBase biocurators was to add GO to chicken gene products that had none. Since 46% of the chicken proteins in NRPD were predicted, they would have no GO  IEA, ISS, ISO….

HumanMouseRatChicken no GO AgBase computational GO manual GO % of gene products annotated the proportion of GO for chicken is over-represented because of their under-representation in public databases Functional Annotation

Phase 2: “Depth”

What are the community needs?

GO Annotation of Arrays DelMar14K, FHCRC, Tgu array 44K Agilent oligo array AIIM array, Affymetrix Should we be focusing on arrays? What arrays should we do?

GO Annotation Priorities? Provide “breadth” of coverage Annotate products represented on arrays Reference Genome targets Subject areas (immunity, nutrition/metabolism, development Ad hoc as requested