CACAO Training Fall 2012. Community Assessment of Community Annotation with Ontologies (CACAO)

Slides:



Advertisements
Similar presentations
Annotation of Gene Function …and how thats useful to you.
Advertisements

Applications of GO. Goals of Gene Ontology Project.
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
Genome databases and webtools for genome analysis Become familiar with microbial genome databases Use some of the tools useful for analyzing genome Visit.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
Ontology annotation: mapping genomic regions biological function Paul D Thomas, Huaiyu Mi and Suzanna Lewis.
Gene Ontology John Pinney
Bioinformatics at WSU Matt Settles Bioinformatics Core Washington State University Wednesday, April 23, 2008 WSU Linux User Group (LUG)‏
Gene function analysis Stem Cell Network Microarray Course, Unit 5 May 2007.
CACAO - Remote training Gene Function and Gene Ontology Fall 2011
Community Annotation of Gene Function with GONUTS Jim Hu EcoliHub/EcoliWiki Dept. of Biochemistry and Biophysics Texas A&M University.
COG and GO tutorial.
CACAO Biocurator Training CACAO Fall CACAO Syllabus What is CACAO & why is it important? Training Examples.
Department of Biology Core Courses for Majors Bio 114Organisms Bio 124Ecology and Evolution Bio 214Cell and Molecular Biology Bio 224Genetics and Development.
CACAO - Remote training Gene Function and Gene Ontology Fall 2011
BICH CACAO Biocurator Training Session #3.
CACAO - Penn State Gene Function and Gene Ontology January 2011
Gene Ontology at WormBase: Making the Most of GO Annotations Kimberly Van Auken.
Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki.
Using The Gene Ontology: Gene Product Annotation.
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
CACAO training part 1 Jim Hu and Suzi Aleksander For UW Parkside Fall 2014.
NCBI’s Bioinformatics Resources Michele R. Tennant, Ph.D., M.L.I.S. Health Science Center Libraries U.F. Genetics Institute January 2015.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
Organizing information in the post-genomic era The rise of bioinformatics.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Monday, November 8, 2:30:07 PM  Ontology is the philosophical study of the nature of being, existence or reality as such, as well as the basic categories.
From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of.
Manual GO annotation Evidence: Source AnnotationsProteins IEA:Total Manual: Total
Introduction to the GO: a user’s guide Iowa State Workshop 11 June 2009.
SRI International Bioinformatics 1 Submitting pathway to MetaCyc Ron Caspi.
24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?
Gene Product Annotation using the GO ml Harold J Drabkin Senior Scientific Curator The Jackson Laboratory.
Alastair Kerr, Ph.D. WTCCB Bioinformatics Core An introduction to DNA and Protein Sequence Databases.
Getting Started: a user’s guide to the GO GO Workshop 3-6 August 2010.
Functional Annotation and Functional Enrichment. Annotation Structural Annotation – defining the boundaries of features of interest (coding regions, regulatory.
1 Gene function annotation. 2 Outline  Functional annotation  Controlled vocabularies  Functional annotation at TAIR  Resources and tools at TAIR.
DATA MANAGEMENT AND CURATION AT TAIR
Operated by Los Alamos National Security, LLC for NNSA Bioscience Discovering virulence genes present in novel strains and metagenomes Chris Stubben IC.
Getting Started: a user’s guide to the GO TAMU GO Workshop 17 May 2010.
Rice Proteins Data acquisition Curation Resources Development and integration of controlled vocabulary Gene Ontology Trait Ontology Plant Ontology
CACAO Training Fall Community Assessment of Community Annotation with Ontologies (CACAO)
Bioinformatics and Computational Biology
Introduction to the GO: a user’s guide NCSU GO Workshop 29 October 2009.
Update Susan Bridges, Fiona McCarthy, Shane Burgess NRI
Proteomics, the next step What does each protein do? Where is each protein located? What does each protein interact with, if anything? What role does it.
CACAO Training Jim Hu and Suzi Aleksander Fall 2015.
Primary vs. Secondary Databases Primary databases are repositories of “raw” data. These are also referred to as archival databases. -This is one of the.
A sensor histidine kinase coordinates cell wall architecture with cell division in Bacillus subtilis Component annotation PMID:
SRI International Bioinformatics 1 Editing Pathway/Genome Databases Ron Caspi.
1 Annotation EPP 245/298 Statistical Analysis of Laboratory Data.
Getting GO: how to get GO for functional modeling Iowa State Workshop 11 June 2009.
An example of GO annotation from a primary paper Rebecca E. Foulger (UniProt Curator) GO Annotation Camp, June 2005 PMID:
Tools in Bioinformatics Ontologies and pathways. Why are ontologies needed? A free text is the best way to describe what a protein does to a human reader.
An example of GO annotation from a primary paper GO Annotation Camp, July 2006 PMID:
Nitrogen Fixing GO Annotations UW Fall 2013 Example.
CACAO Training Jim Hu and Suzi Aleksander Fall 2015.
CACAO Training ASM-JGI 2012.
Annotating with GO: an overview
GO : the Gene Ontology & Functional enrichment analysis
Introduction to the Gene Ontology
Department of Genetics • Stanford University School of Medicine
Modified from slides from Jim Hu and Suzi Aleksander Spring 2016
Genome Annotation Continued
Annotation: linking literature to gene products
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
A User’s Guide to GO: Structural and Functional Annotation
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Functional Genomics of Bacillus Phages
Insight into GO and GOA Angelica Tulipano , INFN Bari CNR
Presentation transcript:

CACAO Training Fall 2012

Community Assessment of Community Annotation with Ontologies (CACAO)

Spring 2010Fall 2010Spring 2011Fall 2011 InstitutionsTAMU UCL TAMU Miami (Ohio) North Texas Penn State Michigan State TAMU UCL Swarthmore Mississippi State Hofstra Houston Baptist North Dakota State Wisconsin Wisconsin-Parkside # Rounds1 round4 rounds5 rounds Annotations/ Submitted 118/153496/753723/1018Not finished with assessments/1524 What’s in it for you? – We hope you will learn how we think about protein function gain skills that will help your future career enjoy contributing to a resource used by people all over the world have fun!

Annotation Annotation: a note that is made while reading any form of text For scientists, 1.Nucleotide level: Where the genes are in the genome 2.Protein level: What their functions are From Wikipedia

Annotation Annotation: a note that is made while reading any form of text For scientists, 1.Nucleotide level: Where the genes are in the genome 2.Protein level: What their functions are From Wikipedia

Functional Annotation Annotation: a note that is made while reading any form of text Functional Annotation: a note in a specific format that is made based on evidence in a peer-reviewed paper about the attributes of a protein

Functional Annotations Allow us to: –Infer the function of genes Related by common descent Related by similar expression patterns Related by phylogenetic profiles … Allow us to: –Understand the capabilities of organisms’ genomes –Understand patterns of gene expression In different environments In different tissues In disease states …

Functional Annotations Finding genes faster than we can understand them

Functional Annotations >21 million peer-reviewed articles in PubMed Many millions of proteins recorded in UniProt

Who classically makes functional annotations? Literature Datasets Biocurators (rate limiting) Database

Functional Annotations Accurate functional annotation for as many genes as possible A system of assigning function that allows both humans and computers to compare, contrast, analyze, and predict gene function Curators to make and/or check these assignments – For CACAO, we will train you to be biocurators.

Functional Annotation Functional Annotation: a note in a specific format that is made based on evidence in a peer-reviewed paper about the attributes of a protein Specific format = GO (Gene Ontology) Annotation

GO (Gene Ontology) Annotations 3 aspects (ontologies) for describing protein attributes: 1. Biological Process 2. Molecular Function 3. Cellular Component Controlled vocabulary –Everyone uses the same terms –Terms have 7 digit IDs that computers can understand Relationships between terms GO:

Molecular Function activities or “jobs” of a gene product GO: hexokinase activity From PMID: , rndsystems.com GO: Kinase activity

Biological Process a commonly recognized series of events GO: cell division From ridge.icu.ac.jp, edtech.clas.pdx.edu, scielosp.org GO: transcription, DNA dependent GO: pathogenesis

Cellular Component where a gene product acts From visualphotos.com, epmm.group.shef.ac.uk, GO: mitochondrion GO: peptidoglycan-based cell wall GO: ribosome

Where can you search for GO terms? GONUTS (gowiki.tamu.edu)

What do you actually need once you have found the correct term? GO:

Functional Annotation Functional Annotation: a note in a specific format that is made based on evidence in a peer-reviewed paper about the attributes of a protein Specific format = GO (Gene Ontology) Annotation Peer-reviewed paper

Finding a scientific paper Has to be a scientific paper with experimental data in it. (Anything else is a valid reason to challenge!!) No review articles, no books, no textbooks, no wikipedia articles, no class notes… You will need the PMID number

Functional Annotation Functional Annotation: a note in a specific format that is made based on evidence in a peer-reviewed paper about the attributes of a protein Specific format = GO (Gene Ontology) Annotation Peer-reviewed paper Protein

What can you annotate? Proteins. PubMed for papers on a specific topic or protein or GO term Search UniProt for something interesting (i.e. allergen) or a protein of interest (i.e. PcnB) Check the references in the paper you are currently reading No matter what, you will need to find the protein’s accession on UniProt ( Use that accession to make a page for that protein on GONUTS ( Add your GO annotations to the protein’s page on GONUTS

Why do you need an accession from UniProt ( 1.UniProt is not editable by the community, but GONUTS is. 2.GONUTS can make a page that has the annotations from UniProt for any protein using it’s UniProt accession. 3.Correct & complete annotations at the end of the competition will be submitted back to UniProt. *

How do you make a new protein page in GONUTS? GoPageMaker will:  Check if the page exists in GONUTS & take you there if it does.  Make a page if it does not exist in GONUTS already & pull all of the annotations from UniProt into a table that you can edit. Make as many protein pages as you would like!

Functional Annotation Functional Annotation: a note in a specific format that is made based on evidence in a peer-reviewed paper about the attributes of a protein Specific format = GO (Gene Ontology) Annotation Peer-reviewed paper Protein

Form for your annotation (when you edit the table)

4 REQUIRED parts of EVERY GO annotation GO Evidence code Reference Notes (about evidence)

Summary of Evidence Codes for CACAO Evidence codes describe the type of work or analysis done by the authors IDA: Inferred from Direct Assay IMP: Inferred from Mutant Phenotype IGI: Inferred from Genetic Interaction ISO: Inferred from Sequence Orthology ISA: Inferred from Sequence Alignment ISM: Inferred from Sequence Model IGC: Inferred from Genomic Context If it’s not one of these 7, your annotation is incorrect!!!

Functional Annotation Functional Annotation: a note in a specific format that is made based on evidence in a peer-reviewed paper about the attributes of a protein Specific format = GO (Gene Ontology) Annotation Peer-reviewed paper Protein Evidence code

4 REQUIRED parts of EVERY GO annotation GO Evidence code Reference Notes (about evidence)

2 other parts that may rarely be required… With/From Qualifier

How is CACAO scored? Rounds Points for a complete AND correct annotation (1 week/round) 4 necessary parts May be additional parts NOTE: We will take away points if the annotation is not correct when assessed by an experienced CACAO biocurator Challenges are used to steal points for incorrect &/or incomplete annotations (1 week/round) Identify a problem Suggest correct alternative Refinements can be entered by any team (during any challenge week)

Scoreboard & Challenges

Team & Individual Pages challenge

Challenges 1.Enter the reason for your challenge here. - (i.e. What’s wrong) 2. Provide the fix(es) for it.

I don’t think IGI is appropriate for this annotation. IGI uses multiple strains or organisms to compare. The evidence listed is just showing mutations in the protein and it’s effects on Dynamin-1, endophilin, and GluR1. The evidence code should be changed to IMP instead, and the other two annotations will probably need to be deleted. * Example Challenge

Multiple Challenges = Potentially More Points!

Scoreboard

UniProt – –Find your protein(s) here (UniProt accession required) PubMed – –Find your papers about the protein’s attributes (molecular function, biological process, cellular component) GONUTS – –Search for GO terms –Make page for your protein on GONUTS (using UniProt accession) –Add your annotation to the protein’s Annotation table during first (Annotation) week of any round –Review and challenge competitors’ annotations during the second (challenge) week of any round