CACAO Training ASM-JGI 2012.

Slides:



Advertisements
Similar presentations
Annotation of Gene Function …and how thats useful to you.
Advertisements

Applications of GO. Goals of Gene Ontology Project.
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
Ontology annotation: mapping genomic regions biological function Paul D Thomas, Huaiyu Mi and Suzanna Lewis.
Gene Ontology John Pinney
CACAO - Remote training Gene Function and Gene Ontology Fall 2011
Community Annotation of Gene Function with GONUTS Jim Hu EcoliHub/EcoliWiki Dept. of Biochemistry and Biophysics Texas A&M University.
CACAO Biocurator Training CACAO Fall CACAO Syllabus What is CACAO & why is it important? Training Examples.
CACAO - Remote training Gene Function and Gene Ontology Fall 2011
BICH CACAO Biocurator Training Session #3.
CACAO - Penn State Gene Function and Gene Ontology January 2011
Gene Ontology at WormBase: Making the Most of GO Annotations Kimberly Van Auken.
An introduction to using the AmiGO Gene Ontology tool.
Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki.
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
CACAO training part 1 Jim Hu and Suzi Aleksander For UW Parkside Fall 2014.
CACAO Training Fall Community Assessment of Community Annotation with Ontologies (CACAO)
NCBI’s Bioinformatics Resources Michele R. Tennant, Ph.D., M.L.I.S. Health Science Center Libraries U.F. Genetics Institute January 2015.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
Grup.bio.unipd.it CRIBI Genomics group Erika Feltrin PhD student in Biotechnology 6 months at EBI.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of.
Introduction to the GO: a user’s guide Iowa State Workshop 11 June 2009.
SRI International Bioinformatics 1 Submitting pathway to MetaCyc Ron Caspi.
24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?
Alastair Kerr, Ph.D. WTCCB Bioinformatics Core An introduction to DNA and Protein Sequence Databases.
Getting Started: a user’s guide to the GO GO Workshop 3-6 August 2010.
Functional Annotation and Functional Enrichment. Annotation Structural Annotation – defining the boundaries of features of interest (coding regions, regulatory.
Copyright OpenHelix. No use or reproduction without express written consent1.
1 Gene function annotation. 2 Outline  Functional annotation  Controlled vocabularies  Functional annotation at TAIR  Resources and tools at TAIR.
Overview of Bioinformatics 1 Module Denis Manley..
Operated by Los Alamos National Security, LLC for NNSA Bioscience Discovering virulence genes present in novel strains and metagenomes Chris Stubben IC.
Getting Started: a user’s guide to the GO TAMU GO Workshop 17 May 2010.
CACAO Training Fall Community Assessment of Community Annotation with Ontologies (CACAO)
Bioinformatics and Computational Biology
Introduction to the GO: a user’s guide NCSU GO Workshop 29 October 2009.
CACAO Training Jim Hu and Suzi Aleksander Fall 2015.
Primary vs. Secondary Databases Primary databases are repositories of “raw” data. These are also referred to as archival databases. -This is one of the.
A sensor histidine kinase coordinates cell wall architecture with cell division in Bacillus subtilis Component annotation PMID:
SRI International Bioinformatics 1 Editing Pathway/Genome Databases Ron Caspi.
Getting GO: how to get GO for functional modeling Iowa State Workshop 11 June 2009.
An example of GO annotation from a primary paper Rebecca E. Foulger (UniProt Curator) GO Annotation Camp, June 2005 PMID:
Tools in Bioinformatics Ontologies and pathways. Why are ontologies needed? A free text is the best way to describe what a protein does to a human reader.
An example of GO annotation from a primary paper GO Annotation Camp, July 2006 PMID:
Welcome to the Protein Database Tutorial. This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Nitrogen Fixing GO Annotations UW Fall 2013 Example.
CACAO Training Jim Hu and Suzi Aleksander Fall 2015.
Research Introduction to the concept of incorporating sources into your own work.
Gene Annotation & Gene Ontology
Annotating with GO: an overview
GO : the Gene Ontology & Functional enrichment analysis
Introduction to the Gene Ontology
Mental Functioning and the Gene Ontology
INTRODUCTION.
Workshop Aims TAMU GO Workshop 17 May 2010.
Department of Genetics • Stanford University School of Medicine
Modified from slides from Jim Hu and Suzi Aleksander Spring 2016
Genome Annotation Continued
Annotation: linking literature to gene products
What to write and how to write it!
Experimental Psychology PSY 433
AP Research The second course in College Board’s Capstone Program
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
Annotating Gene Products to the GO
Thinking About How You Read
Functional Genomics of Bacillus Phages
AP Research The second course in College Board’s Capstone Program
November 3, 2015 You have 5 minutes to find the ten mistakes and answer the questions I have graded your last bell ringers MAKE SURE YOU DO THEM EVERYDAY.
GSP 470/570 Advanced Geospatial Analysis and Modeling
Presentation transcript:

CACAO Training ASM-JGI 2012

Transferring information to new genomes Database Lists of genes Known functions of Homologs or subsets New knowledge This figure describes an ideal situation for how we obtain/transfer information to new genes/genomes How well in works depends on the quantity and quality of what we can get out of the databases. The problem is…

Curation is rate limiting Literature Database Biocurators (rate limiting) Datasets

CACAO is growing

CACAO biodiversity Spring 2012 Annotations

CACAO 2 CACAO changes the job of the professionals from primary curation to assessment Growth in CACAO makes assessment rate limiting Solution: Promote CACAO veterans to help with assessment

BIOCURATORS ecoliwiki@gmail.com

The biocurator training …

What’s in it for you? We hope you will learn how we think about protein function gain skills that will help your future career enjoy contributing to a resource used by people all over the world have fun!

Annotation Annotation: a note that is made while reading any form of text For scientists, Nucleotide level: Where the genes are in the genome Protein level: What their functions are From Wikipedia

Annotation Annotation: a note that is made while reading any form of text For scientists, Nucleotide level: Where the genes are in the genome Protein level: What their functions are From Wikipedia

Functional Annotation Annotation: a note that is made while reading any form of text Functional Annotation: a note in a specific format that is made based on evidence in a peer-reviewed paper about the attributes of a protein

Functional Annotation Functional Annotation: a note in a specific format that is made based on evidence in a peer-reviewed paper about the attributes of a protein Specific format = GO (Gene Ontology) Annotation

GO (Gene Ontology) Annotations 3 aspects (ontologies) for describing protein attributes: 1. Biological Process 2. Molecular Function 3. Cellular Component Controlled vocabulary Everyone uses the same terms Terms have 7 digit IDs that computers can understand Relationships between terms GO:0005886

GO:0004347 hexokinase activity Molecular Function activities or “jobs” of a gene product GO:0004347 hexokinase activity GO:0016301 Kinase activity From PMID:9341134, rndsystems.com

Biological Process a commonly recognized series of events GO:0009405 pathogenesis GO:0006351 transcription, DNA dependent GO:0051301 cell division From ridge.icu.ac.jp, edtech.clas.pdx.edu, scielosp.org

GO:0009274 peptidoglycan-based cell wall Cellular Component where a gene product acts GO terms: mitochondrion, mitochondrial membrane, mitochondrial matrix, mitochondrial intermembrane space GO:0009274 peptidoglycan-based cell wall GO:0005840 ribosome GO:0005739 mitochondrion From visualphotos.com, epmm.group.shef.ac.uk, http://www.cellsignal.com/products/2415.html

Where can you search for GO terms? GONUTS (gowiki.tamu.edu) http://gowiki.tamu.edu http://www.ebi.ac.uk/QuickGO http://amigo.geneontology.org

What do you actually need once you have found the correct term? GO:0004713

Functional Annotation Functional Annotation: a note in a specific format that is made based on evidence in a peer-reviewed paper about the attributes of a protein Specific format = GO (Gene Ontology) Annotation Peer-reviewed paper

Finding a scientific paper Has to be a scientific paper with experimental data in it. (Anything else is a valid reason to challenge!!) No review articles, no books, no textbooks, no wikipedia articles, no class notes… You will need the PMID number Inspiration ok, not suitable for reference section Biocurators can annotate the same paper as previous semesters, but they have to synthesize NOVEL annotations (can’t just remake the same annotations = no points!) 22110029

Functional Annotation Functional Annotation: a note in a specific format that is made based on evidence in a peer-reviewed paper about the attributes of a protein Specific format = GO (Gene Ontology) Annotation Peer-reviewed paper Protein

What can you annotate? Proteins. PubMed for papers on a specific topic or protein or GO term Search UniProt for something interesting (i.e. allergen) or a protein of interest (i.e. PcnB) Check the references in the paper you are currently reading No matter what, you will need to find the protein’s accession on UniProt (http://uniprot.org) Use that accession to make a page for that protein on GONUTS (http://gowiki.tamu.edu) Add your GO annotations to the protein’s page on GONUTS Can either show this list after they have run out of suggestions or write them in as the students suggest them.

Why do you need an accession from UniProt (http://www.uniprot.org)? * UniProt is not accessible for us to annotate protein on, so we use a community-contributed website, GONUTS. GONUTS needs the UniProt accession so it can pull the existing annotations from UniProt into GONUTS. However, correct and complete annotations are being submitted to the GO Consortium and then get added to UniProt (as part of the GO Consortium) * UniProt is not editable by the community, but GONUTS is. GONUTS can make a page that has the annotations from UniProt for any protein using it’s UniProt accession. Correct & complete annotations at the end of the competition will be submitted back to UniProt.

How do you make a new protein page in GONUTS? 2 1 GoPageMaker will: Check if the page exists in GONUTS & take you there if it does. Make a page if it does not exist in GONUTS already & pull all of the annotations from UniProt into a table that you can edit. Make as many protein pages as you would like!

Annotations edit table

Functional Annotation Functional Annotation: a note in a specific format that is made based on evidence in a peer-reviewed paper about the attributes of a protein Specific format = GO (Gene Ontology) Annotation Peer-reviewed paper Protein

Annotations edit table

Form for your annotation (when you edit the table)

4 REQUIRED parts of EVERY GO annotation Reference GO Notes (about evidence) Evidence code There are 24 rows in the annotations table on this protein’s page. Each row is a separate annotation. Suzi added the ones with the white background. The annotation shown on the slide is from a different protein’s page. It is merely to show what an annotation is.

Summary of Evidence Codes for CACAO Evidence codes describe the type of work or analysis done by the authors IDA: Inferred from Direct Assay IMP: Inferred from Mutant Phenotype IGI: Inferred from Genetic Interaction ISO: Inferred from Sequence Orthology ISA: Inferred from Sequence Alignment ISM: Inferred from Sequence Model IGC: Inferred from Genomic Context If it’s not one of these 7, your annotation is incorrect!!! http://gowiki.tamu.edu/wiki/index.php/evidence_codes

Functional Annotation Functional Annotation: a note in a specific format that is made based on evidence in a peer-reviewed paper about the attributes of a protein Specific format = GO (Gene Ontology) Annotation Peer-reviewed paper Protein Evidence code

4 REQUIRED parts of EVERY GO annotation Reference GO Notes (about evidence) Evidence code There are 24 rows in the annotations table on this protein’s page. Each row is a separate annotation. Suzi added the ones with the white background. The annotation shown on the slide is from a different protein’s page. It is merely to show what an annotation is.

2 other parts that may rarely be required… Qualifier With/From

How is CACAO scored? Rounds Points for a complete AND correct annotation (normally 1 week/round, today = 25 mins) 4 necessary parts May be additional parts NOTE: We will take away points if the annotation is not correct when assessed by an experienced CACAO biocurator Challenges are used to steal points for incorrect &/or incomplete annotations (normally 1 week/round, today = 20 mins) Identify a problem Suggest correct alternative Refinements can be entered by any team (during any challenge week) Important part: challenges. These are what make the difference every semester in the final standings!!! also: My rubrics require min 15 correct annotations for 4 rounds of CACAO for excellent grade for Rubric #2 3 ways to get those: Complete & Correct annotations Refine & correct someone else’s annotation Refine an annotation of yours that was refined by another team but not corrected completely.

Scoreboard & Challenges http://gowiki.tamu.edu/wiki/index.php/Category:ASM_JGI_challenge

Team & Individual Pages challenge

Challenges Enter the reason for your challenge here. - (i.e. What’s wrong) 2. Provide the fix(es) for it.

Annotation discussion (aka argument)

UniProt – http://uniprot.org PubMed – http://pubmed.org Find your protein(s) here (UniProt accession required) PubMed – http://pubmed.org Find your papers about the protein’s attributes (molecular function, biological process, cellular component) GONUTS – http://gowiki.tamu.edu Search for GO terms Make page for your protein on GONUTS (using UniProt accession) Add your annotation to the protein’s Annotation table during first (Annotation) week of any round Review and challenge competitors’ annotations during the second (challenge) week of any round

ASM-JGI Competition! You now have 25 mins to: Use the assigned paper for your group and … Find the correct UniProt accession Make the page for the protein on GONUTS Make at least one annotation You will have 20 mins to challenge other teams’ annotations What fields are wrong & why?!