Functional manual annotation including GO

Slides:



Advertisements
Similar presentations
1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Advertisements

Pfam(Protein families )
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
EBI is an Outstation of the European Molecular Biology Laboratory. Alex Mitchell InterPro team Using InterPro for functional analysis.
©CMBI 2005 Exploring Protein Sequences - Part 2 Part 1: Patterns and Motifs Profiles Hydropathy Plots Transmembrane helices Antigenic Prediction Signal.
Genome analysis and annotation Part II. THE INSTITUTE FOR GENOMIC RESEARCH TIGRTIGR Evidence View S.mansoni PASA assemblies S. japonicum EST alignments.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Tutorial 5 Motif discovery.
The Protein Data Bank (PDB)
Pattern databases in protein analysis Arthur Gruber Instituto de Ciências Biomédicas Universidade de São Paulo AG-ICB-USP.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Today’s menu: -SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Protein and Function Databases
Introduction to Bioinformatics - Tutorial no. 8 Protein Prediction: - PROSITE - Pfam - SCOP - TOPITS - genThreader.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Protein Classification A comparison of function inference techniques.
Predicting Function (& location & post-tln modifications) from Protein Sequences June 15, 2015.
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
Making Sense of DNA and protein sequence analysis tools (course #2) Dave Baumler Genome Center of Wisconsin,
Pattern databasesPattern databasesPattern databasesPattern databases Gopalan Vivek.
PAT project Advanced bioinformatics tools for analyzing the Arabidopsis genome Proteins of Arabidopsis thaliana (PAT) & Gene Ontology (GO) Hongyu Zhang,
Automatic methods for functional annotation of sequences Petri Törönen.
Wellcome Trust Workshop Working with Pathogen Genomes Module 3 Sequence and Protein Analysis (Using web-based tools)
PROTEIN PATTERN DATABASES. PROTEIN SEQUENCES SUPERFAMILY FAMILY DOMAIN MOTIF SITE RESIDUE.
Good solutions are advantageous Christophe Roos - MediCel ltd Similarity is a tool in understanding the information in a sequence.
Gene Annotation and Analysis Lab Work Reference: European Multimedia Bioinformatics Educational Resource.
Biology 224 Instructor: Tom Peavy Feb 21 & 26, Protein Structure & Analysis.
Fission Yeast Computing Workshop -1- Searching, querying, browsing downloading and analysing data using PomBase Basic PomBase Features Gene Page Overview.
Sequence analysis: Macromolecular motif recognition Sylvia Nagl.
Module 3 Sequence and Protein Analysis (Using web-based tools) Working with Pathogen Genomes - Uruguay 2008.
Grup.bio.unipd.it CRIBI Genomics group Erika Feltrin PhD student in Biotechnology 6 months at EBI.
Functional Annotation of Proteins via the CAFA Challenge Lee Tien Duncan Renfrow-Symon Shilpa Nadimpalli Mengfei Cao COMP150PBT | Fall 2010.
Functional Annotation 基因功能预测 唐海宝 基因组与生物技术研究中心 2013 年 11 月 23 日.
BLOCKS Multiply aligned ungapped segments corresponding to most highly conserved regions of proteins- represented in profile.
From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of.
Introduction to the GO: a user’s guide Iowa State Workshop 11 June 2009.
Gene Product Annotation using the GO ml Harold J Drabkin Senior Scientific Curator The Jackson Laboratory.
Protein and RNA Families
Functional Annotation and Functional Enrichment. Annotation Structural Annotation – defining the boundaries of features of interest (coding regions, regulatory.
Other biological databases and ontologies. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and.
Motif discovery and Protein Databases Tutorial 5.
Copyright OpenHelix. No use or reproduction without express written consent1.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
Western New York Genetics in Research Partnership Expanding Exposure, Career Exploration and Interactive Projects in Basic Genome Analysis and Bioinformatics.
March 28, 2002 NIH Proteomics Workshop Bethesda, MD Lai-Su Yeh, Ph.D. Protein Scientist, National Biomedical Research Foundation Demo: Protein Information.
Exercises Pairwise alignment Homology search (BLAST) Multiple alignment (CLUSTAL W) Iterative Profile Search: Profile Search –Pfam –Prosite –PSI-BLAST.
1 Annotation EPP 245/298 Statistical Analysis of Laboratory Data.
Group discussion Name this protein. Protein sequence, from Aedes aegypti automated annotation >25558.m01330 MIHVQQMQVSSPVSSADGFIGQLFRVILKRQGSPDKGLICKIPPLSAARREQFDASLMFE.
DNA makes RNA  Transcription RNA makes Proteins  Translation Information flows from genes  proteins – But not the other way! (usually)
Protein domain/family db Secondary databases are the fruit of analyses of the sequences found in the primary sequence db Either manually curated (i.e.
InterPro Sandra Orchard.
Protein databases Petri Törönen Shamelessly copied from material done by Eija Korpelainen and from CSC bio-opas
Welcome to the Protein Database Tutorial. This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
BUSINESS SENSITIVE 1 SAAW - Sequence Annotation and Analysis Workshop Boyu Yang and Gene Godbold Battelle Memorial Institute, Charlottesville Operations.
Protein families, domains and motifs in functional prediction May 31, 2016.
BLAST: Basic Local Alignment Search Tool Robert (R.J.) Sperazza BLAST is a software used to analyze genetic information It can identify existing genes.
Protein domains Miguel Andrade Mainz, Germany Faculty of Biology,
Web Apollo/JBrowse • JBrowse is a web based genome browser
Protein families, domains and motifs in functional prediction
Bio/Chem-informatics
Protein Families, Motifs & Domains.
Demo: Protein Information Resource
Sequence based searches:
[Rz/Rz1, LysB/LysC, gp u/v] proteins of Lytic Cassette
Department of Genetics • Stanford University School of Medicine
Genome Annotation Continued
Genome Center of Wisconsin, UW-Madison
SIFGD: Setaria italica Functional Genomics Database
A brief on: Domain Families & Classification
A brief on: Domain Families & Classification
Presentation transcript:

Functional manual annotation including GO Exercises for hands-on session

Exercise 1: AmiGO search: Type in “cell surface” and explore the results Exercise 1: AmiGO search:

Exercise 1: AmiGO search , continued What is the GO term accession for “cell surface? What is the definition of the “cell surface” term? Name one gene product that has the GO term “cell surface” annotated to it.

Answers to Exercise 1: Amigo The cell surface GO ID is GO:0009986. The definition of the term is: The external part of the cell wall and/or plasma membrane. There are many gene products annotated with this term. Here are a few:

Exercise 2: Functional Annotation Analyze and annotate the unknown T. brucei protein sequence T_brucei_unknown.fasta, which is on the flash drive in the functional_and_go directory. >unknown_T. brucei protein_sequence MLRRLGVRHFRRTPLLFVGGDGSIFERYTEIDNSNERRINALKGCGMFEDEWIATEKVHGANFGIYSIEGEKMIRYAKRSGIMPPNEHFFGYHILIPELQRYITSIREMLCEKQKKKLHVVLINGELFGGKYDHPSVPKTRKTVMVAGKPRTISAVQTDSFPQYSPDLHFYAFDIKYKETEDGDYTTLVYDEAIELFQRVPGLLYARAVIRGPMSKVAAFDVERFVTTIPPLVGMGNYPLTGNWAEGLVVKHSRLGMAGFDPKGPTVLKFKCTAFQEISTDRAQGPRVDEMRNVRRDSINRAGVQLPDLESIVQDPIQLEASKLLLNHVCENRLKNVLSKIGTEPFEKEEMTPDQLATLLAKDVLKDFLKDTEPSIVNIPVLIRKDLTRYVIFESRRLVCSQWKDILKRQSPDFSE*

Analyze and annotate the sequence Blast your sequence at NCBI, and interpret the results. Use the Pfam site to search for Pfam and TIGRfamdomains. Use the Superfamily site to search for SCOP domains. Examine the output. Search for families and motifs (Interpro, Prosite, SignalP, TargetP, TmHMM) and examine the output. Summarize the results in an annotation. Search for GO terms. Annotate all possible GO terms.

Functional annotation of unknown T_brucei protein Summarize sequence homology: Domain(s): Motif(s): Name, or functional assignment: GO assignment(s): The annotations are in the Manual Functional Annotation and GO printouts, with the exception of additional GO term annotations.

Additional possible GO term For RNA editing ligase: Cellular component: mitochondrion GO:0005739 (ISS with CBS:TargetP)

Exercise 3: Functional Annotation Analyze and annotate the unknown Aedes aegypti protein sequence A_Aegypti_unknown.fasta, which is in your directory on the flash drive. >unknown_Aedes_aegypti_protein_85aa MASREAVRRAVQNVRPILSVDREEARKRVLNLYKAWYRQIPYIVMDYDIPKSVEQCREKLREEFLKHKNVTDIRVIDMLVIKGML

Analyze and annotate the sequence Blast your sequence at NCBI, and interpret the results. Use the Pfam site to search for Pfam and TIGRFAMs domains. Examine the output. Search for families and motifs (Interpro, Superfamily, Prosite, SignalP, TargetP, TmHMM) and examine the output. Summarize the results in an annotation. Search for GO terms. Annotate all possible GO terms.

Annotation for Unknown Aedes aegypti protein sequence Summarize sequence homology: Domain(s) and Families: Motif(s): Name, or functional assignment: GO assignment(s): The annotations are in the Manual Functional Annotation printout, with the exception of GO term annotations.

GO terms Predicted cellular component: mitochondrion GO:0005739 (ISS with CBS:TargetP) Predicted cellular component: Integral to membrane GO:0016021 (ISS with CBS:TMHMM) Predicted molecular function: NADH dehydrogenase (ubiquinone) activity GO:0008137 (ISS with UniProt:P56556 ) –deduced through sequence similarity (p-value 3.7e-16) using AmiGO BLAST function.