Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Slides:



Advertisements
Similar presentations
A Comparative mapping resource ONTOLOGY DEVELOPMENT AND INTEGRATION IN GRAMENE Pankaj Jaiswal Cornell University.
Advertisements

GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Gene Ontology John Pinney
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
©CMBI 2005 Exploring Protein Sequences - Part 2 Part 1: Patterns and Motifs Profiles Hydropathy Plots Transmembrane helices Antigenic Prediction Signal.
© Wiley Publishing All Rights Reserved. Analyzing Protein Sequences.
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
Introduction to Functional Analysis J.L. Mosquera and Alex Sanchez.
Biology 224 Dr. Tom Peavy Sept 27 & 29 Protein Structure & Analysis.
Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245.
Protein structure (Part 2 of 2).
Readings for this week Gogarten et al Horizontal gene transfer….. Francke et al. Reconstructing metabolic networks….. Sign up for meeting next week for.
Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245.
Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245.
COG and GO tutorial.
Bioinformatics on Proteomics Hsueh-Fen Juan April 24, 2003 NTNU.
Genome analysis and annotation Part II. THE INSTITUTE FOR GENOMIC RESEARCH TIGRTIGR Evidence View S.mansoni PASA assemblies S. japonicum EST alignments.
Biology 224 Dr. Tom Peavy Sept 28 & 30
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Protein analysis and proteomics Friday, 27 January 2006 Introduction to Bioinformatics DA McClellan
What’s next ?? Today 3.3 Protein function 10.3 Protein secondary structure prediction 17.3 Protein tertiary structure prediction 24.3Gene expression &
Biology 224 Dr. Tom Peavy Sept 27 & 29 Protein Structure & Analysis- part 2.
Protein Modules An Introduction to Bioinformatics.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Today’s menu: -SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Protein and Function Databases
Marcotte EM, Pellegrini M, Ng HL, Rice DW, Yeates TO, Eisenberg D. (1999). Detecting protein function and protein-protein interactions from genome sequences.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Protein Classification A comparison of function inference techniques.
Predicting Function (& location & post-tln modifications) from Protein Sequences June 15, 2015.
DEMO CSE fall. What is GeneMANIA GeneMANIA finds other genes that are related to a set of input genes, using a very large set of functional.
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
Methods for Creating GO Annotations Emily Dimmer European Bioinformatics Institute Wellcome Trust Genome Campus Cambridge UK.
Protein analysis and proteomics July 29, 2009 August 5, 2009 Bioinformatics M.E: J. Pevsner
Protein Bioinformatics Course
Protein analysis and proteomics (Part 2 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
© Wiley Publishing All Rights Reserved. Protein and Specialized Sequence Databases.
Good solutions are advantageous Christophe Roos - MediCel ltd Similarity is a tool in understanding the information in a sequence.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
Database 5: protein domain/family. Protein domain/family: some definitions Most proteins have « modular » structures Estimation: ~ 3 domains / protein.
Biology 224 Instructor: Tom Peavy Feb 21 & 26, Protein Structure & Analysis.
Sequence analysis: Macromolecular motif recognition Sylvia Nagl.
Day 2: Protein Sequence Analysis 1.Physico-chemical properties. 2.Cellular localization. 3.Signal peptides. 4.Transmembrane domains. 5.Post-translational.
GENE ONTOLOGY FOR THE NEWBIES Suparna Mundodi, PhD The Arabidopsis Information Resources, Stanford, CA.
Gene expression analysis
BIOINFORMATIK I UEBUNG 2 mRNA processing.
You have worked for 2 years to isolate a gene involved in axon guidance. You sequence the cDNA clone that contains axon guidance activity. What do you.
Monday, November 8, 2:30:07 PM  Ontology is the philosophical study of the nature of being, existence or reality as such, as well as the basic categories.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
Alastair Kerr, Ph.D. WTCCB Bioinformatics Core An introduction to DNA and Protein Sequence Databases.
Tutorial 7 Gene expression analysis 1. Expression data –GEO –UCSC –ArrayExpress General clustering methods –Unsupervised Clustering Hierarchical clustering.
Protein and RNA Families
Functional Annotation and Functional Enrichment. Annotation Structural Annotation – defining the boundaries of features of interest (coding regions, regulatory.
Rice Proteins Data acquisition Curation Resources Development and integration of controlled vocabulary Gene Ontology Trait Ontology Plant Ontology
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245.
Protein domain/family db Secondary databases are the fruit of analyses of the sequences found in the primary sequence db Either manually curated (i.e.
Protein databases Petri Törönen Shamelessly copied from material done by Eija Korpelainen and from CSC bio-opas
Protein structure, domains, and interactions Curtis Huttenhower Harvard T.H. Chan School of Public Health Department of Biostatistics.
 What is MSA (Multiple Sequence Alignment)? What is it good for? How do I use it?  Software and algorithms The programs How they work? Which to use?
Protein families, domains and motifs in functional prediction May 31, 2016.
Protein analysis and proteomics
Introduction to Bioinformatics
Bio/Chem-informatics
Genome Annotation Continued
Predicting Active Site Residue Annotations in the Pfam Database
Presentation transcript:

Protein analysis and proteomics (Part 1 of 2)

Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan Pevsner (ISBN ). Copyright © 2003 by John Wiley & Sons, Inc.John Wiley & Sons, Inc These images and materials may not be used without permission from the publisher. We welcome instructors to use these powerpoints for educational purposes, but please acknowledge the source. The book has a homepage at Including hyperlinks to the book chapters. Copyright notice

Outline for today Protein analysis and proteomics Individual proteins Protein families Physical properties Localization Function Large-scale protein analysis 2D protein gels Yeast two-hybrid Rosetta Stone approach Pathways

protein Page 224 RNA DNA

protein [1] Protein families Page 224

protein [1] Protein families [2] Physical properties Page 224

protein [1] Protein families [2] Physical properties Page 224 [3] Protein localization

protein [1] Protein families [4] Protein function [2] Physical properties Page 224 [3] Protein localization

protein [1] Protein families [4] Protein function [2] Physical properties Page 224 [3] Protein localization Gene ontology (GO): --cellular component --biological process --molecular function

Perspective 1: Protein domains and motifs Page 225

Definitions Signature: a protein category such as a domain or motif Page 225

Definitions Signature: a protein category such as a domain or motif Domain: a region of a protein that can adopt a 3D structure a fold a family is a group of proteins that share a domain examples: zinc finger domain immunoglobulin domain Motif (or fingerprint): a short, conserved region of a protein typically 10 to 20 contiguous amino acid residues Page 225

15 most common domains (human) Zn finger, C2H2 type1093 proteins Immunoglobulin1032 EGF-like471 Zn-finger, RING458 Homeobox417 Pleckstrin-like405 RNA-binding region RNP-1400 SH3394 Calcium-binding EF-hand392 Fibronectin, type III300 PDZ/DHR/GLGF280 Small GTP-binding protein 261 BTB/POZ236 bHLH226 Cadherin226 Page 227

15 most common domains (various species) The European Bioinformatics Institute (EBI) offers many key proteomics resources: Page 227

Definition of a domain According to InterPro at EBI ( /): A domain is an independent structural unit, found alone or in conjunction with other domains or repeats. Domains are evolutionarily related. According to SMART ( A domain is a conserved structural entity with distinctive secondary structure content and a hydrophobic core. Homologous domains with common functions usually show sequence similarities. Page 226

Varieties of protein domains Page 228 Extending along the length of a protein Occupying a subset of a protein sequence Occurring one or more times

Example of a protein with domains: Methyl CpG binding protein 2 (MeCP2) MBD Page 227 TRD The protein includes a methylated DNA binding domain (MBD) and a transcriptional repression domain (TRD). MeCP2 is a transcriptional repressor. Mutations in the gene encoding MeCP2 cause Rett Syndrome, a neurological disorder affecting girls primarily.

Page 228 Result of an MeCP2 blastp search: A methyl-binding domain shared by several proteins

Page 228 Are proteins that share only a domain homologous?

Example of a multidomain protein: HIV-1 pol 1003 amino acids long cleaved into three proteins with distinct activities: -- aspartyl protease -- reverse transcriptase -- integrase We will explore HIV-1 pol and other proteins at the Expert Protein Analysis System (ExPASy) server. Visit Page 229

Page 230

SwissProt entry for HIV-1 pol links to many databases

Page 231 ProDom entry for HIV-1 pol shows many related proteins

Page 231 Proteins can have both domains and patterns (motifs) Domain (aspartyl protease) Domain (reverse transcriptase) Pattern (several residues) Pattern (several residues)

Page 232

Definition of a motif A motif (or fingerprint) is a short, conserved region of a protein. Its size is often 10 to 20 amino acids. Simple motifs include transmembrane domains and phosphorylation sites. These do not imply homology when found in a group of proteins. PROSITE ( is a dictionary of motifs (there are currently 1600 entries). In PROSITE, a pattern is a qualitative motif description (a protein either matches a pattern, or not). In contrast, a profile is a quantitative motif description. We will encounter profiles in Pfam, ProDom, SMART, and other databases. Page

Perspective 2: Physical properties of proteins Page 233

Page 234

Physical properties of proteins Many websites are available for the analysis of individual proteins. ExPASy and ISREC are two excellent resources. The accuracy of these programs is variable. Predictions based on primary amino acid sequence (such as molecular weight prediction) are likely to be more trustworthy. For many other properties (such as posttranslational modification of proteins by specific sugars), experimental evidence may be required rather than prediction algorithms. Page 236

Page 235

Page 236

Page 238

Syntaxin, SNAP-25 and VAMP are three proteins that interact via coiled-coil domains

Introduction to Perspectives 3 and 4: Gene Ontology (GO) Consortium Page 237

The Gene Ontology Consortium An ontology is a description of concepts. The GO Consortium compiles a dynamic, controlled vocabulary of terms related to gene products. There are three organizing principles: Molecular function Biological process Cellular compartment You can visit GO at There is no centralized GO database. Instead, curators of organism-specific databases assign GO terms to gene products for each organism. Page 237

Page 241 GO terms are assigned to LocusLink entries

Page 241

The Gene Ontology Consortium: Evidence Codes ICInferred by curator IDAInferred from direct assay IEAInferred from electronic annotation IEPInferred from expression pattern IGIInferred from genetic interaction IMPInferred from mutant phenotype IPIInferred from physical interaction ISSInferred from sequence or structural similarity NASNon-traceable author statement NDNo biological data TASTraceable author statement Page 240

Perspective 3: Protein localization Page 242

protein Protein localization Page 242

Protein localization Proteins may be localized to intracellular compartments, cytosol, the plasma membrane, or they may be secreted. Many proteins shuttle between multiple compartments. A variety of algorithms predict localization, but this is essentially a cell biological question. Page 240

Page 242

Page 244

Localization of 2,900 yeast proteins Michael Snyder and colleagues incorporated epitope tags into thousands of S. cerevisiae cDNAs, and systematically localized proteins (Kumar et al., 2002). See for a database including 2,900 fluorescence micrographs. Page 243

Perspective 4: Protein function Page 243

Protein function Function refers to the role of a protein in the cell. We can consider protein function from a variety of perspectives. Page 243

1. Biochemical function (molecular function) RBP binds retinol, could be a carrier Page 245

2. Functional assignment based on homology RBP could be a carrier too Other carrier proteins Page 245

3. Function based on structure RBP forms a calyx Page 245

4. Function based on ligand binding specificity RBP binds vitamin A Page 245

5. Function based on cellular process DNARNA RBP is abundant, soluble, secreted Page 245

6. Function based on biological process RBP is essential for vision Page 245

7. Function based on “proteomics” or high throughput “functional genomics” High throughput analyses show... RBP levels elevated in renal failure RBP levels decreased in liver disease Page 245

Functional assignment of enzymes: the EC (Enzyme Commission) system Oxidoreductases1,003 Transferases1,076 Hydrolases1,125 Lyases356 Isomerases156 Ligases126 Page 246

Functional assignment of proteins: Clusters of Orthologous Groups (COGs) Information storage and processing Cellular processes Metabolism Poorly characterized Page 247

Functional assignment of proteins: Clusters of Orthologous Groups (COGs) Information storage and processing Cellular processes Metabolism Poorly characterized (Most useful for prokaryotes) Page 247

This lecture continues in part 2 with a discussion of two dimensional gels and the yeast two-hybrid system