Presented at: Pacific Symposium on Biocomputing January 3, 2012.

Slides:



Advertisements
Similar presentations
Functional Site Prediction Selects Correct Protein Models Vijayalakshmi Chelliah Division of Mathematical Biology National Institute.
Advertisements

Protein Structure Prediction using ROSETTA
05/27/2006 Modeling and Determining the Structures of Proteins and Macromolecular Assemblies Depts. of Biopharmaceutical Sciences and Pharmaceutical Chemistry.
Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
Folding and flexibility. Outline What is protein folding ? How proteins fold in vivo ? What is protein flexibility ?
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.
Chapter 9 Structure Prediction. Motivation Given a protein, can you predict molecular structure Want to avoid repeated x-ray crystallography, but want.
Protein-DNA interactions: amino acid conservation and the effects of mutations on binding specificity Nicholas M. Luscombe and Janet M. Thornton JMB (2002)
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
Protein Functions: catalyze reactions (enzymes) receptors (eg. pain receptors) transport (ions across membranes, oxygen in blood) molecular motors recognition.
Thomas Blicher Center for Biological Sequence Analysis
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.
Methods for Improving Protein Disorder Prediction Slobodan Vucetic1, Predrag Radivojac3, Zoran Obradovic3, Celeste J. Brown2, Keith Dunker2 1 School of.
C enter For C omputational B iology and B ioinformatics Protein Intrinsic Disorder, Cell Signaling and Alternative Splicing.
1 Protein Structure Prediction Charles Yan. 2 Different Levels of Protein Structures The primary structure is the sequence of residues in the polypeptide.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modelling Thomas Blicher Center for Biological Sequence Analysis.
A System Approach to Measuring the Binding Energy Landscapes of Transcription Factors Authors: Sebastian J. et. al Presenter: Hongliang Fei.
Project list 1.Peptide MHC binding predictions using position specific scoring matrices including pseudo counts and sequences weighting clustering (Hobohm)
Protein and Function Databases
Detecting the Domain Structure of Proteins from Sequence Information Niranjan Nagarajan and Golan Yona Department of Computer Science Cornell University.
Introduction to Molecular Biology zMolecular biology is interdisciplinary (biochemistry, genetics, cell biology) zImpact of genome projects (human, bacteria,
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Protein Bioinformatics Course
The dynamic nature of the proteome
What are proteins? Proteins are important; e.g. for catalyzing and regulating biochemical reactions, transporting molecules, … Linear polymer chain composed.
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
Protein Correlation Profiling for Label-Free Quantitation.
Predicting protein disorder Peter Tompa Institute of Enzymology Hungarian Academy of Sciences Budapest, Hungary.
Makromolekulak_2010_12_07 Simon István. Prion protein.
PROTEINS PROTEINS Levels of Protein Structure.
Prediction of protein disorder Zsuzsanna Dosztányi MTA-ELTE Momentum Bioinformatics Group Department of Biochemistry Eotvos Lorand University, Budapest,
PART II. Prediction of functional regions within disordered proteins Zsuzsanna Dosztányi MTA-ELTE Momentum Bioinformatics Group Department of Biochemistry.
C enter For C omputational B iology and B ioinformatics Bioinformatics and Intrinsically Disordered Proteins (IDPs) A. Keith Dunker Biochemistry and Molecular.
Prediction of protein disorder Zsuzsanna Dosztányi Institute of Enzymology, Budapest, Hungary
Department of Mechanical Engineering
Secondary structure prediction
Web Servers for Predicting Protein Secondary Structure (Regular and Irregular) Dr. G.P.S. Raghava, F.N.A. Sc. Bioinformatics Centre Institute of Microbial.
What is a Project Purpose –Use a method introduced in the course to describe some biological problem How –Construct a data set describing the problem –Define.
Last Tuesday and Beyond Common 2° structural elements: influenced by 1° structure –alpha helices –beta strands –beta turns Structure vs. function –Fibrous.
10/3/2003 Molecular and Cellular Modeling 10/3/2003 Introduction Objective: to construct a comprehensive simulation software system for the computational.
Protein Disordered Regions and the Evolution of Eukaryotes Allan Wu Phar 201 Phil Bourne.
Proteins. Protein Function  Catalysis  Structure  Movement  Defense  Regulation  Transport  Antibodies.
Protein Modeling Protein Structure Prediction. 3D Protein Structure ALA CαCα LEU CαCαCαCαCαCαCαCα PRO VALVAL ARG …… ??? backbone sidechain.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Modelling proteomes Ram Samudrala Department of Microbiology How does the genome of an organism specify its behaviour and characteristics?
Russell Group, Protein Evolution _________ ____ Rob Russell Cell Networks University of Heidelberg Interactions and Modules: the how and why of molecular.
This seems highly unlikely.
Fehérjék 3. Simon István. p27 Kip1 IA 3 FnBP Tcf3 Bound IUP structures.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Ubiquitination Sites Prediction Dah Mee Ko Advisor: Dr.Predrag Radivojac School of Informatics Indiana University May 22, 2009.
Molecular mechanics Classical physics, treats atoms as spheres Calculations are rapid, even for large molecules Useful for studying conformations Cannot.
Intrinsically disordered proteins Zsuzsanna Dosztányi EMBO course Budapest, 3 June 2016.
Protein Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form in a biologically functional.
Biochemistry Free For All
Dr. Jagdish Kaur, P.G.G.C.,Sector 11 Chandigarh
Predicting Active Site Residue Annotations in the Pfam Database
binding sites 58 of the 473 unambiguously assigned phosphorylation sites are predicted by Scansite to be sites for binding. 50 of these correspond.
Protein Bioinformatics Course
Distribution of disorder in the cytosolic phosphoproteome
Phosphorylation and sequence disorder in microtubule-associated protein Tau.A, schematic illustration of the domain profile of Tau with all known phosphorylation.
Protein Disorder Prediction
Combining Predictors for Short and Long Protein Disorder
Loyola Marymount University
Department of Chemical and Systems Biology
A Million Peptide Motifs for the Molecular Biologist
Michael S. Kuhns, Mark M. Davis  Immunity 
Discussion of Protein Disorder Prediction
Volume 109, Issue 7, Pages (October 2015)
Mechanisms and Consequences of Macromolecular Phase Separation
Presentation transcript:

Presented at: Pacific Symposium on Biocomputing January 3, 2012. Tutorial: Protein Intrinsic Disorder Jianhan Chen, Kansas State University Jianlin Cheng, University of Missouri A. Keith Dunker, Indiana University Presented at: Pacific Symposium on Biocomputing January 3, 2012. 1

Outline Intrinsically Disordered Proteins (IDPs) Definitions Methods for detecting IDPs and IDP regions Examples Prediction of disorder from amino acid sequence Visit www.disprot.org Research Frontiers of IDPs – A Session Summary Prediction methods for IDPs Simulation of IDPs’ conformations Analysis of IDPs’ function and evolution 2

Part I: Intrinsically Disordered Proteins

Definitions: Intrinsically Disordered Proteins (IDPs) and IDP Regions Whole proteins and regions of proteins are intrinsically disordered if: they lack stable 3D structure under physiological conditions, and if: they exist instead as dynamic, inter-converting configurational ensembles without particular equilibrium values for their coordinates or bond angles. 4

Types of IDPs and IDP Regions Flexible and dynamic random coils, which are distinct from structured random coils. Transient helices, turns, and sheets in random coil regions Stable helices, turns and sheets, but unstable tertiary structure (e.g. molten globules) 5

Three of ~ Sixty Methods for Studying IDPs and IDP Regions (Book in Press) X-ray Diffraction: requires regular spacing for diffraction to occur. Mobility of IDPs and IDP regions causes them to simply disappear. Gives residue-specific information. NMR: various NMR methods can directly identify IDPs and IDP regions due to their faster movements as compared to the movements of globular domains. Gives residue-specific information. Circular Dichroism: IDPs and IDP regions typically give “random-coil” type CD spectrum. Gives whole-protein information, not residue-specific information. 6

X-ray Determined Disorder: Calcineurin and Calmodulin Meador W et al., Science 257: 1251-1255 (1992) B-Subunit A-Subunit Active Site Autoinhibitory Peptide Kissinger C et al., Nature 378:641-644 (1995)

NMR Determined Disorder: Breast Cancer Protein 1 (BRCA1) 103 + 217 = 320 320 / 1,863  17% Structured 1,543 / 1,863  83% Unstructured (Disordered) Many such “natively unfolded proteins” or “intrinsically disordered proteins” have been described. Mark WY et al., J Mol Biol 345: 275-287 (2005) 8

Intrinsic Disorder in the Protein Data Bank Observed Not Observed Ambiguous Uncharacterized Total Eukarya 647067 39077 24621 504312 1215077 (53.3%) (3.2%) (2.0%) (41.5%) (100%) Bacteria 573676 19 126 17702 82479 692983 (82.8%) (2.7%) (2.6%) (11.9%) Viruses 76019 4856 3797 127970 212642 (35.7%) (2.3%) (1.8%) (60.2%) Achaea 60411 2055 2112 3029 67607 (89.4%) (3.0%) (3.1%) (4.5 %) 1357173 65114 48232 717790 2188309 (62.0%) (2.2%) (32.8%) LaGall et al., J. Biomol Struct Dyn 24: 325-342 (2007)

LaGall et al., J. Biomol Struct Dyn 24: 325-342 (2007)

Why are IDPs & IDP Regions unstructured? IDPs & IDP Regions lack structure because: They lack a cofactor, ligand or partner. They were denatured during isolation. Their folding requires conditions found inside cells. Their lack of structure is encoded by their amino acid composition. 11

Amino Acid Compositions Surface Buried

Why are IDPs & IDP Regions unstructured? To a first approximation, amino acid composition determines whether a protein folds or remains intrinsically disordered. Given a composition that favors folding, the sequence details determine which fold. Given a composition that favors not folding, the sequence details provide motifs for biological function. 13

Prediction of Intrinsic Disorder Aromaticity, Hydropathy, Charge, Complexity Attribute Selection or Extraction Separate Training and Testing Sets Predictor Training Ordered / Disordered Sequence Data Neural Networks, SVMs, etc. Predictor Validation on Out-of-Sample Data Prediction 14

PONDR®VL-XT, PONDR®VSL2B and PreDisorder (+) Disordered XPA (–) Structured Iakoucheva L et al., Protein Sci 3: 561-571 (2001) Dunker AK et al., FEBS J 272: 5129-5148 (2005) Deng X., et al., BMC Bioinformatics 10:436 (2009) 15

Predicted Disorder vs. Proteome Size

Why So Much Disorder? Hypothesis: Disorder Used for Signaling • Sequence  Structure  Function – Catalysis, – Membrane transport, – Binding small molecules. • Sequence  Disordered Ensemble  Function – Signaling, Sites for PTMs, Partner Binding, – Regulation, Dunker AK, et al., Biochemistry 41: 6573-6582 (2002) – Recognition, Dunker AK, et al., Adv. Prot. Chem. 62: 25-49 (2002) – Control. Xie H, et al., Proteome Res. 6: 1882-1932 (2007) 17

Molecular Recognition Features (MoRFs) Proteinase A + Inhibitor IA3 viral protein pVIc + Adenovirus 2 Proteinase ι-MoRF complex-MoRF Amphiphysin + a-adaptin C β-amyloid protein + protein X11 Vacic V, et al. J Proteome Res. 6: 2351-2366 (2007) 18

Protein Interaction Domains: GYF Bound to CD2 http://www.mshri.on.ca/pawson/domains.html; GOOGLE: Tony Pawson 19

Short and Long MoRFs in PDB As of 1/11/11, PDB contained 70,695 entries: number of short* MoRFs = 7681 number of long** MoRFs = 8525 short MoRFs + long MoRFs = ~ 23% of PDB entries! * Short = 5 – 30 aa **Long = 31 – 70 aa 20

p53 MoRFs Note use of disordered tails! Uversky VN & Dunker AK BBA 1804: 1231-1264 (2010)

Part II: Research Frontiers of Intrinsically Disordered Proteins

Current Topics of Intrinsically Disordered Proteins Prediction of Intrinsically Disordered Proteins (IDPs) Simulation of IDPs’ conformation Analysis of IDPs’ function and evolution Chen, Cheng, Keith, PSB, 2012

IDP Prediction Methods Identification of Disordered Region Ab initio method Template-based method Clustering method Meta method Deng et al., Molecular Biosystems, 2011

Benchmark on 117 CASP9 Targets Disorder Predictor ACC Score AUC Weighed Pos. Sens. Spec. Neg. F-meas. Prdos2 0.752 0.852 7.153 0.608 0.375 0.897 0.957 0.464 PreDisorder 0.748 0.819 7.187 0.650 0.300 0.846 0.960 0.410 biomine_DR_pdb 0.739 0.818 6.763 0.597 0.338 0.881 0.956 0.432 GSmetaDisorderMD 0.736 0.813 6.906 0.657 0.266 0.816 0.959 0.378 mason 0.730 0.740 6.297 0.537 0.416 0.923 0.952 0.469 ZHOU-SPINE-D 0.729 0.829 6.411 0.579 0.326 0.878 0.954 0.417 GSmetaserver 0.713 0.811 5.982 0.577 0.279 0.849 0.376 ZHOU-SPINE-DM 0.705 0.789 5.621 0.535 0.303 0.875 0.949 0.387 Distill-Punch1 0.701 0.797 5.392 0.505 0.946 0.405 GSmetaDisorder 0.694 0.793 5.268 0.519 0.287 0.869 0.947 0.370 OnD-CRF 0.733 5.513 0.586 0.231 0.802 0.950 0.332 CBRC_POODLE 0.693 0.828 4.958 0.447 0.425 0.939 0.944 0.435 MULTICOM 0.687 4.723 0.419 0.481 0.955 0.942 0.448 IntFOLD-DR 0.683 0.794 4.831 0.299 0.885 0.369 Biomine_DR_mixed 0.769 4.901 0.501 0.274 0.865 0.945 0.354 Spritz3 0.751 4.732 0.457 0.336 0.909 0.943 DISOPRED3C 0.669 0.851 3.975 0.349 0.775 0.990 0.937 GSmetaDisorder3D 0.781 4.142 0.398 0.399 biomine_DR 0.659 0.815 3.647 0.333 0.696 0.985 0.936 0.451 OnD-CRF-pruned 0.707 4.358 0.526 0.205 0.792 0.295 Distill 0.654 4.152 0.510 0.204 0.798 0.941 0.291 ULg-GIGA 0.589 0.718 1.302 0.191 0.988 0.924 0.290 0.572 0.644 0.152 0.647 0.992 0.920 0.247 Deng et al., Molecular Biosystems, 2011

A Prediction Example by PreDisorder Deng et al., Molecular Biosystems, 2011

Improve Disorder Prediction by Regression-Based Consensus Peng and Kurgan, PSB, 2012

Current Topics of Intrinsically Disordered Proteins Prediction of Intrinsically Disordered Proteins (IDPs) Simulation of IDPs’ conformation Analysis of IDPs’ function and evolution Chen, Cheng, Keith, PSB, 2012

Construct IDP Ensembles Using Variational Bayesian Weighting with Structure Selection Construct a minimal number of conformations Estimate uncertainty in properties Validated against reference ensembles of a-synuclein Alignment of weighted structures Fisher et al., PSB, 2012

Discover Intermediate States in IDP Ensemble by Quasi-Aharmonic Analysis Bound and unbound forms of Nuclear Co-Activator Binding Domain (NCBD) Burger et al., PSB, 2012

Order-Disorder Transformation by Sequential Phosphorylations? Domains organization of human nucleophosmin (Npm) Order – Disorder Transition Triggered by Phosphorylation Phosphorylation Sites (blue) Mitrea and Kriwacki, PSB, 2012

Current Topics of Intrinsically Disordered Proteins Prediction of Intrinsically Disordered Proteins (IDPs) Simulation of IDPs’ conformation Analysis of IDPs’ function and evolution Chen, Cheng, Keith, PSB, 2012

Classify Disordered Proteins by CH-CDF Plot Charge-hydropathy , cumulative distribution function Four classes: structured, mixed, disordered, rare Huang et al., PSB, 2012

Function Annotation of IDP Domains by Amino Acid Content Frequency of an amino acid in sequence i Similarity between disordered proteins Achieve similar function prediction precision, but much higher coverage in comparison with Blast CC: cellular component MF: molecular function BP: biological process Patil et al., PSB, 2012

High Conservation in Flexible Disordered Binding Sites Hsu et al., PSB, 2012

Sequence Conservation & Co-Evolution in IDPs and their Function Implication Jeong and Kim, PSB, 2012

Intrinsic Disorder Flanking DNA-Binding Domains of Human TFs Guo et al., PSB, 2012

Modulate Protein-DNA Binding by Post-Translational Modifications at Disordered Regions Vuzman et al., PSB, 2012

High Correlation between Disorder and Post-Translational Modification Disorder-order transitions might be introduced by modifications of phospho-serine-threonine, mono-di-tri-methyllysine, sulfotyrosine, 4-carboxyglutamate Gao and Xu, PSB, 2012

Acknowledgements Authors and reviewers of PSB IDP session IDP community PSB organizers Thank You ! ! ! Images.google.com