Protein Structures from A Statistical Perspective Jinfeng Zhang Department of Statistics Florida State University.

Slides:



Advertisements
Similar presentations
Functional Site Prediction Selects Correct Protein Models Vijayalakshmi Chelliah Division of Mathematical Biology National Institute.
Advertisements

Experimental Techniques in Protein Structure Determination Homayoun Valafar Department of Computer Science and Engineering, USC.
Determination of Protein Structure. Methods for Determining Structures X-ray crystallography – uses an X-ray diffraction pattern and electron density.
Rosetta Energy Function Glenn Butterfoss. Rosetta Energy Function Major Classes: 1. Low resolution: Reduced atom representation Simple energy function.
Protein Structure Prediction using ROSETTA
Andrzej Kolinski LABORATORY OF THEORY OF BIOPOLYMERS WARSAW UNIVERSITY Structure and Function of Biomolecules, Bedlewo,
Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
Geometric Algorithms for Conformational Analysis of Long Protein Loops J. Cortess, T. Simeon, M. Remaud- Simeon, V. Tran.
1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala.
The Calculation of Enthalpy and Entropy Differences??? (Housekeeping Details for the Calculation of Free Energy Differences) first edition: p
With thanks to Zhijun Wu An introduction to the algorithmic problems of Distance Geometry.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.
Tertiary protein structure viewing and prediction July 1, 2009 Learning objectives- Learn how to manipulate protein structures with Deep View software.
Two Examples of Docking Algorithms With thanks to Maria Teresa Gil Lucientes.
Determination of alpha-helix propensities within the context of a folded protein Blaber et al. J. Mol. Biol 1994.
Protein folding kinetics and more Chi-Lun Lee ( 李紀倫 ) Department of Physics National Central University.
Thomas Blicher Center for Biological Sequence Analysis
Protein Primer. Outline n Protein representations n Structure of Proteins Structure of Proteins –Primary: amino acid sequence –Secondary:  -helices &
Energetics and kinetics of protein folding. Comparison to other self-assembling systems?
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
Introduction to Statistical Thermodynamics of Soft and Biological Matter Lecture 4 Diffusion Random walk. Diffusion. Einstein relation. Diffusion equation.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modelling Thomas Blicher Center for Biological Sequence Analysis.
Stochastic Roadmap Simulation: An Efficient Representation and Algorithm for Analyzing Molecular Motion Mehmet Serkan Apaydin, Douglas L. Brutlag, Carlos.
Protein Structure Prediction Samantha Chui Oct. 26, 2004.
The Geometry of Biomolecular Solvation 1. Hydrophobicity Patrice Koehl Computer Science and Genome Center
Inverse Kinematics for Molecular World Sadia Malik April 18, 2002 CS 395T U.T. Austin.
Computational Structure Prediction Kevin Drew BCH364C/391L Systems Biology/Bioinformatics 2/12/15.
Practical session 2b Introduction to 3D Modelling and threading 9:30am-10:00am 3D modeling and threading 10:00am-10:30am Analysis of mutations in MYH6.
What are proteins? Proteins are important; e.g. for catalyzing and regulating biochemical reactions, transporting molecules, … Linear polymer chain composed.
Marcin Pacholczyk, Silesian University of Technology.
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
 Four levels of protein structure  Linear  Sub-Structure  3D Structure  Complex Structure.
Representations of Molecular Structure: Bonds Only.
RNA Secondary Structure Prediction Spring Objectives  Can we predict the structure of an RNA?  Can we predict the structure of a protein?
Statistical Physics of the Transition State Ensemble in Protein Folding Alfonso Ramon Lam Ng, Jose M. Borreguero, Feng Ding, Sergey V. Buldyrev, Eugene.
Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding I Prof. Corey O’Hern Department of Mechanical Engineering & Materials.
De novo Protein Design Presented by Alison Fraser, Christine Lee, Pradhuman Jhala, Corban Rivera.
EBI is an Outstation of the European Molecular Biology Laboratory. A web service for the analysis of macromolecular interactions and complexes PDBe Protein.
Department of Mechanical Engineering
Molecular modeling of protein-ligand interactions: Detailed simulations of a biotin-streptavidin complex Prof. Terry P. Lybrand Vanderbilt University Center.
EBI is an Outstation of the European Molecular Biology Laboratory. A web service for the analysis of macromolecular interactions and complexes PDBe Protein.
Conformational Entropy Entropy is an essential component in ΔG and must be considered in order to model many chemical processes, including protein folding,
10/3/2003 Molecular and Cellular Modeling 10/3/2003 Introduction Objective: to construct a comprehensive simulation software system for the computational.
INTERACTIONS IN PROTEINS AND THEIR ROLE IN STRUCTURE FORMATION.
Altman et al. JACS 2008, Presented By Swati Jain.
Self-organized Models of Selectivity in Ca and Na Channels Bob Eisenberg 1, Dezső Boda 2, Janhavi Giri 1,3, James Fonseca 1, Dirk Gillespie 1, Douglas.
Protein Structure Prediction: Homology Modeling & Threading/Fold Recognition D. Mohanty NII, New Delhi.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Protein Design with Backbone Optimization Brian Kuhlman University of North Carolina at Chapel Hill.
Structure prediction: Ab-initio Lecture 9 Structural Bioinformatics Dr. Avraham Samson Let’s think!
Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.
Placing hydrogens by dynamic programming By Xueyi Wang Department of Computer Science UNC-Chapel Hill.
FlexWeb Nassim Sohaee. FlexWeb 2 Proteins The ability of proteins to change their conformation is important to their function as biological machines.
Solving and Analyzing Side-Chain Positioning Problems Using Linear and Integer Programming Carleton L. Kingsford, Bernard Chazelle and Mona Singh Bioinformatics.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Forward and inverse kinematics in RNA backbone conformations By Xueyi Wang and Jack Snoeyink Department of Computer Science UNC-Chapel Hill.
Dynameomics: Protein Mechanics, Folding and Unfolding through Large Scale All-Atom Molecular Dynamics Simulations INCITE 6 David A. C. Beck Valerie Daggett.
In silico Protein Design: Implementing Dead-End Elimination algorithm
1 Three-Body Delaunay Statistical Potentials of Protein Folding Andrew Leaver-Fay University of North Carolina at Chapel Hill Bala Krishnamoorthy, Alex.
Conformat ional ENTROPY Sannali M Dittli 14 October 2015.
Computational Structure Prediction
Coarse-Grained Models Part II: Statistical potentials, CABS model
Protein Classification
Yang Zhang, Andrzej Kolinski, Jeffrey Skolnick  Biophysical Journal 
Giovanni Settanni, Antonino Cattaneo, Paolo Carloni 
Volume 18, Issue 11, Pages (November 2010)
Protein structure prediction.
Volume 15, Issue 1, Pages (January 2007)
Structural Elements of an Orphan Nuclear Receptor–DNA Complex
Presentation transcript:

Protein Structures from A Statistical Perspective Jinfeng Zhang Department of Statistics Florida State University

2 Introduction – Machines of Life Proteins play crucial roles in virtually all biological processes. Rhodopsin Myosin Hemoglobin From Protein Data Bank (PDB) Pepsin

3 From Sequence to Structure to Function The function of a protein is governed by its three-dimensional structure. –Structure is determined by sequence. DNA Sequences Amino Acid Sequences Protein Structures

4 Protein Folding Problem

5 Evaluation of Energy Functions Current approach: rank the energy of the single native structure among those of decoy structures. –Neither necessary nor sufficient. EE X-ray structure EE A B Near-Native structures

6 X-ray Structures

7 Proteins are Dynamic Molecules

8 YJ. Huang and GT. Montelione, Nature, 438, (2005),

9 N Furnham, T Blundell, M DePristo, Nature Structure & Molecular Biology, (2006) 13: … A more suitable representation of a macro-molecular crystal structure would be an ensemble of models. The range of structures in the ensemble would be considered by any user of the structural information.

10 A New Criterion for Energy Function Evaluation Based on Probability of NNS All structures Near-Native Structures (NNS) Native Structure

11 Off-lattice Discrete State Model J Zhang, R Chen, J Liang, (2006), Proteins, 63: CC

12 All structures NNS Sampling Near-Native Structures (NNS) by Sequential Monte Carlo (SMC) SMC Native structure Two constraints: – Conformational – Energetic Partition functions: J. Zhang et. al. (2007), Proteins, 66: 61-68

13 Comparison with Enumeration At length 15, 5-state model, protein 1ail has 1.04×10 9 conformations. Estimated number is 1.039×10 9 with a sample size of 10,000.

14 Energy Functions and Their Performances J Zhang et. al., (2007), Proteins, 66: J Zhang, R Chen, J Liang, (2006), Proteins, 63: UP: Uniform Potential CP: Contact Potential CALSP: Contact And Local Sequence-structure Potential

15 Characterizing Ensemble of Side Chain Conformations Side chain conformations –Ensemble of structures with the same backbone but different side chain conformations.

16 Ensemble of Side Chain Conformations

17 Ensemble of Side Chain Conformations Number of side chain conformations, N sc. Side chain conformational entropy. S sc = k B ln(N sc ) Protein stability.  G =  H-T  S

18 Sequential Monte Carlo (SMC) S n = (r 1,…, r j,…, r n ), r j ∈ R j = {1,…, M j }. SMC: sample a side-chain (r) one at a time and fix the residues that are already sampled. For each sample i, there is an associated weight, w (i). At step t, a residue, r t, is picked, and a rotamer, k, is sampled from a given distribution with probability p k. Update weight by

19 Sequential Sampling of Side Chains

20 Performance The total number of self-avoiding side-chain conformations for the fragment of 3ebx, residue 1-17, is 396,325,923,840 ≈ 3.96×10 11, SMC estimate is 4.01×10 11 with a sample size of 1000.

21 Incorporating SCE in Energy Function  G =  H –  H : Residue contact potential.  G =  H - T  S sc –  H : Residue contact potential. –  S sc : Side-chain entropy. –T = 1.

22 ΔH vs. ΔH - ΔS sc Protein IDΔHΔHΔH - ΔS sc Protein IDΔHΔHΔH - ΔS sc 1ctf (A, 630)*611beo (D, 2000)672 1r69 (A, 675)2451ctf (D, 2000)101 1sn3 (A, 660)86101dkt-A (D, 2000)5885 2cro (A, 674)6351fca (D, 2000) icb (A, 653)19251nkl (D, 2000)2173 4pti (A, 687)143831pgb (D, 1572)121 4rxn (A, 677)1471b0n-B (E, 497) fc2 (B, 500)751ctf (E, 497)134 1hdd-C (B, 500)1051dtk (E, 215)11 2cro (B, 500)47171fc2 (E, 500)323 4icb (B, 500) 11 1igd (E, 500)1596 1bl0 (C, 971)85141shf-A (E, 437)2 2 1eh2 (C, 2413)99532cro (E, 500)11 1jwe (C, 1407)28812ovo (E, 347)192 smd3 (C, 1200)26614pti (E, 343)11 * A: 4state_reduced, B: fisa, C: fisa_casp3, D: lattice_ssfit, E: lmds.

23 Summary Proteins can be better modeled as ensemble of conformations. –Estimating entropy and free energy by SMC P NNS as a better criterion for evaluating energy functions. SCE is important for protein folding and structure modeling.

24 Acknowledgement Prof. Jun LiuDepartment of Statistics Harvard University Prof. Jie LiangBioengineering Department University of Illinois at Chicago Prof. Rong ChenDepartment of Information and Decision Science University of Illinois at Chicago Dr. Ming LinDepartment of Information and Decision Science University of Illinois at Chicago NIH, NSF for financial support!

25 Protein Interactions

26 Native & Decoy Structure of Protein Complexes 1spb 1brc Native

27 Native & Decoys Structures Native S sc can differ by more than 20 in k B unit, which corresponds to kcal/mol of free energy at 300K. The stability of a protein is around -5 to -20 kcal/mol. 1ctf

28 Side-chain Modeling All heavy atoms are explicitly modeled. Side-chain flexibility –Rotamer library by D. Richardson Excluded volume effect –A pair of atoms i and j are considered to be a hard clash if r ij : distance; r 0 (i) and r 0 (j) : van der Waals radii of the two atoms; a : scaling coefficient.

29 X-ray & NMR Structures Protein in crystalProtein in solution

30 SCE vs. R g of X-ray and NMR Structures 23 proteins with both X-ray and NMR structures