Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction.

Slides:



Advertisements
Similar presentations
Lecture 14: Special interactions. What did we cover in the last lecture? Restricted motion of molecules near a surface results in a repulsive force which.
Advertisements

Rosetta Energy Function Glenn Butterfoss. Rosetta Energy Function Major Classes: 1. Low resolution: Reduced atom representation Simple energy function.
Protein Structure – Part-2 Pauling Rules The bond lengths and bond angles should be distorted as little as possible. No two atoms should approach one another.
A Digital Laboratory “In the real world, this could eventually mean that most chemical experiments are conducted inside the silicon of chips instead of.
Survey of Molecular Dynamics Simulations By Will Welch For Jan Kubelka CHEM 4560/5560 Fall, 2014 University of Wyoming.
Molecular Modeling: Molecular Mechanics C372 Introduction to Cheminformatics II Kelsey Forsythe.
Amino Acids:. Peptide Bond * Elimination of water upon formation. * Peptide bond is flat.
Lawrence Hunter, Ph.D. Director, Computational Bioscience Program University of Colorado School of Medicine
Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
Molecular Dynamics, Monte Carlo and Docking Lecture 21 Introduction to Bioinformatics MNW2.
Case Studies Class 5. Computational Chemistry Structure of molecules and their reactivities Two major areas –molecular mechanics –electronic structure.
CISC667, F05, Lec21, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction 3-Dimensional Structure.
Protein Primer. Outline n Protein representations n Structure of Proteins Structure of Proteins –Primary: amino acid sequence –Secondary:  -helices &
Energetics and kinetics of protein folding. Comparison to other self-assembling systems?
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
Molecular modelling / structure prediction (A computational approach to protein structure) Today: Why bother about proteins/prediction Concepts of molecular.
Protein Basics Protein function Protein structure –Primary Amino acids Linkage Protein conformation framework –Dihedral angles –Ramachandran plots Sequence.
Molecular Mechanics, Molecular Dynamics, and Docking Michael Strong, PhD National Jewish Health 11/23/2010.
Molecular Dynamics Classical trajectories and exact solutions
Proteins: Levels of Protein Structure Conformation of Peptide Group
Inverse Kinematics for Molecular World Sadia Malik April 18, 2002 CS 395T U.T. Austin.
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques117 Jan 2006 Bioinformatics Data Analysis & Tools Molecular simulations & sampling.
Molecular Modeling Part I Molecular Mechanics and Conformational Analysis ORG I Lab William Kelly.
Protein Structure Prediction Dr. G.P.S. Raghava Protein Sequence + Structure.
8/30/2015 5:26 PM Homology modeling Dinesh Gupta ICGEB, New Delhi.
Construyendo modelos 3D de proteinas ‘fold recognition / threading’
Molecular Mechanics, Molecular Dynamics, and Docking
Empirical energy function Summarizing some points about typical MM force field In principle, for a given new molecule, all force field parameters need.
Molecular Modeling Part I. A Brief Introduction to Molecular Mechanics.
What are proteins? Proteins are important; e.g. for catalyzing and regulating biochemical reactions, transporting molecules, … Linear polymer chain composed.
Protein Secondary Structure Lecture 2/19/2003. Three Dimensional Protein Structures Confirmation: Spatial arrangement of atoms that depend on bonds and.
02/03/10 CSCE 769 Dihedral Angles Homayoun Valafar Department of Computer Science and Engineering, USC.
Protein “folding” occurs due to the intrinsic chemical/physical properties of the 1° structure “Unstructured” “Disordered” “Denatured” “Unfolded” “Structured”
Representations of Molecular Structure: Bonds Only.
RNA Secondary Structure Prediction Spring Objectives  Can we predict the structure of an RNA?  Can we predict the structure of a protein?
CZ5225 Methods in Computational Biology Lecture 4-5: Protein Structure and Structural Modeling Prof. Chen Yu Zong Tel:
Rotamer Packing Problem: The algorithms Hugo Willy 26 May 2010.
Protein Folding & Biospectroscopy F14PFB David Robinson Mark Searle Jon McMaster
Department of Mechanical Engineering
Conformational Entropy Entropy is an essential component in ΔG and must be considered in order to model many chemical processes, including protein folding,
A Technical Introduction to the MD-OPEP Simulation Tools
Applied Bioinformatics Week 12. Bioinformatics & Functional Proteomics How to classify proteins into functional classes? How to compare one proteome with.
Protein Folding and Modeling Carol K. Hall Chemical and Biomolecular Engineering North Carolina State University.
Molecular Mechanics Studies involving covalent interactions (enzyme reaction): quantum mechanics; extremely slow Studies involving noncovalent interactions.
Altman et al. JACS 2008, Presented By Swati Jain.
Lecture 16 – Molecular interactions
Protein Structure Prediction: Homology Modeling & Threading/Fold Recognition D. Mohanty NII, New Delhi.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Molecular Modelling - Lecture 2 Techniques for Conformational Sampling Uses CHARMM force field Written in C++
ChE 452 Lecture 25 Non-linear Collisions 1. Background: Collision Theory Key equation Method Use molecular dynamics to simulate the collisions Integrate.
Structure prediction: Ab-initio Lecture 9 Structural Bioinformatics Dr. Avraham Samson Let’s think!
LSM3241: Bioinformatics and Biocomputing Lecture 6: Fundamentals of Molecular Modeling Prof. Chen Yu Zong Tel:
PROTEIN FOLDING: H-P Lattice Model 1. Outline: Introduction: What is Protein? Protein Folding Native State Mechanism of Folding Energy Landscape Kinetic.
Chemistry XXI Unit 3 How do we predict properties? M1. Analyzing Molecular Structure Predicting properties based on molecular structure. M4. Exploring.
Protein Structure and Bioinformatics. Chapter 2 What is protein structure? What are proteins made of? What forces determines protein structure? What is.
Molecular Mechanics (Molecular Force Fields). Each atom moves by Newton’s 2 nd Law: F = ma E = … x Y Principles of M olecular Dynamics (MD): F =
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Protein backbone Biochemical view:
Molecular dynamics (MD) simulations  A deterministic method based on the solution of Newton’s equation of motion F i = m i a i for the ith particle; the.
Protein Structure Prediction. Protein Synthesis the national health museum.
Protein Structure BL
Protein Structure and Properties
Protein Structure Prediction and Protein Homology modeling
Computational Analysis
Protein Structure Prediction
3-Dimensional Structure
Intro to Molecular Dynamics (MD) Simulation using CHARMM
CZ5225 Methods in Computational Biology Lecture 7: Protein Structure and Structural Modeling Prof. Chen Yu Zong Tel:
Protein structure prediction.
Presentation transcript:

Protein Tertiary Structure Prediction

Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction Secondary structure 3D structure Ab initio Comparative modeling Threading Structure alignment 3D structure alignment Protein docking

Predicting Protein 3D Structure Goal: Find the best fit of a sequence to a 3D structure Ab initio methods Attempt to calculate 3D structure “from scratch” Lattice models off-lattice models Energy minimization Molecular dynamics Comparative (homology) modeling Construct 3D model from alignment to protein sequences with known structure Threading (fold recognition/reverse folding) Pick best fit to sequences of known 2D/3D structures (folds)

How proteins interact? It is believed that hydrophobic collapse is a key driving force for protein folding It is believed that hydrophobic collapse is a key driving force for protein folding Hydrophobic core! Hydrophobic core! Analog: water and oil separation Analog: water and oil separation Model: A chain of twenty kinds of beats Model: A chain of twenty kinds of beats

“Elementary school kid model” Different assembles (shapes) Different assembles (shapes) Frustrated system Frustrated system Lots of local minimums Lots of local minimums Jose Onuchic, UCSD

Classes of Amino Acids

Cubic lattice model

Hydrophobic packing models Dill's HP model Dill's HP model Two classes of amino acids, hydrophobic (H) and polar (P) Two classes of amino acids, hydrophobic (H) and polar (P) Lattice model for position of amino acids. Lattice model for position of amino acids. Thread chain of H's and P's through lattice to maximize number of H-H contacts Thread chain of H's and P's through lattice to maximize number of H-H contacts 2D 3D3D

Hydrophobic Zipper

Most Designable Structures

All the chains here are 21 beads long. The upper panel shows some of the 107 exceptionally stable foldings of 80 sequences that maximize the number of H-H contacts. In the lower panel are a few of the other 117,676,504,514,560 combinations of sequences and foldings, selected at random. (Brian Hayes, American Scientists,1998) All the chains here are 21 beads long. The upper panel shows some of the 107 exceptionally stable foldings of 80 sequences that maximize the number of H-H contacts. In the lower panel are a few of the other 117,676,504,514,560 combinations of sequences and foldings, selected at random. (Brian Hayes, American Scientists,1998)

HP Lattice Model Simplifications in the model: Simplifications in the model: All amino acids are classified as hydrophobic (H) or polar (P). A protein is represented as a string of H’s and P’s. HHHHHPPPHHHPP All amino acids are classified as hydrophobic (H) or polar (P). A protein is represented as a string of H’s and P’s. HHHHHPPPHHHPP Space is discretized. Each amino acid is embedded to a single lattice point. A protein fold corresponds to a self-avoiding walk over the lattice. Space is discretized. Each amino acid is embedded to a single lattice point. A protein fold corresponds to a self-avoiding walk over the lattice. The energy function is defined as The energy function is defined as E =  (# of H-H contacts not including covalent interaction).

Example of HP lattice model Example of HP lattice model Hydrophobic amino acid Polar amino acid Peptide bond H-H contacts E = Number of H-H contacts (except for peptide bonds) = -7

HP Lattice Model Other lattices Other lattices 2D triangular lattice, 3D-diamond lattice 2D triangular lattice, 3D-diamond lattice Other energy functions Other energy functions HP=0, HH=-1, PP=1 HP=0, HH=-1, PP=1 Lattice model can be used Lattice model can be used Study qualitative features of protein folding Study qualitative features of protein folding Reduce search space in structure prediction methods Reduce search space in structure prediction methods Study potential effectiveness of the methods for structure prediction (inverse folding problem) Study potential effectiveness of the methods for structure prediction (inverse folding problem)

Inverse Folding Problem Example: Example: Can we find all protein sequences in GenBank with the globin fold? globin foldglobin fold Claim: Claim: There exist two native sequence S i, S j such that E(S(S i ), S i )  E(S(S i ), S j ) where S(S i ) and S(S j ) be the native structures of S i & S j. i.e. the sequence S j “scores” better on S i ’s native structure than S i itself. NO.

Exercise Find native structures of S 1 and S 2 Find native structures of S 1 and S 2 S 1 = HHPPPPHPPPH S 1 = HHPPPPHPPPH S 2 = HHPHPPHPHPH S 2 = HHPHPPHPHPH Thread S 2 on to the structure of S 1 and find the energy associated with that fold Thread S 2 on to the structure of S 1 and find the energy associated with that fold

Exercise Find native structures of S 1 and S 2 Find native structures of S 1 and S 2 S 1 = HHPPPPHPPPH S 1 = HHPPPPHPPPH S 2 = HHPHPPHPHPH S 2 = HHPHPPHPHPH Thread S 2 on to the structure of S 1 and find the energy associated with that fold Thread S 2 on to the structure of S 1 and find the energy associated with that fold S 1 S 1 E(S(S 1 ), S 1 ) = -2; E(S(S 1 ), S 2 ) = -3; E(S(S 2 ), S 2 ) = -4. H PP H H P P P HP P H PP H H P P H HP H H P P H H H P H P P H

More on Lattice Model Even the 2D HP packing problem (which is easier than the 3D one) turns out to be NP complete! Even the 2D HP packing problem (which is easier than the 3D one) turns out to be NP complete! Good approximation results exist. Good approximation results exist. 3/8 of optimal approximation (3D) 3/8 of optimal approximation (3D) In triangular lattice, algorithm for >60% of optimal packing In triangular lattice, algorithm for >60% of optimal packing Other interesting results in the model, e.g. Other interesting results in the model, e.g. Which sequences are native (a single optimal fold)? Which sequences are native (a single optimal fold)?

Summary Approach Reduce computation by limiting degrees of freedom Limit α-carbon (Cα) atoms to positions on 2D or 3D lattice Protein sequence → represented as path through lattice points H-P (hydrophobic-polar) cost model Each residue → hydrophobic (H) or hydrophilic (P) Score position of sequence → maximize H-H contacts Problem Still NP-hard Greatly simplified problem Emphasis on forming hydrophobic core Need more accurate cost models

Off-Lattice Models Approach Compromise between lattice model and molecular dynamics Backbone placement → allowed by Ramachandran plot Represent as phi & psi angles of α-carbon atoms Degree of precision α-carbon only All backbone atoms All backbone atoms + side chains (residues) Common conformation (positions) of side chain = rotamer Problem Still simplified problem Increased computation cost

Molecular Dynamics Goal Goal Provides a way to observe the motion of large molecules such as proteins at the atomic level – dynamic simulation Provides a way to observe the motion of large molecules such as proteins at the atomic level – dynamic simulation Approach Model all interatomic forces acting on atoms in protein Potential energy function (Newtonian mechanics) Potential energy function (Newtonian mechanics) Perform numerical simulations to predict fold Repeat for each atom at each time step Calculate & add up all (pairwise) forces bonds: bonds: non-bonded: electrostatic and non-bonded: electrostatic and van der Waals’ Newton’s 2nd law ? ) Apply force, move atom to new position (Newton’s 2nd law ? ) Obtain trajectories of motion of molecule Obtain trajectories of motion of molecule F = ma

MD Problem with MD Smaller time step → more accurate simulation Modeling folding is computationally intensive Current models require tiny ( second) time steps Simulations reported for at most seconds Folding requires 1 second or more Demo (12 nanosecond MD simulation) Demo

Types of Inter-atomic Forces

Molecular Dynamics

Potential Energy Components Components (1) bond length Bonds behave like spring with equilibrium bond length depending on bond type. Increase or decrease from equilibrium length requires higher energy.

Potential Energy (2) bond angle Bond angles have equilibrium value eg 108 for H-C-H Bond angles have equilibrium value eg 108 for H-C-H Behave as if sprung. Behave as if sprung. Increase or decrease in angle requires higher energy. Increase or decrease in angle requires higher energy.

Potential Energy (3) torsion angle Rotation can occur about single bond in A-B-C-D but energy depends on torsion angle (angle between CD & AB viewed along BC). Staggered conformations (angle +60, -60 or 180 are preferred).

Potential Energy (4) van der Waals interactions Interactions between atoms not near neighbours expressed by Lennard-Jones potential. Very high repulsive force if atoms closer than sum of van der Waals radii. Attractive force if distance greater. Because of strong distance dependence, van der Waals interactions become negligible at distances over 15 Å.

Potential Energy (5) Electrostatic interactions All atoms have partial charge eg in C=O, C has partial positive charge, O atom partial negative charge. Two atoms that have the same charge repel one another, those with unlike charge attract. Electrostatic energy falls off much less quickly than for van der Waals interactions and may not be negligible even at 30 Å.

Potential Energy Potential Energy is given by the sum of these contributions: Potential Energy is given by the sum of these contributions: Hydrogen bonds are usually supposed to arise by electrostatic interactions but occasionally a small extra term is added. Hydrogen bonds are usually supposed to arise by electrostatic interactions but occasionally a small extra term is added.

Force fields A force field is the description of how potential energy depends on parameters A force field is the description of how potential energy depends on parameters Several force fields are available Several force fields are available AMBER used for proteins and nucleic acids (UCSF) AMBER used for proteins and nucleic acids (UCSF) CHARMM (Harvard) CHARMM (Harvard) … Force fields differ: Force fields differ: in the precise form of the equations in the precise form of the equations in values of the constants for each atom type in values of the constants for each atom type

Obtain Trajectory Start with a initial structure (Ex. Structure from PDB) Start with a initial structure (Ex. Structure from PDB) Assign random starting velocities to the atoms Assign random starting velocities to the atoms Calculating the forces acting on each atom Calculating the forces acting on each atom Bonds, non-bonded (electrostatic and van der Val’s) Bonds, non-bonded (electrostatic and van der Val’s) Numerically integrate Newton’s equations of motion Numerically integrate Newton’s equations of motion Verlet method Verlet method Leapfrog method Leapfrog method After equilibrating the system, record the positions and momentum of the atoms as a function of time After equilibrating the system, record the positions and momentum of the atoms as a function of time

Molecular Dynamics Energy minimization gives local minimum, not necessarily global minimum. Energy minimization gives local minimum, not necessarily global minimum. Give molecule thermal energy so can explore conformational space & overcome energy barriers. Give molecule thermal energy so can explore conformational space & overcome energy barriers. Give atoms initial velocity random value + direction. Scale velocities so total kinetic energy =3/2kT * number atoms Give atoms initial velocity random value + direction. Scale velocities so total kinetic energy =3/2kT * number atoms Solve equation of motion to work out position of atoms at 1 fs. Solve equation of motion to work out position of atoms at 1 fs.