Solving NMR Structures II: Calculation and evaluation What NMR-based (solution) structures look like the NMR ensemble inclusion of hydrogen coordinates.

Slides:



Advertisements
Similar presentations
Protein NMR terminology COSY-Correlation spectroscopy Gives experimental details of interaction between hydrogens connected via a covalent bond NOESY-Nuclear.
Advertisements

Determination of Protein Structure. Methods for Determining Structures X-ray crystallography – uses an X-ray diffraction pattern and electron density.
Rosetta Energy Function Glenn Butterfoss. Rosetta Energy Function Major Classes: 1. Low resolution: Reduced atom representation Simple energy function.
Introduction to Molecular Orbitals
Analysis of the Quality of NMR Protein Structures With A Structure Calculated From Your NMR Data, How Do You Determine the Accuracy and Quality of the.
Computing Protein Structures from Electron Density Maps: The Missing Loop Problem I. Lotan, H. van den Bedem, A. Beacon and J.C. Latombe.
With thanks to Zhijun Wu An introduction to the algorithmic problems of Distance Geometry.
Incorporating additional types of information in structure calculation: recent advances chemical shift potentials residual dipolar couplings.
Solving NMR structures II: Calculation and evaluation The NMR ensemble Methods for calculating structures distance geometry, restrained molecular dynamics,
Solving NMR structures I --deriving distance restraints from crosspeak intensities in NOESY spectra --deriving dihedral angle restraints from J couplings;
Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction.
Solving NMR structures II hydrogen bond restraints chemical shift information structure determination methods evaluating/describing NMR structures.
Potential Energy Surfaces
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
Chemical shifts and structure chemical shifts depend upon local electron distributions, bond hybridization states, proximity to polar groups, nearby aromatic.
Proteins: Levels of Protein Structure Conformation of Peptide Group
Inverse Kinematics for Molecular World Sadia Malik April 18, 2002 CS 395T U.T. Austin.
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques117 Jan 2006 Bioinformatics Data Analysis & Tools Molecular simulations & sampling.
Molecular Modeling Part I Molecular Mechanics and Conformational Analysis ORG I Lab William Kelly.
Chapter 9 Superposition and Dynamic Programming 1 Chapter 9 Superposition and dynamic programming Most methods for comparing structures use some sorts.
Free energies and phase transitions. Condition for phase coexistence in a one-component system:
What are proteins? Proteins are important; e.g. for catalyzing and regulating biochemical reactions, transporting molecules, … Linear polymer chain composed.
02/03/10 CSCE 769 Dihedral Angles Homayoun Valafar Department of Computer Science and Engineering, USC.
Comparing Data from MD simulations and X-ray Crystallography What can we compare? 3D shapes (Scalar coupling constants, a.k.a. J-values, nuclear Overhauser.
Biomolecular Nuclear Magnetic Resonance Spectroscopy BASIC CONCEPTS OF NMR How does NMR work? Resonance assignment Structure determination 01/24/05 NMR.
Ionic Conductors: Characterisation of Defect Structure Lecture 15 Total scattering analysis Dr. I. Abrahams Queen Mary University of London Lectures co-financed.
-1/2 E +1/2 low energy spin state
Department of Mechanical Engineering
Computing Missing Loops in Automatically Resolved X-Ray Structures Itay Lotan Henry van den Bedem (SSRL)
Computer Simulation of Biomolecules and the Interpretation of NMR Measurements generates ensemble of molecular configurations all atomic quantities Problems.
Biomolecular Nuclear Magnetic Resonance Spectroscopy FROM ASSIGNMENT TO STRUCTURE Sequential resonance assignment strategies NMR data for structure determination.
Conformational Entropy Entropy is an essential component in ΔG and must be considered in order to model many chemical processes, including protein folding,
Potential energy surface, Force field & Molecular Mechanics 3N (or 3N-6 or 3N-5) Dimension PES for N-atom system x E’ =  k i (l i  l 0,i ) +  k i ’
A Technical Introduction to the MD-OPEP Simulation Tools
Common Potential Energy Functions of Separation Distance The Potential Energy function describes the energy of a particular state. When given as a function.
Altman et al. JACS 2008, Presented By Swati Jain.
MODELING MATTER AT NANOSCALES 3. Empirical classical PES and typical procedures of optimization Classical potentials.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Molecular Modelling - Lecture 2 Techniques for Conformational Sampling Uses CHARMM force field Written in C++
Protein NMR Part II.
EBI is an Outstation of the European Molecular Biology Laboratory. Sanchayita Sen, Ph.D. PDB Depositions Validation & Structure Quality.
Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.
Interacting Molecules in a Dense Fluid
Autonomous Robots Robot Path Planning (3) © Manfred Huber 2008.
Protein Structure and Bioinformatics. Chapter 2 What is protein structure? What are proteins made of? What forces determines protein structure? What is.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
--Experimental determinations of radial distribution functions --Potential of Mean Force 1.
Residual dipolar couplings in NMR structure determination
3D Triple-Resonance Methods for Sequential Resonance Assignment of Proteins Strategy: Correlate Chemical Shifts of Sequentially Related Amides to the Same.
CS-ROSETTA Yang Shen et al. Presented by Jonathan Jou.
Lecture 10 CS566 Fall Structural Bioinformatics Motivation Concepts Structure Solving Structure Comparison Structure Prediction Modeling Structural.
How NMR is Used for the Study of Biomacromolecules Analytical biochemistry Comparative analysis Interactions between biomolecules Structure determination.
Automated Refinement (distinct from manual building) Two TERMS: E total = E data ( w data ) + E stereochemistry E data describes the difference between.
Lecture 3 Patterson functions. Patterson functions The Patterson function is the auto-correlation function of the electron density ρ(x) of the structure.
Molecular dynamics (MD) simulations  A deterministic method based on the solution of Newton’s equation of motion F i = m i a i for the ith particle; the.
Uses of NMR: 1) NMR is a method of chemical analysis
Structure Refinement BCHM 5984 September 7, 2009.
Title: How to determine the solution structure of murine epidermal growth factor by NMR Spectroscopy Hong Liu.
Protein Structure Prediction and Protein Homology modeling
Hierarchical Structure of Proteins
Protein structure prediction.
Axel T Brünger, Paul D Adams, Luke M Rice  Structure 
Coarse-Grained Peptide Modeling Using a Systematic Multiscale Approach
G. Fiorin, A. Pastore, P. Carloni, M. Parrinello  Biophysical Journal 
Combining Efficient Conformational Sampling with a Deformable Elastic Network Model Facilitates Structure Refinement at Low Resolution  Gunnar F. Schröder,
Conformational Search
Feng Ding, Douglas Tsao, Huifen Nie, Nikolay V. Dokholyan  Structure 
Solution Structure of the Proapoptotic Molecule BID
Protein structure prediction
Tertiary structure of an immunoglobulin-like domain from the giant muscle protein titin: a new member of the I set  Mark Pfuhl, Annalisa Pastore  Structure 
Presentation transcript:

Solving NMR Structures II: Calculation and evaluation What NMR-based (solution) structures look like the NMR ensemble inclusion of hydrogen coordinates Methods for calculating structures distance geometry, restrained molecular dynamics, simulated annealing Evaluating the quality of NMR structures resolution, stereochemical quality, restraint violations, etc

NMR data do not uniquely define a 3D protein structure (single set of coordinates) Restraints are ranges of allowed distances, angles etc. rather than single values, reflecting the fact that the experimental data contain uncertainties both in measurement and interpretation. Only a limited number of the possible restraints are observable experimentally due to peak overlap/chemical shift degeneracy, lack of stereospecific assignments, etc. View of protein structure as a single set of atomic coordinates may itself be physically unrealistic! proteins are dynamic molecules

The NMR Ensemble NMR methods not calculate a single structure, but rather repeat a structure calculation many times to generate an ensemble of structures The structure calculations are designed to thoroughly explore all regions of conformational space that satisfy the experimentally derived restraints At the same time, they often impose some physical reasonableness on the system, such as bond angles, distances and proper stereochemistry. The ideal result is an ensemble which A. satisfies all the experimental restraints (minimizes violations) B. at the same time accurately represents the full permissible conformational space under the restraints (maximizes RMSD between ensemble members) C. looks like a real protein

The NMR Ensemble At right, an ensemble of 25 structures for Syrian hamster prion protein(only the backbone is shown) Liu et al. Biochemistry (1999) 38, The fact that NMR structures are reported as ensembles gives them a “fuzzy” appearance which is both informative and sometimes annoying

NMR structures include hydrogen coordinates X-ray structures do not generally include hydrogen atoms in atomic coordinate files, because the heavy atoms dominate the diffraction pattern and the hydrogen atoms are not explicitly seen. By contrast, NMR restraints such as NOE distance restraints and hydrogen bond restraints often explicitly include the positions of hydrogen atoms. Therefore, these positions are reported in the PDB coordinate files.

Methods for structure calculation distance geometry (DG) restrained molecular dynamics (rMD) simulated annealing (SA) hybrid methods

Starting points for calculations to get the most unbiased, representative ensemble, it is wise to start the calculations from a set of randomly generated starting structures. Alternatively, in some methods the same initial structure is used for each trial structure calculation, but the calculation trajectory is pushed in a different initial direction each time using a random-number generator.

DG--Distance geometry In distance geometry, one uses the nOe-derived distance restraints to generate a distance matrix, which one then uses as a guide in calculating a structure Structures calculated from distance geometry will produce the correct overall fold but usually have poor local geometry (e.g. improper bond angles, distances) hence distance geometry must be combined with some extensive energy minimization method to generate physically reasonable structures

rMD--Restrained molecular dynamics Molecular dynamics involves computing the potential energy V with respect to the atomic coordinates. Usually this is defined as the sum of a number of terms: V total = V bond + V angle + V dihedr + V vdW + V coulomb + V NMR the first five terms here are “real” energy terms corresponding to such forces as van der Waals and electrostatic repulsions and attractions, cost of deforming bond lengths and angles...these come from some standard molecular force field like CHARMM or AMBER the NMR restraints are incorporated into the V NMR term, which is a “pseudoenergy” or “pseudopotential” term included to represent the cost of violating the restraints

Pseudo-energy potentials for rMD Generate “fake” energy potentials representing the cost of violating the distance or angle restraints. Here’s an example of a distance restraint potential K NOE (r ij -r ij 1 ) 2 if r ij <r ij l K NOE (r ij -r ij u ) 2 if r ij >r ij u 0if r ij l <r ij < r ij u V NOE = where r ij l and r ij u are the lower and upper bounds of our distance restraint, and K NOE is some chosen force constant, typically ~ 250 kcal mol -1 nm -2 So it’s somewhat permissible to violate restraints but it raises V

Example of nOe pseudopotential r ij l r ij u 0 V NOE potential rises steeply with degree of violation

SA-Simulated annealing SA is essentially a special implementation of rMD and uses similar potentials but employs raising the temperature of the system and then slow cooling in order not to get trapped in local energy minima SA is very efficient at locating the global minimum of the target function

Dealing with ambiguous restraints often not possible to tell which atoms are involved in a NOESY crosspeak, either because of a lack of stereospecific assignments or because multiple protons have the same chemical shift sometimes an ambiguous restraint is included but is expressed ambiguously in the restraint file, e.g. 3 HA --> 6 HB#, where the # wildcard indicates that the beta protons of residue 6 are not stereospecifically assigned. This is quite commonly done for stereochemical ambiguities. it is also possible to leave ambiguous restraints out and then try to resolve them iteratively using multiple cycles of calculation. This is often done for restraints that involve more complicated ambiguities, e.g. 3 HA-->10 HN, 43 HN, or 57 HN, where three amides all have the same shift. can also make stereospecific assignments iteratively using what are called floating chirality methods

A B C 9.52 ppm 4.34 ppm Due to resonance overlap between atoms B and C, an NOE crosspeak between 9.52 ppm and 4.34 ppm could be A to C or A to B-- this restraint is ambiguous But if an ensemble generated with this ambiguous restraint left out shows that A is never close to B, then the restraint must be A to C. Example of resolving an ambiguity during structure calculation 9-11 Å 3-4 Å range of interatomic distances observed in trial ensemble

Iterative structure calculation with assignment of ambiguous restraints source: there are programs such as ARIA, with automatic routines for iterative assignment of ambiguous restraints. The key to success is to make absolutely sure the restraints you start with are right! start with some set of unambiguous NOEs and calculate an ensemble

Acceptance criteria: choosing structures for an ensemble typical to generate 50 or more trial structures, but not all will converge to a final structure that is physically reasonable or consistent with the experimentally derived NMR restraints. We want to throw such structures away rather than include them in our reported ensemble. these are typical acceptance criteria for including calculated structures in the ensemble: –no more than 1 nOe distance restraint violation greater than 0.4 Å –no dihedral angle restraint violations greater than 5 –no gross violations of reasonable molecular geometry sometimes structures are rejected on other grounds as well: –too many residues with backbone angles in disfavored regions of Ramachandran space –too high a final potential energy in the rMD calculation

Precision of NMR Structures (Resolution) judged by RMSD of superimposed ensemble of accepted structures RMSDs for both backbone (C , N, C C=O ) and all heavy atoms (i.e. everything except hydrogen) are typically reported, e.g. bb 0.6 Å heavy 1.4 Å sometimes only the more ordered regions are included in the reported RMSD, e.g. for a 58 residue protein you will see RMSD (residues 5-58) if residues 1-4 are completely disordered.

Reporting ensemble RMSD two major ways of calculating RMSD of the ensemble: –pairwise: compute RMSDs for all possible pairs of structures in the ensemble, and calculate the mean of these RMSDs –from mean: calculate a mean structure from the ensemble and measure RMSD of each ensemble structure from it, then calculate the mean of these RMSDs –pairwise will generally give a slightly higher number, so be aware that these two ways of reporting RMSD are not completely equal. Usually the Materials and Methods, or a footnote somewhere in the paper, will indicate which is being used.

“Minimized average” structure a minimized average is just that: a mean structure is calculated from the ensemble and then subjected to energy minimization to restore reasonable geometry, which is often lost in the calculation of a mean this is NMR’s way of generating a single representative structure from the data. It is much easier to visualize structural features from a minimized average than from the ensemble. for highly disordered regions a minimized average will not be informative and may even be misleading--such regions are sometimes left out of the minimized average sometimes when an NMR structure is deposited in the PDB, there will be separate entries for both the ensemble and the minimized average. It is nice when people do this. Alternatively, a member of the ensemble may be identified which is considered the most representative (often the one closest to the mean).

How many restraints do we need to get a high-resolution NMR structure? usually ~15-20 nOe distance restraints per residue, but the total # is not as important as how many long-range restraints you have, meaning long-range in the sequence: |i-j|> 5, where i and j are the two residues involved good NMR structures usually have ≥ ~ 3.5 long-range distance restraints per residue in the structured regions to get a very good quality structure, it is usually also necessary to have some stereospecific assignments, e.g.  hydrogens; Leu, Val methyls

Assessing Structure Quality NMR spectroscopists usually run their ensemble through the program PROCHECK-NMR to assess its quality high-resolution structure will have backbone RMSD ≤ ~0.8 Å, heavy atom RMSD ≤ ~1.5 Å low RMS deviation from restraints (good agreement w/restraints) will have good stereochemical quality: –ideally >90% of residues in core (most favorable) regions of Ramachandran plot –very few “unusual” side chain angles and rotamers (as judged by those commonly found in crystal structures) –low deviations from idealized covalent geometry

Structural Statistics Tables list of restraints, # and type precision of structure (RMSD) agreement of ensemble structures with restraints (RMS) calculated energies sometimes also see listings of Ramachandran statistics, deviations from ideal covalent geometry, etc.