Solving protein structures,molecular mechanics, and docking Lecture 18 Introduction to Bioinformatics 2006.

Slides:



Advertisements
Similar presentations
Areas of Spectrum.
Advertisements

Simulazione di Biomolecole: metodi e applicazioni giorgio colombo
Protein x-ray crystallography
Introduction to protein x-ray crystallography. Electromagnetic waves E- electromagnetic field strength A- amplitude  - angular velocity - frequency.
Crystals are made from very large numbers of small units: Unit cells. The unit cell may contain more than one protein. The packing of the unit cells gives.
Experimental Techniques in Protein Structure Determination Homayoun Valafar Department of Computer Science and Engineering, USC.
Determination of Protein Structure. Methods for Determining Structures X-ray crystallography – uses an X-ray diffraction pattern and electron density.
Lecture 14: Special interactions. What did we cover in the last lecture? Restricted motion of molecules near a surface results in a repulsive force which.
X-Ray Crystallography
Molecular Dynamics, Monte Carlo and Docking Lecture 21 Introduction to Bioinformatics MNW2.
Computing Protein Structures from Electron Density Maps: The Missing Loop Problem I. Lotan, H. van den Bedem, A. Beacon and J.C. Latombe.
The Calculation of Enthalpy and Entropy Differences??? (Housekeeping Details for the Calculation of Free Energy Differences) first edition: p
Molecular Simulation. Molecular Simluation Introduction: Introduction: Prerequisition: Prerequisition: A powerful computer, fast graphics card, A powerful.
Structure Determination by NMR CHY 431 Biological Chemistry Karl D. Bishop, Ph.D. Lecture 1 - Introduction to NMR Lecture 2 - 2D NMR, resonance assignments.
Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction.
Protein Primer. Outline n Protein representations n Structure of Proteins Structure of Proteins –Primary: amino acid sequence –Secondary:  -helices &
Experimentally solving protein structures and protein-protein interactions Lecture 21 Introduction to Bioinformatics 2007 C E N T R F O R I N T E G R A.
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
Genetic Threading By J.Yadgari and A.Amir Published: special issue on Bioinformatics in Journal of Constraints, June 2001 Alexandre Tchourbanov University.
Experimentally solving protein structures, protein-protein interactions and simulating protein dynamics Lecture 15 Introduction to Bioinformatics 2007.
Computing and Chemistry 3-41 Athabasca Hall Sept. 16, 2013.
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques117 Jan 2006 Bioinformatics Data Analysis & Tools Molecular simulations & sampling.
Computational Structure Prediction Kevin Drew BCH364C/391L Systems Biology/Bioinformatics 2/12/15.
Molecular Modeling Part I Molecular Mechanics and Conformational Analysis ORG I Lab William Kelly.
Chapter 12 Protein Structure Basics. 20 naturally occurring amino acids Free amino group (-NH2) Free carboxyl group (-COOH) Both groups linked to a central.
Molecular Modeling Fundamentals: Modus in Silico C372 Introduction to Cheminformatics II Kelsey Forsythe.
X-ray and NMR Topic 7 Chapter 4 & 5, Du and Bourne “Structural Bioinformatics”
Molecular Modeling Part I. A Brief Introduction to Molecular Mechanics.
Molecular Mechanics and docking Lecture 22 Introduction to Bioinformatics 2007 C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E.
Structure-Function Analysis 117 Jan 2006 DNA/Protein structure-function analysis and prediction Protein Structure Determination: –X-ray Diffraction (Titia.
02/03/10 CSCE 769 Dihedral Angles Homayoun Valafar Department of Computer Science and Engineering, USC.
Molecular Dynamics Simulation Solid-Liquid Phase Diagram of Argon ZCE 111 Computational Physics Semester Project by Gan Sik Hong (105513) Hwang Hsien Shiung.
 Four levels of protein structure  Linear  Sub-Structure  3D Structure  Complex Structure.
PROTEINS PROTEINS Levels of Protein Structure.
Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding I Prof. Corey O’Hern Department of Mechanical Engineering & Materials.
CZ5225 Methods in Computational Biology Lecture 4-5: Protein Structure and Structural Modeling Prof. Chen Yu Zong Tel:
Biomolecular Nuclear Magnetic Resonance Spectroscopy BASIC CONCEPTS OF NMR How does NMR work? Resonance assignment Structure determination 01/24/05 NMR.
DNA/Protein structure-function analysis and prediction
Ionic Conductors: Characterisation of Defect Structure Lecture 15 Total scattering analysis Dr. I. Abrahams Queen Mary University of London Lectures co-financed.
Department of Mechanical Engineering
Biomolecular Nuclear Magnetic Resonance Spectroscopy FROM ASSIGNMENT TO STRUCTURE Sequential resonance assignment strategies NMR data for structure determination.
Potential energy surface, Force field & Molecular Mechanics 3N (or 3N-6 or 3N-5) Dimension PES for N-atom system x E’ =  k i (l i  l 0,i ) +  k i ’
Protein Folding and Modeling Carol K. Hall Chemical and Biomolecular Engineering North Carolina State University.
Molecular Mechanics Studies involving covalent interactions (enzyme reaction): quantum mechanics; extremely slow Studies involving noncovalent interactions.
Homework 2 (due We, Feb. 1): Reading: Van Holde, Chapter 1 Van Holde Chapter 3.1 to 3.3 Van Holde Chapter 2 (we’ll go through Chapters 1 and 3 first. 1.Van.
Molecular Crystals. Molecular Crystals: Consist of repeating arrays of molecules and/or ions.
Lecture 5 Barometric formula and the Boltzmann equation (continued) Notions on Entropy and Free Energy Intermolecular interactions: Electrostatics.
Methods in Chemistry III – Part 1 Modul M.Che.1101 WS 2010/11 – 8 Modern Methods of Inorganic Chemistry Mi 10:15-12:00, Hörsaal II George Sheldrick
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Molecular Modelling - Lecture 2 Techniques for Conformational Sampling Uses CHARMM force field Written in C++
LSM3241: Bioinformatics and Biocomputing Lecture 6: Fundamentals of Molecular Modeling Prof. Chen Yu Zong Tel:
JG/10-09 NMR for structural biology DNA purification Protein domain from a database Protein structure possible since 1980s, due to 2-dimensional (and 3D.
Atomic structure model
Molecular Dynamics, Monte Carlo and Docking Lecture 21 Introduction to Bioinformatics MNW2.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
--Experimental determinations of radial distribution functions --Potential of Mean Force 1.
Chemical Bonding and Lewis Structures. Chemical Bonding Chemical Bonds are the forces that hold atoms together. Atoms form bonds in order to attain a.
Fourier transform from r to k: Ã(k) =  A(r) e  i k r d 3 r Inverse FT from k to r: A(k) = (2  )  3  Ã(k) e +i k r d 3 k X-rays scatter off the charge.
Lecture 10 CS566 Fall Structural Bioinformatics Motivation Concepts Structure Solving Structure Comparison Structure Prediction Modeling Structural.
Molecular dynamics (MD) simulations  A deterministic method based on the solution of Newton’s equation of motion F i = m i a i for the ith particle; the.
Elon Yariv Graduate student in Prof. Nir Ben-Tal’s lab Department of Biochemistry and Molecular Biology, Tel Aviv University.
Uses of NMR: 1) NMR is a method of chemical analysis
Topics for today: 1) A few comments on using NMR to investigate internal motions in biomolecules. 2) “MRI”, or Magnetic Resonance Imaging (The last day.
Introduction to Biophysics
NMR Principles of Structure Determination
Computational Analysis
Axel T Brünger, Paul D Adams, Luke M Rice  Structure 
CHY 431 Biological Chemistry
Conformational Search
Presentation transcript:

Solving protein structures,molecular mechanics, and docking Lecture 18 Introduction to Bioinformatics 2006

Thursday May 4th NO LECTURE But … 13:30 – 15:15 hrs in S329 and S345: PRACTICAL HOMOLOGY SEARCHING

Today’s lecture 1.Experimental techniques for determining protein tertiary structure 2.Molecular motion simulated by molecular mechanics 3.Protein interaction and docking i.Ribosome example ii.Zdock method

If you throw up a stone, it is Physics.

If you throw up a stone, it is Physics. If it lands on your head, it is Biophysics.

If you write a computer program, it is Informatics.

If you throw up a stone, it is Physics. If it lands on your head, it is Biophysics. If you write a computer program, it is Informatics. If there is a bug in it, it is Bioinformatics

Experimentally solving protein structures Two basic techniques: 1.X-ray crystallography 2.Nuclear Magnetic Resonance (NMR) tchniques

1. X-ray crystallography Purified protein Crystal X-ray Diffraction Electron density 3D structure Biological interpretation Crystallization Phase problem

Protein crystals Regular arrays of protein molecules ‘Wet’: 20-80% solvent Few crystal contacts Protein crystals contain active protein Enzyme turnover Ligand binding Example of crystal packing

Examples of crystal packing  2 Glycoprotein I ~90% solvent (extremely high!) Acetylcholinesterase ~68% solvent

Problematic proteins (no crystallisation) Multiple domains Similarly, floppy ends may hamper crystallization: change construct Membrane proteins Glycoproteins Flexible Lipid bilayer hydrophilic hydrophobic Flexible and heterogeneous!!

Experimental set-up Options for wavelength: –monochromatic, polychromatic –variable wavelength Liq.N 2 gas stream X-ray source detector goniometer beam stop

Diffraction image Water ring Diffuse scattering (from the fibre loop) reciprocal lattice (this case hexagonal) Beam stop Increasing resolution Direct beam Reflections (h,k,l) with I(h,k,l)

The rules for diffraction: Bragg’s law Scattered X-rays reinforce each other only when Bragg’s law holds: Bragg’s law: 2d hkl sin  = n

Phase Problem Determining the structure of a molecule in a crystalline sample requires knowing both the amplitude and the phase of the photon wave being diffracted from the sample X-rays which are emitted start out with dispersed phases, and so the phases get lost Unfortunately, phases contribute more to the informational content of a X-ray diffraction pattern than do amplitudes. It is common to refer to phaseless X-ray data as having "lost phases“ Luckily, several ways to recover the lost phases have been developed

Building a protein model Find structural elements: –  -helices,  -strands Fit amino-acid sequence

Building a protein model Find structural elements: –  -helices,  -strands Fit amino-acid sequence

Effects of resolution on electron density Note: map calculated with perfect phases d = 4 Å

d = 3 Å Effects of resolution on electron density Note: map calculated with perfect phases

d = 2 Å Effects of resolution on electron density Note: map calculated with perfect phases

d = 1 Å Effects of resolution on electron density Note: map calculated with perfect phases

Refinement process Bad phases  poor electron density map  errors in the protein model Interpretation of the electron density map  improved model  improved phases  improved map  even better model … iterative process of refinement

Validation Free R-factor (cross validation) –Number of parameters/ observations Ramachandran plot Chemically likely (WhatCheck) –Hydrophobic inside, hydrophilic outside –Binding sites of ligands, metals, ions –Hydrogen-bonds satisfied –Chemistry in order Final B-factor (temperature) values

2. Nuclear Magnetic Resonance (NMR) 800 MHz NMR spectrometer

Nuclear Magnetic Resonance (NMR) Pioneered by Richard R. Ernst, who won a Nobel Prize in chemistry in 1991, FT-NMR works by irradiating the sample, held in a static external magnetic field, with a short square pulse of radio-frequency energy containing all the frequencies in a given range of interest. The polarized magnets of the nuclei begin to spin together, creating a radio frequency (RF) that is observable. Because the signals decays over time, this time-dependent pattern can be converted into a frequency-dependent pattern of nuclear resonances using a mathematical function known as a Fourier transformation, revealing the nuclear magnetic resonance spectrum. The use of pulses of different shapes, frequencies and durations in specifically-designed patterns or pulse sequences allows the spectroscopist to extract many different types of information about the molecule.

Nuclear Magnetic Resonance (NMR) Time intervals between pulses allow—among other things— magnetization transfer between nuclei and, therefore, the detection of the kinds of nuclear-nuclear interactions that allowed for the magnetization transfer. Interactions that can be detected are usually classified into two kinds. There are through-bond interactions and through-space interactions. The latter usually being a consequence of the so-called nuclear Overhauser effect (NOE). Experiments of the nuclear-Overhauser variety may establish distances between atoms. These distances are subjected to a technique called Distance Geometry which normally results in an ensemble of possible structures that are all relatively consistent with the observed distance restraints (NOEs). Richard Ernst and Kurt Wüthrich —in addition to many others— developed 2-dimensional and multidimensional FT-NMR into a powerful technique for the determination of the structure of biopolymers such as proteins or even small nucleic acids. This is used in protein nuclear magnetic resonance spectroscopy. Wüthrich shared the 2002 Nobel Prize in Chemistry for this work.

Gly Asp Asn Asp Phe Thr Ser Leu Val 2D NOESY spectrum Peptide sequence (N-terminal NH not observed) Arg-Gly-Asp-Val-Asn-Ser-Leu-Phe-Asp-Thr-Gly

NMR structure determination: hen lysozyme 129 residues –~1000 heavy atoms –~800 protons NMR data set –1632 distance restraints –110 torsion restraints –60 H-bond restraints 80 structures calculated 30 low energy structures used Total energy Structure number

Solution Structure Ensemble Disorder in NMR ensemble –lack of data ? –or protein dynamics ?

Problems with NMR Protein concentration in sample needs to be high (multimilligram samples) Restricted to smaller sized proteins (although magnets get stronger) Uncertainties in NOEs introduced by internal motions in molecules (preceding slide)

Molecular motions Proteins are very dynamic systems Protein folding Protein structure Protein function (e.g. opening and closing of oxygen binding site in hemoglobin)

X-ray and NMR summary Are experimental techniques to solve protein structures (although they both need a lot of computation) Nowadays typically contain many refinement and energy-minimisation steps to optimise the structure (next topic)

Protein motion Principles Simulation –MD –MC

The Ramachandran plot Allowed phi-psi angles Red areas are preferred, yellow areas are allowed, and white is avoided

Molecular mechanics techniques Two basic techniques: Molecular Dynamics (MD) simulations Monte Carlo (MC) techniques

Molecular Dynamics (MD) simulation MD simulation can be used to study protein motions. It is often used to refine experimentally determined protein structures. It is generally not used to predict structure from sequence or to model the protein folding pathway. MD simulation can fold extended sequences to `global' potential energy minima for very small systems (peptides of length ten, or so, in vacuum), but it is most commonly used to simulate the dynamics of known structures. Principle: an initial velocity is assigned to each atom, and Newton's laws are applied at the atomic level to propagate the system's motion through MD simulation incorporates a notion of time

q = coordinates p = momentum K = kinetic energy V = potential energy

Molecular Dynamics Knowledge of the atomic forces and masses can be used to solve the position of each atom along a series of extremely small time steps (on the order of femtoseconds = seconds). The resulting series of snapshots of structural changes over time is called a trajectory. The use of this method to compute trajectories can be more easily seen when Newton's equation is expressed in the following form: The "leapfrog" method is a common numerical approach to calculating trajectories based on Newton's equation. This method gets its name from the way in which positions (r) and velocities (v) are calculated in an alternating sequence, `leaping' past each other in time The steps can be summarized as follows: v = dr i /dt a = d 2 r i /d 2 t

Force field The potential energy of a system can be expressed as a sum of valence (or bond), crossterm, and nonbond interactions: The energy of valence interactions comprises bond stretching (E bond ), valence angle bending (E angle ), dihedral angle torsion (E torsion ), and inversion (also called out-of- plane interactions) (E inversion or E oop ) terms, which are part of nearly all force fields for covalent systems. A Urey-Bradley term (E UB ) may be used to account for interactions between atom pairs involved in 1-3 configurations (i.e., atoms bound to a common atom): E valence = E bond + E angle + E torsion + E oop + E UB Modern (second-generation) forcefields include cross terms to account for such factors as bond or angle distortions caused by nearby atoms. Crossterms can include the following terms: stretch-stretch, stretch-bend-stretch, bend-bend, torsion-stretch, torsion-bend-bend, bend-torsion-bend, stretch-torsion-stretch. The energy of interactions between nonbonded atoms is accounted for by van der Waals (E vdW ), electrostatic (E Coulomb ), and (in some older forcefields) hydrogen bond (E hbond ) terms: E nonbond = E vdW + E Coulomb + E hbond

Force field

f = a/r 12 - b/r 6 Van der Waals forces distance energy The Lennard-Jones potential is mildly attractive as two uncharged molecules or atoms approach one another from a distance, but strongly repulsive when they approach too close. The resulting potential is shown (in pink). At equilibrium, the pair of atoms or molecules tend to go toward a separation corresponding to the minimum of the Lennard--Jones potential (a separation of 0.38 nanometers for the case shown in the Figure)

Thermal bath

Figure: Snapshots of ubiquitin pulling with constant velocity at three different time steps.

Monte Carlo (MC) simulation "Monte Carlo Simulation" is a term for a general class of optimization methods that use randomization. The general idea is, given the current configuration and some figure of merit, e.g., the energy of the folded configuration, to generate a new configuration at random (or semi-random):  If the energy of the new configuration is smaller than the old configuration, always accept it as the next configuration;  if it is worse than the current configuration, accept or reject it it with some probability dependent on how much larger the new energy is than the old energy.  E = E(new)-E(old) If  E<0 then accept else if random[0, 1] < e -  E /kT then accept else reject Boltzmann -- probability of conformation c: P(c) = e -E(c)/kT

Monte Carlo (MC) simulation The idea is that by always accepting a better configuration, on the average the system will tend to move toward a (local) energy minimum, while conversely, by sometimes accepting worse configurations, the system will be able to "climb" out of a sub-optimal local minima, and perhaps fall into the basin of attraction of the global minimum. The specific algorithms for probabilistically generating and accepting new configurations define the type of "Monte Carlo" algorithm; some common methods are "Metropolis," "Gibbs Sampler," "Heat Bath," "Simulated Annealing," "Great Deluge," etc. MC techniques are computationally more efficient than MD MC simulations do not incorporate a notion of time! E Configuration space (models) Local minimum Global minimum

#! /usr/bin/perl #=============================================================================== # # $Id: mcdemo.pl,v /03/12 16:13:28 jkleinj Exp $ # # mcdemo: Demo program for MC simulation of the number pi # # (C) 2003 Jens Kleinjung # # Dr Jens Kleinjung, Room P440 | # Bioinformatics Unit, Faculty of Sciences | Tel # Free University Amsterdam | Fax # De Boelelaan 1081A, 1081 HV Amsterdam | # #=============================================================================== # preset parameters $hits = 1; $miss = 1; for ($i=0; $i<100000; $i++) { # assign random x,y coordinates $x = rand; $y = rand; # calculate radius $r = sqrt(($x*$x)+($y*$y)); # sum up hits and misses if ($r <= 1) { $hits++; } else { $miss++; } # calculate pi $pi = (4*$hits)/($hits +$miss); # print pi if ($i%100 == 0) { print("$i $pi\n"); } } #===============================================================================

In many conformational search methods based on Monte Carlo (MC), after a MC move, the system is energy minimised, i.e. put in the lowest local energy conformation, for example by gradient descent (steepest descent).

Take home messages Experimentally determining protein structures –X-ray diffraction From crystallised protein sample to electron density map –Structure descriptors: resolution, R-factor –Nuclear magnetic resonance (NMR) Based on atomic nuclear spin Produces set of distances between residues (distance restraints) Distances are used to build protein model using Distance Geometry Protein dynamics simulation –Molecular dynamics Follows Newton’s equations of motion Simulates molecular movements through time Very small time steps (2 femtoseconds) Protein conformational search –Monte Carlo Conformations are randomly changed Uses Mitropolis criterion to decide between conformation i and i+1 based on conformational internal energy and the Boltzmann equation Has no notion of time, is a conformational search protocol