SMA5233 Particle Methods and Molecular Dynamics Lecture 5: Applications in Biomolecular Simulation and Drug Design A/P Chen Yu Zong Tel: 6516-6877 Email:

Slides:



Advertisements
Similar presentations
Simulazione di Biomolecole: metodi e applicazioni giorgio colombo
Advertisements

SMA5233 Particle Methods and Molecular Dynamics Lecture 3: Force Fields A/P Chen Yu Zong Tel:
Computational methods in molecular biophysics (examples of solving real biological problems) EXAMPLE I: THE PROTEIN FOLDING PROBLEM Alexey Onufriev, Virginia.
Ion Solvation Thermodynamics from Simulation with a Polarizable Force Field Gaurav Chopra 07 February 2005 CS 379 A Alan GrossfeildPengyu Ren Jay W. Ponder.
Incorporating Solvent Effects Into Molecular Dynamics: Potentials of Mean Force (PMF) and Stochastic Dynamics Eva ZurekSection 6.8 of M.M.
Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
Case Studies Class 5. Computational Chemistry Structure of molecules and their reactivities Two major areas –molecular mechanics –electronic structure.
Lecture 3 – 4. October 2010 Molecular force field 1.
A COMPLEX NETWORK APPROACH TO FOLLOWING THE PATH OF ENERGY IN PROTEIN CONFORMATIONAL CHANGES Del Jackson CS 790G Complex Networks
Amino Acid and Protein1. 2  The formation of a peptide bond between glycine and alanine is shown in Figure 5.8. The product is called dipeptide, the.
Evaluation of Fast Electrostatics Algorithms Alice N. Ko and Jesús A. Izaguirre with Thierry Matthey Department of Computer Science and Engineering University.
Molecular Dynamics Molecular dynamics Some random notes on molecular dynamics simulations Seminar based on work by Bert de Groot and many anonymous Googelable.
Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction.
Protein Primer. Outline n Protein representations n Structure of Proteins Structure of Proteins –Primary: amino acid sequence –Secondary:  -helices &
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
An Integrated Approach to Protein-Protein Docking
Molecular modelling / structure prediction (A computational approach to protein structure) Today: Why bother about proteins/prediction Concepts of molecular.
Molecular Mechanics, Molecular Dynamics, and Docking Michael Strong, PhD National Jewish Health 11/23/2010.
Stochastic Roadmap Simulation: An Efficient Representation and Algorithm for Analyzing Molecular Motion Mehmet Serkan Apaydin, Douglas L. Brutlag, Carlos.
LSM2104/CZ2251 Essential Bioinformatics and Biocomputing Essential Bioinformatics and Biocomputing Protein Structure and Visualization (3) Chen Yu Zong.
Coarse grained simulations of p7 folding
Inverse Kinematics for Molecular World Sadia Malik April 18, 2002 CS 395T U.T. Austin.
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques117 Jan 2006 Bioinformatics Data Analysis & Tools Molecular simulations & sampling.
Chemistry in Biology.
Molecular Mechanics, Molecular Dynamics, and Docking
Ana Damjanovic (JHU, NIH) JHU: Petar Maksimovic Bertrand Garcia-Moreno NIH: Tim Miller Bernard Brooks OSG: Torre Wenaus and team.
Algorithms and Software for Large-Scale Simulation of Reactive Systems _______________________________ Ananth Grama Coordinated Systems Lab Purdue University.
02/03/10 CSCE 769 Dihedral Angles Homayoun Valafar Department of Computer Science and Engineering, USC.
Introduction to Protein Folding and Molecular Simulation Background of protein folding Molecular Dynamics (MD) Brownian Dynamics (BD) September, 2006 Tokyo.
 Four levels of protein structure  Linear  Sub-Structure  3D Structure  Complex Structure.
Statistical Physics of the Transition State Ensemble in Protein Folding Alfonso Ramon Lam Ng, Jose M. Borreguero, Feng Ding, Sergey V. Buldyrev, Eugene.
Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding I Prof. Corey O’Hern Department of Mechanical Engineering & Materials.
CZ5225 Methods in Computational Biology Lecture 4-5: Protein Structure and Structural Modeling Prof. Chen Yu Zong Tel:
Scheduling Many-Body Short Range MD Simulations on a Cluster of Workstations and Custom VLSI Hardware Sumanth J.V, David R. Swanson and Hong Jiang University.
BL5203 Molecular Recognition & Interaction Section D: Molecular Modeling. Chen Yu Zong Department of Computational Science National University of Singapore.
E-science grid facility for Europe and Latin America E2GRIS1 André A. S. T. Ribeiro – UFRJ (Brazil) Itacuruça (Brazil), 2-15 November 2008.
Department of Mechanical Engineering
Molecular Dynamics Simulation
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
Conformational Entropy Entropy is an essential component in ΔG and must be considered in order to model many chemical processes, including protein folding,
Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling II Prof. Corey O’Hern Department of Mechanical Engineering Department.
A Technical Introduction to the MD-OPEP Simulation Tools
Applied Bioinformatics Week 12. Bioinformatics & Functional Proteomics How to classify proteins into functional classes? How to compare one proteome with.
Molecular Dynamics Study of Aqueous Solutions in Heterogeneous Environments: Water Traces in Organic Media Naga Rajesh Tummala and Alberto Striolo School.
Molecular Dynamics simulations
Protein Folding and Modeling Carol K. Hall Chemical and Biomolecular Engineering North Carolina State University.
7. Lecture SS 2005Optimization, Energy Landscapes, Protein Folding1 V7: Diffusional association of proteins and Brownian dynamics simulations Brownian.
Protein Modeling Protein Structure Prediction. 3D Protein Structure ALA CαCα LEU CαCαCαCαCαCαCαCα PRO VALVAL ARG …… ??? backbone sidechain.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Molecular Modelling - Lecture 2 Techniques for Conformational Sampling Uses CHARMM force field Written in C++
LSM3241: Bioinformatics and Biocomputing Lecture 6: Fundamentals of Molecular Modeling Prof. Chen Yu Zong Tel:
Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.
Molecular Dynamics Arjan van der Vaart PSF346 Center for Biological Physics Department of Chemistry and Biochemistry Arizona State.
FlexWeb Nassim Sohaee. FlexWeb 2 Proteins The ability of proteins to change their conformation is important to their function as biological machines.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Dynameomics: Protein Mechanics, Folding and Unfolding through Large Scale All-Atom Molecular Dynamics Simulations INCITE 6 David A. C. Beck Valerie Daggett.
A Computational Study of RNA Structure and Dynamics Rhiannon Jacobs and Dr. Harish Vashisth Department of Chemical Engineering, University of New Hampshire,
A Computational Study of RNA Structure and Dynamics Rhiannon Jacobs and Harish Vashisth Department of Chemical Engineering, University of New Hampshire,
SMA5422: Special Topics in Biotechnology Lecture 9: Computer modeling of biomolecules: Structure, motion, and binding. Chen Yu Zong Department of Computational.
Page 1 Molecular Modeling Service in Profacgen. Page 2 The three-dimensional structure of a protein provides essential information about its biological.
March 21, 2008 Christopher Bruns
Protein Structure Prediction and Protein Homology modeling
Algorithms and Software for Large-Scale Simulation of Reactive Systems
Computational Analysis
An Integrated Approach to Protein-Protein Docking
Large Time Scale Molecular Paths Using Least Action.
CZ5225 Methods in Computational Biology Lecture 7: Protein Structure and Structural Modeling Prof. Chen Yu Zong Tel:
Algorithms and Software for Large-Scale Simulation of Reactive Systems
Experimental Overview
Presentation transcript:

SMA5233 Particle Methods and Molecular Dynamics Lecture 5: Applications in Biomolecular Simulation and Drug Design A/P Chen Yu Zong Tel: Room 08-14, level 8, S16 National University of Singapore

2 Proteins Proteins are life’s machines, tools and structures –Many jobs, many shapes, many sizes

3 Proteins Proteins are life’s machines, tools and structures –Nature reuses designs for similar jobs 1hdd 1enh 1f43 1ftt 1bw5 1du6 1cqt

4 Proteins Proteins are hetero-polymers of specific sequence –There are 20 common polymeric units (amino acids) Composed of a variety of basic chemical moieties –Chain lengths range from 40 amino acids on up M K L V D Y A G E

5 Proteins Proteins are hetero-polymers that adopt a unique fold M K L V D Y A G E

6 Proteins Protein folding as a reaction Products Reactants Transition state Free Energy Bad Good

7 Proteins Protein folding … Native Denatured / Partially Unfolded Transition state Free Energy Bad Good

8 Proteins Folded proteins Native Denatured / Partially Unfolded Transition state Free Energy Folded, active, functional, biologically relevant state (ensemble of conformers) Bad Good

9 Proteins Folded proteins Native Denatured / Partially Unfolded Transition state Free Energy Static, 3D coordinates of some proteins’ atoms are available from x-ray crystallography & NMR Bad Good

10 Proteins Folded proteins Native Denatured / Partially Unfolded Transition state Free Energy Static, 3D coordinates of some proteins’ atoms are available from PDB Bad Good

11 Proteins Folded proteins are complex and dynamic molecules Native Denatured / Partially Unfolded Transition state Free Energy Bad Good

12 Proteins Folded proteins are complex and dynamic molecules Native Denatured / Partially Unfolded Transition state Free Energy Bad Good

13 Molecular Dynamics MD provides atomic resolution of native dynamics PDB ID: 3chy, E. coli CheY 1.66 Å X-ray crystallography

14 Molecular Dynamics MD provides atomic resolution of native dynamics PDB ID: 3chy, E. coli CheY 1.66 Å X-ray crystallography

15 Molecular Dynamics MD provides atomic resolution of native dynamics 3chy, hydrogens added

16 Molecular Dynamics MD provides atomic resolution of native dynamics 3chy, waters added (i.e. solvated)

17 Molecular Dynamics MD provides atomic resolution of native dynamics 3chy, waters and hydrogens hidden

18 Molecular Dynamics MD provides atomic resolution of native dynamics native state simulation of 3chy at 298 Kelvin, waters and hydrogens hidden

19 Proteins Folding & unfolding at atomic resolution Native Denatured / Partially Unfolded Transition state Free Energy Disordered, non-functional, heterogeneous ensemble of conformers Bad Good

20 Proteins Protein folding, why we care how it happens Native Denatured / Partially Unfolded Transition state Free Energy mutation Many diseases are related to protein folding and / or misfolding in response to genetic mutation.

21 Proteins Protein folding, why we care how it happens Native Denatured / Partially Unfolded Transition state Free Energy mutation We need to comprehend folding to build nano-scale biomachines (that could produce energy, etc…)

22 Proteins Protein folding takes > 10 μs (often much longer) Native Denatured / Partially Unfolded Transition state Free Energy Bad Good

23 Proteins Protein folding takes > 10 μs (often much longer) Native Denatured / Partially Unfolded Transition state Free Energy Bad Good

24 Proteins Protein folding is the reverse of protein unfolding Native Denatured / Partially Unfolded Transition state Free Energy Bad Good

25 Proteins Protein unfolding is relatively invariant to temperature Native Denatured / Partially Unfolded Transition state Free Energy Temperature Bad Good

26 Molecular Dynamics MD provides atomic resolution of folding / unfolding unfolding simulation (reversed) of 3chy at 498 Kelvin, waters & hydrogens hidden

27 Forces Involved in the Protein Folding Electrostatic interactions van der Waals interactions Hydrogen bonds Hydrophobic interactions (Hydrophobic molecules associate with each other in water solvent as if water molecules is the repellent to them. It is like oil/water separation. The presence of water is important for this interaction.)

28 Energy Functions used in Molecular Simulation Electrostatic term H-bonding term Van der Waals term Bond stretching term Dihedral term Angle bending term r Φ Θ + ー O H r r r The most time demanding part.

29 System for MD Simulations Without water molecules With water molecules # of atoms: 304 # of atoms: ,377 = 7,681

30 MD Requires Huge Computational Cost Time step of MD (Δt) is limited up to about 1 fsec ( sec). ← The size of Δt should be approximately one-tenth the time of the fastest motion in the system. For simulation of a protein, because bond stretching motions of light atoms (ex. O-H, C-H), whose periods are about sec, are the fastest motions in the system for biomolecular simulations, Δt is usually set to about 1 fsec. Huge number of water molecules have to be used in biomolecular MD simulations. ← The number of atom-pairs evaluated for non-bonded interactions (van der Waals, electrostatic interactions) increases in order of N 2 (N is the number of atoms). It is difficult to simulate for long time. Usually a few tens of nanoseconds simulation is performed.

31 Time Scales of Protein Motions and MD Time (s) (fs) (ps) (μs)(ns) (ms) Bond stretching Permeation of an ion in Porin channel Elastic vibrations of proteins It is still difficult to simulate a whole process of a protein folding using the conventional MD method. MD α-Helix folding β-Hairpin folding Protein folding

32 Much Faster, Much Larger! Special-purpose computer –Calculation of non-bonded interactions is performed using the special chip that is developed only for this purpose. –For example; MDM (Molecular Dynamics Machine) or MD-Grape: RIKEN MD Engine: Taisho Pharmaceutical Co., and Fuji Xerox Co. Parallelization –A single job is divided into several smaller ones and they are calculated on multi CPUs simultaneously. –Today, almost MD programs for biomolecular simulations (ex. AMBER, CHARMm, GROMOS, NAMD, MARBLE, etc) can run on parallel computers.

33 Brownian Dynamics (BD) The dynamic contributions of the solvent are incorporated as a dissipative random force (Einstein’s derivation on 1905). Therefore, water molecules are not treated explicitly. Since BD algorithm is derived under the conditions that solvent damping is large and the inertial memory is lost in a very short time, longer time- steps can be used. BD method is suitable for long time simulation.

34 System for BD Simulations Without water molecules With water molecules # of atoms: 304 # of atoms: ,377 = 7,681

35 Algorithm of BD The Langevin equation can be expressed as Here, r i and m i represent the position and mass of atom i, respectively. ζ i is a frictional coefficient and is determined by the Stokes ’ law, that is, ζ i = 6πa i Stokes η in which a i Stokes is a Stokes radius of atom i and η is the viscosity of water. F i is the systematic force on atom i. R i is a random force on atom i having a zero mean = 0 and a variance = 6ζ i kTδ ij δ(t); this derives from the effects of solvent. For the overdamped limit, we set the left of eq.7 to zero, The integrated equation of eq. 8 is called Brownian dynamics; where Δt is a time step and ω i is a random noise vector obtained from Gaussian distribution. (7) (9) (8)

36 Computational Time of BD AlgorithmComputer # of atoms Time (sec) Efficiency MD Pentium4 2.8 GHz 7,6812, BD BD +MTS † Pentium4 2.8 GHz BD +MTS † IBM Regatta 8 CPU † MTS(Multiple time step) algorithm: This method reduces the frequency of calculation of the most time-demanding part ( non- bonded energy terms ). Computational time required for 1 nsec simulation of a peptide

37 Folding Simulation of an α-Helical Peptide using BD The fraction of native contacts Simulation time (nsec)

38 Folding Simulation of an β-Hairpin Peptide using BD The fraction of native contacts Simulation time (nsec)

39 Time Scales of Protein Motions and BD Time (s) (fs) (ps) (μs)(ns) (ms) BD method allows us to simulate for long time. BD α-Helix folding β-Hairpin folding Protein folding MD Bond stretching Permeation of an ion in Porin channel Elastic vibrations of proteins

40 EXAMPLE: Unfolding of Staphylococcal protein A through high temperature MD simulations D. Alonso and V. Daggett, PNAS 2000; 97:

41 Simulation Methodology Starting structure: NMR structures 1edk and 1bbd ENCAD program The protein was initially minimized 1,000 steps in vacuo. The minimized protein was then solvated with water in a box (approximately 1 g/ml) extending a minimum of 10 Å from the protein The box dimensions were then increased uniformly to yield the experimental liquid water density for the temperature of interest (0.997 g/ml at 298 K and at 498 K)

42 Simulation Methodology (cont) The systems were then equilibrated by minimizing the water for 2,000 steps, minimizing water and protein for 100 steps, performing MD of the water for 4,000 steps, minimizing the water for 2,000 steps, minimizing the protein for 500 steps, and minimizing the protein and water for 1,000 steps. Production MD simulations were then run using a 2-fs time step for several ns (T=298K, T=498K)

43 RMSD and RG as a function of simulation time

44 Snapshots along the unfolding trajectory

45 EXAMPLE 2: Identification of the N-terminal peptide binding site of GRP94 GRP94 - Glucose regulated protein 94 VSV8 peptide - derived from vesicular stomatitis virus Gidalevitz T, Biswas C, Ding H, Schneidman-Duhovny D, Wolfson HJ, Stevens F, Radford S, Argon Y. J Biol Chem. 2004

46 Biological and Drug Design Motivation The complex between the two molecules highly stimulates the response of the T-cells of the immune system. The complex between the two molecules highly stimulates the response of the T-cells of the immune system. The grp94 protein alone does not have this property. The activity that stimulates the immune response is due to the ability of grp94 to bind different peptides. The grp94 protein alone does not have this property. The activity that stimulates the immune response is due to the ability of grp94 to bind different peptides. Characterization of peptide binding site is highly important for drug design. Characterization of peptide binding site is highly important for drug design. Either the peptides or their derived non-peptide inhibitors can be developed into drugs for treating immune related or immune-regulating diseases

47 GRP94 molecule There was no structure of grp94 protein. Homology modeling was used to predict a structure using another protein with 52% identity. There was no structure of grp94 protein. Homology modeling was used to predict a structure using another protein with 52% identity. Recently the structure of grp94 was published. The RMSD between the crystal structure and the model is 1.3A. Recently the structure of grp94 was published. The RMSD between the crystal structure and the model is 1.3A.

48 Docking and Structure Optimization PatchDock was applied to dock the two molecules, without any binding site constraints followed by MD or MM simulation to optimize the docked structure. PatchDock was applied to dock the two molecules, without any binding site constraints followed by MD or MM simulation to optimize the docked structure. Docking results were clustered in the two cavities: Docking results were clustered in the two cavities:

49 GRP94 molecule There is a binding site for inhibitors between the helices. There is a binding site for inhibitors between the helices. There is another cavity produced by beta sheet on the opposite side. There is another cavity produced by beta sheet on the opposite side.

50 Advantages of Fully Atomic Models Computationally very costly Cannot reach the long time and length scales of biological interest Disadvantages of Fully Atomic Models Detailed level of description of protein and solvent Can use enhanced sampling techniques (see talks by Andrij and Arturo) Can use simplified (coarse-grained) models

51 Molecular Dynamics Scalable, parallel MD & analysis software: in lucem Molecular Mechanics 1 ilmm 1.Beck, Alonso, Daggett, (2004) University of Washington, Seattle

52 Molecular Dynamics ilmm is written in C (ANSI / POSIX) 64 bit math POSIX threads / MPI Software design philosophy: –Kernel Compiles user’s molecular mechanics programs Schedules execution across processor and machines –Modules, e.g. Molecular Dynamics Analysis CPU POSIX threads (multiprocessor machines) CPU Message Passing Interface (multiple machines) + VERY high bandwidth

53 Molecular Dynamics ilmm is written in C (ANSI / POSIX) 64 bit math POSIX threads / MPI Software design philosophy: –Kernel Compiles user’s molecular mechanics programs Schedules execution across processor and machines –Modules, e.g. Molecular Dynamics Analysis CPU POSIX threads (multiprocessor machines) CPU Message Passing Interface (multiple machines) + VERY high bandwidth

54 Dynameomics Simulate representative protein from all folds 1. Day R., Beck D. A. C., Armen R., Daggett V. Protein Science (2003) 10: fold population coverage 150 folds represent ~ 75% of known protein structures fold 1

55 Dynameomics Simulate representative protein from all folds –Native (folded) dynamics 20 nanosecond simulation at 298 Kelvin –Folding / unfolding pathway 3 x 2 ns simulations at 498 K 2 x 20 ns simulations at 498 K –Each target requires 6 simulations = MANY CPU HOURS MANY CPU HOURS

56 Dynameomics NERSC DOE INCITE award –2,000,000 + hours –906 simulations of 151 protein folds on Seaborg –One to two simulations per node (8 – 16 CPUs / simulation) –Opportunity to tune ilmm for maximum performance

57 Dynameomics Load balancing –Even distribution of non-bonded pairs to processors ~20% faster

58 Dynameomics Parallel efficiency –Threaded computations on 16 CPU IBM Nighthawk p, number of processors t(p), run-time using p processors parallel efficiency,

59 Dynameomics Simulate representative from top 151 folds –151 folds represent about 75% of known proteins ~ 11 μs of combined sim. time from 906 sims! ~ 2 terabytes of data (w/ 40 to 60% compression!) ~ 75 / 151 have been analyzed Validated against experiment where possible

60 Dynameomics Now what? –Simulate the top 1130 folds (>90%) More CPU time –Share simulation data from top 151 folds w/ world: Coordinates, analyses, available via WWW MicrosoftSQL database w/ On-Line Analytical Processing (OLAP) End-user queries of coordinate data, analyses, etc. –Data mining More CPU time, clever statistical algorithms, etc.