Drug Discovery Workshop, Catania 2009 Molecular dynamics of protein unfolding using the DEISA-Grid Andrew Emerson, CINECA Supercomputing Centre, Bologna,

Slides:



Advertisements
Similar presentations
Forschungszentrum Jülich in der Helmholtz-Gesellschaft December 2006 A European Grid Middleware Achim Streit
Advertisements

Ingrid Conferene, Ischia, April Stefan Heinzel, DEISA DEISA Towards a European HPC Infrastructure ( Topics Vision The DEISA/eDEISA.
Towards a Virtual European Supercomputing Infrastructure Vision & issues Sanzio Bassini
High Performance Computing Course Notes Grid Computing.
Towards a European Training Network in Scientific Computing Pekka Manninen, PhD CSC – IT Center for Science Ltd.
1 In VINI Veritas: Realistic and Controlled Network Experimentation Jennifer Rexford with Andy Bavier, Nick Feamster, Mark Huang, and Larry Peterson
Evaluation of Fast Electrostatics Algorithms Alice N. Ko and Jesús A. Izaguirre with Thierry Matthey Department of Computer Science and Engineering University.
HPC-Europa2 Funding research visits in Europe
Graphical Models for Protein Kinetics Nina Singhal CS374 Presentation Nov. 1, 2005.
Atomistic Protein Folding Simulations on the Submillisecond Timescale Using Worldwide Distributed Computing Qing Lu CMSC 838 Presentation.
Protein Primer. Outline n Protein representations n Structure of Proteins Structure of Proteins –Primary: amino acid sequence –Secondary:  -helices &
Lisbon, August A. Streit DEISA Forschungszentrum Jülich in der Helmholtz-Gesellschaft Achim Streit
CSC Grid Activities Arto Teräs HIP Research Seminar February 18th 2005.
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques117 Jan 2006 Bioinformatics Data Analysis & Tools Molecular simulations & sampling.
Ana Damjanovic (JHU, NIH) JHU: Petar Maksimovic Bertrand Garcia-Moreno NIH: Tim Miller Bernard Brooks OSG: Torre Wenaus and team.
Experiences with using UNICORE in Production Grid Infrastructures DEISA and D-Grid Michael Rambadt
STUDY OF STRUCTURAL FEATURES OF PROTEINS OF BIOTECHNOLOGICAL INTEREST BY MD SIMULATIONS Anna Marabotti Dept. Chemistry and Biology, University of Salerno,
EuroCAMP, Malaga, October 19, 2006 DEISA requirements for federations and AA Jules Wolfrat SARA
Using the WS-PGRADE Portal in the ProSim Project Protein Molecule Simulation on the Grid Tamas Kiss, Gabor Testyanszky, Noam.
GGF16 Athens, February DEISA Perspectives Towards cooperative extreme computing in Europe Victor Alessandrini IDRIS - CNRS
Application of e-infrastructure to real research.
RI User Management in DEISA The DEISA VO view Jules Wolfrat SARA, HPDC’08 workshop June 24, 2008.
The John von Neumann Institute for Computing (NIC): A survey of its computer facilities and its Europe-wide computational science activities Norbert Attig.
Results of the HPC in Europe Taskforce (HET) e-IRG Workshop Kimmo Koski CSC – The Finnish IT Center for Science April 19 th, 2007.
Protein Molecule Simulation on the Grid G-USE in ProSim Project Tamas Kiss Joint EGGE and EDGeS Summer School.
Loosely Coupled Parallelism: Clusters. Context We have studied older archictures for loosely coupled parallelism, such as mesh’s, hypercubes etc, which.
RI User Support in DEISA/PRACE EEF meeting 2 November 2010, Geneva Jules Wolfrat/Axel Berg SARA.
Forschungszentrum Jülich in der Helmholtz-Gesellschaft Experiences with using UNICORE in Production Grid Infrastructures DEISA and D-Grid Michael Rambadt.
Computer Simulation of Biomolecules and the Interpretation of NMR Measurements generates ensemble of molecular configurations all atomic quantities Problems.
1 Scripting Workflows with the Application Hosting Environment Stefan Zasada University College London.
Grid Middleware Tutorial / Grid Technologies IntroSlide 1 /14 Grid Technologies Intro Ivan Degtyarenko ivan.degtyarenko dog csc dot fi CSC – The Finnish.
INFSO-RI Enabling Grids for E-sciencE In silico docking on EGEE infrastructure, the case of WISDOM Nicolas Jacq LPC of Clermont-Ferrand,
A Technical Introduction to the MD-OPEP Simulation Tools
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Molecular Dynamics Simulations of Compressional Metalloprotein Deformation Andrew Hung 1, Jianwei Zhao 2, Jason J. Davis 2, Mark S. P. Sansom 1 1 Department.
RI The DEISA Sustainability Model Wolfgang Gentzsch DEISA-2 and OGF rzg.mpg.de.
Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey.
 Our mission Deploying and unifying the NMR e-Infrastructure in System Biology is to make bio-NMR available to the scientific community in.
ESFRI & e-Infrastructure Collaborations, EGEE’09 Krzysztof Wrona September 21 st, 2009 European XFEL.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
April 10, 2008, Garching Claudio Gheller CINECA The DEISA HPC Grid for Astrophysical Applications.
Home - Distributed Parallel Protein folding Chris Garlock.
Fourth EGEE Conference Pise, October 23-28, 2005 DEISA Perspectives Towards cooperative extreme computing in Europe Victor Alessandrini IDRIS - CNRS
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
Università di Perugia Enabling Grids for E-sciencE Status of and requirements for Computational Chemistry NA4 – SA1 Meeting – 6 th April.
INFSO-RI Enabling Grids for E-sciencE The EGEE Project Owen Appleton EGEE Dissemination Officer CERN, Switzerland Danish Grid Forum.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
14, Chicago, IL, 2005 Science Gateways to DEISA Motivation, user requirements, and prototype example Thomas Soddemann, RZG, Germany.
PEER 2003 Meeting 03/08/031 Interdisciplinary Framework Major focus areas Structural Representation Fault Systems Earthquake Source Physics Ground Motions.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI
A Computational Study of RNA Structure and Dynamics Rhiannon Jacobs and Dr. Harish Vashisth Department of Chemical Engineering, University of New Hampshire,
A Computational Study of RNA Structure and Dynamics Rhiannon Jacobs and Harish Vashisth Department of Chemical Engineering, University of New Hampshire,
EGEE Workshop on Management of Rights in Production Grids Paris, June 19th, 2006 Victor Alessandrini IDRIS - CNRS DEISA : status, strategies, perspectives.
Page : 1 SC2004 Pittsburgh, November 12, 2004 DEISA : integrating HPC infrastructures in Europe DEISA : integrating HPC infrastructures in Europe Victor.
Monterey HPDC Workshop Experiences with MC-GPFS in DEISA Andreas Schott
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
1 MSWG, Amsterdam, December 15, 2005 DEISA security Jules Wolfrat SARA.
BioExcel - Intro Erwin Laure, KTH. PDC Center for High Performance Computing BioExcel Consortium KTH Royal Institute of Technology – Sweden University.
Grids and SMEs: Experience and Perspectives Emanouil Atanassov, Todor Gurov, and Aneta Karaivanova Institute for Parallel Processing, Bulgarian Academy.
Bob Jones EGEE Technical Director
PRACE Experiences of an e-Infrastructure Flagship Project
Centre for Computational Science, University College London
DEISA : integrating HPC infrastructures in Europe Prof
Grid Portal Services IeSE (the Integrated e-Science Environment)
How to obtain HPC resources
Home - Distributed Parallel Protein folding
Enzyme Kinetics & Protein Folding 9/7/2004
Large Time Scale Molecular Paths Using Least Action.
Experimental Overview
Presentation transcript:

Drug Discovery Workshop, Catania 2009 Molecular dynamics of protein unfolding using the DEISA-Grid Andrew Emerson, CINECA Supercomputing Centre, Bologna, Italy

Drug Discovery Workshop, Catania 2009 DEISA/DEISA2 HPC Centers  DEISA  May April 2008  eDEISA  DEISA2  May April 2011 Distributed European Infrastructure for Supercomputing Applications.

Drug Discovery Workshop, Catania 2009 DEISA2 HPC centers BSC Barcelona Supercomputing Centre Spain CINECA Consorzio Interuniversitario per il Calcolo Automatico Italy CSC Finnish Information Technology Centre for Science Finland EPCC University of Edinburgh and CCLRC UK ECMWF European Centre for Medium-Range Weather Forecast UK (int) FZJ Research Centre JuelichGermany HLRS High Performance Computing Centre Stuttgart Germany IDRIS Institut du Développement et des Ressources France en Informatique Scientifique - CNRS LRZ Leibniz Rechenzentrum Munich Germany RZG Rechenzentrum Garching of the Max Planck SocietyGermany SARA Dutch National High Performance Computing Netherlands KTH Kungliga Tekniska Högskolan Sweden CSCS Swiss National Supercomputing Centre Switzerland JSCC Joint Supercomputer Center of the Russian Russia CEA Atomic Energy Commission France

Drug Discovery Workshop, Catania 2009 European Grids – EGEE and DEISA  EGEE (Enabling Grids for E- sciencE ). “Community based grid” for running many, lightly- coupled applications. Jobs can run over many computers simultaneously. Example: high throughput virtual screening (docking)  DEISA jobs are generally only run on one machine at a time. Better suited for tightly coupled (i.e. parallel) applications. Example: parallel molecular dynamics, QM etc.

Drug Discovery Workshop, Catania 2009 DEISA/DEISA2 computing power  DEISA, 2004 ~ 30TFlops  DEISA2, 2008 ~1PFlops  Dedicated 10 Gbit/s network connection  Dedicated subset of computing resources for researchers

Drug Discovery Workshop, Catania 2009 DEISA Supercomputing Grid services  Workflow management: based on UNICORE plus further extensions and services coming from DEISA’s JRA7 and other projects (UniGrids, …)  Global data management: a well defined architecture implementing extended global file systems on heterogeneous systems, fast data transfers across sites, and hierarchical data management at a continental scale.  Science Gateways and portals: specific Internet interfaces to hide complex supercomputing environments from end users, and facilitate the access of new, non traditional, scientific communities.

Drug Discovery Workshop, Catania 2009 CPU GPFS CPU GPFS CPU GPFS CPU GPFS CPU GPFS + NRENs Clien t Job Data Running applications via UNICORE Global Data Management with GPFS UNiform Interface to COmputing REsources - platform independent way of running applications. common file system avoids having to ftp files from one site to another

Drug Discovery Workshop, Catania 2009 Accessing DEISA -UNICORE clients,.. Client downloadable Java client, with workflow capability OR DESHL command-line client: $ deshl list cineca/home $ deshl submit –q bcx myjob.sh more easily incorporated into graphical applications or portals ImmunoGrid portal + ssh, gsissh, portals

Drug Discovery Workshop, Catania 2009 DEISA Extreme Computing Initiative (DECI) The basic service providing model for scientific users is currently the Extreme Computing Initiative  DECI launched in early 2005 to enhance DEISA’s impact on science and research  European Call for proposals in May-June every year. Multi-national proposals strongly encouraged  Identification, enabling, deploying and operation of “flagship” applications in selected areas of science and technology  Complex, demanding, innovative simulations that require the capabilities of DEISA compute resources  Proposals reviewed by national evaluation committees  Applications are selected on the basis of scientific excellence, innovation potential and relevance criteria – and national priorities  Once approved, the most powerful HPC architectures in Europe assigned to the most challenging projects, and the most appropriate supercomputer architecture selected for each project

Drug Discovery Workshop, Catania 2009 DEISA Extreme Computing Initiative (DECI) Supporting environment  Application enabling by Applications Task Force (ATASKF), team of leading experts in high performance and grid computing  DEISA Common Production Environment (DCPE): homogenising the heterogeneous DEISA software environments  General environment and user related application support  European team of system operation specialists handling coordination and synchronisation of system services, maintenance measures and failure situations  Training sessions

Drug Discovery Workshop, Catania 2009 DEISA Extreme Computing Initiative (DECI) DECI call 2005  51 proposals, 12 European countries involved  30m cpu-h requested  29 proposals accepted, 12m cpu-h awarded (normalized to IBM P4+) DECI call 2006  41 proposals, 12 European countries involved  28m cpu-h requested  23 proposals accepted, 12m cpu-h awarded DECI call 2007  63 proposals, 14 European countries involved  70 m cpu-h requested  45 proposals accepted, ~30m cpu-h awarded DECI call 2008  42 proposals accepted, ~49m cpu-h awarded

Drug Discovery Workshop, Catania 2009 Using DEISA: DECI Case Study  DECI 2007 Project UnBlaMD  Principal Investigator: Ivano Eberini (University of Milan)  Title: Urea induced unfolding of Beta- lactoglobulin by Molecular Dynamics  Area: Protein structure

Drug Discovery Workshop, Catania 2009 Background - Protein folding problem  Proteins are linear polymers of amino-acid monomers but their function depends on the three-dimensional structure.  Most proteins, e.g. enzymes and receptors, are biologically active only when correctly and completely folded.  Protein misfolding can give rise to diseases such as Alzheimer’s, CJ disease, cystic fibrosis, some cancers etc.

Drug Discovery Workshop, Catania 2009 Background - Protein folding problem  Proteins are linear polymers of amino-acid monomers but their function depends on the three-dimensional structure. Most proteins, e.g. enzymes and receptors, are biologically active only when correctly and completely folded.  Protein structures can in some cases be determined experimentally by X-ray diffraction or NMR. But for X-ray,  Some globular proteins are difficult to crystallize  cannot be used for trans-membrane proteins (about 30% of all proteins)  Also crystal structure of protein may not be an accurate representation of the structure in vivo.  NMR can probe structure in solution but can only generally be used for small proteins.

Drug Discovery Workshop, Catania 2009 “ab-initio” protein folding  Homology modelling can be successful but obviously requires homology with molecule of known structure.  The big question is whether we can do this..STCGGGLIFYNQRKEGWPMPGGRKCGGTLHHHNY... by calculation alone

Drug Discovery Workshop, Catania 2009 “ab-initio” protein folding  Classical methods such as Molecular Dynamics generally give poor results:  folding is complex and slow (μs→mins) compared with MD timescales (ns)  proteins have many conformational degrees of freedom → complicated energy surface → many states to sample  other factors may also be involved, e.g. the accuracy of classical force-fields.  Thus many researchers study instead the reverse process, unfolding, in the hope this will give clues to the folding process. Unfolding is also easier to study experimentally.

Drug Discovery Workshop, Catania 2009 protein unfolding  Experimentally researchers use high T, high P, low pH or denaturants (e.g. guanidinium chloride or urea) to accelerate the unfolding process.  In silico (i.e. MD) High T and Urea are often used. Denaturing agents like urea must disrupt the hydrogen bonding pattern, since protein secondary structure is due to hydrogen bonds, but the precise mechanism is unknown.

Drug Discovery Workshop, Catania 2009 Example: MD unfolding of ProteinL A. Rocco et al, Biophys J, at 400K and 480K two different unfolding pathways, depending on the presence of urea.

Drug Discovery Workshop, Catania 2009

DSSP ProteinL at 300K and 350K at more realistic temperatures however, nothing much happens. Insufficient simulation time ?

Drug Discovery Workshop, Catania 2009 Urea-induced unfolding of BLG by MD Background: Induced protein unfolding by urea may provide a tractable route to understanding protein structure. However, most MD studies use extreme (i.e. non-physiological) temperatures. Is the unfolding due to high temperature or due to urea? Aim: investigate urea-induced unfolding mechanism of a small protein at ambient temperature and pressure by classical MD. Computational resources available thanks to a grant of ~1M hrs from the DEISA Extreme Computing Initiative (“UnBlaMD”, DECI 2007 call).

Drug Discovery Workshop, Catania 2009 β-lactoglobulin (BLG) Ligand binding protein (transport protein) of the lipocalin family, present in cow milk (whey) atoms, mass 18 kDa 152 amino acids extensively studied experimentally (~170 articles). Known to unfold at 300K in 8M urea. full-blown protein unlike a peptide or domain but sufficiently small to be computationally treatable (we hope !) Why BLG ?

Drug Discovery Workshop, Catania 2009 System preparation (“application enabling”)  NAMD chosen as main simulation engine due to high parallel scalability.  Validation of Charmm params for Urea using ProteinL. Selected those of Jorgensen (originally from OPLS).  Structure available from PDB (1b8e)  Water, 10M urea + Na + created with Insight/Builder and Vega (checked correct assignment of cis/trans H). short simulated annealing runs to equilibrate solvent (same protocol as ProteinL setup). time for application enabling ~ 3 months

Drug Discovery Workshop, Catania 2009 Further application enabling - benchmarking choose #procs and machine options to optimise performance

Drug Discovery Workshop, Catania 2009 Production runs  Started Jul-Aug 2008 on IBM BlueGene BG/P Jülich Supercomputing Centre (Jugene).  simulation details  NAMD 2.6 (CVS version compiled for BG/P)  NPT, 1 atm pressure  Langevin temperature scaling  Shake + 2fs timestep  cubic PBC, Particle Mesh Ewald for electrostatics,  standard cutoffs, Charmm22  Performance  BG/P (dual mode, 2048 PowerPC cores) ~19ns/day  Cineca Linux cluster (BCX, 256 AMD Opteron cores) ~13ns/day

Drug Discovery Workshop, Catania 2009 Current status  BG/P runs finished in November. → 650ns  Cineca runs started, hope to reach 900ns- 1μs.  Data obtained so far = ~0.7 Tb in 60 trajectory files. Transferred from BG/P to Cineca via DEISA GFS.  Analysis using Gromacs (after trajectory conversion) and VMD/Tcl. Data analysis is proving to be non-trivial! Perhaps use workflows or other automation ?

Drug Discovery Workshop, Catania 2009 RMSD C-alpha RMSD/nm 0-12 ns ns ns ns

Drug Discovery Workshop, Catania 2009 DSSP – comparison t=0-12ns and t= ns β-sheet α-helix

Drug Discovery Workshop, Catania 2009 H-bond params: Angle=30 ° Distance=3.5Å H-Bond analysis

Drug Discovery Workshop, Catania 2009 Observations Technical  A 1μs trajectory of a real protein is very long by the standards of atomistic MD simulation and would not have been feasible without DEISA resources.  The non-trivial problem of data storage and transfer aided by the DEISA shared, high performance file systems. Scientific  Data analysis has only just started but it is clear that BLG hasn’t unfolded and may not do so before 1μs.  There are indications that the secondary structure is decreasing and loss of H-bonds in the protein backbone but the process at 300K is clearly slow on the MD timescale.  More detailed analysis is underway, particularly to understand which hydrogen bonds are being affected. Will probably be followed by higher temp or REMD runs.

Drug Discovery Workshop, Catania 2009 Acknowledgements  Ivano Eberini and his group at the University of Milan.  Developers of Vega ( ) for their help with urea Charmm/NAMD parametrisation.  Anna Tramontano for expert advice  DEISA for computer time and slides. Staff at the Juelich Supercomputing Centre for assistance in optimisation of NAMD on BG/P.