Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey
Villin Folding
Z Z Z Z Z Z Z Z Z Z Z Z
Z Z Z Z Z Z Z Z Z Z Z Z
Overview is an adaptive framework for harnessing low latency parallel compute resources for protein folding research. It combines capability discovery, load balancing, process monitoring, and checkpoint/re-start services to provide a platform for molecular dynamics simulations on a range of grid-based parallel computing resources including clusters, SMP machines, and clusters of SMP machines (sometimes known as constellations).
Design Goals Provide an easy to use, open source interface to significant computing resources for scientists performing molecular dynamics simulations on large biomolecular systems. Automate process of running molecular systems on a variety of parallel computing resources Handle failures gracefully & automatically Don’t hinder performance possibilities Ease of use for scientists, sys. admin.s, & contributors Provide low friction install, configuration, & run-time interfaces Sustain tight linkage with project.
Open Source Building Blocks GROMACS: Molecular dynamics software package. Primary Scientific Core FFTW: Fast Fourier Transform Library. Used internally in GROMACS. LAM/MPI: Message Passing Interface implementation. Supports the MPI-2.0 specification. COSM: Distributed computing library to aid in portability. Provides capability discovery, logging, & base utilities. NetPipe: Common tool for measuring bandwidth & latency. Used in capability discovery. Large-scale distributed computing project. Foundation for this project.
Contributor Setup 1.Create a user to run 2.Download & unpack the distribution 3.Confirm LAM/MPI installation & configuration 4.Start LAM/MPI: $ lamboot 5.Configure using mother.conf 6.Start $ mpirun -np 1 bin/mother $ lamnodes n0 c1.cluster.earlham.edu:2:origin,this_node n1 c2.cluster.earlham.edu:2: n2 c3.cluster.earlham.edu:2: n3 c4.cluster.earlham.edu:2: n4 c5.cluster.earlham.edu:2: n5 c6.cluster.earlham.edu:2: n6 c7.cluster.earlham.edu:2: n7 c8.cluster.earlham.edu:2: n8 c9.cluster.earlham.edu:2: n9 c10.cluster.earlham.edu:2: $ cat conf/mother.conf [Network] LamHosts=n0,n1,n2,n3,n4,n5,n6 LamMother=n0
Testing Environment: Cairo Network Fabric: 2 Netgear GSM MB Switches Linked together by dual GBIC/1000 BT RJ45 modules OS: Yellow Dog Linux (4.0 Release, SMP Kernel) GCC: Nodes16 Apple Xserves ProcessorDual G4 PowerPC 999 MHz L2 Cache256 KB L3 Cache2 MB Front Side Bus133 MHz RAM1 GB PC2100 DDRAM NIC 1 on-board 10/100/1000 BT 1 PCI 10/100/1000 BT Hard Drive60 GB IBM Ultra ATA/ RPM w/ 2 MB cache MotherboardApple Proprietary
Testing Environment: Molecules MoleculeDescriptionMass Points DPPCA phospholipid membrane, consisting of 1024 dipalmitoylphosphatidylcholine (DPPC) lipids in a bilayer configuration with 23 water molecules per lipid. 121,856 Proteasome (Stable) A peptide in a proteasome with explicit solvent and a Coulomb type of reaction field. 119,507 VillinThe Villin headpiece, a 35 residue peptide, simulated with 3000 water molecules. 9,389
Performance DPPC Proteasome (Stable) Villin
Future Directions New scientific cores (Amber, NAMD, etc…) Remove dependencies on pre-installed software Extend testing suite of molecules Extend range of parallel compute resources used in testing Abstract framework Investigate load balancing & resource usage improvements Architecture addition: Grandmothers Beta Release!
Future Directions
About Us Charles Peck Josh McCoy John Schaefer Vijay Pande Erik Lindahl Adam Beberg Josh Hursey
Questions
Speedup DPPC Proteasome (Stable) Villin
Testing Environment: Bazaar Network Fabric: 2 Switches (3Com 3300XM 100 MB, 3Com MB) Linked together by a 3Com MultiLink cable OS: SuSE Linux ( SMP Kernel) GCC: Nodes16 VA Linux 2200s ProcessorDual Pentium III 500 MHz L1 Cache32 KB L2 Cache512 KB Front Side Bus100 MHz RAM512 MB SDRAM NICIntel Pro 10/100B/100+ Ethernet (on-board) Hard Drive18 GB WD Caviar 7200 MotherboardIntel L330GX+ Server Board
Motivation