Parallel computing in Computational chemistry
Why? What Happens in molecular level?
Computational chemistry is a branch of chemistry that uses computer simulation to assist in solving chemical problems. It uses methods of theoretical chemistry, incorporated into efficient computer programs, to calculate the structures and properties of molecules.
J. Chem. Phys. 27, 1208 (1957); doi: 10.1063/1.1743957 PNO: 96 PBC MD simulation IBM-704
IBM-704 The first mass-produced computer with floating-point arithmetic hardware was introduced by IBM in 1954 32 bit execute up to 12,000 calculations per second (O kFLOPS) Today: Petaplops 10^15 ( A quadrillion (thousand trillion) calculations per second ) Future: exaFLOPS 10^18 (a billion billion calculations per second) !!!!
1964: Rahman; MD simulation of liquid Ar 1960: Vineyard group; Simulated radiation damage of a Cu crystal with MD 1964: Rahman; MD simulation of liquid Ar 1969: Barker and Watts; Monte Carlo simulation of water 1971: Rahman and Stillinger; MD simulation of water Cray 1 (1976) Cray T3E (1995)
Ref: www.maximumpc.com Year speed unit 1985 33 MHz 1989 100 1993 233 1996 385 1997 450 1999 570 1.4 GHz 2000 2 2001 2.25 2004 2.3 3.2 2006
PARALLEL COMPUTING: is a type of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved at the same time.
Instructions: 1- clean the windows 2- clean the door 3- clean the roof 4- clean the table
problem instructions processor problem instructions processor processor processor processor
Required conditions for parallel processing Having suitable hardware The problem can be parallelized Having suitable algorithm
Required conditions for parallel processing Having suitable hardware The problem can be parallelized Having suitable algorithm
HARDWARE:Parallel hardware architectures Memory CPU Shared Memory Memory Control Unit Arithmetic Logic Input Output C P U Memory CPU network Distributed Memory
HARDWARE: Computational Units (CPU) CPU: Central Processing Unit (basic arithmetic, logical, control, input/output) Single Core CPU Dual Core Quad Core
HARDWARE: Computational Units (GPU) GPU: Graphics Processing Unit CPU GPU
HARDWARE: Computational Units (GPU) Ref: www.ks.uiuc.edu/Research/namd Molecular dynamics simulation of protein insertion process NCSA Lincoln Cluster performance (8 Intel cores and 2 NVIDIA Tesla GPUs per node, 1 million atoms)
HARDWARE: Computational Units (GPU) GPUs need a fundamentally different architecture. One would have to program an application specifically for a GPU that uses different techniques. GPU constraints: It needs new programming languages. It needs new programming paradigm. NAMD (www.ks.uiuc.edu/Research/namd) LAMMPS (lammps.sandia.gov) Gromacs (www.gromacs.org) DL_POLY 4 (www.stfc.ac.uk//research/app/ccg/software/DL_POLY/44516.aspx) GAMESS 2012 closed shell MP2 and closed shell CCSD(T) energy (www.msg.ameslab.gov/gamess)
Required conditions for parallel processing Having suitable hardware The problem can be parallelized Having suitable algorithm
The problem can be parallelized
The problem can be parallelized x(1)=100. DO 10 i=2,1000 x(i)=sin(x(i-1)) 10 CONTINUE i=2 : X(2)=sin(x(1)) i=3 : X(3)=sin(x(2)) i=4 : X(4)=sin(x(3))
The problem can be parallelized 2 1 5 3 7 6 9 4 for (i = 0; i < n; i++) for (j = 0; i < n; j++) B= 6 1 2 3 4 5 9 8 -8 C[i][j] = 0; for (k= 0; k < n; k++) C[i][j] += a[i][k] * b[k][j] end for end for end for
The problem can be parallelized 2 1 7 5 3 1 6 A= 9 2 3 6 4 7 2 6 1 4 5 2 3 6 5 B= 1 9 4 8 -8 5
Required conditions for parallel processing Having suitable hardware The problem can be parallelized Having suitable algorithm
Parallel algorithms in computational chemistry-QM
Obtain initial guess for Parallel algorithms in computational chemistry-QM Obtain initial guess for density matrix Fock matrix formation Two-electron integrals Iterate Diagonalize Fock matrix Density formation Annihilation Integral evaluation Others Fock matrix formation Form new density matrix
Parallel algorithms in computational chemistry-QM . . . Ref: DOI: 10.1039/c002859b
Parallelization Strategies in MD Molecular dynamics (MD) is a computer simulation technique where the time evolution of a set of interacting atoms is followed by integrating their equation of motion.
Parallelization Strategies in MD Initialize Force Calculation Others forces Motion Analysis Summarize
Parallelization Strategies in MD-Replicated Data . . . Ref: ROM. J. BIOCHEM., 46, 2, 129-148 (2009)
Parallelization Strategies in MD-Replicated Data Advantages: Simplicity (this is relatively easy parallel strategy to implement, requiring only minor changes to scalar code. dis-advantages: Memory usage is high (due to duplication of data) Communication costs are quite high
Parallelization Strategies in MD-Force Decomposition Properties: Communication operations scale as rather than N Memory cost for positions and force vectors are reduced by the factor Retains the simplicity of the RD technique. Ref: DOI: 10.1007/1-4020-2670-5_15
Parallelization Strategies in MD-Spatial Decomposition rcut Properties: The communication costs can be minimized. Needs more sophisticated programming. Ref: DOI: 10.1002/(SICI)1096-987X(199703)18:4<478::AID-JCC3>3.0.CO;2-Q
How to run a parallel program efficiently? A lot of independent programs run as serial using a lot of CPU. (embarrassingly parallel) A problem divides in to some parts and each parts is run on each CPU. Load balancing Communication cost Communication cost Computation cost The number of CPU Amount of memory The chosen algorithm
How to run a parallel program efficiently?
How to run a parallel program efficiently?
How to run a parallel program efficiently?
How to run a parallel program efficiently?
How to run a parallel program efficiently? Hexanitroethane C2N6O12 B3LYP/6-31g(df, pd) Single point
How to run a parallel program efficiently?
THANKS!