Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 First-Principles Molecular Dynamics for Petascale Computers François Gygi Dept of Applied Science, UC Davis

Similar presentations


Presentation on theme: "1 First-Principles Molecular Dynamics for Petascale Computers François Gygi Dept of Applied Science, UC Davis"— Presentation transcript:

1 1 First-Principles Molecular Dynamics for Petascale Computers François Gygi Dept of Applied Science, UC Davis fgygi@ucdavis.edu http://eslab.ucdavis.edu Zhaojun Bai Dept of Computer Science, UC Davis Giulia Galli Dept of Chemistry, UC Davis Kwan-Liu Ma Dept of Computer Science, UC Davis Supported by NSF-ITR-HECURA 0749217

2 2 The Qbox project Qbox is a C++/MPI implementation of First-Principles Molecular Dynamics (FPMD) Qbox includes a quantum mechanical description of electronic structure within Density Functional Theory Applications to Materials Science, Chemistry, Nanoscience Software development focuses on large-scale parallelism

3 3 Qbox code architecture Qbox ScaLAPACK/PBLAS BLACS MPI BLAS/ATLAS XercesC (XML parser) FFTW lib DGEMM lib http://eslab.ucdavis.edu/software/qbox

4 4 Qbox performance results 8 k-points: 207.3 TFlop/s (56% of peak) 4 k-points: 187.7 TFlop/s (51% of peak) 1 k-point: 108.8 TFlop/s (30% of peak) 2006 ACM/IEEE Gordon Bell Award for peak performance Electronic structure of a 1000- atom Molybdenum sample 12,000 electrons LLNL BlueGene/L

5 5 Current Qbox availability on Teragrid Platforms Mercury, NCSA Cobalt, NCSA Tungsten, NCSA BlueGene/L, SDSC IBM p655, SDSC Other platforms ANL BG/L ANL BG/P NERSC Franklin, Cray XT4 NCSA Abe

6 6 New scalable algorithms for electronic structure calculations One-sided Jacobi simultaneous diagonalization algorithm used in electronic structure calculations –64-node dual-dual-core AMD Opteron/Infinipath cluster –1 rack ANL BlueGene/L

7 7 Qbox scalability for nanoscience applications Electronic structure of a 2260-atom silicon nanowire Cray-XT4, up to 8k CPUs Superlinear scaling due to cache effects and size- dependent MPI protocols 86% parallel efficiency between 2k and 8k CPUs

8 8 Qbox parallel I/O strategy Advanced functions in MPI-IO are not supported by all file systems (MPI_File_write_shared, etc.) Qbox uses a strategy based on shared file pointer objects Achieves >700 MB/s write rate for file sizes of 50–250 GB platform#taskswrite speed Cray-XT42048778 MB/s Cray-XT44096715 MB/s Cray-XT48192687 MB/s BG/P (ANL)2048814 MB/s

9 9 Analysis of MPI message traffic patterns in Qbox Multiple traffic patterns are involved during a Qbox simulation –physics kernels –3D Fourier transforms –ScaLAPACK linear algebra Logical-to-physical mapping of tasks has a large impact on performance on large platforms (> 4k CPUs) We are developing instrumentation and visualization tools to analyze message traffic patterns on various interconnect architectures Mapping of 65536 MPI tasks on the 32x32x64 torus of the LLNL BG/L

10 10 Analysis of MPI message traffic patterns in Qbox Screenshot of the message traffic visualization tool showing MPI calls in a ScaLAPACK matrix multiplication (C. Muelder, K-L Ma, UCDavis)

11 11 Qbox current developments Deployment on TeraGrid track-2 platforms Applications to Nanoscience simulations –G. Galli, Chemistry UCDavis Specialized linear algebra algorithms –Z. Bai, Computer Science, UCDavis Visualization –K-L. Ma, Computer Science, UCDavis Application-specific data compression algorithms Large dataset management (10 10 – 10 12 bytes) XML standards for electronic structure data (http://www.quantum- simulation.org) Supported by NSF-ITR-HECURA 0749217 http://eslab.ucdavis.edu


Download ppt "1 First-Principles Molecular Dynamics for Petascale Computers François Gygi Dept of Applied Science, UC Davis"

Similar presentations


Ads by Google