Computational Techniques for Efficient Carbon Nanotube Simulation Ashok Srinivasan Computer Science Florida State University Namas Chandra Mechanical Engineering Florida State University
Outline Background Parallel nanotube simulation Nanocomposites Parallelization Conclusions and future work
Background Uses of Carbon nanotubes CNT properties Materials NEMS Transistors Displays Etc CNT properties Can span 23,000 miles without failing due to its own weight 100 times stronger than steel Many times stiffer than any known material Conducts heat better than diamond Can be a conductor or insulator without any doping Lighter than feather
Sequential computation Molecular dynamics, using Brenner’s potential Short-range interactions Neighbors can change dynamically during the course of the simulation Computational scheme Find force on each particle due to interactions with “close” neighbors Update position and velocity of each atom
Force computations Pair interactions Bond angles Four body
Profile of execution time 1: Force 2: Neighbor list 3: Predictor/corrector 4: Thermostating 5: Miscellaneous
Profile for force computations
Parallel nanotube simulation Shared memory Message passing Load balancing
Shared memory parallelization Do each of the following loops in parallel For each atom Update forces due to atom i If neighboring atoms are owned by other threads, update an auxiliary array For each thread Collect force terms for atoms it owns Srivastava, et al, SC-97 and CSE 2001 Simulated 105 to 107 atoms Speedup around 16 on 32 processors Include long-range forces too Lexical decomposition
Message passing parallelization Decompose domain into cells Each cell contains its atoms Assign a set of adjacent cells to each processor Each processor computes values for its cells, communicating with neighbors when their data is needed Caglar and Griebel, World scientific, 1999 Simulated 108 atoms on up to 512 processors Linear speedup for 160,000 atoms on 64 processors
Load balancing
Nanocomposites Matrix-nanotube interface modeled with springs An extra force term computed for atoms attached to springs Springs can break, requiring substantial increase in computations in that region Spring Polymer matrix
Parallelization Distributed shared memory Balance the load Ensure locality of data Simple lexical approach will result in load imbalance Balanced lexical Adjust the domain size
Breadth first search We want to ensure locality too Balanced Breadth First Search Perform a breadth first search on the atom interaction graph
Use general purpose software Other approaches Use general purpose software Jostle Metis ParMetis
Experimental parameters Nanotube with 1000 atoms Spring probability: 0.05 Probability of a spring breaking in an iteration: 0.01 Load increase factor due to spring break: 200 Disturbance region depth: 3 Number of time steps: 100
Load imbalance
Non-local interactions
Load balancing time
Variation of load with time
Conclusions and future work Neighbor search Parallelization Current approaches appear inadequate General purpose software is too slow Special purpose techniques appear promising Stochastic versions of certain current techniques possible Multi-scale simulation of nano-composites MD at nano-scale and FEM at larger scale