Neural Networks for Quantum Simulation Brian Barch
Problem: Quantum Simulation Want to find motion of atoms in a system Done by calculating energy, then finding force Classically easy, but not for quantum wavefunctions Want to learn a function from atomic position to system energy Regression problem Poliovirus, used in 108 atom simulation of infection
Hydrogen Dataset Configurations of hydrogen atoms and associated energy produced by iterative simulation methods 8 sets, 3000 samples each Each sample: 54 H atom positions in (x, y, z) and total energy Preprocessing: Previously: Calculated inter-atomic distances and angles, and from there symmetry function values. Normalized results Currently: Just inter-atomic distances and angles. Not normalized.
Base Model: Atomic NN structure Describe environment of each atom with symmetry functions Atomic NN uses this vector to predict energy for that atom Atomic NNs share weights atom positions total energy preprocess Separate NNs with shared weights
New Model: Atomic NN structure Reformulated atomic NN structure as convolution Used 1D convolutional layers, with atoms as the input dimension and symmetry function values as the channels Rather than preprocessing, first layer calculates symmetry function values from distances and angles Second is batch normalization layer Allows application to arbitrary numbers of atoms and GPU CNN optimization Con: much slower on complicated symmetry functions
New Model: Atomic NNs Ê Preprocess NN + Sum of atomic energies angles ѳ + Ê D distances Sum of atomic energies Sym. Func. Layer Batch normalization Atoms Conv layers
Base Model: Symmetry Functions Represent atomic environment in a form invariant to changes that don’t affect total energy (i.e. shifts, rotation, translation) Currently manually selected and used to preprocess data - NNs are trained on saved sets of symmetry functions
New Model: Symmetry Functions Calculated per batch by first layer - Made possible by the miracle of Theano tensor broadcasting Slower than preprocessing, but… allows training of sym. func. parameters - Don’t need to rely on hand-picking values anymore - Still need to hand pick initial values for now - Used Keras’ built in analytic gradient tools Once good values are found, can use them for preprocessing
New Model: Results Wasn’t trying to optimize parameters as much as apply a new NN structure to this problem Angle-based symmetry functions required triple sum terms - too expensive to compute on the fly for my computer - haven’t had a chance to run on cluster yet Current results use a subset of symmetry functions, so aren’t comparable to previous results That being said...
Results: trainable vs static parameters Trainable symmetry function parameters had a purely beneficial effect on accuracy Reduced MSE by ~10%, faster training, less overfitting Static sym funcs Trainable sym funcs
Results continued Improved accuracy more for worse accuracies – e.g. for worse hyperparameters parameters often remained near initial value - likely because other weights optimized for those parameters Other times there was significant deviation, i.e. multiple parameters converging to nearly the same value Future work will focus on avoiding such redundancies - tried l1 regularization but it was insufficient
Lessons and Challenges From checkpoint 1: Visualizing can be challenging but is very important New Lessons: Implementation efficient surprisingly important, because it allows testing on the fly during development Linux is much easier to run code in than Windows 10 is New Challenges: Interpreting whether trained sym func parameters are meaningful Making angle based sym funcs efficient enough to run