Quantum Simulation Neural Networks

Slides:



Advertisements
Similar presentations
EE 690 Design of Embodied Intelligence
Advertisements

CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
ImageNet Classification with Deep Convolutional Neural Networks
CHE 185 – PROCESS CONTROL AND DYNAMICS
Gaussian process emulation of multiple outputs Tony O’Hagan, MUCM, Sheffield.
Lecture 14 – Neural Networks
Chapter 5 NEURAL NETWORKS
MACHINE LEARNING 12. Multilayer Perceptrons. Neural Networks Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Saturation, Flat-spotting Shift up Derivative Weight Decay No derivative on output nodes.
CSC2535: 2013 Advanced Machine Learning Lecture 3a: The Origin of Variational Bayes Geoffrey Hinton.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Artificial Neural Networks
The Nuts and Bolts of First-Principles Simulation Durham, 6th-13th December : DFT Plane Wave Pseudopotential versus Other Approaches CASTEP Developers’
Hybrid AI & Machine Learning Systems Using Ne ural Network and Subsumption Architecture Libraries By Logan Kearsley.
Chapter 11 – Neural Networks COMP 540 4/17/2007 Derek Singer.
Hybrid AI & Machine Learning Systems Using Ne ural Networks and Subsumption Architecture By Logan Kearsley.
Outline 1-D regression Least-squares Regression Non-iterative Least-squares Regression Basis Functions Overfitting Validation 2.
CSC321: Neural Networks Lecture 13: Learning without a teacher: Autoencoders and Principal Components Analysis Geoffrey Hinton.
CSC321: Neural Networks Lecture 2: Learning with linear neurons Geoffrey Hinton.
CSC321: Introduction to Neural Networks and machine Learning Lecture 16: Hopfield nets and simulated annealing Geoffrey Hinton.
CHAPTER 5 S TOCHASTIC G RADIENT F ORM OF S TOCHASTIC A PROXIMATION Organization of chapter in ISSO –Stochastic gradient Core algorithm Basic principles.
Artificial Intelligence Chapter 3 Neural Networks Artificial Intelligence Chapter 3 Neural Networks Biointelligence Lab School of Computer Sci. & Eng.
Non-Bayes classifiers. Linear discriminants, neural networks.
Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.
Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.
Reservoir Uncertainty Assessment Using Machine Learning Techniques Authors: Jincong He Department of Energy Resources Engineering AbstractIntroduction.
Neural Networks The Elements of Statistical Learning, Chapter 12 Presented by Nick Rizzolo.
ConvNets for Image Classification
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Transition State Theory, Continued
Neural Networks - Berrin Yanıkoğlu1 MLP & Backpropagation Issues.
Machine Learning Supervised Learning Classification and Regression
Big data classification using neural network
Deep Learning Amin Sobhani.
Neural Networks for Quantum Simulation
Computer Science and Engineering, Seoul National University
第 3 章 神经网络.
A Simple Artificial Neuron
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Inception and Residual Architecture in Deep Convolutional Networks
Policy Compression for MDPs
Neural Networks and Backpropagation
Master’s Thesis defense Ming Du Advisor: Dr. Yi Shang
RNNs: Going Beyond the SRN in Language Prediction
Machine Learning Today: Reading: Maria Florina Balcan
Hyperparameters, bias-variance tradeoff, validation
MXNet Internals Cyrus M. Vahid, Principal Solutions Architect,
Logistic Regression & Parallel SGD
Artificial Intelligence Chapter 3 Neural Networks
Chap. 7 Regularization for Deep Learning (7.8~7.12 )
Multilayer Perceptron & Backpropagation
Convolutional networks
Artificial Intelligence Chapter 3 Neural Networks
Synthesis of Motion from Simple Animations
Neural Networks References: “Artificial Intelligence for Games”
Artificial Intelligence Chapter 3 Neural Networks
Machine learning overview
Backpropagation David Kauchak CS159 – Fall 2019.
Image Classification & Training of Neural Networks
Artificial Intelligence Chapter 3 Neural Networks
3. Feedforward Nets (1) Architecture
Deep Learning for the Soft Cutoff Problem
Model Compression Joseph E. Gonzalez
Introduction to Neural Networks
Batch Normalization.
Image recognition.
Distributed Reinforcement Learning for Multi-Robot Decentralized Collective Construction Gyu-Young Hwang
Artificial Intelligence Chapter 3 Neural Networks
Overall Introduction for the Lecture
Presentation transcript:

Quantum Simulation Neural Networks Brian Barch Examples of results of molecular dynamics Problem: Molecular Dynamics Want to simulate motion or distributions of atoms Do this by constructing a potential energy surface (PES) Predicts energy from position Can then derive energy to get forces on atoms Impossible analytically due to electron wavefunctions Traditionally use iterative methods, i.e. density functional theory (DFT) or quantum chemistry methods These have time vs accuracy tradeoff Can augment with neural networks, which can predict accurately very quickly once trained Neural network goal: predict energy from atomic position Snapshot of melting water from NN-based simulation Poliovirus, used in 108 atom simulation of infection Dataset: Atomic hydrogen 8 datasets at different temperatures and pressure ~3000 configurations of hydrogen atoms each Each configuration: x, y, z for 54 atoms and total energy As neural network PES go, this is very high dimensional Preprocessing: Calculated interatomic distances and angles between atoms For baseline model: used these to calculate symmetry function values, which were then normalized and saved For new model, preprocessed only distances and angles Symmetry functions Represent atomic environment in a form invariant under changes that shouldn’t affect energy i.e. rotation, total system translation, index swaps, etc Differentiable but highly nonlinear. Different types of parameters and functions used per sym. func. mean can’t represent as a single vector function (which would make things much easier) Cutoff function ensures they represent only local environment – reduces scaling Baseline model (as used in literature): manually picked, kept as static hyperparameter, and used to preprocess data Atomic Neural Network Structure Baseline NN structure Preprocess Sym. funcs. Separate NNs that share weights Sum of energies Project goal: design and implement trainable symmetry functions i.e. turn hyperparameter to normal parameter To my knowledge, there are no papers even suggesting this Sym. func. types: Effects of parameters of G2 Each atom is described by a vector of sym. funcs Baseline model: Take list of sym. func. vectors as input to NN Individual atomic NN trained on each vector NNs share weights Total energy is sum of outputs of atomic NNs Cutoff: used in others, but not a sym. func. Activation of sym. func. G2 for 2 atoms as a function of distance for different parameter values New convolutional Neural Network model: Batch normalization Sum of atomic energies D ѳ Preprocess Atoms NN angles distances Sym. Func. Layer Conv layers Ê + Changes to model Replaced atomic NNs with 1D convolutional layers, with atom index as dimension This part is functionally the same as the baseline, but more efficient Allows GPU optimization, i.e. CuDNN Different symmetry functions in different channels Takes distances and angles as input and applies a sym. func. layer Batch normalization layer instead of using pre-normalized data Final layer is same as baseline model Trainable symmetry functions Sym. func. layer calculates sym func values with parameters stored as theano tensors Allows more generalizability, since no need to preprocess data Allows backpropagation onto parameters the same as is done for weights Caused NN to train slower, but once good sym. func. parameters are found they can be used for preprocessing to avoid this Results: Loss during training Goal was to make trainable sym. funcs, which succeeded New convolutional NN structure greatly increased train and prediction speed of NN through GPU optimization G1 – G3 could be implemented quickly, but G4 and G5 use angles, so require a triple sum over all atoms Causes massive slowdown in feed forward, and even more during backpropagation Baselines results relied heavily on G4 and G5, which are the most complex sym. funcs, so could not beat my best results with the baseline model I did beat the results of the model used by researchers from LBL on this problem though MSE (eV) Future directions Will focus on increasing efficiency of complex sym. func. implementation and training I plan to represent sym. funcs not as a single complex layer, but as a combination of multiple simple add, multiple, exponent, etc layers I will also focus on making sure NNs are learning useful sym. func. parameters by penalizing redundancies and optimizing training with an evolutionary neural network structure I made previously (based on MPANN) Loss during training for a NN with static sym. func. parameters (left) and with trainable sym. func. parameters (right). Training MSE (red) and validation MSE (blue) are measured in electronvolts. Used 8 each of G1, G2, and G3 for this. When compared to the same untrained sym funcs, trained ones always performed better Decreased MSE, faster training and less overfitting ~10% MSE decrease for best case, but more with worse hyperparameters Trained sym. fun. Parameters were unpredictable – sometimes stayed near initialization, other times multiple converged to same value or crossed over. Unsure what to make of this