Deep learning enhanced Markov State Models (MSMs)

Slides:



Advertisements
Similar presentations
A brief review of non-neural-network approaches to deep learning
Advertisements

Beyond Linear Separability
NEURAL NETWORKS Backpropagation Algorithm
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Unsupervised Learning With Neural Nets Deep Learning and Neural Nets Spring 2015.
Supervised learning 1.Early learning algorithms 2.First order gradient methods 3.Second order gradient methods.
Radial Basis-Function Networks. Back-Propagation Stochastic Back-Propagation Algorithm Step by Step Example Radial Basis-Function Networks Gaussian response.
Artificial Neural Networks Artificial Neural Networks are (among other things) another technique for supervised learning k-Nearest Neighbor Decision Tree.
Machine Learning Motivation for machine learning How to set up a problem How to design a learner Introduce one class of learners (ANN) –Perceptrons –Feed-forward.
Dan Simon Cleveland State University
Image Compression Using Neural Networks Vishal Agrawal (Y6541) Nandan Dubey (Y6279)
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Biointelligence Laboratory, Seoul National University
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Mestrado em Ciência de Computadores Mestrado Integrado em Engenharia de Redes e Sistemas Informáticos VC 14/15 – TP19 Neural Networks & SVMs Miguel Tavares.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 24 Nov 2, 2005 Nanjing University of Science & Technology.
Non-Bayes classifiers. Linear discriminants, neural networks.
Akram Bitar and Larry Manevitz Department of Computer Science
CSC2535: Computation in Neural Networks Lecture 12: Non-linear dimensionality reduction Geoffrey Hinton.
Neural Nets: Something you can use and something to think about Cris Koutsougeras What are Neural Nets What are they good for Pointers to some models and.
CSC2515: Lecture 7 (post) Independent Components Analysis, and Autoencoders Geoffrey Hinton.
Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.
CS Statistical Machine learning Lecture 12 Yuan (Alan) Qi Purdue CS Oct
Perceptrons Michael J. Watts
Dynamic Background Learning through Deep Auto-encoder Networks Pei Xu 1, Mao Ye 1, Xue Li 2, Qihe Liu 1, Yi Yang 2 and Jian Ding 3 1.University of Electronic.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
A Tutorial on ML Basics and Embedding Chong Ruan
Chapter 11 – Neural Nets © Galit Shmueli and Peter Bruce 2010 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
A Presentation on Adaptive Neuro-Fuzzy Inference System using Particle Swarm Optimization and it’s Application By Sumanta Kundu (En.R.No.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Big data classification using neural network
Deep Feedforward Networks
Artificial Neural Networks
Deep Learning Amin Sobhani.
Data Mining, Neural Network and Genetic Programming
Data Mining, Neural Network and Genetic Programming
LECTURE 09: BAYESIAN ESTIMATION (Cont.)
CSE 473 Introduction to Artificial Intelligence Neural Networks
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Intelligent Information System Lab
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Data Mining Lecture 11.
Random walk initialization for training very deep feedforward networks
CSE P573 Applications of Artificial Intelligence Neural Networks
Classification / Regression Neural Networks 2
Machine Learning Today: Reading: Maria Florina Balcan
Goodfellow: Chap 6 Deep Feedforward Networks
Hidden Markov Models Part 2: Algorithms
Neural Networks Advantages Criticism
Towards Understanding the Invertibility of Convolutional Neural Networks Anna C. Gilbert1, Yi Zhang1, Kibok Lee1, Yuting Zhang1, Honglak Lee1,2 1University.
Goodfellow: Chapter 14 Autoencoders
CSE 573 Introduction to Artificial Intelligence Neural Networks
network of simple neuron-like computing elements
Outline Background Motivation Proposed Model Experimental Results
Backpropagation.
Representation Learning with Deep Auto-Encoder
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
Graph Neural Networks Amog Kamsetty January 30, 2019.
Artificial Neural Networks
Neural networks (1) Traditional multi-layer perceptrons
Artificial Intelligence 10. Neural Networks
Autoencoders Supervised learning uses explicit labels/correct output in order to train a network. E.g., classification of images. Unsupervised learning.
Computer Vision Lecture 19: Object Recognition III
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
Linear Discrimination
Attention for translation
Report Yang Zhang.
CSC 578 Neural Networks and Deep Learning
Akram Bitar and Larry Manevitz Department of Computer Science
Goodfellow: Chapter 14 Autoencoders
Presentation transcript:

Deep learning enhanced Markov State Models (MSMs) Wei Wang Feb 20, 2019

Outline General protocol of building MSM Challenges with MSM VAMPnets Time-lagged auto-encoder

Revisit the protocol of building MSM

Need a lot of expertise in biology & machine learning Wang, Cao, Zhu, Huang WIREs Comput. Mol. Sci., e1343, (2017)

Criterion to choose a model: slowest dynamics Choose the MSM that best captures the slowest transitions of the system Wang, Cao, Zhu, Huang WIREs Comput. Mol. Sci., e1343, (2017)

Choose the one with slowest transition Timescales (μs) Da, Pardo, Xu, Zhang, Gao, Wang, Huang,  Nature Communications., 7, 11244, (2016)

Perform this cumbersome work: search Propose good clustering algorithms & features Parametric search using good strategies http://msmbuilder.org/osprey/1.1.0

Challenges: parametric space is too large: Collective Variable (CV) Need to propose good features http://homepages.laas.fr/jcortes/algosb13/sutto-ALGO13-META.pdf

Challenges: parametric space is too large: CV Need to propose good features http://homepages.laas.fr/jcortes/algosb13/sutto-ALGO13-META.pdf

Challenges: parametric space is too large: CV Need to propose good features, otherwise will worsen the clustering stage Truth tICA Wehmeyera and Noe, J. Chem. Phys. 148, 241703 (2018)

Challenges: parametric space is too large: clustering Zhang et al., Methods in Enzymology, 578, 343-371 (2016)

Essence of these operations Linearlly/Nonlinearlly transform the protein configurations into the state vectors: 𝒙 𝑡 𝟑𝑵 → 𝑝 1 , 𝑝 2 ,…, 𝑝 𝑀 , 𝑗=1 𝑀 𝑝 𝑀 =1 (1, 0, 0, 0) (0, 0, 1, 0) Husic and Pande, J. Am. Chem. Soc. 2018, 140, 2386−2396

Deep learning can greatly help: powerful In the mathematical theory of artificial neural networks, the universal approximation theorem states that a feed-forward network with a single hidden layer containing a finite number of neurons can approximate continuous functions on compact subsets of Rn, under mild assumptions on the activation function. Deep learning has been widely applied in numerous fields Dog: 0.99 Cat: 0.01 https://en.wikipedia.org/wiki/Universal_approximation_theorem

Deep learning can greatly help MSM Dog: 0.99 Cat: 0.01 Macro1: 0.990 Macro2: 0.005 Macro3: 0.005

Outline General protocol of building MSM Challenges with MSM VAMPnets Time-lagged auto-encoder

VAMPnets for deep learning of molecular kinetics VAMPnets: employ the variational approach for Markov processes (VAMP) to develop a deep learning framework for molecular kinetics using neural networks, encodes the entire mapping from molecular coordinates to Markov states, thus combining the whole data processing pipeline in a single end-to-end framework. coordinates state vector Related to the implied timescale plot, maximize it Noe et al., 9, 5, 2018, Nature Communications

Understanding VAMPnets The basic structure of neural network What is VAMP score

Basic structure of neural network

Where can we get the weights? Forward propagation Where can we get the weights?

Backpropagation to update the weights Define a objective function 𝜖= 𝑖 𝑦 𝑡𝑟𝑢𝑒 − 𝑦 𝑝𝑟𝑒𝑑 2 Weights are updated following the largest gradient direction http://www.saedsayad.com/images/ANN_4.png

Backpropagation to update the weights https://independentseminarblog.files.wordpress.com/2017/12/giphy.gif

Backpropagation to update the weights In VAMPnets, it is VAMP-2 score Define a objective function 𝜖= 𝑖 𝑦 𝑡𝑟𝑢𝑒 − 𝑦 𝑝𝑟𝑒𝑑 2 Weights are updated following the largest gradient direction http://www.saedsayad.com/images/ANN_4.png

VAMP-2 score: objective function 𝜒(𝑥): state vector, e.g., 𝜒 𝑥 =(0,1,0) if x belongs to state 2 Noe et al., 9, 5, 2018, Nature Communications

VAMP-2 score: related to TPM Sum of eigenvalues of T(𝜏) 2 Related to the implied timescale plot, we want to maximize it 𝜒(𝑥): state vector, e.g., 𝜒 𝑥 =(0,1,0) if x belongs to state 2 Noe et al., 9, 5, 2018, Nature Communications

VAMPnets: example on alanine dipeptide Try to lump to 6 states Output: 6 probabilities 10 heavy atoms xyz for 10 heavy atoms Noe et al., 9, 5, 2018, Nature Communications

VAMPnets: example on alanine dipeptide Visualizing the outputs (soft assignments) Once we have the state vectors, we can calculate TPM, and get the kinetics Noe et al., 9, 5, 2018, Nature Communications

Comparison with traditional way to build MSM Advantages No need to worry about features to do tICA and the clustering algorithms Inputs are simple: aligned trajectories Find the variationally optimal one Disadvantages Easy to overfit the data Easy to be trapped in local optimal Alanine dipeptide Noe et al., 9, 5, 2018, Nature Communications

Outline General protocol of building MSM Challenges with MSM VAMPnets Time-lagged auto-encoder

Other application of deep learning in MSM: CV Improve PCA/tICA through nonlinear transformation trained by (time-lagged) auto-encoder PCA/tICA: find the direction that maximizes the variance/time- lagged covariance matrix.

PCA: minimizing reconstruction error http://alexhwilliams.info/itsneuronalblog/2016/03/27/pca/

PCA: Linear version of auto-encoder Original data Reconstructed data  Wehmeyer and Noe, J. Chem. Phys. 148, 241703 (2018)

Improving tICA using time-lagged auto-encoder D,E are constant matrix in tICA Current frame Next frame  Wehmeyer and Noe, J. Chem. Phys. 148, 241703 (2018)

Improving tICA using time-lagged auto-encoder D,E are constant matrix in tICA 𝝉=𝟑  Wehmeyer and Noe, J. Chem. Phys. 148, 241703 (2018)

Time-lagged autoencoder improves over tICA Villin  Wehmeyer and Noe, J. Chem. Phys. 148, 241703 (2018)

Summary Deep learning improves MSM in reducing the number of prior knowledge However, deep learning may overfit the data when our sampling is not enough