Structure learning with deep neuronal networks 6 th Network Modeling Workshop, 6/6/2013 Patrick Michl.

Slides:



Advertisements
Similar presentations
Deep Learning Bing-Chen Tsai 1/21.
Advertisements

Stochastic Neural Networks Deep Learning and Neural Nets Spring 2015.
CS590M 2008 Fall: Paper Presentation
Advanced topics.
Nathan Wiebe, Ashish Kapoor and Krysta Svore Microsoft Research ASCR Workshop Washington DC Quantum Deep Learning.
Support Vector Machines
CS 678 –Boltzmann Machines1 Boltzmann Machine Relaxation net with visible and hidden units Learning algorithm Avoids local minima (and speeds up learning)
Kostas Kontogiannis E&CE
Deep Learning.
Machine Learning Neural Networks
Small Codes and Large Image Databases for Recognition CVPR 2008 Antonio Torralba, MIT Rob Fergus, NYU Yair Weiss, Hebrew University.
Unsupervised Learning With Neural Nets Deep Learning and Neural Nets Spring 2015.
RBF Neural Networks x x1 Examples inside circles 1 and 2 are of class +, examples outside both circles are of class – What NN does.
L15:Microarray analysis (Classification) The Biological Problem Two conditions that need to be differentiated, (Have different treatments). EX: ALL (Acute.
Rutgers CS440, Fall 2003 Neural networks Reading: Ch. 20, Sec. 5, AIMA 2 nd Ed.
Neural Networks. R & G Chapter Feed-Forward Neural Networks otherwise known as The Multi-layer Perceptron or The Back-Propagation Neural Network.
November 2, 2010Neural Networks Lecture 14: Radial Basis Functions 1 Cascade Correlation Weights to each new hidden node are trained to maximize the covariance.
Greg GrudicIntro AI1 Introduction to Artificial Intelligence CSCI 3202: The Perceptron Algorithm Greg Grudic.
Y. Weiss (Hebrew U.) A. Torralba (MIT) Rob Fergus (NYU)
Artificial Neural Networks
LOGO Classification III Lecturer: Dr. Bo Yuan
CSC321: Introduction to Neural Networks and Machine Learning Lecture 20 Learning features one layer at a time Geoffrey Hinton.
Hazırlayan NEURAL NETWORKS Radial Basis Function Networks I PROF. DR. YUSUF OYSAL.
Submitted by:Supervised by: Ankit Bhutani Prof. Amitabha Mukerjee (Y )Prof. K S Venkatesh.
Image Compression Using Neural Networks Vishal Agrawal (Y6541) Nandan Dubey (Y6279)
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
A Genetic Algorithms Approach to Feature Subset Selection Problem by Hasan Doğu TAŞKIRAN CS 550 – Machine Learning Workshop Department of Computer Engineering.
How to do backpropagation in a brain
Artificial Neural Networks
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
Explorations in Neural Networks Tianhui Cai Period 3.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 23 Nov 2, 2005 Nanjing University of Science & Technology.
NEURAL NETWORKS FOR DATA MINING
Artificial Intelligence Techniques Multilayer Perceptrons.
Learning Lateral Connections between Hidden Units Geoffrey Hinton University of Toronto in collaboration with Kejie Bao University of Toronto.
ARTIFICIAL NEURAL NETWORKS. Overview EdGeneral concepts Areej:Learning and Training Wesley:Limitations and optimization of ANNs Cora:Applications and.
Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.
Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
CS621 : Artificial Intelligence
Math 285 Project Diffusion Maps Xiaoyan Chong Department of Mathematics and Statistics San Jose State University.
Web-Mining Agents Prof. Dr. Ralf Möller Universität zu Lübeck Institut für Informationssysteme Tanya Braun (Übungen)
Neural Networks William Cohen [pilfered from: Ziv; Geoff Hinton; Yoshua Bengio; Yann LeCun; Hongkak Lee - NIPs 2010 tutorial ]
CSC321 Lecture 24 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton.
Deep Belief Network Training Same greedy layer-wise approach First train lowest RBM (h 0 – h 1 ) using RBM update algorithm (note h 0 is x) Freeze weights.
Neural Networks Lecture 4 out of 4. Practical Considerations Input Architecture Output.
CSC321 Lecture 27 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
A Document-Level Sentiment Analysis Approach Using Artificial Neural Network and Sentiment Lexicons Yan Zhu.
Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.
Learning Deep Generative Models by Ruslan Salakhutdinov
Deep Feedforward Networks
Artificial Neural Networks
Energy models and Deep Belief Networks
CSC321: Neural Networks Lecture 22 Learning features one layer at a time Geoffrey Hinton.
Multimodal Learning with Deep Boltzmann Machines
Artificial Neural Networks
Deep Learning Qing LU, Siyuan CAO.
Structure learning with deep autoencoders
of the Artificial Neural Networks.
Goodfellow: Chapter 14 Autoencoders
Deep Belief Nets and Ising Model-Based Network Construction
Artificial Neural Networks
Regulation Analysis using Restricted Boltzmann Machines
Multilayer Perceptron & Backpropagation
CSC321 Winter 2007 Lecture 21: Some Demonstrations of Restricted Boltzmann Machines Geoffrey Hinton.
Seminar on Machine Learning Rada Mihalcea
CSC 578 Neural Networks and Deep Learning
Goodfellow: Chapter 14 Autoencoders
Presentation transcript:

Structure learning with deep neuronal networks 6 th Network Modeling Workshop, 6/6/2013 Patrick Michl

Page 26/6/2013 Patrick Michl Network Modeling Agenda Autoencoders Biological Model Validation & Implementation

Page 36/6/2013 Patrick Michl Network Modeling Real world data usually is high dimensional … x1x1 x2x2 DatasetModel Autoencoders

Page 46/6/2013 Patrick Michl Network Modeling … which makes structural analysis and modeling complicated! x1x1 x2x2 x1x1 x2x2 DatasetModel Autoencoders

Page 56/6/2013 Patrick Michl Network Modeling Dimensionality reduction techinques like PCA … x1x1 x2x2 PCA DatasetModel Autoencoders

Page 66/6/2013 Patrick Michl Network Modeling … can not preserve complex structures! x1x1 x2x2 PCA DatasetModel x1x1 x2x2 Autoencoders

Page 76/6/2013 Patrick Michl Network Modeling Therefore the analysis of unknown structures … x1x1 x2x2 DatasetModel Autoencoders

Page 86/6/2013 Patrick Michl Network Modeling … needs more considerate nonlinear techniques! x1x1 x2x2 DatasetModel x1x1 x2x2 Autoencoders

Page 96/6/2013 Patrick Michl Network Modeling Autoencoders are artificial neuronal networks … Autoencoder Artificial Neuronal Network Autoencoders input data X output data X‘ Perceptrons Gaussian Units

Page 106/6/2013 Patrick Michl Network Modeling Autoencoders are artificial neuronal networks … Autoencoder Artificial Neuronal Network Autoencoders input data X output data X‘ Perceptrons Gaussian Units Perceptron 1 0 Gauss Units R

Page 116/6/2013 Patrick Michl Network Modeling Autoencoders are artificial neuronal networks … Autoencoder Artificial Neuronal Network Autoencoders input data X output data X‘ Perceptrons Gaussian Units

Page 126/6/2013 Patrick Michl Network Modeling Autoencoder Artificial Neuronal Network Multiple hidden layers Autoencoders … with multiple hidden layers. Gaussian Units input data X output data X‘ Perceptrons (Visible layers) (Hidden layers)

Page 136/6/2013 Patrick Michl Network Modeling Autoencoder Artificial Neuronal Network Multiple hidden layers Autoencoders Such networks are called deep networks. Gaussian Units input data X output data X‘ Perceptrons (Visible layers) (Hidden layers)

Page 146/6/2013 Patrick Michl Network Modeling Autoencoder Artificial Neuronal Network Multiple hidden layers Autoencoders Such networks are called deep networks. Gaussian Units input data X output data X‘ Perceptrons (Visible layers) (Hidden layers) Definition (deep network) Deep networks are artificial neuronal networks with multiple hidden layers

Page 156/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders Gaussian Units input data X output data X‘ Perceptrons (Visible layers) (Hidden layers) Such networks are called deep networks. Deep network

Page 166/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders Autoencoders have a symmetric topology … Gaussian Units input data X output data X‘ Perceptrons (Visible layers) (Hidden layers) Deep network Symmetric topology

Page 176/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders … with an odd number of hidden layers. Gaussian Units input data X output data X‘ Perceptrons (Visible layers) (Hidden layers) Deep network Symmetric topology

Page 186/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders The small layer in the center works lika an information bottleneck input data X output data X‘ Deep network Symmetric topology Information bottleneck Bottleneck

Page 196/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders... that creates a low dimensional code for each sample in the input data. input data X output data X‘ Deep network Symmetric topology Information bottleneck Bottleneck

Page 206/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders The upper stack does the encoding … input data X output data X‘ Deep network Symmetric topology Information bottleneck Encoder

Page 216/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders … and the lower stack does the decoding. input data X output data X‘ Deep network Symmetric topology Information bottleneck Encoder Decoder Encoder Decoder

Page 226/6/2013 Patrick Michl Network Modeling Deep network Symmetric topology Information bottleneck Encoder Decoder Autoencoder Autoencoders … and the lower stack does the decoding. input data X output data X‘ Encoder Decoder Definition (deep network) Deep networks are artificial neuronal networks with multiple hidden layers Definition (autoencoder) Autoencoders are deep networks with a symmetric topology and an odd number of hiddern layers, containing a encoder, a low dimensional representation and a decoder.

Page 236/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders Autoencoders can be used to reduce the dimension of data … input data X output data X‘ Problem: dimensionality of data Idea: 1.Train autoencoder to minimize the distance between input X and output X‘ 2.Encode X to low dimensional code Y 3.Decode low dimensional code Y to output X‘ 4.Output X‘ is low dimensional

Page 246/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders … if we can train them! input data X output data X‘ Problem: dimensionality of data Idea: 1.Train autoencoder to minimize the distance between input X and output X‘ 2.Encode X to low dimensional code Y 3.Decode low dimensional code Y to output X‘ 4.Output X‘ is low dimensional

Page 256/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders In feedforward ANNs backpropagation is a good approach. input data X output data X‘ Training Backpropagation

Page 266/6/2013 Patrick Michl Network Modeling Backpropagation Autoencoder Autoencoders input data X output data X‘ Training Definition (autoencoder) Backpropagation (1)The distance (error) between current output X‘ and wanted output Y is computed. This gives a error function In feedforward ANNs backpropagation is a good approach.

Page 276/6/2013 Patrick Michl Network Modeling Backpropagation Autoencoder Autoencoders In feedforward ANNs backpropagation is the choice input data X output data X‘ Training Definition (autoencoder) Backpropagation (1)The distance (error) between current output X‘ and wanted output Y is computed. This gives a error function Example (linear neuronal unit with two inputs)

Page 286/6/2013 Patrick Michl Network Modeling Backpropagation Autoencoder Autoencoders input data X output data X‘ Training Definition (autoencoder) Backpropagation In feedforward ANNs backpropagation is a good approach.

Page 296/6/2013 Patrick Michl Network Modeling Backpropagation Autoencoder Autoencoders In feedforward ANNs backpropagation is the choice input data X output data X‘ Training Definition (autoencoder) Backpropagation

Page 306/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders … the problem are the multiple hidden layers! input data X output data X‘ Training Backpropagation Problem: Deep Network

Page 316/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders input data X output data X‘ Training Backpropagation is known to be slow far away from the output layer … Backpropagation Problem: Deep Network Very slow training

Page 326/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders input data X output data X‘ Training … and can converge to poor local minima. Backpropagation Problem: Deep Network Very slow training Maybe bad solution

Page 336/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders input data X output data X‘ Training Backpropagation Problem: Deep Network Very slow training Maybe bad solution Idea: Initialize close to a good solution The task is to initialize the parameters close to a good solution!

Page 346/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders input data X output data X‘ Training Backpropagation Problem: Deep Network Very slow training Maybe bad solution Idea: Initialize close to a good solution Pretraining Therefore the training of autoencoders has a pretraining phase …

Page 356/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders input data X output data X‘ Training Backpropagation Problem: Deep Network Very slow training Maybe bad solution Idea: Initialize close to a good solution Pretraining Restricted Boltzmann Machines … which uses Restricted Boltzmann Machines (RBMs)

Page 366/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders input data X output data X‘ Training Backpropagation Problem: Deep Network Very slow training Maybe bad solution Idea: Initialize close to a good solution Pretraining Restricted Boltzmann Machines … which uses Restricted Boltzmann Machines (RBMs) Restricted Boltzmann Machine RBMs are Markov Random Fields

Page 376/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders input data X output data X‘ Training Backpropagation Problem: Deep Network Very slow training Maybe bad solution Idea: Initialize close to a good solution Pretraining Restricted Boltzmann Machines … which uses Restricted Boltzmann Machines (RBMs) Restricted Boltzmann Machine RBMs are Markov Random Fields Markov Random Field Every unit influences every neighbor The coupling is undirected Motivation (Ising Model) A set of magnetic dipoles (spins) is arranged in a graph (lattice) where neighbors are coupled with a given strengt

Page 386/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders input data X output data X‘ Training Backpropagation Problem: Deep Network Very slow training Maybe bad solution Idea: Initialize close to a good solution Pretraining Restricted Boltzmann Machines … which uses Restricted Boltzmann Machines (RBMs) Restricted Boltzmann Machine RBMs are Markov Random Fields Bipartite topology: visible (v), hidden (h) Use local energy to calculate the probabilities of values Training: contrastive divergency (Gibbs Sampling)

Page 396/6/2013 Patrick Michl Network Modeling Autoencoder Autoencoders input data X output data X‘ Training Backpropagation Problem: Deep Network Very slow training Maybe bad solution Idea: Initialize close to a good solution Pretraining Restricted Boltzmann Machines … which uses Restricted Boltzmann Machines (RBMs) Restricted Boltzmann Machine Gibbs Sampling

Page 406/6/2013 Patrick Michl Network Modeling Autoencoders Autoencoder The top layer RBM transforms real value data into binary codes. Top Training

Page 416/6/2013 Patrick Michl Network Modeling Autoencoders Autoencoder Top Therefore visible units are modeled with gaussians to encode data … Training

Page 426/6/2013 Patrick Michl Network Modeling Autoencoders Autoencoder Top … and many hidden units with simoids to encode dependencies Training

Page 436/6/2013 Patrick Michl Network Modeling Autoencoders Autoencoder Top The objective function is the sum of the local energies. Local Energy Training

Page 446/6/2013 Patrick Michl Network Modeling Autoencoders Autoencoder Reduction The next RBM layer maps the dependency encoding… Training

Page 456/6/2013 Patrick Michl Network Modeling Autoencoders Autoencoder Reduction … from the upper layer … Training

Page 466/6/2013 Patrick Michl Network Modeling Autoencoders Autoencoder Reduction … to a smaller number of simoids … Training

Page 476/6/2013 Patrick Michl Network Modeling Autoencoders Autoencoder Reduction … which can be trained faster than the top layer Local Energy Training

Page 486/6/2013 Patrick Michl Network Modeling Autoencoders Autoencoder Unrolling The symmetric topology allows us to skip further training. Training

Page 496/6/2013 Patrick Michl Network Modeling Autoencoders Autoencoder Unrolling The symmetric topology allows us to skip further training. Training

Page 506/6/2013 Patrick Michl Network Modeling After pretraining backpropagation usually finds good solutions Autoencoders Autoencoder Training Pretraining Top RBM (GRBM) Reduction RBMs Unrolling Finetuning Backpropagation

Page 516/6/2013 Patrick Michl Network Modeling The algorithmic complexity of RBM training depends on the network size Autoencoders Autoencoder Training Complexity: O(inw) i: number of iterations n: number of nodes w: number of weights Memory Complexity: O(w)

Page 526/6/2013 Patrick Michl Network Modeling Agenda Autoencoders Biological Model Validation & Implementation

Page 536/6/2013 Patrick Michl Network Modeling Network Modeling Restricted Boltzmann Machines (RBM) How to model the topological structure? SETF

Page 546/6/2013 Patrick Michl Network Modeling We define S and E as visible data Layer … S E TF Network Modeling Restricted Boltzmann Machines (RBM)

Page 556/6/2013 Patrick Michl Network Modeling SE TF Network Modeling Restricted Boltzmann Machines (RBM) We identify S and E with the visible layer …

Page 566/6/2013 Patrick Michl Network Modeling SE … and the TFs with the hidden layer in a RBM TF Network Modeling Restricted Boltzmann Machines (RBM)

Page 576/6/2013 Patrick Michl Network Modeling SE The training of the RBM gives us a model TF Network Modeling Restricted Boltzmann Machines (RBM)

Page 586/6/2013 Patrick Michl Network Modeling Agenda Autoencoder Biological Model Implementation & Results

Page 596/6/2013 Patrick Michl Network Modeling Results Validation of the results Needs information about the true regulation Needs information about the descriptive power of the data

Page 606/6/2013 Patrick Michl Network Modeling Results Validation of the results Needs information about the true regulation Needs information about the descriptive power of the data Without this infomation validation can only be done, using artificial datasets!

Page 616/6/2013 Patrick Michl Network Modeling Results Artificial datasets We simulate data in three steps:

Page 626/6/2013 Patrick Michl Network Modeling Results Artificial datasets We simulate data in three steps Step 1 Choose number of Genes (E+S) and create random bimodal distributed data

Page 636/6/2013 Patrick Michl Network Modeling Results Artificial datasets We simulate data in three steps Step 1 Choose number of Genes (E+S) and create random bimodal distributed data Step 2 Manipulate data in a fixed order

Page 646/6/2013 Patrick Michl Network Modeling Results Artificial datasets We simulate data in three steps Step 1 Choose number of Genes (E+S) and create random bimodal distributed data Step 2 Manipulate data in a fixed order Step 3 Add noise to manipulated data and normalize data

Page 656/6/2013 Patrick Michl Network Modeling Simulation Results

Page 666/6/2013 Patrick Michl Network Modeling Simulation Results Step 2 Manipulate data

Page 676/6/2013 Patrick Michl Network Modeling Simulation Results

Page 686/6/2013 Patrick Michl Network Modeling Results We analyse the data X with an RBM

Page 696/6/2013 Patrick Michl Network Modeling Results We train an autoencoder with 9 hidden layers and 165 nodes: Layer 1 & 9: 32 hidden units Layer 2 & 8: 24 hidden units Layer 3 & 7: 16 hidden units Layer 4 & 6: 8 hidden units Layer 5: 5 hidden units input data X output data X‘

Page 706/6/2013 Patrick Michl Network Modeling Results We transform the data from X to X‘ And reduce the dimensionality

Page 716/6/2013 Patrick Michl Network Modeling Results We analyse the transformed data X‘ with an RBM

Page 726/6/2013 Patrick Michl Network Modeling Results Lets compare the models

Page 736/6/2013 Patrick Michl Network Modeling Results Another Example with more nodes and larger autoencoder

Page 746/6/2013 Patrick Michl Network Modeling Conclusion Autoencoders can improve modeling significantly by reducing the dimensionality of data Autoencoders preserve complex structures in their multilayer perceptron network. Analysing those networks (for example with knockout tests) could give more structural information The drawback are high computational costs Since the field of deep learning is getting more popular (Face recognition / Voice recognition, Image transformation). Many new improvements in facing the computational costs have been made.

Page 756/6/2013 Patrick Michl Network Modeling Acknowledgement eilsLABS Prof. Dr. Rainer König Prof. Dr. Roland Eils Network Modeling Group