Presentation is loading. Please wait.

Presentation is loading. Please wait.

Structure learning with deep autoencoders

Similar presentations


Presentation on theme: "Structure learning with deep autoencoders"— Presentation transcript:

1 Structure learning with deep autoencoders
Network Modeling Seminar, 30/4/2013 Patrick Michl

2 Agenda Autoencoders Biological Model Validation & Implementation

3 Real world data usually is high dimensional …
Autoencoders Dataset Model x2 x1 Real world data usually is high dimensional …

4 … which makes structural analysis and modeling complicated!
Autoencoders Dataset Model x1 x2 𝐹( 𝑥 1 ,𝑥2) ? x2 x1 … which makes structural analysis and modeling complicated!

5 Dimensionality reduction techinques like PCA …
Autoencoders Dataset Model PCA x2 x1 Dimensionality reduction techinques like PCA …

6 … can not preserve complex structures!
Autoencoders Dataset Model PCA x1 x2 𝑥2=α 𝑥 1 +β x2 x1 … can not preserve complex structures!

7 Therefore the analysis of unknown structures …
Autoencoders Dataset Model x2 x1 Therefore the analysis of unknown structures …

8 … needs more considerate nonlinear techniques!
Autoencoders Dataset Model x1 x2 𝑥2=𝑓( 𝑥 1 ) x2 x1 … needs more considerate nonlinear techniques!

9 Autoencoders are artificial neuronal networks …
input data X Autoencoder Artificial Neuronal Network Perceptrons Gaussian Units output data X‘ Autoencoders are artificial neuronal networks …

10 Autoencoders are artificial neuronal networks …
Perceptron input data X Autoencoder 1 Artificial Neuronal Network Perceptrons Gaussian Units Gauss Units R output data X‘ Autoencoders are artificial neuronal networks …

11 Autoencoders are artificial neuronal networks …
input data X Autoencoder Artificial Neuronal Network Perceptrons Gaussian Units output data X‘ Autoencoders are artificial neuronal networks …

12 … with multiple hidden layers.
Autoencoders input data X output data X‘ Perceptrons (Visible layers) (Hidden layers) Autoencoder Artificial Neuronal Network Multiple hidden layers Gaussian Units … with multiple hidden layers.

13 Such networks are called deep networks.
Autoencoders input data X output data X‘ Perceptrons (Visible layers) (Hidden layers) Autoencoder Artificial Neuronal Network Multiple hidden layers Gaussian Units Such networks are called deep networks.

14 Such networks are called deep networks.
Autoencoders input data X output data X‘ Perceptrons (Visible layers) (Hidden layers) Autoencoder Artificial Neuronal Network Multiple hidden layers Definition (deep network) Deep networks are artificial neuronal networks with multiple hidden layers Gaussian Units Such networks are called deep networks.

15 Such networks are called deep networks.
Autoencoders input data X output data X‘ Perceptrons (Visible layers) (Hidden layers) Autoencoder Deep network Gaussian Units Such networks are called deep networks.

16 Autoencoders have a symmetric topology …
input data X output data X‘ Perceptrons (Visible layers) (Hidden layers) Autoencoder Deep network Symmetric topology Gaussian Units Autoencoders have a symmetric topology …

17 … with an odd number of hidden layers.
Autoencoders input data X output data X‘ Perceptrons (Visible layers) (Hidden layers) Autoencoder Deep network Symmetric topology Gaussian Units … with an odd number of hidden layers.

18 The small layer in the center works lika an information bottleneck
Autoencoders input data X output data X‘ Autoencoder Deep network Symmetric topology Information bottleneck Bottleneck The small layer in the center works lika an information bottleneck

19 Autoencoders Autoencoder
input data X output data X‘ Autoencoder Deep network Symmetric topology Information bottleneck Bottleneck ... that creates a low dimensional code for each sample in the input data.

20 The upper stack does the encoding …
Autoencoders input data X output data X‘ Autoencoder Encoder Deep network Symmetric topology Information bottleneck Encoder The upper stack does the encoding …

21 … and the lower stack does the decoding.
Autoencoders input data X output data X‘ Autoencoder Encoder Deep network Symmetric topology Information bottleneck Encoder Decoder Decoder … and the lower stack does the decoding.

22 … and the lower stack does the decoding.
Autoencoders input data X output data X‘ Autoencoder Encoder Deep network Symmetric topology Information bottleneck Encoder Decoder Definition (autoencoder) Autoencoders are deep networks with a symmetric topology and an odd number of hiddern layers, containing a encoder, a low dimensional representation and a decoder. Definition (deep network) Deep networks are artificial neuronal networks with multiple hidden layers Decoder … and the lower stack does the decoding.

23 Autoencoders can be used to reduce the dimension of data …
input data X output data X‘ Autoencoder Problem: dimensionality of data Idea: Train autoencoder to minimize the distance between input X and output X‘ Encode X to low dimensional code Y Decode low dimensional code Y to output X‘ Output X‘ is low dimensional Autoencoders can be used to reduce the dimension of data …

24 Autoencoders Autoencoder … if we can train them!
input data X output data X‘ Autoencoder Problem: dimensionality of data Idea: Train autoencoder to minimize the distance between input X and output X‘ Encode X to low dimensional code Y Decode low dimensional code Y to output X‘ Output X‘ is low dimensional … if we can train them!

25 In feedforward ANNs backpropagation is a good approach.
Autoencoders input data X output data X‘ Autoencoder Training Backpropagation In feedforward ANNs backpropagation is a good approach.

26 In feedforward ANNs backpropagation is a good approach.
Autoencoders input data X output data X‘ Autoencoder Training Backpropagation Backpropagation Definition (autoencoder) The distance (error) between current output X‘ and wanted output Y is computed. This gives a error function 𝑋 ′ =𝐹 𝑋 error = 𝑋 ′2 −𝑌 In feedforward ANNs backpropagation is a good approach.

27 In feedforward ANNs backpropagation is the choice
Autoencoders input data X output data X‘ Autoencoder Training Backpropagation Backpropagation Definition (autoencoder) The distance (error) between current output X‘ and wanted output Y is computed. This gives a error function Example (linear neuronal unit with two inputs) In feedforward ANNs backpropagation is the choice

28 In feedforward ANNs backpropagation is a good approach.
Autoencoders input data X output data X‘ Autoencoder Training Backpropagation Backpropagation Definition (autoencoder) The distance (error) between current output X‘ and wanted output Y is computed. This gives a error function By calculating −𝛻𝑒𝑟𝑟𝑜𝑟 we get a vector that shows in a direction which decreases the error We update the parameters to decrease the error In feedforward ANNs backpropagation is a good approach.

29 In feedforward ANNs backpropagation is the choice
Autoencoders input data X output data X‘ Autoencoder Training Backpropagation Backpropagation Definition (autoencoder) The distance (error) between current output X‘ and wanted output Y is computed. This gives a error function By calculating −𝛻𝑒𝑟𝑟𝑜𝑟 we get a vector that shows in a direction which decreases the error We update the parameters to decrease the error We repeat that In feedforward ANNs backpropagation is the choice

30 … the problem are the multiple hidden layers!
Autoencoders input data X output data X‘ Autoencoder Training Backpropagation Problem: Deep Network … the problem are the multiple hidden layers!

31 Backpropagation is known to be slow far away from the output layer …
Autoencoders input data X output data X‘ Autoencoder Training Backpropagation Problem: Deep Network Very slow training Backpropagation is known to be slow far away from the output layer …

32 … and can converge to poor local minima.
Autoencoders input data X output data X‘ Autoencoder Training Backpropagation Problem: Deep Network Very slow training Maybe bad solution … and can converge to poor local minima.

33 The task is to initialize the parameters close to a good solution!
Autoencoders input data X output data X‘ Autoencoder Training Backpropagation Problem: Deep Network Very slow training Maybe bad solution Idea: Initialize close to a good solution The task is to initialize the parameters close to a good solution!

34 Therefore the training of autoencoders has a pretraining phase …
input data X output data X‘ Autoencoder Training Backpropagation Problem: Deep Network Very slow training Maybe bad solution Idea: Initialize close to a good solution Pretraining Therefore the training of autoencoders has a pretraining phase …

35 … which uses Restricted Boltzmann Machines (RBMs)
Autoencoders input data X output data X‘ Autoencoder Training Backpropagation Problem: Deep Network Very slow training Maybe bad solution Idea: Initialize close to a good solution Pretraining Restricted Boltzmann Machines … which uses Restricted Boltzmann Machines (RBMs)

36 … which uses Restricted Boltzmann Machines (RBMs)
Autoencoders input data X output data X‘ Autoencoder Training Restricted Boltzmann Machine RBMs are Markov Random Fields Backpropagation Problem: Deep Network Very slow training Maybe bad solution Idea: Initialize close to a good solution Pretraining Restricted Boltzmann Machines … which uses Restricted Boltzmann Machines (RBMs)

37 … which uses Restricted Boltzmann Machines (RBMs)
Autoencoders input data X output data X‘ Autoencoder Training Restricted Boltzmann Machine RBMs are Markov Random Fields Backpropagation Problem: Deep Network Very slow training Maybe bad solution Idea: Initialize close to a good solution Pretraining Restricted Boltzmann Machines Markov Random Field Every unit influences every neighbor The coupling is undirected Motivation (Ising Model) A set of magnetic dipoles (spins) is arranged in a graph (lattice) where neighbors are coupled with a given strengt … which uses Restricted Boltzmann Machines (RBMs)

38 … which uses Restricted Boltzmann Machines (RBMs)
Autoencoders input data X output data X‘ Autoencoder Training Restricted Boltzmann Machine RBMs are Markov Random Fields Bipartite topology: visible (v), hidden (h) Use local energy to calculate the probabilities of values Training: contrastive divergency (Gibbs Sampling) Backpropagation Problem: Deep Network Very slow training Maybe bad solution Idea: Initialize close to a good solution Pretraining Restricted Boltzmann Machines h1 v1 v2 v3 v4 h2 h3 … which uses Restricted Boltzmann Machines (RBMs)

39 … which uses Restricted Boltzmann Machines (RBMs)
Autoencoders input data X output data X‘ Autoencoder Training Restricted Boltzmann Machine Gibbs Sampling Backpropagation Problem: Deep Network Very slow training Maybe bad solution Idea: Initialize close to a good solution Pretraining Restricted Boltzmann Machines … which uses Restricted Boltzmann Machines (RBMs)

40 The top layer RBM transforms real value data into binary codes.
Autoencoders Autoencoder Training Top 𝑉 ≔set of visible units 𝑥 𝑣 ≔value of unit 𝑣,∀𝑣∈𝑉 𝑥 𝑣 ∈𝑹, ∀𝑣∈𝑉 𝐻 ≔set of hidden units 𝑥 ℎ ≔value of unit ℎ, ∀ℎ∈𝐻 𝑥 ℎ ∈{𝟎, 𝟏}, ∀ℎ∈𝐻 The top layer RBM transforms real value data into binary codes.

41 Therefore visible units are modeled with gaussians to encode data …
Autoencoders Autoencoder Training h2 v1 v2 v3 v4 h3 h4 h5 h1 Top 𝑥 𝑣 ~𝑁 𝑏 𝑣 + ℎ 𝑤 𝑣ℎ 𝑥 ℎ , 𝜎 𝑣 𝜎 𝑣 ≔ std. dev. of unit 𝑣 𝑏 𝑣 ≔bias of unit 𝑣 𝑤 𝑣ℎ ≔weight of edge (𝑣,ℎ) Therefore visible units are modeled with gaussians to encode data …

42 … and many hidden units with simoids to encode dependencies
Autoencoders Autoencoder Training h2 v1 v2 v3 v4 h3 h4 h5 h1 Top 𝑥 ℎ ~sigm 𝑏 ℎ + 𝑣 𝑤 𝑣ℎ 𝑥 𝑣 𝜎 𝑣 𝜎 𝑣 ≔ std. dev. of unit 𝑣 𝑏 ℎ ≔bias of unit ℎ 𝑤 𝑣ℎ ≔weight of edge (𝑣,ℎ) … and many hidden units with simoids to encode dependencies

43 The objective function is the sum of the local energies.
Autoencoders Autoencoder Training h2 v1 v2 v3 v4 h3 h4 h5 h1 Top Local Energy 𝐸 𝑣 ≔− ℎ 𝑤 𝑣ℎ 𝑥 𝑣 𝜎 𝑣 𝑥 ℎ + 𝑥 𝑣 − 𝑏 𝑣 𝜎 𝑣 2 𝐸 ℎ ≔− 𝑣 𝑤 𝑣ℎ 𝑥 𝑣 𝜎 𝑣 𝑥 ℎ + 𝑥 ℎ 𝑏 ℎ The objective function is the sum of the local energies.

44 The next RBM layer maps the dependency encoding…
Autoencoders Autoencoder Training Reduction 𝑉 ≔set of visible units 𝑥 𝑣 ≔value of unit 𝑣,∀𝑣∈𝑉 𝑥 𝑣 ∈{𝟎, 𝟏}, ∀𝑣∈𝑉 𝐻 ≔set of hidden units 𝑥 ℎ ≔value of unit ℎ, ∀ℎ∈𝐻 𝑥 ℎ ∈{𝟎, 𝟏}, ∀ℎ∈𝐻 The next RBM layer maps the dependency encoding…

45 Autoencoders Autoencoder Training h1 v1 v2 v3 v4 h2 h3 Reduction 𝑥 𝑣 ~sigm 𝑏 𝑣 + ℎ 𝑤 𝑣ℎ 𝑥 ℎ 𝑏 𝑣 ≔bias of unit v 𝑤 𝑣ℎ ≔weight of edge (𝑣,ℎ) … from the upper layer …

46 … to a smaller number of simoids …
Autoencoders Autoencoder Training h1 v1 v2 v3 v4 h2 h3 Reduction 𝑥 ℎ ~sigm 𝑏 ℎ + 𝑣 𝑤 𝑣ℎ 𝑥 𝑣 𝑏 ℎ ≔bias of unit h 𝑤 𝑣ℎ ≔weight of edge (𝑣,ℎ) … to a smaller number of simoids …

47 … which can be trained faster than the top layer
Autoencoders Autoencoder Training h1 v1 v2 v3 v4 h2 h3 Reduction Local Energy 𝐸 𝑣 ≔− ℎ 𝑤 𝑣ℎ 𝑥 𝑣 𝑥 ℎ + 𝑥 ℎ 𝑏 ℎ 𝐸 ℎ ≔− 𝑣 𝑤 𝑣ℎ 𝑥 𝑣 𝑥 ℎ + 𝑥 𝑣 𝑏 𝑣 … which can be trained faster than the top layer

48 The symmetric topology allows us to skip further training.
Autoencoders Autoencoder Training Unrolling The symmetric topology allows us to skip further training.

49 The symmetric topology allows us to skip further training.
Autoencoders Autoencoder Training Unrolling The symmetric topology allows us to skip further training.

50 After pretraining backpropagation usually finds good solutions
Autoencoders Autoencoder Training Pretraining Top RBM (GRBM) Reduction RBMs Unrolling Finetuning Backpropagation After pretraining backpropagation usually finds good solutions

51 The algorithmic complexity of RBM training depends on the network size
Autoencoders Autoencoder Training Complexity: O(inw) i: number of iterations n: number of nodes w: number of weights Memory Complexity: O(w) The algorithmic complexity of RBM training depends on the network size

52 Agenda Autoencoders Biological Model Validation & Implementation

53 Network Modeling Restricted Boltzmann Machines (RBM)
TF E How to model the topological structure?

54 Network Modeling Restricted Boltzmann Machines (RBM)
TF E We define S and E as visible data Layer …

55 Network Modeling Restricted Boltzmann Machines (RBM)
TF We identify S and E with the visible layer …

56 Network Modeling Restricted Boltzmann Machines (RBM)
TF … and the TFs with the hidden layer in a RBM

57 Network Modeling Restricted Boltzmann Machines (RBM)
TF The training of the RBM gives us a model

58 Agenda Autoencoder Biological Model Implementation & Results

59 Results Validation of the results Needs information about the true regulation Needs information about the descriptive power of the data

60 Results Validation of the results Needs information about the true regulation Needs information about the descriptive power of the data Without this infomation validation can only be done, using artificial datasets!

61 Results Artificial datasets We simulate data in three steps:

62 Results Artificial datasets We simulate data in three steps Step 1 Choose number of Genes (E+S) and create random bimodal distributed data

63 Results Artificial datasets We simulate data in three steps Step 1 Choose number of Genes (E+S) and create random bimodal distributed data Step 2 Manipulate data in a fixed order

64 Results Artificial datasets We simulate data in three steps Step 1 Choose number of Genes (E+S) and create random bimodal distributed data Step 2 Manipulate data in a fixed order Step 3 Add noise to manipulated data and normalize data

65 Results Simulation Step 1 Number of visible nodes 8 (4E, 4S)
Create random data: Random {-1, +1} + N(0, 𝜎=0.5)

66 Results Simulation Step 2 Manipulate data
𝑒 1 =0.25 𝑠 𝑠 𝑠 𝑠 4 𝑒 2 =0.5 𝑠 Noise 𝑒 3 =0.5 𝑠 𝑁𝑜𝑖𝑠𝑒 4 𝑒 4 =0.5 𝑠 𝑁𝑜𝑖𝑠𝑒

67 Results Simulation Step 3 Add noise: N(0, 𝜎=0.5)

68 Results We analyse the data X with an RBM

69 We train an autoencoder with 9 hidden layers and 165 nodes:
Results We train an autoencoder with 9 hidden layers and 165 nodes: Layer 1 & 9: 32 hidden units Layer 2 & 8: 24 hidden units Layer 3 & 7: 16 hidden units Layer 4 & 6: 8 hidden units Layer 5: 5 hidden units input data X output data X‘

70 We transform the data from X to X‘ And reduce the dimensionality
Results We transform the data from X to X‘ And reduce the dimensionality

71 Results We analyse the transformed data X‘ with an RBM

72 Lets compare the models
Results Lets compare the models

73 Another Example with more nodes and larger autoencoder
Results Another Example with more nodes and larger autoencoder

74 Conclusion Conclusion Autoencoders can improve modeling significantly by reducing the dimensionality of data Autoencoders preserve complex structures in their multilayer perceptron network. Analysing those networks (for example with knockout tests) could give more structural information The drawback are high computational costs Since the field of deep learning is getting more popular (Face recognition / Voice recognition, Image transformation). Many new improvements in facing the computational costs have been made.

75 Acknowledgement eilsLABS PD Dr. Rainer König Prof. Dr Roland Eils Network Modeling Group


Download ppt "Structure learning with deep autoencoders"

Similar presentations


Ads by Google