Download presentation
Presentation is loading. Please wait.
Published byKathryn Lizbeth Black Modified over 8 years ago
1
Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology, Japan
2
Background Learning machines Mixture models Hidden Markov models Bayesian networks Pattern recognition Natural language processing Gene analysis Information systems mathematically Bayes learning is effective Singular statistical models
3
Problem : Calculations which include a Bayes posterior require huge computational cost. Mean field approximation a Bayes posterior a trial distribution Stochastic Complexity Accuracy of approximation Difference from regular Model selection statistical models
4
Asymptotic behavior of mean field stochastic complexities are studied. Mixture models [ K. Watanabe, et al. 2004. ] Reduced rank regressions [ Nakajima, et al. 2005. ] Hidden Markov models [ Hosino, et al. 2005. ] Stochastic context-free grammar [ Hosino, et al. 2005. ] Neural networks [ Nakano, et al. 2005. ]
5
Purpose We derive the upper bound of mean field stochastic complexity of complete bipartite graph-type Boltzmann machines. Boltzmann Machines Graphical models Spin systems
6
Table of Contents Review Bayes Learning Mean Field Approximation Boltzmann Machines Main Theorem Outline of the Proof Discussion and Conclusion Main Theorem ( Complete Bipartite Graph-type )
7
Bayes Learning True distribution model prior : Bayes posterior : Bayes predictive distribution
8
Mean Field Approximation (1) The Bayes posterior can be rewritten as We consider a Kullback distance from a trial distribution to the Bayes posterior..
9
Mean Field Approximation (2) When we restrict the trial distributionto The minimum value of which minimizes is called mean field stochastic complexity., is called mean field approximation.
10
Complete Bipartite Graph-type Boltzmann Machines units parametric model takes
11
True Distribution units We assume that the true distribution is included in the parametric model and the number of hidden units is. True distribution is
12
Main Theorem The mean field stochastic complexity of complete bipartite graph-type Boltzmann machines has the following upper bound. : the number of input and output units : the number of hidden units (learning machines) : the number of hidden units (true distribution) : constant
13
Outline of the Proof (Methods) normal distribution family prior depends on the BM
14
Outline of the Proof [lemma] of parameter and, such that the number of elements of the set if there exists a value is less than or equal to, mean field stochastic complexity has the e r o n o n - z following upper bound. Hessian matrix For Kullback information
15
We apply this lemma to the Boltzmann machines. Kullback information is given by The second order differential is Here..,.
16
The parameter is a true parameter. Then, becomes. hold. By using the lemma, we have., e r o n o n - z Then,. and
17
Discussion Comparison with other studies regular statistical model :Number of Training data asymptotic area Bayes learning mean field approximation derived result upper bound algebraic geometry [Yamazaki] upper bound Stochastic Complexity
18
Conclusion We derived the upper bound of mean field stochastic complexity of complete bipartite graph-type Boltzmann Machines. Lower bound Future works Comparison with experimental results
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.