Robust Full Bayesian Learning for Neural Networks

Robust Full Bayesian Learning for Neural Networks
C Andrieu, JFG de Freitas and A Doucet from Cambridge Univ. Engeneering Dept. Arranged by Jinsan Yang

Outlines Introduction Problem Statement Bayesian Model
Bayesian Computation Reversible Jump Simulated Annealing Convergence Results Experiments Machine Learning (형주) > - Supervised Learning, Unsupervised Learning 정의 > - What if the data are (preprocessed) text documents?

Introduction ANNs are nonlinear approximation tools used for regression, classification and density estimation ANNs can approximate any continuous function as long as the number of neurons increases without bound. Examples of application: speech recognition (Robinson 1994), hand written digit recognition (Le Cun et al 1989)), financial modeling (References 1995), medical diagnosis (Baxt 1990)). Machine Learning (형주) > - Supervised Learning, Unsupervised Learning 정의 > - What if the data are (preprocessed) text documents?

Introduction Related Works
By using Gaussian distribution for weights and smoothing priors, it is possible to estimate weights, variances and regularization coefficients. (Mackay 1992 et al) Use of the hybrid Monte Carlo and convergence of NN function to the Gaussian process. (Neal 1996 et al) Methods for the selection of the number of neurons Penalized likelihood to avoid over-fitting: AIC, BIC, MDL Predictive assessment: split data into training, validation, test sets and choose the number to balance the bias and errors of predictor in each data set. Growing and pruning: for model selection Bayesian growing and pruning by reversible jump MCMC Machine Learning (형주) > - Supervised Learning, Unsupervised Learning 정의 > - What if the data are (preprocessed) text documents?

Introduction In this paper
Used a full Bayesian model which accounts for model order uncertainty, regularization and showed the robustness of result to the prior specification Automated growing and pruning procedure by AIC, BIC and MDL criteria for computational efficiency Convergence results for the RJMCMC and annealed RJMCMC Machine Learning (형주) > - Supervised Learning, Unsupervised Learning 정의 > - What if the data are (preprocessed) text documents?

Problem Statement Model
Machine Learning (형주) > - Supervised Learning, Unsupervised Learning 정의 > - What if the data are (preprocessed) text documents?

Problem Statement Objective: estimate k and  from the data set {x, y} from the nonlinear regression model. Machine Learning (형주) > - Supervised Learning, Unsupervised Learning 정의 > - What if the data are (preprocessed) text documents?

Bayesian Model and Aims
Assume priors for unknown k and  and hyper priors (priors are not fixed) Joint distribution of all variables Likelihood 정의 > - What if the data are (preprocessed) text documents?

Prior structure are assumed to be independent of the hyper-parameters, independent of each other and distributed as a conjugate inverse gamma. Machine Learning (형주) > - Supervised Learning, Unsupervised Learning 정의 > - What if the data are (preprocessed) text documents?

Assume Machine Learning (형주) > - Supervised Learning, Unsupervised Learning 정의 > - What if the data are (preprocessed) text documents? (Hyper-parameters) (DAG )

Inference aims Using MCMC Machine Learning (형주) > - Supervised Learning, Unsupervised Learning 정의 > - What if the data are (preprocessed) text documents?

Integration of nuisance parameters ( , 2) Machine Learning (형주) > - Supervised Learning, Unsupervised Learning 정의 > - What if the data are (preprocessed) text documents?

Bayesian Computation (fixed k)
Hybrid MCMC Sampler (Gibbs steps + MH steps) 1.Updating the RBF centers Target distribution: Proposal distribution: select each ’s with acceptance prob.= A -from uniform (with prob.=w). -from conditional normal (with prob.=1-w). Machine Learning (형주) > - Supervised Learning, Unsupervised Learning 정의 > - What if the data are (preprocessed) text documents?

Bayesian Computation 2. Sampling the nuisance parameters
3. Sampling the hyper-parameters Machine Learning (형주) > - Supervised Learning, Unsupervised Learning 정의 > - What if the data are (preprocessed) text documents?

Bayesian Computation Machine Learning (형주) > - Supervised Learning, Unsupervised Learning 정의 > - What if the data are (preprocessed) text documents?

Bayesian Computation (unknown k)
For unknown dimension: MCMC over different dimensional subspaces k in Suggested moves Birth of a new basis Death of a new basis Merge a randomly chosen basis and its closest neighbor Split a randomly chosen basis Update the RBF centers Machine Learning (형주) > - Supervised Learning, Unsupervised Learning 정의 > - What if the data are (preprocessed) text documents?

Bayesian Computation At each iteration one of the candidate moves is randomly chosen according to the corresponding probability (where bk+dk+mk+sk+uk=1 for 0 < k < kmax ). mk = dk , sk = bk , c* = .25, Machine Learning (형주) > - Supervised Learning, Unsupervised Learning 정의 > - What if the data are (preprocessed) text documents?

Bayesian Computation For fixed dimension

Bayesian Computation Birth and death moves
rbirth = (posterior dist. ratio) x (proposal ratio) x (Jacobian) Machine Learning (형주) > - Supervised Learning, Unsupervised Learning 정의 > - What if the data are (preprocessed) text documents?

Bayesian Computation Machine Learning (형주) > - Supervised Learning, Unsupervised Learning 정의 > - What if the data are (preprocessed) text documents?

Bayesian Computation Split and merge moves

Experimental Results Modification of prior for  to fix ill-conditioning Experiment1- Signal detection. for 50 covariate points on [-2,2]. Machine Learning (형주) > - Supervised Learning, Unsupervised Learning 정의 > - What if the data are (preprocessed) text documents?

Experimental Results Histograms of p(k|x,y)

Experimental Results Classification with discriminants
Data: 9 patients, one control for muscle tremor. Three linear an three angular directions were measured and reduced into 2 features. o: patient, +: control Machine Learning (형주) > - Supervised Learning, Unsupervised Learning 정의 > - What if the data are (preprocessed) text documents?

Experimental Results Machine Learning (형주) > - Supervised Learning, Unsupervised Learning 정의 > - What if the data are (preprocessed) text documents?

Robust Full Bayesian Learning for Neural Networks

Similar presentations

Presentation on theme: "Robust Full Bayesian Learning for Neural Networks"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Robust Full Bayesian Learning for Neural Networks

Similar presentations

Presentation on theme: "Robust Full Bayesian Learning for Neural Networks"— Presentation transcript:

Similar presentations

About project

Feedback