Guillaume Bouchard Xerox Research Centre Europe Efficient Bounds for the Softmax Function Applications to Inference in Hybrid Models Guillaume Bouchard Xerox Research Centre Europe
Deterministic Inference in Hybrid Graphical Models X1 X2 X3 X4 Y1 X5 Y2 Y3 X0 Discrete variables with continuous* parents No sufficient statistic No conjugate distribution Intractable inference Approximate deterministic inference Local sampling Deterministic approximations Gaussian quadrature delta method Laplace approximation Maximize a lower bound to the variational free energy Discrete variable Continuous variable Observed variable Hidden variable *or a large number of discrete parents December 7, 2007 Guillaume Bouchard, Xerox Research Center Europe
Variational inference X1i X2i β 1 β2 Yi Data i Focus on Bayesian multinomial logistic regression Mean field approximation Discrete variable Continuous variable Observed variable Hidden variable Q belongs to an approximation family upper bound? max upper bound? December 7, 2007 Guillaume Bouchard, Xerox Research Center Europe
Bounding the log-partition function (1) Binary case dimension: classical bound [Jordan and Jaakkola] We propose its multiclass extension December 7, 2007 Guillaume Bouchard, Xerox Research Center Europe
Bounding the log-partition function (2) K=2 K=10 December 7, 2007 Guillaume Bouchard, Xerox Research Center Europe
Guillaume Bouchard, Xerox Research Center Europe Other upper bounds Concavity of the log [e.g. Blei et al.] Worst curvature [Bohning] Bound using hyperbolic cosines [Jebara] Local approximation [Gibbs] not proved to be an upper bound December 7, 2007 Guillaume Bouchard, Xerox Research Center Europe
Guillaume Bouchard, Xerox Research Center Europe Proof Idea: Expand the product of inverted sigmoids Upper-bounded by K quadratic upper bounds Lower bounded by a linear function (log-convexity of f) Proof: apply Jensen inequality to December 7, 2007 Guillaume Bouchard, Xerox Research Center Europe
Bounds on the Expectation Exponential bound Quadratic bound simulations December 7, 2007 Guillaume Bouchard, Xerox Research Center Europe
Bayesian multinomial logistic regression Exponential bound Cannot be maximized in closed form gradient-based optimization Fixed point equation (unstable !) Quadratic bound Analytic update: December 7, 2007 Guillaume Bouchard, Xerox Research Center Europe
Numerical experiments Iris dataset 4 dimensions 3 classes Prior: unit variance Experiment Learning: Batch updates Compared to MCMC estimation based on 100K samples Error = Euclidian distance between the mean and variance parameters Results The “worse curvature” bound is more faster and better… December 7, 2007 Guillaume Bouchard, Xerox Research Center Europe
Guillaume Bouchard, Xerox Research Center Europe Conclusion Multinomial links in graphical models are feasible Existing bound work well We can expect further improvements Remark better bounds are only needed for the Bayesian setting For MAP estimation, even a loose bound converge Future work Application to discriminative learning Mixture-based mean-field approximation December 7, 2007 Guillaume Bouchard, Xerox Research Center Europe
Guillaume Bouchard, Xerox Research Center Europe December 7, 2007 Guillaume Bouchard, Xerox Research Center Europe
Guillaume Bouchard, Xerox Research Center Europe Backup slides December 7, 2007 Guillaume Bouchard, Xerox Research Center Europe
Numerical experiments Iris dataset 4 dimensions 3 classes Prior: unit variance Experiment Learning: Batch updates Compared to MCMC estimation based on 100K samples Error = Euclidian distance between the mean and variance parameters Results The “worse curvature” bound is more faster and better… December 7, 2007 Guillaume Bouchard, Xerox Research Center Europe
Numerical experiments Iris dataset 4 dimensions 3 classes Prior: unit variance Experiment Learning: Batch updates Compared to MCMC estimation based on 100K samples Error = Euclidian distance between the mean and variance parameters Results The “worse curvature” bound is more faster and better… December 7, 2007 Guillaume Bouchard, Xerox Research Center Europe
Guillaume Bouchard, Xerox Research Center Europe December 7, 2007 Guillaume Bouchard, Xerox Research Center Europe
Guillaume Bouchard, Xerox Research Center Europe Jebara’s bound One dimension: Hyperbolic cosine bound Multi-dimensional case December 7, 2007 Guillaume Bouchard, Xerox Research Center Europe