Toric Modification on Machine Learning Keisuke Yamazaki & Sumio Watanabe Tokyo Institute of Technology
Agenda Learning theory and algebraic geometry Two forms of the Kullback divergence Toric modification Application to a binomial mixture model Summary
Agenda Learning theory and algebraic geometry Two forms of the Kullback divergence Toric modification Application to a binomial mixture model Summary
What is the generalization error? The true model Data Learning model Predictive model i.i.d. MLE MAP Bayes Generalization Error
Algebraic geometry connected to learning theory in the Bayes method. The formal definition of the generalization error. The true model DataLearning model Predictive model
Algebraic geometry connected to learning theory in the Bayes method. The formal definition of the generalization error. Another Kullback divergence : The true model DataLearning model Predictive model
Algebraic geometry connected to learning theory in the Bayes method. The formal definition of the generalization error. Another Kullback divergence : The true model DataLearning model Predictive model Machine Learning / Learning Theory Algebraic Geometry
The zeta function has an important role for the connection. Zeta function : h(w) holomorphic function C-infinity func. with compact support h(w) Machine Learning / Learning Theory Algebraic Geometry Analytic func.
The zeta function has an important role for the connection. Zeta function : Machine Learning / Learning Theory Algebraic Geometry holomorphic function Prior distribution Kullback divergence
The largest pole of the zeta function determines the generalization error. Machine Learning / Learning Theory Algebraic Geometry Asymptotic Bayes generalization error [ Watanabe 2001]
Agenda Learning theory and algebraic geometry Two forms of the Kullback divergence Toric modification Application to a binomial mixture model Summary
Calculation of the zeta function requires well-formed H(w). : Polynomial form : Factorized form Resolution of singularities
The resolution of singularities with blow-ups is an iterative method. Blow-ups
The resolution of singularities with blow-ups is an iterative method. Blow-ups Kullback divergence in ML is complicated and has high dimensional w :
The bottleneck is the iterative method. Blow-ups Applicable function Computational cost Any function Large
Toric modification is a systematic method for the resolution of singularities. Blow-upsToric Modification Applicable function Computational cost Any function Large
Agenda Learning theory and algebraic geometry Two forms of the Kullback divergence Toric modification Application to a binomial mixture model Summary
Newton diagram is a convex hull in the exponent space Toric Modification
The borders determine a set of vectors in the dual space. Toric Modification
Add some vectors subdividing the spanned area. Toric Modification
Selected vectors construct the resolution map. Toric Modification
Non-degenerate Kullback divergence The condition to apply the toric modification to the Kullback divergence For all i : Toric Modification
Toric modification is “systematic”. Toric Modification Blow-ups Blow-up The search space will be large. We cannot know how many iterations we need. The number of vectors is limited.
Toric modification can be an effective plug-in method. Blow-upsToric Modification Applicable function Computational cost Any function Large Non-degenerate function Limited
Agenda Learning theory and algebraic geometry Two forms of the Kullback divergence Toric modification Application to a binomial mixture model Summary
An application to a mixture model Mixture of binomial distributions The true model: Learning model:
The Newton diagram of the mixture
The resolution map based on the toric modification
The generalization error of the mixture of binomial distributions : Factorized form The zeta function: : Generalization error : Polynomial form
Agenda Learning theory and algebraic geometry Two forms of the Kullback divergence Toric modification Application to a binomial mixture model Summary
The Bayesian generalization error is derived on the basis of the zeta function. Calculation of the coefficients requires the factorized form of the Kullback divergence. Toric modification is an effective method to find the factorized form. The error of a binomial mixture is derived as the application.
Thank you !!! Toric Modification on Machine Learning Keisuke Yamazaki & Sumio Watanabe Tokyo Institute of Technology
Asymptotic form of G.E. has been derived in regular models. Essential assumption: Fisher information matrix