Label Distribution Learning and Its Applications Xin Geng (耿新) Pattern Learning and Mining (PALM) Lab (模式学习与挖掘实验室, http://palm.seu.edu.cn) School of Computer Science and Engineering Southeast University, Nanjing, China (东南大学)
Learning with Ambiguity Single-label Learning Multi-label Learning ? Label Ambiguity Less Ambiguity More Ambiguity
Label Ambiguity Multi-label Learning “What describes the instance?” cloud sky water building Multi-label Learning
More Ambiguity? “How to describe the instance?” some cloud mostly sky much water a bit of building
How to learn? Not a good choice! Keep more, learn more MLL Thresholding Positive labels MLL Label Distribution Learning (LDL) Assign a real number to each label Importance Confidence Level …… Not a good choice! Keep more, learn more
LDL – Problem Formulation Description Degree A real number is assigned to the label for the instance WLOG Label Distribution Complete label set
LDL – Problem Formulation
LDL – Algorithms Two Categories Conditional Probability Mass Function (Classification) Model the mapping from the instance x to the label distribution d via a conditional PMF Multivariate Support Vector Regression (Regression) Model the mapping from the instance x to the label distribution d via a multivariate support vector machine
Conditional Probability Mass Function Learning from Label Distribution Training set: Goal: learn a conditional mass function that can generate label distributions similar to given the instance K-L divergence
Conditional Probability Mass Function Directly minimizing the K-L divergence between predicted and real LDs MaxEnt Model
Conditional Probability Mass Function IIS-LLD [Geng, Yin, and Zhou, TPAMI’13] [Geng, Smith-Miles, and Zhou, AAAI’10]
Conditional Probability Mass Function BFGS-LLD [Geng and Ji, ICDMW’13]
Conditional Probability Mass Function CPNN [Geng, Yin, and Zhou, TPAMI’13] 3
Multivariate Support Vector Regression Two issues How to output a distribution composed by multiple components? Multivariate Support Vector Regression (M-SVR) [Fernandez et al., TSP’04] How to constrain each component of the distribution within the range of a probability, i.e., [0, 1]? Model the regression by a sigmoid function Solve the two problems simultaneously LDSVR [Geng and Hou, submitted to IJCAI’15] Fit a sigmoid function to each component of the label distribution simultaneously by a support vector machine
Multivariate Support Vector Regression Sigmoid model Target function of SVR Loss Function
Multivariate Support Vector Regression The loss function Dimension by dimension Insensitive Zone Problem: Examples falling into the area ρ1 will be penalized once while those falling into the area ρ2 will be penalized twice.
Multivariate Support Vector Regression The loss function Multivariate Insensitive Zone Problem: Difficult to optimize and apply the kernel trick
Multivariate Support Vector Regression The loss function Measure the loss by calculating how far away from zi another point z′i∈ Rc should move to get the same output with the ground truth
Multivariate Support Vector Regression The loss function Replacing ui with u′i/4 Insensitive Zone
Age Estimation Aging is a slow and gradual progress [Geng, Yin, and Zhou, TPAMI’13] [Geng, Smith-Miles, and Zhou, AAAI’10] Aging is a slow and gradual progress The faces at close ages look quite similar Can we use the neighboring ages to relieve the ‘lack of training samples’ problem?
Age Estimation Experiment [Geng, Yin, and Zhou, TPAMI’13] [Geng, Smith-Miles, and Zhou, AAAI’10] Experiment
Head Pose Estimation Bivariate Label Distribution [Geng and Xia, CVPR’14] Bivariate Label Distribution
Head Pose Estimation [Geng and Xia, CVPR’14] Experiment
Multilabel Ranking for Natural Scene Images [Geng and Luo, CVPR’14] Multilabel Ranking A bipartition of the relevant (positive) and irrelevant (negative) labels A proper ranking over relevant labels Multiple Rankers: Subjective Inconsistent “Ground Truth”
Multilabel Ranking for Natural Scene Images [Geng and Luo, CVPR’14] Multilabel Ranking by Preference Distribution Virtual labels as split point between relevant and irrelevant labels
Multilabel Ranking for Natural Scene Images [Geng and Luo, CVPR’14] Experiment
Crowd Counting [Wang, Zhang and Geng, Neurocomputing’15]
Crowd Counting [Wang, Zhang and Geng, Neurocomputing’15]
Pre-release Prediction of Crowd Opinion on Movies [Geng and Hou, submitted to IJCAI’15] Pre-release Metadata Crowd Rating Distribution
Pre-release Prediction of Crowd Opinion on Movies [Geng and Hou, submitted to IJCAI’15] Experiment
Conclusion Label distribution learning It is useful when More general framework than single-label and multi-label learning Deals with different importance of labels Matches certain problems better Needs special design It is useful when There is a natural measure of description degree There are multiple labeling sources for one instance The labels are correlated to each other ……
Download the LDL Matlab package from Interested? Download the LDL Matlab package from http://cse.seu.edu.cn/PersonalPage/xgeng/LDL
Thank You http:// palm.seu.edu.cn