Label Distribution Learning and Its Applications

Slides:

Advertisements

Similar presentations

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

Advertisements

Active Appearance Models

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

Classification / Regression Support Vector Machines

Multi-label Relational Neighbor Classification using Social Context Features Xi Wang and Gita Sukthankar Department of EECS University of Central Florida.

Active Learning for Streaming Networked Data Zhilin Yang, Jie Tang, Yutao Zhang Computer Science Department, Tsinghua University.

KE CHEN 1, SHAOGANG GONG 1, TAO XIANG 1, CHEN CHANGE LOY 2 1. QUEEN MARY, UNIVERSITY OF LONDON 2. THE CHINESE UNIVERSITY OF HONG KONG CUMULATIVE ATTRIBUTE.

Support Vector Machines and Kernels Adapted from slides by Tim Oates Cognition, Robotics, and Learning (CORAL) Lab University of Maryland Baltimore County.

Support Vector Machines

IJCAI Wei Zhang, 1 Xiangyang Xue, 2 Jianping Fan, 1 Xiaojing Huang, 1 Bin Wu, 1 Mingjie Liu 1 Fudan University, China; 2 UNCC, USA {weizh,

Jun Zhu Dept. of Comp. Sci. & Tech., Tsinghua University This work was done when I was a visiting researcher at CMU. Joint.

1-norm Support Vector Machines Good for Feature Selection  Solve the quadratic program for some : min s. t.,, denotes where or membership. Equivalent.

1 Distributed localization of networked cameras Stanislav Funiak Carlos Guestrin Carnegie Mellon University Mark Paskin Stanford University Rahul Sukthankar.

Predictive Automatic Relevance Determination by Expectation Propagation Yuan (Alan) Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani.

Relevance Feedback based on Parameter Estimation of Target Distribution K. C. Sia and Irwin King Department of Computer Science & Engineering The Chinese.

U.S. SENATE BILL CLASSIFICATION & VOTE PREDICTION Alessandra Paulino Rick Pocklington Serhat Selcuk Bucak.

Region Based Image Annotation Through Multiple-Instance Learning By: Changbo Yang Wayne State University Department of Computer Science.

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

Jeff Howbert Introduction to Machine Learning Winter Classification Bayesian Classifiers.

An Introduction to Support Vector Machines Martin Law.

Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.

A k-Nearest Neighbor Based Algorithm for Multi-Label Classification Min-Ling Zhang

Transfer Learning From Multiple Source Domains via Consensus Regularization Ping Luo, Fuzhen Zhuang, Hui Xiong, Yuhong Xiong, Qing He.

Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.

Introduction to Machine Learning for Information Retrieval Xiaolong Wang.

Mining Discriminative Components With Low-Rank and Sparsity Constraints for Face Recognition Qiang Zhang, Baoxin Li Computer Science and Engineering Arizona.

Machine Learning CSE 681 CH2 - Supervised Learning.

Machine Learning Seminar: Support Vector Regression Presented by: Heng Ji 10/08/03.

NEURAL NETWORKS FOR DATA MINING

1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.

ECE 8443 – Pattern Recognition Objectives: Error Bounds Complexity Theory PAC Learning PAC Bound Margin Classifiers Resources: D.M.: Simplified PAC-Bayes.

Special topics on text mining [ Part I: text classification ] Hugo Jair Escalante, Aurelio Lopez, Manuel Montes and Luis Villaseñor.

Jeff Howbert Introduction to Machine Learning Winter Regression Linear Regression.

A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting Huang, C. L. & Tsai, C. Y. Expert Systems with Applications 2008.

Supervised Learning of Edges and Object Boundaries Piotr Dollár Zhuowen Tu Serge Belongie.

Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.

An Introduction to Support Vector Machines (M. Law)

Xiangnan Kong,Philip S. Yu Multi-Label Feature Selection for Graph Classification Department of Computer Science University of Illinois at Chicago.

Transductive Regression Piloted by Inter-Manifold Relations.

Extending the Multi- Instance Problem to Model Instance Collaboration Anjali Koppal Advanced Machine Learning December 11, 2007.

Machine Learning CUNY Graduate Center Lecture 4: Logistic Regression.

D. M. J. Tax and R. P. W. Duin. Presented by Mihajlo Grbovic Support Vector Data Description.

Learning from Positive and Unlabeled Examples Investigator: Bing Liu, Computer Science Prime Grant Support: National Science Foundation Problem Statement.

Linear Models for Classification

VIP: Finding Important People in Images Clint Solomon Mathialagan Andrew C. Gallagher Dhruv Batra CVPR

CS 1699: Intro to Computer Vision Support Vector Machines Prof. Adriana Kovashka University of Pittsburgh October 29, 2015.

Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning Rong Jin, Joyce Y. Chai Michigan State University Luo Si Carnegie.

Support vector machine LING 572 Fei Xia Week 8: 2/23/2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A 1.

Guest lecture: Feature Selection Alan Qi Dec 2, 2004.

Detecting New a Priori Probabilities of Data Using Supervised Learning Karpov Nikolay Associate professor NRU Higher School of Economics.

Date: 2011/1/11 Advisor: Dr. Koh. Jia-Ling Speaker: Lin, Yi-Jhen Mr. KNN: Soft Relevance for Multi-label Classification (CIKM’10) 1.

Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.

Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)

Multi-label Prediction via Sparse Infinite CCA Piyush Rai and Hal Daume III NIPS 2009 Presented by Lingbo Li ECE, Duke University July 16th, 2010 Note:

 Effective Multi-Label Active Learning for Text Classification Bishan yang, Juan-Tao Sun, Tengjiao Wang, Zheng Chen KDD’ 09 Supervisor: Koh Jia-Ling Presenter:

11 Automated multi-label text categorization with VG-RAM weightless neural networks Presenter: Guan-Yu Chen A. F. DeSouza, F. Pedroni, E. Oliveira, P.

Item Based Recommender System SUPERVISED BY: DR. MANISH KUMAR BAJPAI TARUN BHATIA ( ) VAIBHAV JAISWAL( )

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.

PRESENTED BY KE CHEN DEPARTMENT OF SIGNAL PROCESSING TAMPERE UNIVERSITY OF TECHNOLOGY, FINLAND CUMULATIVE ATTRIBUTE SPACE FOR AGE AND CROWD DENSITY ESTIMATION.

Big data classification using neural network

CEE 6410 Water Resources Systems Analysis

Table 1. Advantages and Disadvantages of Traditional DM/ML Methods

Alan Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani

Classification Discriminant Analysis

Classification Discriminant Analysis

Support Vector Machines and Kernels

Machine Learning – a Probabilistic Perspective

Lecture 16. Classification (II): Practical Considerations

Presentation transcript:

Label Distribution Learning and Its Applications Xin Geng （耿新） Pattern Learning and Mining (PALM) Lab （模式学习与挖掘实验室, http://palm.seu.edu.cn） School of Computer Science and Engineering Southeast University, Nanjing, China （东南大学）

Learning with Ambiguity Single-label Learning Multi-label Learning ? Label Ambiguity Less Ambiguity More Ambiguity

Label Ambiguity Multi-label Learning “What describes the instance?” cloud sky water building Multi-label Learning

More Ambiguity? “How to describe the instance?” some cloud mostly sky much water a bit of building

How to learn? Not a good choice! Keep more, learn more MLL Thresholding Positive labels MLL Label Distribution Learning (LDL) Assign a real number to each label Importance Confidence Level …… Not a good choice! Keep more, learn more

LDL – Problem Formulation Description Degree A real number is assigned to the label for the instance WLOG Label Distribution Complete label set

LDL – Problem Formulation

LDL – Algorithms Two Categories Conditional Probability Mass Function (Classification) Model the mapping from the instance x to the label distribution d via a conditional PMF Multivariate Support Vector Regression (Regression) Model the mapping from the instance x to the label distribution d via a multivariate support vector machine

Conditional Probability Mass Function Learning from Label Distribution Training set: Goal: learn a conditional mass function that can generate label distributions similar to given the instance K-L divergence

Conditional Probability Mass Function Directly minimizing the K-L divergence between predicted and real LDs MaxEnt Model

Conditional Probability Mass Function IIS-LLD [Geng, Yin, and Zhou, TPAMI’13] [Geng, Smith-Miles, and Zhou, AAAI’10]

Conditional Probability Mass Function BFGS-LLD [Geng and Ji, ICDMW’13]

Conditional Probability Mass Function CPNN [Geng, Yin, and Zhou, TPAMI’13] 3

Multivariate Support Vector Regression Two issues How to output a distribution composed by multiple components? Multivariate Support Vector Regression (M-SVR) [Fernandez et al., TSP’04] How to constrain each component of the distribution within the range of a probability, i.e., [0, 1]? Model the regression by a sigmoid function Solve the two problems simultaneously LDSVR [Geng and Hou, submitted to IJCAI’15] Fit a sigmoid function to each component of the label distribution simultaneously by a support vector machine

Multivariate Support Vector Regression Sigmoid model Target function of SVR Loss Function

Multivariate Support Vector Regression The loss function Dimension by dimension Insensitive Zone Problem: Examples falling into the area ρ1 will be penalized once while those falling into the area ρ2 will be penalized twice.

Multivariate Support Vector Regression The loss function Multivariate Insensitive Zone Problem: Difficult to optimize and apply the kernel trick

Multivariate Support Vector Regression The loss function Measure the loss by calculating how far away from zi another point z′i∈ Rc should move to get the same output with the ground truth

Multivariate Support Vector Regression The loss function Replacing ui with u′i/4 Insensitive Zone

Age Estimation Aging is a slow and gradual progress [Geng, Yin, and Zhou, TPAMI’13] [Geng, Smith-Miles, and Zhou, AAAI’10] Aging is a slow and gradual progress The faces at close ages look quite similar Can we use the neighboring ages to relieve the ‘lack of training samples’ problem?

Age Estimation Experiment [Geng, Yin, and Zhou, TPAMI’13] [Geng, Smith-Miles, and Zhou, AAAI’10] Experiment

Head Pose Estimation Bivariate Label Distribution [Geng and Xia, CVPR’14] Bivariate Label Distribution

Head Pose Estimation [Geng and Xia, CVPR’14] Experiment

Multilabel Ranking for Natural Scene Images [Geng and Luo, CVPR’14] Multilabel Ranking A bipartition of the relevant (positive) and irrelevant (negative) labels A proper ranking over relevant labels Multiple Rankers: Subjective Inconsistent “Ground Truth”

Multilabel Ranking for Natural Scene Images [Geng and Luo, CVPR’14] Multilabel Ranking by Preference Distribution Virtual labels as split point between relevant and irrelevant labels

Multilabel Ranking for Natural Scene Images [Geng and Luo, CVPR’14] Experiment

Crowd Counting [Wang, Zhang and Geng, Neurocomputing’15]

Crowd Counting [Wang, Zhang and Geng, Neurocomputing’15]

Pre-release Prediction of Crowd Opinion on Movies [Geng and Hou, submitted to IJCAI’15] Pre-release Metadata Crowd Rating Distribution

Pre-release Prediction of Crowd Opinion on Movies [Geng and Hou, submitted to IJCAI’15] Experiment

Conclusion Label distribution learning It is useful when More general framework than single-label and multi-label learning Deals with different importance of labels Matches certain problems better Needs special design It is useful when There is a natural measure of description degree There are multiple labeling sources for one instance The labels are correlated to each other ……

Download the LDL Matlab package from Interested? Download the LDL Matlab package from http://cse.seu.edu.cn/PersonalPage/xgeng/LDL

Thank You http:// palm.seu.edu.cn