Download presentation
Presentation is loading. Please wait.
Published byMorgan Russell Modified over 9 years ago
1
Unsupervised Transfer Classification Application to Text Categorization Tianbao Yang, Rong Jin, Anil Jain, Yang Zhou, Wei Tong Michigan State University
2
Overview Introduction Related Work Unsupervised Transfer Classification Problem Definition Approach & Analysis Experiments Conclusions
3
Introduction Classification: supervised learning semi-supervised learning What if No label information is available? impossible but not with some additional information supervised semi-supervised unsupervised classification
4
Introduction Unsupervised transfer classification (UTC) a collection of training examples and their assignments to auxiliary classes to build a classification model for a target class …. auxiliary class 1auxiliary class K target class No Labeled training examples prior conditional probabilities
5
Introduction: Motivated Examples Image Annotation sky 1 sun 0 11 0 1 water 0 0 1 001 grass ? ? ? ? Social Tagging phoneverizonapple 1 google 0 0 1 10 0 0 1 0 1 1 ? ? ? ? How to predict an annotation word/social tag that does not appear in the training data ? ?// / ///? auxiliary classes target classes
6
Related Work Transfer Learning transfer knowledge from source domain to target domain similarity: transfer label information for auxiliary classes to target class difference: assume NO label information for target class Multi-Label Learning, Maximum Entropy Model
7
Unsupervised Transfer Classification Data for auxiliary class target class target class label target classification model Goal Prior probabilityconditional probabilities Class Information Examples Auxiliary Classes assignments to auxiliary classes
8
Maximum Entropy Model (MaxEnt) Favor uniform distribution Feature statistics computed from conditional model Feature statistics computed from training data : the jth feature function
9
Generalized MaxEnt With a large probability Equality constraints Inequality constraints
10
Generalized MaxEnt
11
is unknown for target class How to extend generalized MaxEnt to unsupervised transfer classification ?
12
Estimating feature statistics of target class from those of the auxiliary classes Unsupervised Transfer Classification ~ ~
13
Build up Relation between Auxiliary Classes and Target Class Independence Assumption
14
Unsupervised Transfer Classification Estimating feature statistics for the target class by regression Feature Statistics for Auxiliary Classes Feature Statistics for Target Class Class Information
15
Unsupervised Transfer Classification Dual problem : function of U; definition can be found in paper
16
Consistency Result With a large probability The optimal dual solution using the label information for the target class The dual solution obtained by the proposed approach
17
Experiments Text categorization Data sets: multi-labeled data Protocol: leave one-class out as the target class Metric: AUC (Area under ROC curve)
18
Experiments: Baselines cModel train a classifier for each auxiliary class linearly combine them for the target class cLabel predict the assignment of the target class for training examples by linearly combining the labels of auxiliary classes train a classifier using the predicted labels for target class GME-avg use generalized maxent model compute the feature statistics for the target class by linearly combining those for the auxiliary classes Proposed Approach: GME-Reg
19
Experiment (I) Estimate class information from training data
20
Compare to the classifier of the target class learned by supervised learning Experiment (I) 15002500
21
Experiment (II) Obtain class information from external sources Datasets: bibtex and delicious bibsonomy www.bibsonomy.org/tags bibtexwww.bibsonomy.org/tags ACM DL www.portal.acm.org bibtexwww.portal.acm.org d eli.cio.us www.delicious.com/tag deliciouswww.delicious.com/tag
22
Experiment (II) Comparison with Supervised Classification 6501000~1200
23
Conclusions A new problem: unsupervised transfer classification A statistical framework for unsupervised transfer classification based on generalized maximum entropy robust estimate feature statistics for target class provable performance by consistency analysis Future Work relax independence assumption better estimation of feature statistics for target class
24
Thanks Questions ?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.