Download presentation
Presentation is loading. Please wait.
Published byPercival Baldwin Modified over 6 years ago
1
Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009
Dirichlet-Bernoulli Alignment: A Generative Model for Multi-Class Multi-Label Multi-Instance Corpora Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009 Presented by Haojun Chen Some contents are from author’s paper and poster
2
Outline Introduction Dirichelet-Bernoulli Alignment (DBA) Model
Model Inference and Prediction Experiments Conclusion
3
Figure is adopted from author’s poster
Introduction In this paper, multi-class, multi-label and multi-instance classification (M3C) problem is considered. Goal: infer class label for both pattern and its instances Pattern Class Instance Feature (e.g. document) (e.g. topic) (e.g. paragraph) (e.g. word) Figure is adopted from author’s poster
4
Problem Formalization
For a multi-class, multi-label multi-instance corpus , we define : set of input patterns : corresponding labels : set of instances in pattern n : dictionary features : a bag of discrete features : class label
5
Tree Structure Assumption
Basic Assumption Assumption 1 [Exchangeability]: A corpus is a bag of patterns, and each pattern is a bag of instances. Assumption 2 [Distinguishablity]: Each pattern can belong to several classes, but each instance belongs to a single class. Tree Structure Assumption
6
Dirichelet-Bernoulli Alignment (DBA) Model (1)
DBA generative process: Sample pattern-level class mixture
7
Dirichelet-Bernoulli Alignment (DBA) Model (2)
For each of the M instances in X Choose instance-level class label Generate the instance according to observation model
8
Dirichelet-Bernoulli Alignment (DBA) Model (3)
Generate pattern-level label where
9
Model Inference and Prediction
Parameter Estimation (MLE) Variational Approximation Prediction Pattern Classification: Instance Disambiguation:
10
Why The Name? Lower bound Fourth term
11
Experiments 1 Text classification
ModApte split of the Reuters text collection, documents, 10 classes Each paragraph of a document is represented with Vector-Space-Model Eliminate docs with empty label sets, length<20. Remaining 1879 docs, 721 docs (38.4%) with multiple labels Compared with Multinomial-event-model-based Naive-Bayes (MNB) and two state-of-art multi-instance multi-label classifiers (MIMLSVM and MIMLBOOST)
12
Experiments 2 Named entity disambiguation
Yahoo! Answer query log crawled in 2008,101 classes, questions 300 entities for training and 100 for test Compared with Multinomial Naive Bayes with TF (MNBTF) or TFIDF (MNBTFIDF) attributes, as well as linear SVM classifier with TF (SVMTF) or TFIDF (SVMTFIDF) attributes.
13
Conclusion A Dirichlet-Bernoulli Alignment model is proposed and proved to be useful for both pattern classification and instance disambiguation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.