Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009

Slides:



Advertisements
Similar presentations
Document Summarization using Conditional Random Fields Dou Shen, Jian-Tao Sun, Hua Li, Qiang Yang, Zheng Chen IJCAI 2007 Hao-Chin Chang Department of Computer.
Advertisements

Classification Classification Examples
+ Multi-label Classification using Adaptive Neighborhoods Tanwistha Saha, Huzefa Rangwala and Carlotta Domeniconi Department of Computer Science George.
Integrated Instance- and Class- based Generative Modeling for Text Classification Antti PuurulaUniversity of Waikato Sung-Hyon MyaengKAIST 5/12/2013 Australasian.
Notes Sample vs distribution “m” vs “µ” and “s” vs “σ” Bias/Variance Bias: Measures how much the learnt model is wrong disregarding noise Variance: Measures.
Probabilistic Generative Models Rong Jin. Probabilistic Generative Model Classify instance x into one of K classes Class prior Density function for class.
Generative Topic Models for Community Analysis
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Scalable Text Mining with Sparse Generative Models
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Topic models for corpora and for graphs. Motivation Social graphs seem to have –some aspects of randomness small diameter, giant connected components,..
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Crash Course on Machine Learning
SVMLight SVMLight is an implementation of Support Vector Machine (SVM) in C. Download source from :
Bayesian Networks. Male brain wiring Female brain wiring.
Correlated Topic Models By Blei and Lafferty (NIPS 2005) Presented by Chunping Wang ECE, Duke University August 4 th, 2006.
Text Classification, Active/Interactive learning.
Review of the web page classification approaches and applications Luu-Ngoc Do Quang-Nhat Vo.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Satoshi Oyama Takashi Kokubo Toru lshida 國立雲林科技大學 National Yunlin.
Bayesian Extension to the Language Model for Ad Hoc Information Retrieval Hugo Zaragoza, Djoerd Hiemstra, Michael Tipping Presented by Chen Yi-Ting.
Universit at Dortmund, LS VIII
Latent Semantic Analysis Hongning Wang Recap: vector space model Represent both doc and query by concept vectors – Each concept defines one dimension.
Hidden Topic Markov Models Amit Gruber, Michal Rosen-Zvi and Yair Weiss in AISTATS 2007 Discussion led by Chunping Wang ECE, Duke University March 2, 2009.
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
Latent Dirichlet Allocation D. Blei, A. Ng, and M. Jordan. Journal of Machine Learning Research, 3: , January Jonathan Huang
IR Homework #3 By J. H. Wang May 4, Programming Exercise #3: Text Classification Goal: to classify each document into predefined categories Input:
Powerpoint Templates Page 1 Powerpoint Templates Scalable Text Classification with Sparse Generative Modeling Antti PuurulaWaikato University.
Topic Models Presented by Iulian Pruteanu Friday, July 28 th, 2006.
IR Homework #3 By J. H. Wang May 10, Programming Exercise #3: Text Classification Goal: to classify each document into predefined categories Input:
USE RECIPE INGREDIENTS TO PREDICT THE CATEGORY OF CUISINE Group 7 – MEI, Yan & HUANG, Chenyu.
Latent Dirichlet Allocation
Iowa State University Department of Computer Science Center for Computational Intelligence, Learning, and Discovery Harris T. Lin, Sanghack Lee, Ngot Bui.
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
 Effective Multi-Label Active Learning for Text Classification Bishan yang, Juan-Tao Sun, Tengjiao Wang, Zheng Chen KDD’ 09 Supervisor: Koh Jia-Ling Presenter:
Hierarchical Beta Process and the Indian Buffet Process by R. Thibaux and M. I. Jordan Discussion led by Qi An.
Text-classification using Latent Dirichlet Allocation - intro graphical model Lei Li
Data Mining Chapter 4 Algorithms: The Basic Methods Reporter: Yuen-Kuei Hsueh.
A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion.
IR Homework #2 By J. H. Wang May 9, Programming Exercise #2: Text Classification Goal: to classify each document into predefined categories Input:
Bayesian Extension to the Language Model for Ad Hoc Information Retrieval Hugo Zaragoza, Djoerd Hiemstra, Michael Tipping Microsoft Research Cambridge,
Text Classification and Naïve Bayes Formalizing the Naïve Bayes Classifier.
IR 6 Scoring, term weighting and the vector space model.
Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.
Topic Modeling for Short Texts with Auxiliary Word Embeddings
Extracting Mobile Behavioral Patterns with the Distant N-Gram Topic Model Lingzi Hong Feb 10th.
Matt Gormley Lecture 3 September 7, 2016
Sentiment analysis algorithms and applications: A survey
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
CH 5: Multivariate Methods
Perceptrons Lirong Xia.
Tackling the Poor Assumptions of Naive Bayes Text Classifiers Pubished by: Jason D.M.Rennie, Lawrence Shih, Jamime Teevan, David R.Karger Liang Lan 11/19/2007.
Learning to Rank Shubhra kanti karmaker (Santu)
J. Zhu, A. Ahmed and E.P. Xing Carnegie Mellon University ICML 2009
Pawan Lingras and Cory Butz
Machine Learning Week 1.
Language Models for Information Retrieval
Topic Oriented Semi-supervised Document Clustering
Discriminative Frequent Pattern Analysis for Effective Classification
Unsupervised Learning II: Soft Clustering with Gaussian Mixture Models
Topic models for corpora and for graphs
Michal Rosen-Zvi University of California, Irvine
Latent Dirichlet Allocation
Topic models for corpora and for graphs
Topic Models in Text Processing
Multivariate Methods Berlin Chen
Multivariate Methods Berlin Chen, 2005 References:
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Perceptrons Lirong Xia.
Jinwen Guo, Shengliang Xu, Shenghua Bao, and Yong Yu
Presentation transcript:

Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009 Dirichlet-Bernoulli Alignment: A Generative Model for Multi-Class Multi-Label Multi-Instance Corpora Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009 Presented by Haojun Chen Some contents are from author’s paper and poster

Outline Introduction Dirichelet-Bernoulli Alignment (DBA) Model Model Inference and Prediction Experiments Conclusion

Figure is adopted from author’s poster Introduction In this paper, multi-class, multi-label and multi-instance classification (M3C) problem is considered. Goal: infer class label for both pattern and its instances Pattern Class Instance Feature (e.g. document) (e.g. topic) (e.g. paragraph) (e.g. word) Figure is adopted from author’s poster

Problem Formalization For a multi-class, multi-label multi-instance corpus , we define : set of input patterns : corresponding labels : set of instances in pattern n : dictionary features : a bag of discrete features : class label

Tree Structure Assumption Basic Assumption Assumption 1 [Exchangeability]: A corpus is a bag of patterns, and each pattern is a bag of instances. Assumption 2 [Distinguishablity]: Each pattern can belong to several classes, but each instance belongs to a single class. Tree Structure Assumption

Dirichelet-Bernoulli Alignment (DBA) Model (1) DBA generative process: Sample pattern-level class mixture

Dirichelet-Bernoulli Alignment (DBA) Model (2) For each of the M instances in X Choose instance-level class label Generate the instance according to observation model

Dirichelet-Bernoulli Alignment (DBA) Model (3) Generate pattern-level label where

Model Inference and Prediction Parameter Estimation (MLE) Variational Approximation Prediction Pattern Classification: Instance Disambiguation:

Why The Name? Lower bound Fourth term

Experiments 1 Text classification ModApte split of the Reuters-21578 text collection, 10788 documents, 10 classes Each paragraph of a document is represented with Vector-Space-Model Eliminate docs with empty label sets, length<20. Remaining 1879 docs, 721 docs (38.4%) with multiple labels Compared with Multinomial-event-model-based Naive-Bayes (MNB) and two state-of-art multi-instance multi-label classifiers (MIMLSVM and MIMLBOOST)

Experiments 2 Named entity disambiguation Yahoo! Answer query log crawled in 2008,101 classes, 216563 questions 300 entities for training and 100 for test Compared with Multinomial Naive Bayes with TF (MNBTF) or TFIDF (MNBTFIDF) attributes, as well as linear SVM classifier with TF (SVMTF) or TFIDF (SVMTFIDF) attributes.

Conclusion A Dirichlet-Bernoulli Alignment model is proposed and proved to be useful for both pattern classification and instance disambiguation