Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009

Slides:

Advertisements

Similar presentations

Document Summarization using Conditional Random Fields Dou Shen, Jian-Tao Sun, Hua Li, Qiang Yang, Zheng Chen IJCAI 2007 Hao-Chin Chang Department of Computer.

Advertisements

Classification Classification Examples

+ Multi-label Classification using Adaptive Neighborhoods Tanwistha Saha, Huzefa Rangwala and Carlotta Domeniconi Department of Computer Science George.

Integrated Instance- and Class- based Generative Modeling for Text Classification Antti PuurulaUniversity of Waikato Sung-Hyon MyaengKAIST 5/12/2013 Australasian.

Notes Sample vs distribution “m” vs “µ” and “s” vs “σ” Bias/Variance Bias: Measures how much the learnt model is wrong disregarding noise Variance: Measures.

Probabilistic Generative Models Rong Jin. Probabilistic Generative Model Classify instance x into one of K classes Class prior Density function for class.

Generative Topic Models for Community Analysis

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Scalable Text Mining with Sparse Generative Models

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Topic models for corpora and for graphs. Motivation Social graphs seem to have –some aspects of randomness small diameter, giant connected components,..

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

Crash Course on Machine Learning

SVMLight SVMLight is an implementation of Support Vector Machine (SVM) in C. Download source from :

Bayesian Networks. Male brain wiring Female brain wiring.

Correlated Topic Models By Blei and Lafferty (NIPS 2005) Presented by Chunping Wang ECE, Duke University August 4 th, 2006.

Text Classification, Active/Interactive learning.

Review of the web page classification approaches and applications Luu-Ngoc Do Quang-Nhat Vo.

Intelligent Database Systems Lab Advisor ： Dr. Hsu Graduate ： Chien-Shing Chen Author ： Satoshi Oyama Takashi Kokubo Toru lshida 國立雲林科技大學 National Yunlin.

Bayesian Extension to the Language Model for Ad Hoc Information Retrieval Hugo Zaragoza, Djoerd Hiemstra, Michael Tipping Presented by Chen Yi-Ting.

Universit at Dortmund, LS VIII

Latent Semantic Analysis Hongning Wang Recap: vector space model Represent both doc and query by concept vectors – Each concept defines one dimension.

Hidden Topic Markov Models Amit Gruber, Michal Rosen-Zvi and Yair Weiss in AISTATS 2007 Discussion led by Chunping Wang ECE, Duke University March 2, 2009.

Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.

Latent Dirichlet Allocation D. Blei, A. Ng, and M. Jordan. Journal of Machine Learning Research, 3: , January Jonathan Huang

IR Homework #3 By J. H. Wang May 4, Programming Exercise #3: Text Classification Goal: to classify each document into predefined categories Input:

Powerpoint Templates Page 1 Powerpoint Templates Scalable Text Classification with Sparse Generative Modeling Antti PuurulaWaikato University.

Topic Models Presented by Iulian Pruteanu Friday, July 28 th, 2006.

IR Homework #3 By J. H. Wang May 10, Programming Exercise #3: Text Classification Goal: to classify each document into predefined categories Input:

USE RECIPE INGREDIENTS TO PREDICT THE CATEGORY OF CUISINE Group 7 – MEI, Yan & HUANG, Chenyu.

Latent Dirichlet Allocation

Iowa State University Department of Computer Science Center for Computational Intelligence, Learning, and Discovery Harris T. Lin, Sanghack Lee, Ngot Bui.

26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.

Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.

 Effective Multi-Label Active Learning for Text Classification Bishan yang, Juan-Tao Sun, Tengjiao Wang, Zheng Chen KDD’ 09 Supervisor: Koh Jia-Ling Presenter:

Hierarchical Beta Process and the Indian Buffet Process by R. Thibaux and M. I. Jordan Discussion led by Qi An.

Text-classification using Latent Dirichlet Allocation - intro graphical model Lei Li

Data Mining Chapter 4 Algorithms: The Basic Methods Reporter: Yuen-Kuei Hsueh.

A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion.

IR Homework #2 By J. H. Wang May 9, Programming Exercise #2: Text Classification Goal: to classify each document into predefined categories Input:

Bayesian Extension to the Language Model for Ad Hoc Information Retrieval Hugo Zaragoza, Djoerd Hiemstra, Michael Tipping Microsoft Research Cambridge,

Text Classification and Naïve Bayes Formalizing the Naïve Bayes Classifier.

IR 6 Scoring, term weighting and the vector space model.

Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.

Topic Modeling for Short Texts with Auxiliary Word Embeddings

Extracting Mobile Behavioral Patterns with the Distant N-Gram Topic Model Lingzi Hong Feb 10th.

Matt Gormley Lecture 3 September 7, 2016

Sentiment analysis algorithms and applications: A survey

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

CH 5: Multivariate Methods

Perceptrons Lirong Xia.

Tackling the Poor Assumptions of Naive Bayes Text Classifiers Pubished by: Jason D.M.Rennie, Lawrence Shih, Jamime Teevan, David R.Karger Liang Lan 11/19/2007.

Learning to Rank Shubhra kanti karmaker (Santu)

J. Zhu, A. Ahmed and E.P. Xing Carnegie Mellon University ICML 2009

Pawan Lingras and Cory Butz

Machine Learning Week 1.

Language Models for Information Retrieval

Topic Oriented Semi-supervised Document Clustering

Discriminative Frequent Pattern Analysis for Effective Classification

Unsupervised Learning II: Soft Clustering with Gaussian Mixture Models

Topic models for corpora and for graphs

Michal Rosen-Zvi University of California, Irvine

Latent Dirichlet Allocation

Topic models for corpora and for graphs

Topic Models in Text Processing

Multivariate Methods Berlin Chen

Multivariate Methods Berlin Chen, 2005 References:

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Perceptrons Lirong Xia.

Jinwen Guo, Shengliang Xu, Shenghua Bao, and Yong Yu

Presentation transcript:

Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009 Dirichlet-Bernoulli Alignment: A Generative Model for Multi-Class Multi-Label Multi-Instance Corpora Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009 Presented by Haojun Chen Some contents are from author’s paper and poster

Outline Introduction Dirichelet-Bernoulli Alignment (DBA) Model Model Inference and Prediction Experiments Conclusion

Figure is adopted from author’s poster Introduction In this paper, multi-class, multi-label and multi-instance classification (M3C) problem is considered. Goal: infer class label for both pattern and its instances Pattern Class Instance Feature (e.g. document) (e.g. topic) (e.g. paragraph) (e.g. word) Figure is adopted from author’s poster

Problem Formalization For a multi-class, multi-label multi-instance corpus , we define : set of input patterns : corresponding labels : set of instances in pattern n : dictionary features : a bag of discrete features : class label

Tree Structure Assumption Basic Assumption Assumption 1 [Exchangeability]: A corpus is a bag of patterns, and each pattern is a bag of instances. Assumption 2 [Distinguishablity]: Each pattern can belong to several classes, but each instance belongs to a single class. Tree Structure Assumption

Dirichelet-Bernoulli Alignment (DBA) Model (1) DBA generative process: Sample pattern-level class mixture

Dirichelet-Bernoulli Alignment (DBA) Model (2) For each of the M instances in X Choose instance-level class label Generate the instance according to observation model

Dirichelet-Bernoulli Alignment (DBA) Model (3) Generate pattern-level label where

Model Inference and Prediction Parameter Estimation (MLE) Variational Approximation Prediction Pattern Classification: Instance Disambiguation:

Why The Name? Lower bound Fourth term

Experiments 1 Text classification ModApte split of the Reuters-21578 text collection, 10788 documents, 10 classes Each paragraph of a document is represented with Vector-Space-Model Eliminate docs with empty label sets, length<20. Remaining 1879 docs, 721 docs (38.4%) with multiple labels Compared with Multinomial-event-model-based Naive-Bayes (MNB) and two state-of-art multi-instance multi-label classifiers (MIMLSVM and MIMLBOOST)

Experiments 2 Named entity disambiguation Yahoo! Answer query log crawled in 2008,101 classes, 216563 questions 300 entities for training and 100 for test Compared with Multinomial Naive Bayes with TF (MNBTF) or TFIDF (MNBTFIDF) attributes, as well as linear SVM classifier with TF (SVMTF) or TFIDF (SVMTFIDF) attributes.

Conclusion A Dirichlet-Bernoulli Alignment model is proposed and proved to be useful for both pattern classification and instance disambiguation