Discriminative Training of Chow-Liu tree Multinet Classifiers

Slides:



Advertisements
Similar presentations
Bayesian network classification using spline-approximated KDE Y. Gurwicz, B. Lerner Journal of Pattern Recognition.
Advertisements

ICONIP 2005 Improve Naïve Bayesian Classifier by Discriminative Training Kaizhu Huang, Zhangbing Zhou, Irwin King, Michael R. Lyu Oct
Combining Classification and Model Trees for Handling Ordinal Problems D. Anyfantis, M. Karagiannopoulos S. B. Kotsiantis, P. E. Pintelas Educational Software.
1 Semi-supervised learning for protein classification Brian R. King Chittibabu Guda, Ph.D. Department of Computer Science University at Albany, SUNY Gen*NY*sis.
Transductive Reliability Estimation for Kernel Based Classifiers 1 Department of Computer Science, University of Ioannina, Greece 2 Faculty of Computer.
Learning on Probabilistic Labels Peng Peng, Raymond Chi-wing Wong, Philip S. Yu CSE, HKUST 1.
Fei Xing1, Ping Guo1,2 and Michael R. Lyu2
Paper presentation for CSI5388 PENGCHENG XI Mar. 23, 2005
Hidden Markov Model based 2D Shape Classification Ninad Thakoor 1 and Jean Gao 2 1 Electrical Engineering, University of Texas at Arlington, TX-76013,
Learning Maximum Likelihood Bounded Semi-Naïve Bayesian Network Classifier Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory.
Variations of Minimax Probability Machine Huang, Kaizhu
MSRC Summer School - 30/06/2009 Cambridge – UK Hybrids of generative and discriminative methods for machine learning.
Constructing a Large Node Chow-Liu Tree Based on Frequent Itemsets Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory.
Discriminative Naïve Bayesian Classifiers Kaizhu Huang Supervisors: Prof. Irwin King, Prof. Michael R. Lyu Markers: Prof. Lai Wan Chan, Prof. Kin Hong.
1 Ensembles of Nearest Neighbor Forecasts Dragomir Yankov, Eamonn Keogh Dept. of Computer Science & Eng. University of California Riverside Dennis DeCoste.
Graph-Based Semi-Supervised Learning with a Generative Model Speaker: Jingrui He Advisor: Jaime Carbonell Machine Learning Department
Finite mixture model of Bounded Semi- Naïve Bayesian Network Classifiers Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory.
Bing LiuCS Department, UIC1 Learning from Positive and Unlabeled Examples Bing Liu Department of Computer Science University of Illinois at Chicago Joint.
Learning Maximum Likelihood Bounded Semi-Naïve Bayesian network classifiers Huang, Kaizhu Sept.25, 2002 Huang, Kaizhu Sept.25, 2002.
Random Subspace Feature Selection for Analysis of Data with Missing Features Presented by: Joseph DePasquale Student Activities Conference 2007 This material.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
ENN: Extended Nearest Neighbor Method for Pattern Recognition
Presented by: Fang-Hui, Chu Automatic Speech Recognition Based on Weighted Minimum Classification Error Training Method Qiang Fu, Biing-Hwang Juang School.
Cost-Sensitive Bayesian Network algorithm Introduction: Machine learning algorithms are becoming an increasingly important area for research and application.
1 Naïve Bayes Models for Probability Estimation Daniel Lowd University of Washington (Joint work with Pedro Domingos)
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 On-line Learning of Sequence Data Based on Self-Organizing.
Ensemble Classification Methods Rayid Ghani IR Seminar – 9/26/00.
ICML2004, Banff, Alberta, Canada Learning Larger Margin Machine Locally and Globally Kaizhu Huang Haiqin Yang, Irwin King, Michael.
Introducing the Separability Matrix for ECOC coding
Optimal Dimensionality of Metric Space for kNN Classification Wei Zhang, Xiangyang Xue, Zichen Sun Yuefei Guo, and Hong Lu Dept. of Computer Science &
Ensemble Methods in Machine Learning
29 August 2013 Venkat Naïve Bayesian on CDF Pair Scores.
NTU & MSRA Ming-Feng Tsai
Musical Genre Categorization Using Support Vector Machines Shu Wang.
Machine Learning: A Brief Introduction Fu Chang Institute of Information Science Academia Sinica ext. 1819
1 Discriminative Frequent Pattern Analysis for Effective Classification Presenter: Han Liang COURSE PRESENTATION:
A distributed PSO – SVM hybrid system with feature selection and parameter optimization Cheng-Lung Huang & Jian-Fan Dun Soft Computing 2008.
The Chinese University of Hong Kong Learning Larger Margin Machine Locally and Globally Dept. of Computer Science and Engineering The Chinese University.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
Exploiting Input Features for Controlling Tunable Approximate Programs Sherry Zhou Department of electronic engineering Tsinghua University.
Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.
Usman Roshan Dept. of Computer Science NJIT
Semi-Supervised Learning Using Label Mean
Boosted Augmented Naive Bayes. Efficient discriminative learning of
Efficient Image Classification on Vertically Decomposed Data
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
Robust Fisher Discriminant Analysis
Discriminative and Generative Classifiers
Basic machine learning background with Python scikit-learn
Asymmetric Gradient Boosting with Application to Spam Filtering
J. Zhu, A. Ahmed and E.P. Xing Carnegie Mellon University ICML 2009
Efficient Image Classification on Vertically Decomposed Data
Data Mining Practical Machine Learning Tools and Techniques
IEEE ICIP Feature Normalization for Part-Based Image Classification
A Modified Naïve Possibilistic Classifier for Numerical Data
Open-Category Classification by Adversarial Sample Generation
Introduction to Boosting
Minimax Probability Machine (MPM)
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Discriminative Frequent Pattern Analysis for Effective Classification
Image recognition: Defense adversarial attacks
Pattern Recognition and Machine Learning
Perceptron Learning for Chinese Word Segmentation
Using Uneven Margins SVM and Perceptron for IE
Three steps are separately conducted
University of Wisconsin - Madison
Usman Roshan Dept. of Computer Science NJIT
Discriminative Training
Machine Learning in Business John C. Hull
SPECIAL ISSUE on Document Analysis, 5(2):1-15, 2005.
Presentation transcript:

Discriminative Training of Chow-Liu tree Multinet Classifiers Huang, Kaizhu Dept. of Computer Science and Engineering, CUHK

Outline Background Motivation Classifiers Discriminative classifiers Generative classifiers Bayesian Multinet Classifiers Motivation Discriminative Bayesian Multinet Classifiers Experiments Conclusion

Discriminative Classifiers Directly maximize a discriminative function SVM

Generative Classifiers Estimate the distribution for each class, and then use Bayes rule to perform classification P1(x|C1) P2(x|C2)

Comparison Example of Missing Information: From left to right: Original digit, Cropped and resized digit, 50% missing digit, 75% missing digit, and occluded digit.

Comparison (Continue) Discriminative Classifiers cannot deal with missing information problems easily. Generative Classifiers provide a principled way to handle missing information problems. When is missing, we can use Marginalized P1 and P2 to perform classification

Handling Missing Information Problem SVM TJT: a generative model

Motivation It seems that a good classifier should combine the strategies of discriminative classifiers and generative classifiers Our work trains the one of the generative classifier: the generative Bayesian Multinet classifier in a discriminative way

Roadmap of our work

How our work relates to other work? Discriminative Classifiers Generative Classifiers 1. Jaakkola and Haussler NIPS98 Difference: Our method performs a reverse process: From Generative classifiers to Discriminative classifiers HMM and GMM Discriminative training 2. Beaufays etc., ICASS99, Hastie etc., JRSS 96 Difference: Our method is designed for Bayesian Multinet Classifiers, a more general classifier.

Problems of Bayesian Multinet Classifiers Pre-classified dataset Sub-dataset D1 for Class I Sub-dataset D2 for Class 2 Estimate the distribution P1 to approximate D1 accurately Estimate the distribution P2 to approximate D2 accurately Use Bayes rule to perform classification Comments: This framework discards the divergence information between classes.

Our Training Scheme

Mathematic Explanation Bayesian Multinet Classifiers (BMC) Discriminative Training of BMC

Mathematic Explanation

Finding P1 and P2

Finding P1 and P2

Experimental Setup Datasets Experimental Environments 2 benchmark datasets from UCI machine learning repository Tic-tac-toe Vote Experimental Environments Platform:Windows 2000 Developing tool: Matlab 6.5

Error Rate

Convergence Performance

Conclusion A discriminative training procedure for generative Bayesian Multinet Classifiers is presented This approach improves the recognition rate for two benchmark datasets significantly The theoretic exploration on the convergence performance of this approach is on the way.