Multiple Classifier Combination for Character Recognition: Revisiting the Majority Voting System and its Variations M.C. Fairhurst University of Kent UK.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Imbalanced data David Kauchak CS 451 – Fall 2013.
Machine learning continued Image source:
Games of Prediction or Things get simpler as Yoav Freund Banter Inc.
DecisionCombination of Multiple Classifiers for Pattern Classification: Hybridization of Majority Voting and Divide and Conquer Techniques A. F. R. Rahman.
Chapter 1: Introduction to Pattern Recognition
The Decision-Making Process IT Brainpower
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Information and Telecommunication Technology Center (ITTC) University of Kansas SmartXAutofill Intelligent Data Entry Assistant for XML Documents Danico.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
2D1431 Machine Learning Boosting.
Ensemble Learning: An Introduction
Supervised learning: Mixture Of Experts (MOE) Network.
 Manmatha MetaSearch R. Manmatha, Center for Intelligent Information Retrieval, Computer Science Department, University of Massachusetts, Amherst.
Three kinds of learning
1 LM Approaches to Filtering Richard Schwartz, BBN LM/IR ARDA 2002 September 11-12, 2002 UMASS.
Chapter 6: Multilayer Neural Networks
Chapter 4 (part 2): Non-Parametric Classification
SVM (Support Vector Machines) Base on statistical learning theory choose the kernel before the learning process.
Machine Learning: Ensemble Methods
Boosting Main idea: train classifiers (e.g. decision trees) in a sequence. a new classifier should focus on those cases which were incorrectly classified.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Copyright c 2006 Oxford University Press 1 Chapter 7 Solving Problems and Making Decisions Problem solving is the communication that analyzes the problem.
Group Decision Making Y. İlker TOPCU, Ph.D twitter.com/yitopcu.
(ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence
Issues with Data Mining
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
A Compositional Context Sensitive Multi-document Summarizer: Exploring the Factors That Influence Summarization Ani Nenkova, Stanford University Lucy Vanderwende,
Visual Information Systems multiple processor approach.
CS 391L: Machine Learning: Ensembles
Learning Progressions: Some Thoughts About What we do With and About Them Jim Pellegrino University of Illinois at Chicago.
Chapter 7 Thinking, Language, and Intelligence. Cognition.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
1 Part II: Practical Implementations.. 2 Modeling the Classes Stochastic Discrimination.
Discriminant Analysis Discriminant analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the predictor.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
Classifier ensembles: Does the combination rule matter? Ludmila Kuncheva School of Computer Science Bangor University, UK
PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.
Methodology for deriving the STRI Hildegunn Kyvik Nordås Alexandros Ragoussis OECD Trade and Agriculture OEC D/TAD Services expert meeting 2 July 2009.
Feature Selection and Weighting using Genetic Algorithm for Off-line Character Recognition Systems Faten Hussein Presented by The University of British.
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:
Object Recognition as Ranking Holistic Figure-Ground Hypotheses Fuxin Li and Joao Carreira and Cristian Sminchisescu 1.
1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 8 Combining Methods and Ensemble Learning.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
Learning Kernel Classifiers 1. Introduction Summarized by In-Hee Lee.
Which list of numbers is ordered from least to greatest? 10 –3, , 1, 10, , 1, 10, 10 2, 10 – , 10 –3, 1, 10, , 10 –3,
Novel Approaches to Optimised Self-configuration in High Performance Multiple Experts M.C. Fairhurst and S. Hoque University of Kent UK A.F. R. Rahman.
Machine Learning in Practice Lecture 8 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
AdaBoost Algorithm and its Application on Object Detection Fayin Li.
1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.
Arrow’s Impossibility Theorem
Automatic Classification of Audio Data by Carlos H. L. Costa, Jaime D. Valle, Ro L. Koerich IEEE International Conference on Systems, Man, and Cybernetics.
Ensemble Classifiers.
Machine Learning: Ensemble Methods
Advanced data mining with TagHelper and Weka
Trees, bagging, boosting, and stacking
Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas
1.3 The Borda Count Method.
Boosting Nearest-Neighbor Classifier for Character Recognition
Voting systems Chi-Kwong Li.
Ensemble learning.
Presentation transcript:

Multiple Classifier Combination for Character Recognition: Revisiting the Majority Voting System and its Variations M.C. Fairhurst University of Kent UK A. F. R. Rahman and H. Alam BCL Technologies Inc. USA

Basic Problem Statement Given a number of experts working on the same problem, is group decision superior to individual decisions?

Ghosts from the Past… Jean-Charles de Borda (1781) N. C. de Condorcet (1785) Laplace (1795) Issac Todhunter (1865) CC. L. Dodgson (Lewis Carrol) (1873) M. W. Crofton (1885) E. J. Nanson (1907) Francis Galton (1907)

Is Democracy the answer? Infinite Number of Experts Each Expert Should be Competent

How Does It Relate to Character Recognition? Each Expert has its: Strengths and Weaknesses Peculiarities Fresh Approach to Feature Extraction Fresh Approach to Classification But NOT 100% Correct!

Practical Resource Constraints Unfortunately, We Have Limited Number of Experts Number of Training Samples Feature Size Classification Time Memory Size

Solution Clever Algorithms to Exploit Experts –Complimentary Information –Redundancy: Check and Balance –Simultaneous Use of Arbitrary Features and Classification Routines

Question? –Recent trend is towards complicated decision combination schemes –Exhaustive Classifier Selection –Theoretical analysis in place of empirical methods How sophisticated (read “complex”) algorithms do we really need?

Majority Voting Philosophy Should the decision agreed by the majority of the experts be accepted without giving due credit to the competence of the experts? ---- OR ---- Should the decision delivered by the most competent expert be accepted without giving any importance to the majority consensus?

[1] Simple Majority Voting Decision accepted if at least k of the experts agree, where If n is even, If n is odd.

[2] Weighted Majority Voting

[2] Weighted Majority Voting (Contd.) So if decision to assign the unknown pattern to the class is denoted by with, being the number of classes, then the final combined decision supporting assignment to the class takes the form of: The final decision is therefore:

[3] Class-wise Weighted Majority Voting

[4] Restricted Majority Voting (Top Choice)

[4] Restricted Majority Voting (Generalized)

[5] Class-wise Best Decision Selection

[6] Enhanced Majority Voting

[7] Ranked Majority Voting Not only the top choice, but ranked list of other classes Takes account of the negative votes cast by the experts against a particular decision. Each expert not only supplies the top choice (class) decision, but also supplies the ranking of all the other choices considered. The idea is to translate this ranking into ``scores" which would be comparable across all the decisions by all the experts.

[7] Ranked Majority Voting: Continued ( Class Set Reordering) Highest Rank: Take the highest assigned rank Borda Count: Sum of the number of classes ranked below it by each classifier. Regression Method

Selection of a Database NIST Handwritten Characters Collected Off-line Total 34 Classes (0-9, A-Z, no Distinction between 0/O and I/1) Total Samples of Over 34,000 characters Size Normalized to 32X32

Performance of the Classifiers ExpertAcceptedRecog.ErrorRej. FWS MPC BWS MLP

Performance of the Combination Combination MethodAcceptedRecog.ErrorRej. Simple Weighted Class-wise Weighted Restricted Top Choice Class-wise Best Decision Restricted Generalized Enhanced (ENOCORE) Ranked (Borda) Committee Regression

Comparative Study MethodAcceptedRecogn.ErrorReject BKSM Sum Rule GA Best of MVS

Conclusions Majority Voting Solutions can be very versatile and adaptive Different variations may be adopted for different problem domains The Majority Voting configuration is generic Majority Voting Systems may be as applicable to any task domains with equal effectiveness as other complicated solutions