Modeling Consensus: Classifier Combination for WSD Authors: Radu Florian and David Yarowsky Presenter: Marian Olteanu
Introduction Ensembles (classifier combination) If errors are uncorrelated, decrease error by a factor of 1/N In practice, all classifiers tend to make errors at hard examples
Approach & Features Automatic POS tagging and lemma extraction Features Bag of words Local Syntactic
Classifier methods (6) Vector-based Enhanced Naïve Bayes Weighted Cosine BayesRatio (good for sparse data)
Classifier methods (cont.) MMVC (Mixture Maximum Variance Correction) 2 stages Second stage: select sense with variance over threshold
Classifier methods (cont.) Discriminative Models TBL (Transformation Based Learning) Non-hierarchical decision lists
Combining classifiers Agreement
Combining classifiers (cont.) Three methods 1. Combine posterior sense probability distribution
Combining classifiers (cont.) determined: Linear regression Minimize mean square error (MSE) Expectation-Maximization (EM) Approximate k with the performance of the classifier (PB)
Combining classifiers (cont.) 2. Combination based on Order Statistics
Combining classifiers (cont.) 3. Voting (each classifier chose only one sense) Win the one with max. # of votes TagPair Each classifier votes Each pair of classifiers votes for the sense most likely by the joint classification Combining – stacking
Evaluation
Evaluation (unseen data)