Three Papers: AUC, PFA and BIOInformatics The three papers are posted online.

Slides:



Advertisements
Similar presentations
ICONIP 2005 Improve Naïve Bayesian Classifier by Discriminative Training Kaizhu Huang, Zhangbing Zhou, Irwin King, Michael R. Lyu Oct
Advertisements

Face Recognition Ying Wu Electrical and Computer Engineering Northwestern University, Evanston, IL
Classification: Definition Given a collection of records (training set ) –Each record contains a set of attributes, one of the attributes is the class.
Comparison of Data Mining Algorithms on Bioinformatics Dataset Melissa K. Carroll Advisor: Sung-Hyuk Cha March 4, 2003.
An Overview of Machine Learning
ROC Statistics for the Lazy Machine Learner in All of Us Bradley Malin Lecture for COS Lab School of Computer Science Carnegie Mellon University 9/22/2005.
Lecture Notes for Chapter 4 Introduction to Data Mining
COMP 328: Final Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology
Classification and risk prediction
Distributional Clustering of Words for Text Classification Authors: L.Douglas Baker Andrew Kachites McCallum Presenter: Yihong Ding.
Pattern Recognition Topic 1: Principle Component Analysis Shapiro chap
CSE803 Fall Pattern Recognition Concepts Chapter 4: Shapiro and Stockman How should objects be represented? Algorithms for recognition/matching.
Classification 10/03/07.
ROC Curves.
Discriminative Naïve Bayesian Classifiers Kaizhu Huang Supervisors: Prof. Irwin King, Prof. Michael R. Lyu Markers: Prof. Lai Wan Chan, Prof. Kin Hong.
Revision (Part II) Ke Chen COMP24111 Machine Learning Revision slides are going to summarise all you have learnt from Part II, which should be helpful.
Stockman CSE803 Fall Pattern Recognition Concepts Chapter 4: Shapiro and Stockman How should objects be represented? Algorithms for recognition/matching.
ROC Curves.
Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.
Machine Learning CS 165B Spring Course outline Introduction (Ch. 1) Concept learning (Ch. 2) Decision trees (Ch. 3) Ensemble learning Neural Networks.
Presented By Wanchen Lu 2/25/2013
Active Learning for Class Imbalance Problem
Bayesian Networks. Male brain wiring Female brain wiring.
Topics on Final Perceptrons SVMs Precision/Recall/ROC Decision Trees Naive Bayes Bayesian networks Adaboost Genetic algorithms Q learning Not on the final:
ROC 1.Medical decision making 2.Machine learning 3.Data mining research communities A technique for visualizing, organizing, selecting classifiers based.
Cost-Sensitive Bayesian Network algorithm Introduction: Machine learning algorithms are becoming an increasingly important area for research and application.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
1 Logistic Regression Adapted from: Tom Mitchell’s Machine Learning Book Evan Wei Xiang and Qiang Yang.
1 Pattern Recognition Concepts How should objects be represented? Algorithms for recognition/matching * nearest neighbors * decision tree * decision functions.
Treatment Learning: Implementation and Application Ying Hu Electrical & Computer Engineering University of British Columbia.
Experimental Evaluation of Learning Algorithms Part 1.
Machine Learning Queens College Lecture 2: Decision Trees.
1 KDD-09, Paris France Quantification and Semi-Supervised Classification Methods for Handling Changes in Class Distribution Jack Chongjie Xue † Gary M.
A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting Huang, C. L. & Tsai, C. Y. Expert Systems with Applications 2008.
Using Support Vector Machines to Enhance the Performance of Bayesian Face Recognition IEEE Transaction on Information Forensics and Security Zhifeng Li,
Classification Course web page: vision.cis.udel.edu/~cv May 12, 2003  Lecture 33.
Presentation Title Department of Computer Science A More Principled Approach to Machine Learning Michael R. Smith Brigham Young University Department of.
Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.
Lecture 12, CS5671 Decisions, Decisions Concepts Naïve Bayesian Classification Decision Trees –General Algorithm –Refinements Accuracy Scalability –Strengths.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
GENDER AND AGE RECOGNITION FOR VIDEO ANALYTICS SOLUTION PRESENTED BY: SUBHASH REDDY JOLAPURAM.
Jen-Tzung Chien, Meng-Sung Wu Minimum Rank Error Language Modeling.
Chapter 13 (Prototype Methods and Nearest-Neighbors )
29 August 2013 Venkat Naïve Bayesian on CDF Pair Scores.
Evaluating Classification Performance
Bayesian decision theory: A framework for making decisions when uncertainty exit 1 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e.
A Brief Introduction and Issues on the Classification Problem Jin Mao Postdoc, School of Information, University of Arizona Sept 18, 2015.
Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
Machine Learning in Practice Lecture 21 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Next, this study employed SVM to classify the emotion label for each EEG segment. The basic idea is to project input data onto a higher dimensional feature.
BAYESIAN LEARNING. 2 Bayesian Classifiers Bayesian classifiers are statistical classifiers, and are based on Bayes theorem They can calculate the probability.
Computer Vision Lecture 7 Classifiers. Computer Vision, Lecture 6 Oleh Tretiak © 2005Slide 1 This Lecture Bayesian decision theory (22.1, 22.2) –General.
Introduction to Data Mining Clustering & Classification Reference: Tan et al: Introduction to data mining. Some slides are adopted from Tan et al.
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss Pedro Domingos, Michael Pazzani Presented by Lu Ren Oct. 1, 2007.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
Applied statistics Usman Roshan.
Trees, bagging, boosting, and stacking
Course Outline MODEL INFORMATION COMPLETE INCOMPLETE
REMOTE SENSING Multispectral Image Classification
Revision (Part II) Ke Chen
Design of Hierarchical Classifiers for Efficient and Accurate Pattern Classification M N S S K Pavan Kumar Advisor : Dr. C. V. Jawahar.
Revision (Part II) Ke Chen
CS4670: Intro to Computer Vision
Generally Discriminant Analysis
Roc curves By Vittoria Cozza, matr
Review of Statistical Pattern Recognition
CAMCOS Report Day December 9th, 2015 San Jose State University
Robert Holte University of Alberta
Presentation transcript:

Three Papers: AUC, PFA and BIOInformatics The three papers are posted online

Learning Algorithms for Better Ranking Jin Huang, Charles X. Ling: Using AUC and Accuracy in Evaluating Learning Algorithms. IEEE Trans. Knowl. Data Eng. 17(3): (2005)IEEE Trans. Knowl. Data Eng. 17 Find the citations online (google scholar) Goal: accuracy vs ranking Secondary Goal: Decision Tree vs Bayesian Networks in Ranking – Design Algorithms That Directly Optimize Ranking

Accuracy: not good enough Two classifiers Accuracy of Classifier1: 4/5 Accuracy of Classifier2: 4/5 But intuitively, Classifier 1 is better! Classifier 1 –––– + – ++++ Classifier 2+ –––– ++++ – Cutoff line Higher ranking: more desirable

Accuracy vs ranking Accuracy-based: making two assumptions: balanced class distribution and equal costs for misclassification Ranking: step aside these assumptions – Problem: Training examples are labeled, not ranked How to evaluate ranking?

ROC curve (Provost & Fawcett, AAAI’97)

How to calculate AUC Rank test examples in an increasing order Let r i be the rank of the i th positive example (left: low r_i, right: high r_i = better) S 0 = ∑ r i AUC: (Hand & Till, 2001, MLJ)

An example Classifier 1 –––– + – ++++ riri S 0 = = 39 AUC = (39 – 5x6/2) / 25 = 24/25 Better result

ROC curve and AUC If A dominates D, then A is better than D Often A and B are not dominating each other AUC (area under the ROC curve) – Overall performance AUC for evaluating ranking

ROC curve and AUC Traditional learning algorithms produce poor probability estimates as by-product. – Decision tree algorithms – Strategies to improve How about Bayesian network learning algorithms ?

Evaluation of Classifiers Classification accuracy or error rate. ROC curve and AUC.

AUC Two classifiers: The AUC of Classifier1: 24/25 The AUC of Classifier2: 16/25 Classifier 1 is better than 2! Classifier 1 –––– + – ++++ Classifier 2+ –––– ++++ –

AUC is more discriminating For N examples (N+1) different accuracies N (N+1)/2 different AUC values AUC is a better and more discriminating evaluation measure than accuracy

Naïve Bayes vs C4.4 Overall, Naïve Bayes outperforms C4.4 in AUC Ling&Zhang, submitted, 2002

PCA in Face Recognition

Problem with PCA The features are principal components – Thus they do not correspond directly to the original features – Problem with face recognition: wish to pick a subset of original features rather than composed ones Principal Feature Analysis: pick the best, uncorrelated, subset of features of a data set – Equivalent to finding q dimensions of a random variable X=[x1,x2, …, xn]^T

How to find the q features? [ q1, q2, q3, … qn] i^th row= i^th feature q

The subspace

Algorithm

Result

When PCA does not work

PCA + Clustering = Bad Idea

More…

Rand Index for Clusters (Partitions)

Results