Calibration from Probabilistic Classification

Slides:



Advertisements
Similar presentations
Study on Ensemble Learning By Feng Zhou. Content Introduction A Statistical View of M3 Network Future Works.
Advertisements

Florida International University COP 4770 Introduction of Weka.
Naïve-Bayes Classifiers Business Intelligence for Managers.
Combining Classification and Model Trees for Handling Ordinal Problems D. Anyfantis, M. Karagiannopoulos S. B. Kotsiantis, P. E. Pintelas Educational Software.
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Part I Introduction to Data Mining by Tan,
Classification: Definition Given a collection of records (training set ) –Each record contains a set of attributes, one of the attributes is the class.
Comparison of Data Mining Algorithms on Bioinformatics Dataset Melissa K. Carroll Advisor: Sung-Hyuk Cha March 4, 2003.
Application of Stacked Generalization to a Protein Localization Prediction Task Melissa K. Carroll, M.S. and Sung-Hyuk Cha, Ph.D. Pace University, School.
An Overview of Machine Learning
ROC Statistics for the Lazy Machine Learner in All of Us Bradley Malin Lecture for COS Lab School of Computer Science Carnegie Mellon University 9/22/2005.
Introduction to Data Mining with XLMiner
Lecture Notes for Chapter 4 Introduction to Data Mining
Rosa Cowan April 29, 2008 Predictive Modeling & The Bayes Classifier.
Rich Caruana Alexandru Niculescu-Mizil Presented by Varun Sudhakar.
Classification and risk prediction
Lesson learnt from the UCSD datamining contest Richard Sia 2008/10/10.
Department of Computer Science, University of Waikato, New Zealand Eibe Frank WEKA: A Machine Learning Toolkit The Explorer Classification and Regression.
Discriminative Naïve Bayesian Classifiers Kaizhu Huang Supervisors: Prof. Irwin King, Prof. Michael R. Lyu Markers: Prof. Lai Wan Chan, Prof. Kin Hong.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
Learning from Imbalanced, Only Positive and Unlabeled Data Yetian Chen
Comparing the Parallel Automatic Composition of Inductive Applications with Stacking Methods Hidenao Abe & Takahira Yamaguchi Shizuoka University, JAPAN.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Using Transactional Information to Predict Link Strength in Online Social Networks Indika Kahanda and Jennifer Neville Purdue University.
Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, a Machine Learning.
Treatment Learning: Implementation and Application Ying Hu Electrical & Computer Engineering University of British Columbia.
Decision Trees Jyh-Shing Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University.
Benk Erika Kelemen Zsolt
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Predicting Good Probabilities With Supervised Learning
NTU & MSRA Ming-Feng Tsai
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
In part from: Yizhou Sun 2008 An Introduction to WEKA Explorer.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
Cost-Sensitive Boosting algorithms: Do we really need them?
Usman Roshan Dept. of Computer Science NJIT
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Machine Learning with Spark MLlib
Captains: Nathaniel Choe, Rohan Suri, Mihir Patel, Nikhil Sardana
Machine Learning for Computer Security
Semi-Supervised Clustering
CEE 6410 Water Resources Systems Analysis
KAIR 2013 Nov 7, 2013 A Data Driven Analytic Strategy for Increasing Yield and Retention at Western Kentucky University Matt Bogard Office of Institutional.
An Empirical Comparison of Supervised Learning Algorithms
Histograms CSE 6363 – Machine Learning Vassilis Athitsos
Antoine Guitton, Geophysics Department, CSM
Action-Grounded Push Affordance Bootstrapping of Unknown Objects
Boosting and Additive Trees (2)
Trees, bagging, boosting, and stacking
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
KDD 2004: Adversarial Classification
Supervised Learning Seminar Social Media Mining University UC3M
COMP61011 : Machine Learning Ensemble Models
Machine Learning Dr. Mohamed Farouk.
Basic machine learning background with Python scikit-learn
Data Mining Lecture 11.
Machine Learning Week 1.
Classifying enterprises by economic activity
ISTEP 2016 Final Project— Project on
Machine Learning with Weka
Machine Learning with R
Machine Learning Ensemble Learning: Voting, Boosting(Adaboost)
Analytics: Its More than Just Modeling
100+ Machine Learning Models running live: The approach
Ensemble learning.
Ensemble learning Reminder - Bagging of Trees Random Forest
Welcome! Knowledge Discovery and Data Mining
Usman Roshan Dept. of Computer Science NJIT
Machine Learning in Business John C. Hull
Credit Card Fraudulent Transaction Detection
Presentation transcript:

Calibration from Probabilistic Classification Dr. Oscar Olmedo

Outline Why calibrate ML probabilities How to calibrate probabilities Platt’s method Isotonic Regression Histogram binning

What is Calibration About Many ML algorithms produce predicted probabilities that do not match empirical probabilities Learning well-calibrated models has not been as extensively research as compared to research into models that discriminate well Naeini, 2016

Why Calibrate Calibration is useful when probabilities of predictions are critical Reduced bias for model comparison People with asymmetric misclassification costs Examples: Finance Marketing Calibration may not always be necessary If only interested in rank ordering of predictions If only interested in an optimal split to get classes Naeini, 2016

ML algorithms and Calibration Known to produced will-calibrated probabilities Discriminant analysis Logistic regression Not so well-calibrated probabilities Naïve bayes SVM Tree methods Boosting Neural networks

How to calibrate Calibration is a post processing task Should not affect the rank of predictions, only numerical probability In a nutshell Split data into train and test Train ML model Calibrate on test set (3 methods discussed later) Final Model to get probabilities composed of ML model and calibration model

Platt’s method This method fits a sigmod to predicted values

Isotonic Regression Pricewise liner function assuming monotonically increasing function

Histogram binning Naeini, 2016

Effects of boosting Niculescu-Mizil & Caruana 2005

Comparison of methods Niculescu-Mizil & Caruana 2005

Platt’s method Niculescu-Mizil & Caruana 2005

Isotonic Regression Niculescu-Mizil & Caruana 2005

Visualizing Probabilities LetterRecognition dataset With R found in mlbench library Predict the letter “Z” 16 attributes based on pixels Reliability Plot

Applying Isotonic Regression After calibration

Future Work Research into multi-class calibration methods Research into non equal-size (or dynamic) histogram binning methods Research into ML methods that produce well-calibrated predictions

References Mahdi Pakdaman Naeini. OBTAINING ACCURATE PROBABILITIES USING CLASSIFIER CALIBRATION. Diss. University of Pittsburgh, 2017. Alexandru Niculescu-Mizil and Rich Caruana. "Predicting good probabilities with supervised learning." Proceedings of the 22nd international conference on Machine learning. ACM, 2005. Alexandru Niculescu-Mizil and Rich Caruana. "Obtaining Calibrated Probabilities from Boosting." UAI. 2005.

Part Two: Careers in Data Science

Marketing yourself Networking Meetups. There are a number ongoing in the DC area. Data Science DC, Spark, … Make business cards to hand out to people you meet Setup Linkedin account for an online presences This is where recruiters will look Post resume to online sites such as: indeed.com monster.com Follow up with recruiters

Tools and Expectations Knowledge of Statistics Machine learning Tools *SQL Python R Java Scala Spark, an open source library written in Scala for distributed computing Online courses are a good resource While a student take electives to build your bag of tools