Download presentation
Presentation is loading. Please wait.
1
Calibration from Probabilistic Classification
Dr. Oscar Olmedo
2
Outline Why calibrate ML probabilities How to calibrate probabilities
Platt’s method Isotonic Regression Histogram binning
3
What is Calibration About
Many ML algorithms produce predicted probabilities that do not match empirical probabilities Learning well-calibrated models has not been as extensively research as compared to research into models that discriminate well Naeini, 2016
4
Why Calibrate Calibration is useful when probabilities of predictions are critical Reduced bias for model comparison People with asymmetric misclassification costs Examples: Finance Marketing Calibration may not always be necessary If only interested in rank ordering of predictions If only interested in an optimal split to get classes Naeini, 2016
5
ML algorithms and Calibration
Known to produced will-calibrated probabilities Discriminant analysis Logistic regression Not so well-calibrated probabilities Naïve bayes SVM Tree methods Boosting Neural networks
6
How to calibrate Calibration is a post processing task
Should not affect the rank of predictions, only numerical probability In a nutshell Split data into train and test Train ML model Calibrate on test set (3 methods discussed later) Final Model to get probabilities composed of ML model and calibration model
7
Platt’s method This method fits a sigmod to predicted values
8
Isotonic Regression Pricewise liner function assuming monotonically increasing function
9
Histogram binning Naeini, 2016
10
Effects of boosting Niculescu-Mizil & Caruana 2005
11
Comparison of methods Niculescu-Mizil & Caruana 2005
12
Platt’s method Niculescu-Mizil & Caruana 2005
13
Isotonic Regression Niculescu-Mizil & Caruana 2005
14
Visualizing Probabilities
LetterRecognition dataset With R found in mlbench library Predict the letter “Z” 16 attributes based on pixels Reliability Plot
15
Applying Isotonic Regression
After calibration
16
Future Work Research into multi-class calibration methods
Research into non equal-size (or dynamic) histogram binning methods Research into ML methods that produce well-calibrated predictions
17
References Mahdi Pakdaman Naeini. OBTAINING ACCURATE PROBABILITIES USING CLASSIFIER CALIBRATION. Diss. University of Pittsburgh, 2017. Alexandru Niculescu-Mizil and Rich Caruana. "Predicting good probabilities with supervised learning." Proceedings of the 22nd international conference on Machine learning. ACM, 2005. Alexandru Niculescu-Mizil and Rich Caruana. "Obtaining Calibrated Probabilities from Boosting." UAI
18
Part Two: Careers in Data Science
22
Marketing yourself Networking
Meetups. There are a number ongoing in the DC area. Data Science DC, Spark, … Make business cards to hand out to people you meet Setup Linkedin account for an online presences This is where recruiters will look Post resume to online sites such as: indeed.com monster.com Follow up with recruiters
23
Tools and Expectations
Knowledge of Statistics Machine learning Tools *SQL Python R Java Scala Spark, an open source library written in Scala for distributed computing Online courses are a good resource While a student take electives to build your bag of tools
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.