Ian Bailey (Indiana University)

Slides:

Advertisements

Similar presentations

Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"

Advertisements

Pattern Recognition and Machine Learning

Inference for Regression

What is Statistical Modeling

Classification and risk prediction

Evaluating Hypotheses Chapter 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics.

Introduction to Inference Estimating with Confidence Chapter 6.1.

Software Quality Control Methods. Introduction Quality control methods have received a world wide surge of interest within the past couple of decades.

Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review

Sampling Distributions

Machine Learning CMPT 726 Simon Fraser University

Chapter 14 Introduction to Linear Regression and Correlation Analysis

Chapter 6: Probability.

Testing Hypotheses.

Quality Assessment 2 Quality Control.

Regression and Correlation Methods Judy Zhong Ph.D.

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.

1 Dr. Jerrell T. Stracener EMIS 7370 STAT 5340 Probability and Statistics for Scientists and Engineers Department of Engineering Management, Information.

Chapter 7: Sampling Distributions

Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.

Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.

Success depends upon the ability to measure performance. Rule #1:A process is only as good as the ability to reliably measure.

EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.

CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.

Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.

Jennifer Lewis Priestley Presentation of “Assessment of Evaluation Methods for Prediction and Classification of Consumer Risk in the Credit Industry” co-authored.

IX. Transient Model Nonlinear Regression and Statistical Analysis.

©2011 Brooks/Cole, Cengage Learning Elementary Statistics: Looking at the Big Picture1 Lecture 35: Chapter 13, Section 2 Two Quantitative Variables Interval.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 04: GAUSSIAN CLASSIFIERS Objectives: Whitening.

Chapter 6 - Standardized Measurement and Assessment

Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.

Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.

Example The strength of concrete depends, to some extent on the method used for drying it. Two different drying methods were tested independently on specimens.

Statistics for Business and Economics Module 1:Probability Theory and Statistical Inference Spring 2010 Lecture 4: Estimating parameters with confidence.

Review of Hypothesis Testing: –see Figures 7.3 & 7.4 on page 239 for an important issue in testing the hypothesis that  =20. There are two types of error.

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.

Computational Intelligence: Methods and Applications Lecture 14 Bias-variance tradeoff – model selection. Włodzisław Duch Dept. of Informatics, UMK Google:

Bootstrap and Model Validation

Experimental Research

7. Performance Measurement

BINARY LOGISTIC REGRESSION

INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE

Statistical Analysis Urmia University

Sampling Distributions

IMAGE PROCESSING RECOGNITION AND CLASSIFICATION

Michael E. Sigman, Mary R. Williams Forensic Science International

PCB 3043L - General Ecology Data Analysis.

Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.

Verifying and interpreting ensemble products

Statistical Data Analysis

Sampling Distributions

Data Mining Lecture 11.

Introduction to Statistics

LECTURE 05: THRESHOLD DECODING

Statistics PSY302 Review Quiz One Fall 2018

REMOTE SENSING Multispectral Image Classification

LECTURE 05: THRESHOLD DECODING

Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.

10701 / Machine Learning Today: - Cross validation,

Pattern Recognition and Machine Learning

Psychology as a Science

Statistical Data Analysis

Product moment correlation

Parametric Methods Berlin Chen, 2005 References:

Quality Control Lecture 3

LECTURE 05: THRESHOLD DECODING

Roc curves By Vittoria Cozza, matr

Test 2 Covers Topics 12, 13, 16, 17, 18, 14, 19 and 20 Skipping Topics 11 and 15.

Psychological Research Methods and Statistics

Presumptions Subgroups (samples) of data are formed.

Presentation transcript:

Ian Bailey (Indiana University) Measuring the Performance of Likelihood-Ratio-Based Systems Ian Bailey (Indiana University) Faculty Advisor: Dr. Cedric Neumann You have flexibility in the layout. Guidelines: Keep page set-up at 44” width x 34” height. Maintain White Background. Use fonts and colors for borders, headings and text as you see fit. Keep at least ½ inch clearance from edges Introduction Biometric and forensic systems are designed to infer the source of a “trace” with respect to a particular individual. These systems address two competing propositions: Hp: the “trace” originates from a specific individual Hd: the “trace” originates from a different individual Biometric systems are designed to categorically support one of the two propositions. Their performances are measured by their discriminative ability. Minimizing error rates is essential. Forensic systems are designed to weigh the evidence in support for one or the other proposition. The magnitude of the support is critical to not over/understate the value of the evidence. Both systems rely on the ratio of two probabilities: the probability of “trace” given Hp vs. the probability of the “trace” given Hd. This ratio is called the likelihood ratio (LR) This work attempts to utilize a desired property of probabilities that originates from the work of Alan Turing: the probability that LR = c is c times more likely under Hp than under Hd [2]. Equivalently, if you were to take the LR of the LR, then you would expect to obtain the same LR value in an ideally calibrated system. This work sets out to investigate the use of empirical cross-entropy (ECE) of the posterior probabilities resulting from biometric and forensic systems to measure their calibration and discriminative power [1,3]. Procedure The major aspect of this work is to investigate the behavior of ECE plots by varying the distributions of LRs that are analyzed. This way, it can be examined if the ECE Plot does indeed show the discriminating power and calibration of a set of LR values independently. The following cases are of particular interest: Case 1: a system with perfect discrimination but poor calibration. Two distributions of LRs were chosen such that they were not symmetric about LR=1, and were not overlapping. The results can be seen in Fig. 2. Case 2: a system with weak discrimination but perfect calibration. Two normal distributions were chosen (such that they overlap) to represent Hp and Hd. Both distribution were sampled 1,000,000 times. The LRs for each sample under the two competing distributions were calculated. We used the property that the LR of the LR is the LR to demonstrate that the system is perfectly calibrated before generating the ECE plot. In general, the derivative of the Receiver Operating Characteristic (ROC, Fig. 3) is the LR [2]. In this case, the slope of the ROC obtained from the LRs calculated for the two sets of samples is the LR of the LRs. Results are shown in Fig. 4 & 5. Case 3: a system with both perfect discrimination and calibration. The same strategy as in Case 2 was used, except that the two chosen distributions did not overlap. Results are not shown in this poster. Fig. 3 ROC Curve of LRs Fig. 4 LR v. LR of LR Fig. 2 ECE Plot showing perfect discrimination Fig. 5 ECE Plot showing perfect calibration Conclusion As can be seen, an ECE plot does act as a good tool to assess the calibration and discriminating power of a LR-based system in biometry and forensic science. Although, we note that ECE is based on measuring the difference between posterior probabilities and expected decisions. Therefore, entropy cannot be used to recalibrate a set of LRs from an uncalibrated operational forensic system since obtaining posterior probabilities from such system would require assigning prior probabilities to Hp and Hd. Legal and scientific scholars have repeatedly argued that such an assignment is outside of the realm of forensic scientists. References Brümmer, N., & du Preez, J. (2013, April 8). The PAV Algorithm Optimizes Binary Proper Scoring Rules. Pepe, S. P. (2004) The Statistical Evaluation of Medical Tests for Classification and Prediction. New York City, New York: Oxford University Press. Ramos, D., & Gonzalez-Rodriguez, J. (2013, May 10). Reliable support: Measuring calibration of likelihood ratios. Forensic Science International, 130, 156-169. Fig 1 ECE plot. The smaller the difference between the red and blue lines, the better the calibration, and the greater distance between the black and blue lines, the better discriminating power. Acknowledgements: This work was made possible by the National Science Foundation REU Security Printing and Anti-Counterfeiting Site EEC-1559958. Thanks to Dr. Cedric Neumann, Dr. Alfred Boysen and Dr. Brian Logue for their support and Jessie Hendricks (SDSU) for providing the code for ROC calculations.