Download presentation
Presentation is loading. Please wait.
1
Ian Bailey (Indiana University)
Measuring the Performance of Likelihood-Ratio-Based Systems Ian Bailey (Indiana University) Faculty Advisor: Dr. Cedric Neumann You have flexibility in the layout. Guidelines: Keep page set-up at 44” width x 34” height. Maintain White Background. Use fonts and colors for borders, headings and text as you see fit. Keep at least ½ inch clearance from edges Introduction Biometric and forensic systems are designed to infer the source of a “trace” with respect to a particular individual. These systems address two competing propositions: Hp: the “trace” originates from a specific individual Hd: the “trace” originates from a different individual Biometric systems are designed to categorically support one of the two propositions. Their performances are measured by their discriminative ability. Minimizing error rates is essential. Forensic systems are designed to weigh the evidence in support for one or the other proposition. The magnitude of the support is critical to not over/understate the value of the evidence. Both systems rely on the ratio of two probabilities: the probability of “trace” given Hp vs. the probability of the “trace” given Hd. This ratio is called the likelihood ratio (LR) This work attempts to utilize a desired property of probabilities that originates from the work of Alan Turing: the probability that LR = c is c times more likely under Hp than under Hd [2]. Equivalently, if you were to take the LR of the LR, then you would expect to obtain the same LR value in an ideally calibrated system. This work sets out to investigate the use of empirical cross-entropy (ECE) of the posterior probabilities resulting from biometric and forensic systems to measure their calibration and discriminative power [1,3]. Procedure The major aspect of this work is to investigate the behavior of ECE plots by varying the distributions of LRs that are analyzed. This way, it can be examined if the ECE Plot does indeed show the discriminating power and calibration of a set of LR values independently. The following cases are of particular interest: Case 1: a system with perfect discrimination but poor calibration. Two distributions of LRs were chosen such that they were not symmetric about LR=1, and were not overlapping. The results can be seen in Fig. 2. Case 2: a system with weak discrimination but perfect calibration. Two normal distributions were chosen (such that they overlap) to represent Hp and Hd. Both distribution were sampled 1,000,000 times. The LRs for each sample under the two competing distributions were calculated. We used the property that the LR of the LR is the LR to demonstrate that the system is perfectly calibrated before generating the ECE plot. In general, the derivative of the Receiver Operating Characteristic (ROC, Fig. 3) is the LR [2]. In this case, the slope of the ROC obtained from the LRs calculated for the two sets of samples is the LR of the LRs. Results are shown in Fig. 4 & 5. Case 3: a system with both perfect discrimination and calibration. The same strategy as in Case 2 was used, except that the two chosen distributions did not overlap. Results are not shown in this poster. Fig. 3 ROC Curve of LRs Fig. 4 LR v. LR of LR Fig. 2 ECE Plot showing perfect discrimination Fig. 5 ECE Plot showing perfect calibration Conclusion As can be seen, an ECE plot does act as a good tool to assess the calibration and discriminating power of a LR-based system in biometry and forensic science. Although, we note that ECE is based on measuring the difference between posterior probabilities and expected decisions. Therefore, entropy cannot be used to recalibrate a set of LRs from an uncalibrated operational forensic system since obtaining posterior probabilities from such system would require assigning prior probabilities to Hp and Hd. Legal and scientific scholars have repeatedly argued that such an assignment is outside of the realm of forensic scientists. References Brümmer, N., & du Preez, J. (2013, April 8). The PAV Algorithm Optimizes Binary Proper Scoring Rules. Pepe, S. P. (2004) The Statistical Evaluation of Medical Tests for Classification and Prediction. New York City, New York: Oxford University Press. Ramos, D., & Gonzalez-Rodriguez, J. (2013, May 10). Reliable support: Measuring calibration of likelihood ratios. Forensic Science International, 130, Fig 1 ECE plot. The smaller the difference between the red and blue lines, the better the calibration, and the greater distance between the black and blue lines, the better discriminating power. Acknowledgements: This work was made possible by the National Science Foundation REU Security Printing and Anti-Counterfeiting Site EEC Thanks to Dr. Cedric Neumann, Dr. Alfred Boysen and Dr. Brian Logue for their support and Jessie Hendricks (SDSU) for providing the code for ROC calculations.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.