Download presentation
Presentation is loading. Please wait.
1
Executive Director and Endowed Chair
CS 5323 Base Rate Fallacy Prof. Ravi Sandhu Executive Director and Endowed Chair Lecture 13 © Ravi Sandhu World-Leading Research with Real-World Impact!
2
Base-Rate Fallacy S: Patient is Sick (has the disease) S ¬S R ᴧ S
True positive False positive R: Test Result is positive ¬R ᴧ S ¬R ᴧ ¬S ¬R False negative True negative © Ravi Sandhu World-Leading Research with Real-World Impact! 2
3
Base-Rate Fallacy S: Patient is Sick (has the disease)
System is under attack S ¬S R ᴧ S R ᴧ ¬S R True positive False positive R: Test Result is positive Alarm is raised ¬R ᴧ S ¬R ᴧ ¬S ¬R False negative True negative © Ravi Sandhu World-Leading Research with Real-World Impact! 3
4
Malware Detection Techniques
I will learn what is good and bad False positives: incorrect learning False negatives: incorrect learning I know what is bad and can detect it False positives: none False negatives: ever increasing I know what is good and can detect when you go beyond specification False positives: incomplete specification False negatives: incorrect specification Nwokedi Idika and Aditya Mathur, A Survey of Malware Detection Techniques, Purdue University, Feb 2007. © Ravi Sandhu World-Leading Research with Real-World Impact! 4
5
Base-Rate Fallacy S: Patient is Sick (has the disease) S ¬S R ᴧ S
True positive False positive R: Test Result is positive ¬R ᴧ S ¬R ᴧ ¬S ¬R False negative True negative © Ravi Sandhu World-Leading Research with Real-World Impact! 5
6
Base-Rate Fallacy S: Patient is Sick (has the disease) S ¬S R ᴧ S
True positive False positive P(R|S) = 0.99 P(R|¬S) = 0.01 R: Test Result is positive ¬R ᴧ S ¬R ᴧ ¬S ¬R False negative True negative These probabilities can be empirically estimated P(¬R|S) = 0.01 P(¬R|¬S) = 0.99 © Ravi Sandhu World-Leading Research with Real-World Impact! 6
7
Estimating P(R|S) etc 2000 sick 1000 not sick Test R is positive
is negative Test R is positive Test R is negative 1980 20 10 990 estimate P(R|S) = 0.99 P(¬R|S) = 0.01 P(R|¬S) = 0.01 P(¬R|¬S) = 0.99 Coincidentally equal © Ravi Sandhu World-Leading Research with Real-World Impact! 7
8
Estimating P(R|S) etc 2000 sick 1000 not sick Test R is positive
is negative Test R is positive Test R is negative 1980 20 30 970 estimate P(R|S) = 0.99 P(¬R|S) = 0.01 P(R|¬S) = 0.03 P(¬R|¬S) = 0.97 In general will not be equal © Ravi Sandhu World-Leading Research with Real-World Impact! 8
9
Base-Rate Fallacy S: Patient is Sick (has the disease) S ¬S R ᴧ S
True positive False positive P(R|S) = 0.99 P(R|¬S) = 0.03 Rows must total between 0 and 2 R: Test Result is positive ¬R ᴧ S ¬R ᴧ ¬S ¬R False negative True negative These probabilities can be empirically estimated P(¬R|S) = 0.01 P(¬R|¬S) = 0.97 Columns must total 1 © Ravi Sandhu World-Leading Research with Real-World Impact! 9
10
Base-Rate Fallacy S: Patient is Sick (has the disease)
We will continue with these numbers S ¬S R ᴧ S R ᴧ ¬S R True positive False positive P(R|S) = 0.99 P(R|¬S) = 0.01 R: Test Result is positive ¬R ᴧ S ¬R ᴧ ¬S ¬R False negative True negative These probabilities can be empirically estimated P(¬R|S) = 0.01 P(¬R|¬S) = 0.99 © Ravi Sandhu World-Leading Research with Real-World Impact! 10
11
Real Interest S: Patient is Sick (has the disease) S ¬S R ᴧ S R ᴧ ¬S R
True positive False positive P(S|R) = ?? P(¬S|R) = ?? Rows must total 1 R: Test Result is positive ¬R ᴧ S ¬R ᴧ ¬S ¬R False negative True negative These probabilities can be computed by Bayes’ theorem if we know P(S) P(S|¬R) = ?? P(¬S|¬R) = ?? Columns must total between 0 and 2 © Ravi Sandhu World-Leading Research with Real-World Impact! 11
12
Bayes’ Theorem P(S|R) = (P(S)×P(R|S))/ (P(S)×P(R|S)+P(¬S) )×P(R|¬S))
P(¬S|R) = 1 - P(S|R) P(S|¬R) = (P(S)×P(¬R|S))/ (P(S)×P(¬R|S)+P(¬S) )×P(¬R|¬S)) P(¬S|¬R) = 1 - P(S|¬R) © Ravi Sandhu World-Leading Research with Real-World Impact! 12
13
Base-Rate Fallacy S: Patient is Sick (has the disease)
We will continue with these numbers S ¬S R ᴧ S R ᴧ ¬S R True positive False positive P(R|S) = 0.99 P(R|¬S) = 0.01 R: Test Result is positive ¬R ᴧ S ¬R ᴧ ¬S ¬R False negative True negative These probabilities can be empirically estimated P(¬R|S) = 0.01 P(¬R|¬S) = 0.99 © Ravi Sandhu World-Leading Research with Real-World Impact! 13
14
Real Interest S: Patient is Sick (has the disease) Assume P(S)=0.0001
1 in 10,000 has disease S ¬S R ᴧ S R ᴧ ¬S R True positive False positive P(S|R) = P(¬S|R) = Rows must total 1 R: Test Result is positive ¬R ᴧ S ¬R ᴧ ¬S ¬R False negative True negative These probabilities can be computed by Bayes’ theorem if we know P(S) P(S|¬R) = P(¬S|¬R) = Columns must total between 0 and 2 © Ravi Sandhu World-Leading Research with Real-World Impact! 14
15
False Alarms Predominate!
Assume P(S)=0.0001 1 in 10,000 has disease P(S|R) requires P(R|¬S) © Ravi Sandhu World-Leading Research with Real-World Impact! 15
16
Base-Rate Fallacy S: Patient is Sick (has the disease)
Total population = 1,000,000 1 in 10,000 has disease S ¬S 100 999,900 R ᴧ S R ᴧ ¬S R True positive False positive R: Test Result is positive ¬R ᴧ S ¬R ᴧ ¬S ¬R False negative True negative R is 99% accurate for sick and non-sick populations © Ravi Sandhu World-Leading Research with Real-World Impact! 16
17
Base-Rate Fallacy S: Patient is Sick (has the disease)
Total population = 1,000,000 1 in 10,000 has disease S ¬S 100 999,900 R ᴧ S R ᴧ ¬S R True positive False positive 99 9,999 R: Test Result is positive ¬R ᴧ S ¬R ᴧ ¬S ¬R False negative True negative 1 989,901 R is 99% accurate for sick and non-sick populations © Ravi Sandhu World-Leading Research with Real-World Impact! 17
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.