What they are and how they fit into

Slides:

Advertisements

Similar presentations

Introduction to Monte Carlo Markov chain (MCMC) methods

Advertisements

Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.

Current Research in Forensic Toolmark Analysis Helping to satisfy the “new” needs of forensic scientists with state of the art microscopy, computation.

Statistical Modeling and Data Analysis Given a data set, first question a statistician ask is, “What is the statistical model to this data?” We then characterize.

Bayesian inference “Very much lies in the posterior distribution” Bayesian definition of sufficiency: A statistic T (x 1, …, x n ) is sufficient for 

Psychology 290 Special Topics Study Course: Advanced Meta-analysis April 7, 2014.

NASSP Masters 5003F - Computational Astronomy Lecture 5: source detection. Test the null hypothesis (NH). –The NH says: let’s suppose there is no.

Statistical Issues in Research Planning and Evaluation

What is Statistical Modeling

Machine Learning CMPT 726 Simon Fraser University

Lecture 9: p-value functions and intro to Bayesian thinking Matthew Fox Advanced Epidemiology.

Quantitative Provenance Using Bayesian Networks to Help Quantify the Weight of Evidence In Fine Arts Investigations A Case Study: Red Black and Silver.

Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.

Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.

Overview G. Jogesh Babu. Probability theory Probability is all about flip of a coin Conditional probability & Bayes theorem (Bayesian analysis) Expectation,

Additional Slides on Bayesian Statistics for STA 101 Prof. Jerry Reiter Fall 2008.

Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models.

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

Statistical Decision Theory

Harrison B. Prosper Workshop on Top Physics, Grenoble Bayesian Statistics in Analysis Harrison B. Prosper Florida State University Workshop on Top Physics:

Introduction Osborn. Daubert is a benchmark!!!: Daubert (1993)- Judges are the “gatekeepers” of scientific evidence. Must determine if the science is.

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.

Computer Science, Software Engineering & Robotics Workshop, FGCU, April 27-28, 2012 Fault Prediction with Particle Filters by David Hatfield mentors: Dr.

Bayesian Networks for Data Mining David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, (1997))

ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.

Computational Intelligence: Methods and Applications Lecture 12 Bayesian decisions: foundation of learning Włodzisław Duch Dept. of Informatics, UMK Google:

Current Research in Forensic Toolmark Analysis Petraco Group.

Computational Strategies for Toolmarks:. Outline Introduction Details of Our Approach Data acquisition Methods of statistical discrimination Error rate.

G. Cowan Lectures on Statistical Data Analysis Lecture 1 page 1 Lectures on Statistical Data Analysis London Postgraduate Lectures on Particle Physics;

Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.

Week 71 Hypothesis Testing Suppose that we want to assess the evidence in the observed data, concerning the hypothesis. There are two approaches to assessing.

Bayesian vs. frequentist inference frequentist: 1) Deductive hypothesis testing of Popper--ruling out alternative explanations Falsification: can prove.

Bayesian statistics Probabilities for everything.

Correlation Assume you have two measurements, x and y, on a set of objects, and would like to know if x and y are related. If they are directly related,

Discriminant functions are trained on a finite set of data How much fitting should we do? What should the model’s dimension be? Model must be used to.

Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.

Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.

The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)

Frequentist and Bayesian Measures of Association Quality in Algorithmic Toolmark Identification.

Lecture 2: Statistical learning primer for biologists

Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.

Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.

Bayes Theorem. Prior Probabilities On way to party, you ask “Has Karl already had too many beers?” Your prior probabilities are 20% yes, 80% no.

1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.

G. Cowan Lectures on Statistical Data Analysis Lecture 8 page 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem 2Random variables and.

1 Introduction to Statistics − Day 3 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Brief catalogue of probability densities.

Machine Learning 5. Parametric Methods.

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

1 Probability and Statistics Confidence Intervals.

- 1 - Outline Introduction to the Bayesian theory –Bayesian Probability –Bayes’ Rule –Bayesian Inference –Historical Note Coin trials example Bayes rule.

Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.

Outline Historical note about Bayes’ rule Bayesian updating for probability density functions –Salary offer estimate Coin trials example Reading material:

FIXETH LIKELIHOODS this is correct. Bayesian methods I: theory.

Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.

Forensic Surface Metrology

Lecture 1.31 Criteria for optimal reception of radio signals.

Oliver Schulte Machine Learning 726

Pushing Out the Frontiers of Forensic Science

MCMC Output & Metropolis-Hastings Algorithm Part I

Bayesian Methods What they are and how to use them in

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 12

Bayesian data analysis

Bayes Net Learning: Bayesian Approaches

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 12

Instructors: Fei Fang (This Lecture) and Dave Touretzky

PSY 626: Bayesian Statistics for Psychological Science

LECTURE 07: BAYESIAN ESTIMATION

Bayes for Beginners Luca Chech and Jolanda Malamud

CS639: Data Management for Data Science

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 12

Presentation transcript:

What they are and how they fit into Bayesian Methods What they are and how they fit into Forensic Science

Outline Bayes’ Rule Bayesian Statistics (Briefly!) Conjugates General parametric using BUGS/MC software Bayesian Networks Some software: GeNIe, SamIam, Hugin, gR R-packages Bayesian Hypothesis Testing The “Bayesian Framework” in Forensic Science Likelihood Ratios with Bayesian Network software

A little about conditional probability Bayes’ Rule:

Probability Frequency: ratio of the number of observations of interest (ni) to the total number of observations (N) It is EMPIRICAL! Probability (frequentist): frequency of observation i in the limit of a very large number of observations

Probability Belief: A “Bayesian’s” interpretation of probability. An observation (outcome, event) is a “measure of the state of knowlege”Jaynes. Bayesian-probabilities reflect degree of belief and can be assigned to any statement Beliefs (probabilities) can be updated in light of new evidence (data) via Bayes theorem.

A better understanding of the world Bayesian Statistics The basic Bayesian philosophy: Prior Knowledge × Data = Updated Knowledge A better understanding of the world Prior × Data = Posterior

Bayesian Statistics Bayesian-ism can be a lot like a religion Different “sects” of (dogmatic) Bayesians don’t believe other “sects” are “true Bayesians” Bayes Nets Graphical Models Steffan Lauritzen (Oxford) Judea Pearl (UCLA) Parametric BUGS (Bayesian Using Gibbs Sampling) MCMC (Markov-Chain Monte Carlo) Andrew Gelman (Columbia) David Speigelhalter (Cambridge) The major Bayesian “churches” Empirical Bayes Data-driven Brad Efron (Stanford)

Bayesian Statistics What’s a Bayesian…?? Someone who adheres ONLY to belief interpretation of probability? Someone who uses Bayesian methods? Only uses Bayesian methods? Usually likes to beat-up on frequentist methodology…

Bayesian Statistics Why? Actually DOING Bayesian statistics is hard! We will up-date this prior belief later Why? Parametric Bayesian Methods All probability functions are “parameterized”

YUK! Bayesian Statistics We need Bayes’ rule: We have a prior “belief” for the value of the mean We observe some data What can we say about the mean now? We need Bayes’ rule: YUK! And this is for an “easy” problem

Bayesian Statistics So what can we do???? Until ~ 1990, get lucky….. Sometimes we can work out the integrals by hand Sometimes you get posteriors that are the same form as the priors (Conjugacy) Now there is software to evaluate the integrals. Some free stuff: MCMC: WinBUGS, OpenBUGS, JAGS HMC: Stan

Bayesian Networks A “scenario” is represented by a joint probability function Contains variables relevant to a situation which represent uncertain information Contain “dependencies” between variables that describe how they influence each other. A graphical way to represent the joint probability function is with nodes and directed lines Called a Bayesian NetworkPearl

Bayesian Networks (A Very!!) Simple exampleWiki: What is the probability the Grass is Wet? Influenced by the possibility of Rain Influenced by the possibility of Sprinkler action Sprinkler action influenced by possibility of Rain Construct joint probability function to answer questions about this scenario: Pr(Grass Wet, Rain, Sprinkler)

Bayesian Networks Pr(Sprinkler | Rain) Pr(Rain) Rain: yes no Sprinkler: was on 40% 1% was off 60% 99% Pr(Rain) Rain: yes 20% no 80% Pr(Grass Wet | Rain, Sprinkler) Sprinkler: was on was off Rain: yes no Grass Wet: 99% 90% 80% 0% 1% 10% 100%

Bayesian Networks Pr(Sprinkler) Pr(Rain) Pr(Grass Wet) Other probabilities are adjusted given the observation You observe grass is wet. Pr(Grass Wet)

Bayesian Networks Areas where Bayesian Networks are used Medical recommendation/diagnosis IBM/Watson, Massachusetts General Hospital/DXplain Image processing Business decision support Boeing, Intel, United Technologies, Oracle, Philips Information search algorithms and on-line recommendation engines Space vehicle diagnostics NASA Search and rescue planning US Military Requires software. Some free stuff: GeNIe (University of Pittsburgh)G, SamIam (UCLA)S Hugin (Free only for a few nodes)H gR R-packagesgR

Bayesian Statistics Bayesian network for the provenance of a painting given trace evidence found on that painting

Bayesian Statistics Frequentist hypothesis testing: Assume/derive a “null” probability model for a statistic E.g.: Sample averages follow a Gaussian curve Say sample statistic falls here “Wow”! That’s an unlikely value under the null hypothesis (small p-value)

Bayesian Statistics Bayesian hypothesis testing: Assume/derive a “null” probability model for a statistic Assume an “alternative” probability model p(x|null) p(x|alt) Say sample statistic falls here

The “Bayesian Framework” Bayes’ RuleAitken, Taroni: Hp = the prosecution’s hypothesis Hd = the defences’ hypothesis E = any evidence I = any background information

{ { { The “Bayesian Framework” Odd’s form of Bayes’ Rule: Posterior odds in favour of prosecution’s hypothesis Likelihood Ratio Prior odds in favour of prosecution’s hypothesis Posterior Odds = Likelihood Ratio × Prior Odds

The “Bayesian Framework” The likelihood ratio has largely come to be the main quantity of interest in their literature: A measure of how much “weight” or “support” the “evidence” gives to one hypothesis relative to the other Here, Hp relative to Hd Major Players: Evett, Aitken, Taroni, Champod Influenced by Dennis Lindley

The “Bayesian Framework” Likelihood ratio ranges from 0 to infinity Points of interest on the LR scale: LR = 0 means evidence TOTALLY DOES NOT SUPPORT Hp in favour of Hd LR = 1 means evidence does not support either hypothesis more strongly LR = ∞ means evidence TOTALLY SUPPORTS Hp in favour of Hd

The “Bayesian Framework” A standard verbal scale of LR “weight of evidence” IS IN NO WAY, SHAPE OR FORM, SETTLED IN THE STATISTICS LITERATURE! A popular verbal scale is due to Jefferys but there are others READ British R v. T footwear case!

Bayesian Networks Likelihood Ratio can be obtained from the BN once evidence is entered Use the odd’s form of Bayes’ Theorem: Probabilities of the theories after we entered the evidence Probabilities of the theories before we entered the evidence

The “Bayesian Framework” Computing the LR from our painting provenance example:

How good of a “match” is it? Efron Empirical Bayes’ An I.D. is output for each questioned toolmark This is a computer “match” What’s the probability the tool is truly the source of the toolmark? Similar problem in genomics for detecting disease from microarray data They use data and Bayes’ theorem to get an estimate No diseasegenomics = Not a true “match”toolmarks

Empirical Bayes’ We use Efron’s machinery for “empirical Bayes’ two-groups model”Efron Surprisingly simple! Use binned data to do a Poisson regression Some notation: S-, truly no association, Null hypothesis S+, truly an association, Non-null hypothesis z, a score derived from a machine learning task to I.D. an unknown pattern with a group z is a Gaussian random variate for the Null

Empirical Bayes’ From Bayes’ Theorem we can getEfron: Estimated probability of not a true “match” given the algorithms' output z-score associated with its “match” Names: Posterior error probability (PEP)Kall Local false discovery rate (lfdr)Efron Suggested interpretation for casework: We agree with Gelaman and ShaliziGelman: “…posterior model probabilities …[are]… useful as tools for prediction and for understanding structure in data, as long as these probabilities are not taken too seriously.” = Estimated “believability” of machine made association

Empirical Bayes’ Bootstrap procedure to get estimate of the KNM distribution of “Platt-scores”Platt,e1071 Use a “Training” set Use this to get p-values/z-values on a “Validation” set Inspired by Storey and Tibshirani’s Null estimation methodStorey Use SVM to get KM and KNM “Platt-score” distributions Use a “Validation” set From fit histogram by Efron’s method get: “mixture” density z-density given KNM => Should be Gaussian Estimate of prior for KNM What’s the point?? We can test the fits to and ! z-score

Posterior Association Probability: Believability Curve 12D PCA-SVM locfdr fit for Glock primer shear patterns +/- 2 standard errors

Bayes Factors/Likelihood Ratios In the “Forensic Bayesian Framework”, the Likelihood Ratio is the measure of the weight of evidence. LRs are called Bayes Factors by most statistician LRs give the measure of support the “evidence” lends to the “prosecution hypothesis” vs. the “defense hypothesis” From Bayes Theorem:

Bayes Factors/Likelihood Ratios Once the “fits” for the Empirical Bayes method are obtained, it is easy to compute the corresponding likelihood ratios. Using the identity: the likelihood ratio can be computed as:

Bayes Factors/Likelihood Ratios Using the fit posteriors and priors we can obtain the likelihood ratiosTippett, Ramos Known match LR values Known non-match LR values

Acknowledgements Professor Chris Saunders (SDSU) Professor Christophe Champod (Lausanne) Alan Zheng (NIST) Research Team: Dr. Martin Baiker Ms. Helen Chan Ms. Julie Cohen Mr. Peter Diaczuk Dr. Peter De Forest Mr. Antonio Del Valle Ms. Carol Gambino Dr. James Hamby Ms. Alison Hartwell, Esq. Dr. Thomas Kubic, Esq. Ms. Loretta Kuo Ms. Frani Kammerman Dr. Brooke Kammrath Mr. Chris Luckie Off. Patrick McLaughlin Dr. Linton Mohammed Mr. Nicholas Petraco Dr. Dale Purcel Ms. Stephanie Pollut Dr. Peter Pizzola Dr. Graham Rankin Dr. Jacqueline Speir Dr. Peter Shenkin Ms. Rebecca Smith Mr. Chris Singh Mr. Peter Tytell Ms. Elizabeth Willie Ms. Melodie Yu Dr. Peter Zoon