Reporting match error: casework, validation & language

Slides:



Advertisements
Similar presentations
DNA Identification: Mixture Weight & Inference
Advertisements

Overcoming DNA Stochastic Effects 2010 NEAFS & NEDIAI Meeting November, 2010 Manchester, VT Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA Cybergenetics.
Forensic DNA Inference ICFIS 2008 Lausanne, Switzerland Mark W Perlin, PhD, MD, PhD Joseph B Kadane, PhD Robin W Cotton, PhD Cybergenetics ©
New York State Police TrueAllele ® Casework Developmental Validation Cybergenetics © New York State DNA Subcommittee March, 2010.
Finding Truth in DNA Mixture Evidence Innocence Network Conference April, 2013 Charlotte, NC Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA.
How Inclusion Interpretation of DNA Mixture Evidence Reduces Identification Information American Academy of Forensic Sciences February, 2013 Washington,
Creating informative DNA libraries using computer reinterpretation of existing data Northeastern Association of Forensic Scientists November, 2011 Newport,
How TrueAllele ® Works (Part 3) Kinship, Paternity and Missing Persons Cybergenetics Webinar December, 2014 Mark W Perlin, PhD, MD, PhD Cybergenetics,
How TrueAllele® Works (Part 1)
TrueAllele ® Casework Validation on PowerPlex ® 21 Mixture Data Australian and New Zealand Forensic Science Society September, 2014 Adelaide, South Australia.
Using TrueAllele ® Casework to Separate DNA Mixtures of Relatives California Association of Criminalists October, 2014 San Francisco, CA Jennifer Hornyak,
Kern Regional Crime Laboratory Laboratory Director: Dr. Kevin W. P. Miller TRUEALLELE® WORK AND WORKFLOW: KERN COUNTY’S FIRST CASES APRIL 23, 2014.
Revolutionising DNA analysis in major crime investigations The Investigator Conferences Green Park Conference Centre May, 2014 Aylesbury, Buckinghamshire.
Revolutionising DNA analysis in major crime investigations The Investigator Conferences Green Park Conference Centre May, 2014 Aylesbury, Buckinghamshire.
Scientific Validation of Mixture Interpretation Methods 17th International Symposium on Human Identification Sponsored by the Promega Corporation October,
Computer Interpretation of Uncertain DNA Evidence National Institute of Justice Computer v. Human June, 2011 Arlington, VA Mark W Perlin, PhD, MD, PhD.
How TrueAllele ® Works (Part 2) Degraded DNA and Allele Dropout Cybergenetics Webinar November, 2014 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh,
TrueAllele ® Genetic Calculator: Implementation in the NYSP Crime Laboratory NYS DNA Subcommittee May 19, 2010 Barry Duceman, Ph.D New York State Police.
Separating Familial Mixtures, One Genotype at a Time Northeastern Association of Forensic Scientists November, 2014 Hershey, PA Ria David, PhD, Martin.
Cybergenetics Webinar January, 2015 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA Cybergenetics © How TrueAllele ® Works (Part 4)
Unleashing Forensic DNA through Computer Intelligence Forensics Europe Expo Forensic Innovation Conference April, 2013 London, UK Mark W Perlin, PhD, MD,
Rapid DNA Response: On the Wings of TrueAllele Mid-Atlantic Association of Forensic Scientists May, 2015 Cambridge, Maryland Martin Bowkley, Matthew Legler,
Getting Past First Bayes with DNA Mixtures American Academy of Forensic Sciences February, 2014 Seattle, WA Mark W Perlin, PhD, MD, PhD Cybergenetics,
Compute first, ask questions later: an efficient TrueAllele ® workflow Midwestern Association of Forensic Scientists October, 2014 St. Paul, MN Martin.
Virginia TrueAllele ® Validation Study: Casework Comparison Presented at AAFS, February, 2013 Published in PLOS ONE, March, 2014 Mark W Perlin, PhD, MD,
Murder in McKeesport October 25, 2008 Tamir Thomas.
When Good DNA Goes Bad International Conference on Forensic Research & Technology October, 2012 Chicago, Illinois Mark W Perlin, PhD, MD, PhD Cybergenetics,
Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA Cybergenetics © Duquesne University October, 2015 Pittsburgh, PA What’s in a Match?
Open Access DNA Database Duquesne University March, 2013 Pittsburgh, PA Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA Cybergenetics ©
Objective DNA Mixture Information in the Courtroom: Relevance, Reliability & Acceptance NIST International Symposium on Forensic Science Error Management:
Death Needs Answers: The DNA Evidence Cybergenetics © Andrea Niapas Book Launch Pittsburgh, PA May, 2013 Mark W Perlin, PhD, MD, PhD Cybergenetics,
DNA Identification: Quantitative Data Modeling Cybergenetics © Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA TrueAllele ® Lectures.
Separating DNA Mixtures by Computer to Identify and Convict a Serial Rapist Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA Garett Sugimoto,
Four person DNA mixture
A Match Likelihood Ratio for DNA Comparison
Forensic Stasis in a World of Flux
Overcoming Bias in DNA Mixture Interpretation
Validating TrueAllele® genotyping on ten contributor DNA mixtures
How to Defend Yourself Against DNA Mixtures
Error in the likelihood ratio: false match probability
Explaining the Likelihood Ratio in DNA Mixture Interpretation
Distorting DNA evidence: methods of math distraction
On the threshold of injustice: manipulating DNA evidence
Machines can work it out: Automated TrueAllele® workflow
Virginia TrueAllele® Validation Study: Casework Comparison
Solving sexual assault cases using DNA mixture evidence
Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA, USA
Suffolk County TrueAllele® Validation
American Academy of Forensic Sciences Criminalistics Section
Solving Crimes using MCMC to Analyze Previously Unusable DNA Evidence
Investigative DNA Databases that Preserve Identification Information
DNA Identification: Inclusion Genotype and LR
New York State Police TrueAllele® Validation
Forensic Thinking, Fast and Slow
DNA Identification: Stochastic Effects
Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA, USA
DNA Identification: Biology and Information
DNA Identification: Biology and Information
Forensic match information: exact calculation and applications
severed carotid artery
TrueAllele® computer technology
The Triumph and Tragedy of DNA Evidence
Probabilistic Genotyping to the Rescue for Pinkins and Glenn
Forensic validation, error and reporting: a unified approach
DNA Identification: Biology and Information
DNA Identification: Mixture Interpretation
Exonerating the Innocent with Probabilistic Genotyping
Testifying about probabilistic genotyping results
David W. Bauer1, PhD Nasir Butt2, PhD Jeffrey Oblock2
Using probabilistic genotyping to distinguish family members
Presentation transcript:

Reporting match error: casework, validation & language Cybergenetics Webinar User Group Meeting December, 2018 Pittsburgh, PA Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA Cybergenetics © 2003-2018 Cybergenetics © 2003-2011

How often would evidence match the wrong person as strongly as the defendant? information likelihood ratio unfamiliar concept How often probability frequency familiar concept

Non-contributor distribution Frequency of log(LR) information for an uncertain genotype, relative to random population CDF PMF Tail region gives error: probability of misleading evidence (PME)

Genotype prior probability 11 12 0.1586

Genotype posterior probability q 11 12 0.1586 0.0236

Genotype probability ratio (LR) q q/p 11 12 0.1586 0.0236 0.1485

Genotype match strength q q/p log(q/p) 11 12 0.1586 0.0236 0.1485 -0.8282

Plot genotype match strength q q/p log(q/p) 11 12 0.1586 0.0236 0.1485 -0.8282 -1 -½ ½ 1

Show another genotype value q q/p log(q/p) 11 12 10 13 0.1586 0.0299 0.0236 0.0114 0.1485 0.3805 -0.8282 -0.4197 -1 -½ ½ 1

Add another allele pair q q/p log(q/p) 11 12 10 13 10 10 0.1586 0.0299 0.0725 0.0236 0.0114 0.0946 0.1485 0.3805 1.3045 -0.8282 -0.4197 0.1154 -1 -½ ½ 1

And another p q q/p log(q/p) 11 12 10 13 10 10 10 12 0.1586 0.0299 0.0725 0.1610 0.0236 0.0114 0.0946 0.2743 0.1485 0.3805 1.3045 1.7037 -0.8282 -0.4197 0.1154 0.2314 -1 -½ ½ 1

And another genotype value q q/p log(q/p) 11 12 10 13 10 10 10 12 10 11 0.1586 0.0299 0.0725 0.1610 0.1428 0.0236 0.0114 0.0946 0.2743 0.5785 0.1485 0.3805 1.3045 1.7037 4.0517 -0.8282 -0.4197 0.1154 0.2314 0.6076 -1 -½ ½ 1

For each value, prior & strength log(q/p) 11 12 10 13 10 10 10 12 10 11 0.1586 0.0299 0.0725 0.1610 0.1428 -0.8282 -0.4197 0.1154 0.2314 0.6076 -1 -½ ½ 1

Chance of match strength ≥ 0 p log(q/p) 11 12 10 13 10 10 10 12 10 11 0.1586 0.0299 0.0725 0.1610 0.1428 0.3763 -0.8282 -0.4197 0.1154 0.2314 0.6076 -1 -½ ½ 1

Chance of strength ≥ 0.2314 p log(q/p) 11 12 10 13 10 10 10 12 10 11 0.1586 0.0299 0.0725 0.1610 0.1428 0.3038 -0.8282 -0.4197 0.1154 0.2314 0.6076 -1 -½ ½ 1

Chance of strength ≥ 0.5 p log(q/p) 11 12 10 13 10 10 10 12 10 11 0.1586 0.0299 0.0725 0.1610 0.1428 -0.8282 -0.4197 0.1154 0.2314 0.6076 -1 -½ ½ 1

Organize prior probability into a frequency plot 11 12 10 13 10 10 10 12 10 11 -1 -½ ½ 1

Chance of match strength ≥ 0 11 12 10 13 10 10 10 12 10 11 0.3763 -1 -½ ½ 1

Chance of strength ≥ 0.2314 11 12 10 13 10 10 10 12 10 11 0.3038 -1 -½ ½ 1

Error statement (one locus) For a match strength of 1.7, only 1 in 3.3 people would match as strongly match strength log(1.7) = 0.2314 ban prior probability 1 / 3.3 0.3038 -1 -½ ½ 1

Chance of strength ≥ 0.5 11 12 10 13 10 10 10 12 10 11 0.1428 -1 -½ ½ ½ 1

Probability mass function 11 12 10 13 10 10 10 12 10 11 -1 -½ ½ 1

11 12 10 13 10 10 10 12 10 11 9 10 11 12 13 -1 -½ ½ 1

Locus pmf

Multi-locus genotype information 1. STR loci are independent 2. Total log(LR) = sum of locus log(LR)’s Therefore, pmf of total log(LR) is the convolution of locus log(LR) pmf’s (math fact, from 1718)

Convolution of functions

Sequential locus pmf convolution pmf of one locus convolve with second locus pmf pmf of two loci convolve with third locus pmf pmf of three loci convolve with fourth locus pmf pmf of four loci convolve … pmf across all loci

Convolving more loci 5 10 15

Error statement For a match strength of 536 thousand, only 1 in 7.32 million people would match as strongly match strength log(536,000) = 5.7291 ban CDF population probability 1 / 7,320,000 PMF

Exact vs. sampled Exact all – 1024 genotypes accurate exact probability function convolution – fast Sampled some – 104 genotypes approximate sample using random profiles Monte Carlo – slow

How can TrueAllele® help in reporting match error? A homeless man took a woman into an alleyway and sexually assaulted her. He stole her phone so she couldn’t call for help.   He threatened her, saying, “Don't tell anyone about this or I will kill you” and “You are never going to see your mother again.” Fearing for her life, she followed him across a bridge and into a downtown Pittsburgh park. He sexually assaulted her again, but she screamed and ran toward a hotel. Hotel workers came to her aid, and chased after him. Police officers caught him a few blocks away.

Crime lab DNA analysis The Allegheny County Medical Examiner's Office developed informative DNA data from the evidence. Using limited DNA mixture interpretation methods, the lab said that no conclusion can be made due to insufficient data on some items, and the complexity of the data on others. They did not report DNA match statistics.

TrueAllele® interpretation log(LR) Description Victim Suspect non-sperm rectal swabs 16.06 25.81 sperm rectal swabs 3.69 right hand fingernails 30.72 21.31 left hand fingernails 29.97 16.30 LR values range from thousands to nonillions

Genotype selector

Match table

Select the “Noncontributor” Genotype info Select the “Noncontributor” log(LR) distribution

Noncontributor distribution

Noncontributor statistics

Noncontributor tail statistics PME = 1/28,700,000 << 1/4,890 = 1/LR For a match strength of 4,890, only 1 in 28.7 million people would match as strongly

Why are verbal equivalents unnecessary? • hides the real match strength information • not what a DNA expert actually believes • misleads the jury about “million” (Koehler) Just report PME error, along with the LR, when the match strength is under a million

Why are error rates from DNA evidence and validation studies similar? A group of evidence genotype noncontributor distributions

Validation pmf is the average • exact: average the evidence distributions (a second) • sample: compare evidence vs. random profiles (weeks)

Match strength histogram (ban) Distribution curve (milliban) Match strength histogram (ban) contributor noncontributor

Reporting validation-based error noncontributor For a match strength of 536 thousand, only 1 in 9.65 million people would match as strongly

Reporting exclusionary error Frequency of log(LR) information for an uncertain genotype, relative to evidence distribution CDF PMF

Conclusions • measuring error is built into genotype probability • always report the LR; can also report PME • verbal equivalents are not good science • validation is easy – average the evidence curves no “right” match answer is needed, just the evidence genotype distributions Perlin MW. Efficient construction of match strength distributions for uncertain multi-locus genotypes. Heliyon. 2018;4(10):e00824.