Inferring the Number of Contributors to Mixed DNA Profiles David Paoletti.

Slides:



Advertisements
Similar presentations
Copyright Pearson Prentice Hall
Advertisements

Quantitative Methods Topic 5 Probability Distributions
DNA Identification: Mixture Weight & Inference
Overcoming DNA Stochastic Effects 2010 NEAFS & NEDIAI Meeting November, 2010 Manchester, VT Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA Cybergenetics.
New York State Police TrueAllele ® Casework Developmental Validation Cybergenetics © New York State DNA Subcommittee March, 2010.
Unit 3: Probability 3.1: Introduction to Probability
Fundamentals of Probability
Finding Truth in DNA Mixture Evidence Innocence Network Conference April, 2013 Charlotte, NC Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA.
How Inclusion Interpretation of DNA Mixture Evidence Reduces Identification Information American Academy of Forensic Sciences February, 2013 Washington,
Creating informative DNA libraries using computer reinterpretation of existing data Northeastern Association of Forensic Scientists November, 2011 Newport,
Statistical Weights of DNA Profiles Forensic Bioinformatics ( Dan E. Krane, Wright State University, Dayton, OH.
Familial searches and cold hit statistics Forensic Bioinformatics ( Dan Krane Wright State University, Dayton, OH
Peak Height Ratios in Forensic STR Analyses: Pattern of Occurrence & Effects of Concentration Data from: Illinois State Police Laboratory Illinois State.
Genophiler: A starting point for reviewing DNA testing results Michael L. Raymer, Ph.D. Travis Doom, Ph.D.
Elementary Statistics for Lawyers References Evett and Weir, Interpreting DNA evidence. Balding, Weight-of-evidence for forensic DNA profiles.
The statistical weight of mixed samples with allelic drop out First serious attempt by Gill et al. 2006, Forensic Science International 160:90 An important.
Attaching statistical weight to DNA test results 1.Single source samples 2.Relatives 3.Substructure 4.Error rates 5.Mixtures/allelic drop out 6.Database.
Database Searches Non-random samples of N individuals Typically individuals convicted of some crime Maryland, people arrested but not convicted.
Bayesian Statistics: Asking the Right Questions Michael L. Raymer, Ph.D.
Effective Change Detection Using Sampling Junghoo John Cho Alexandros Ntoulas UCLA.
Addition and Subtraction Equations
OPTN Modifications to Heart Allocation Policy Implemented July 12, 2006 Changed the allocation order for medically urgent (Status 1A and 1B) patients Policy.
Statistical Significance and Population Controls Presented to the New Jersey SDC Annual Network Meeting June 6, 2007 Tony Tersine, U.S. Census Bureau.
Measurements and Their Uncertainty 3.1
CALENDAR.
0 - 0.
Addition Facts
CS1512 Foundations of Computing Science 2 Week 3 (CSD week 32) Probability © J R W Hunter, 2006, K van Deemter 2007.
SEQUENCES Target: To find the next number in the sequence.
Chapter 7 Sampling and Sampling Distributions
Biostatistics Unit 5 Samples Needs to be completed. 12/24/13.
Student & Work Study Employment Facts & Time Card Training
Break Time Remaining 10:00.
The basics for simulations
Proving a Premise – Chi Square Inferential statistics involve randomly drawing samples from populations, and making inferences about the total population.
On Comparing Classifiers : Pitfalls to Avoid and Recommended Approach
Mental Math Math Team Skills Test 20-Question Sample.
MAT 103 Probability In this chapter, we will study the topic of probability which is used in many different areas including insurance, science, marketing,
Business and Economics 6th Edition
TCCI Barometer March “Establishing a reliable tool for monitoring the financial, business and social activity in the Prefecture of Thessaloniki”
Lecture 3 Validity of screening and diagnostic tests
Charging at 120 and 240 Volts 120-Volt Portable Vehicle Charge Cord 240-Volt Home Charge Unit.
Adding Up In Chunks.
Past Tense Probe. Past Tense Probe Past Tense Probe – Practice 1.
Before Between After.
Addition 1’s to 20.
25 seconds left…...
Putting Statistics to Work
Test B, 100 Subtraction Facts
11 = This is the fact family. You say: 8+3=11 and 3+8=11
Week 1.
Clock will move after 1 minute
Chapter 11: The t Test for Two Related Samples
Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 11 Simple Linear Regression.
Select a time to count down from the clock above
January Structure of the book Section 1 (Ch 1 – 10) Basic concepts and techniques Section 2 (Ch 11 – 15): Inference for quantitative outcomes Section.
IGES 2003 How many markers are necessary to infer correct familial relationships in follow-up studies? Silvano Presciuttini 1,3, Chiara Toni 2, Fabio Marroni.
Forensic Statistics From the ground up…. Basics Interpretation Hardy-Weinberg equations Random Match Probability Likelihood Ratio Substructure.
Kern Regional Crime Laboratory Laboratory Director: Dr. Kevin W. P. Miller TRUEALLELE® WORK AND WORKFLOW: KERN COUNTY’S FIRST CASES APRIL 23, 2014.
Statistical weights of mixed DNA profiles Forensic Bioinformatics ( Dan E. Krane, Wright State University, Dayton, OH Forensic DNA.
How TrueAllele ® Works (Part 2) Degraded DNA and Allele Dropout Cybergenetics Webinar November, 2014 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh,
Getting Past First Bayes with DNA Mixtures American Academy of Forensic Sciences February, 2014 Seattle, WA Mark W Perlin, PhD, MD, PhD Cybergenetics,
Murder in McKeesport October 25, 2008 Tamir Thomas.
DNA Identification: Quantitative Data Modeling Cybergenetics © Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA TrueAllele ® Lectures.
Data summary – “alleles” Threshold Over threshold, peaks are labeled as allele events All-or-none allele peaks, each given equal status Allele Pair 8,
Disputed DNA Stats for a Low-level Sample: A Case Study By Dan Krane – Carrie Rowland –
A Match Likelihood Ratio for DNA Comparison
Statistical Weights of DNA Profiles
Validating TrueAllele® genotyping on ten contributor DNA mixtures
Solving Crimes using MCMC to Analyze Previously Unusable DNA Evidence
Presentation transcript:

Inferring the Number of Contributors to Mixed DNA Profiles David Paoletti

2 STR Samples Each parent contributes one allele Seeing one or two alleles at a locus implies at least one contributor Three or four alleles means at least 2 contributors Etc.

3 Crime Scene Sample What is the actual number of contributors? Counting alleles may be misleading –More than 3% of 3 contributor mixtures appear to be from 2 individuals –With 4 contributors, 75% or more of the mixtures can appear to originate with fewer individuals

4 Bayesian Bottleneck We would like to be able to say something like There is a 90% chance that this sample contains 3 individuals, not 2 However, we do not know the priors –The actual percentage of crime scene samples that have a single contributor, etc Criminals self-interest is against helping

5 Create All Possible Mixtures Using a computer, consider every possible mixture that could have produced the crime scene sample –Do this for 2 contributors –Compute the probability for each potential mixture (using allele frequencies) Sum up the probabilities for all mixtures We refer to this as the Probabilistic Mixture Model, or PMM

6 Example Call the available alleles at a locus 7, 8, 9, 10, 11, 12 Assume 3 contributors, 4 unique alleles, 2 duplicates Assume weve chosen the unique alleles to be the alleles 7, 8, 9, 10 Assume that as duplicates weve chosen 8, 8 The probability of seeing this is: p 7 p 8 p 9 p 10 p 8 p 8 permutations = = = % Not very likely, but u = 4 can occur in many different ways (7,8,9,10,7,7), (7,8,9,10,7,8), …, (7,8,9,10,8,8), …, (9,10,11,12,12,12)

7 Comparing Probabilities Repeat for a different number of contributors Compare any two as a likelihood ratio; for example:

8 What does LR Mean? Suppose from the previous example that the LR was 25 This means that, if the number of contributors is actually 2, it is 25 times more likely to observe this profile than it is if the true number of contributors is 3

9 Verifying this Approach Create 2-person mixtures Create 3-person mixtures that appear (by allele counting) to be a mixture of two individuals

10 Actual 2-person Mixtures Dataset Correctly Identified FBI – Combined99.57% African American99.52% Bahamian99.03% Caucasian98.91% Jamaican98.40% Southwest Hispanic98.95% Trinidadian99.27%

11 3-person Mixtures that appear to have only 2 Contributors Dataset Correctly Identified FBI – Combined60.02% African American62.82% Bahamian72.77% Caucasian70.18% Jamaican69.74% Southwest Hispanic72.67% Trinidadian70.07%

12 Adjusting the Threshold On the previous charts, the decision was based on comparing the likelihood ratio (LR) to a threshold of 1.0 Suppose you want to be sure that youre making the correct decision, and decide that the LR must be higher

13 Effect of Changing LR Threshold

PMM Demonstration Three profiles from the publicly available FBI Dataset – Sample ID numbers 2000, 2017, B

15 Conclusions The PMM seldom predicts more contributors than the sample contains The PMM is much better than simple allele counting Using cognate frequencies produces better results

16 Future Work Compare cognate to non-cognate prediction ability Modify the approach for cases where one contributors sample is known Combine with other approaches (that use peak height or area) for a consensus decision

Tool and Contact Info

18 References David R. Paoletti, Travis E. Doom, Michael L. Raymer, and Dan E. Krane, Inferring the Number of Contributors to Mixed DNA Profiles, IEEE-ACM Transactions on Computational Biology and Bioinformatics (in preparation) David R. Paoletti, Travis E. Doom, Michael L. Raymer, and Dan E. Krane, Assessing the Implications for Close Relatives in the Event of Similar but Nonmatching DNA Profiles, Jurimetrics, 46(2), Winter 2006, pg. 161–175. David R. Paoletti, Travis E. Doom, Carissa M. Krane, Michael L. Raymer, and Dan E. Krane, Empirical Analysis of the STR Profiles Resulting from Conceptual Mixtures, Journal of Forensic Sciences, 50(6), November 2005, pg. 1361–1366. Bruce Budowle and Tamyra R. Moretti, Genotype Profiles for Six Population Groups at the 13 CODIS Short Tandem Repeat Core Loci and Other PCR-Based Loci,