Statistical Weights of DNA Profiles Forensic Bioinformatics ( Dan E. Krane, Wright State University, Dayton, OH
DNA statistics Coincidental 10 locus DNA profile matches are very rare Several factors can make statistics less impressive –Mixtures –Incomplete information –Relatives –Database searches
DNA profile
Comparing electropherograms Evidence sampleSuspect #1s reference EXCLUDE
Comparing electropherograms Evidence sampleSuspect #2s reference CANNOT EXCLUDE
What weight should be given to DNA evidence? Statistics do not lie. But, you have to pay close attention to the questions they are addressing.
What weight should be given to DNA evidence? Statistics do not lie. But, you have to pay close attention to the questions they are addressing. What is the chance that a randomly chosen, unrelated individual from a given population would have the same DNA profile observed in a sample?
Single source statistics: Random Match Probability (RMP) or Random Man Not Excluded (RMNE)
Single source samples Formulae for RMNE: At a locus: Heterozygotes: Homozygotes: Multiply across all loci p2p2 Statistical estimates: the product rule 2pq p2p2 p2p2 p2p2 xxxx xxxx xxxx x x
0.1454x0.1097x2 Statistical estimate: Single source sample
3.2%6.0%4.6%1.2% 9.8%9.5%6.3%2.2%1.0% 2.9%5.1%29.9%4.0% 1.1%6.6% XXXX XXXXX XXXX X Statistical estimate: Single source sample 1 in 608,961,665,956,361,000,000 1 in 608 quintillion (less than one in one billion) = xx
What weight should be given to DNA evidence? Statistics do not lie. But, you have to pay close attention to the questions they are addressing. What is the chance that a randomly chosen, unrelated individual from a given population would have the same DNA profile observed in a sample?
Mixture statistics: Combined Probability of Inclusion (CPI) or Likelihood Ratios (LR)
Mixed DNA samples
Put two peoples names into a mixture.
How many names can you take out?
How many contributors to a mixture if analysts can discard a locus? How many contributors to a mixture? Maximum # of alleles observed in a 3-person mixture # of occurrencesPercent of cases ,967, ,037, ,532, There are 146,536,159 possible different 3-person mixtures of the 959 individuals in the FB I database (Paoletti et al., November 2005 JFS). 3,398 7,274, ,469,398 26,788,
How many contributors to a mixture? Maximum # of alleles observed in a 4-person mixture # of occurrencesPercent of cases 413, ,596, ,068, ,637, , There are 57,211,376 possible different 4-way mixtures of the 194 individuals in the FB I Caucasian database (Paoletti et al., November 2005 JFS). (35,022,142,001 4-person mixtures with 959 individuals.)
CPI Stats
Probability that a random, unrelated person could be included as a possible contributor to a mixed profile For a mixed profile with the alleles 14, 16, 17, 18; contributors could have any of 10 genotypes: 14, 14 14, 1614, 1714, 18 16, 1616, 1716, 18 17, 1717, 18 18, 18 Probability works out as: CPI = (p [14] + p [16] + p [17] + p [18] ) 2 ( ) 2 = Combined Probability of Inclusion
62.1% 91.5%23.5%19.2%40.7% 47.6%99.0%54.4%61.2%8.4% 91.6%63.7%8.8% 82.9%31.1% XXXX XXXXX XXXX X 62.1% CPI Stats 1 in 1.3 million
What weight should be given to DNA evidence? Statistics do not lie. But, you have to pay close attention to the questions they are addressing. What is the chance that a randomly chosen, unrelated individual from a given population would have the same DNA profile observed in a sample?
Mixtures with drop out
The testing labs conclusions
Ignoring loci with missing alleles Labs often claim that this is a conservative statistic Ignores potentially exculpatory information It fails to acknowledge that choosing the omitted loci is suspect-centric and therefore prejudicial against the suspect. –Gill, et al. DNA commission of the International Society of Forensic Genetics: Recommendations on the interpretation of mixtures. FSI
Likelihood approaches for mixtures where allelic drop out may have occurred Determining the rate of allelic drop-out is problematic Determining the rate of allelic drop-in is problematic Considering more than two possible contributors is computationally intensive Considering mixtures of different racial groups can be computationally intensive Contributions from different kinds of close relatives require special considerations
How many names can you take out if you can use blanks?
The more blanks the harder it is to eliminate anyones name as possibly being in the mix.
What weight should be given to DNA evidence? Statistics do not lie. But, you have to pay close attention to the questions they are addressing. What is the chance that a randomly chosen, unrelated individual from a given population would have the same DNA profile observed in a sample?
The alternative suspect pool
Which allele frequency database should be used? Random match probabilities are typically generated for each of three major racial groups Literally hundreds of alternative allele frequency databases are available The racial background of a suspect is not relevant.
What is the relevant population?
A process of elimination Consider that a suspect matches an evidence sample If he is not the source of the DNA then it must be someone elses. Whose might it be? Could the actual source be: Caucasian, Afro-Caribbean, or Indo-Pakistan? If it cannot be and there is no one else in the alternative suspect pool then the suspect must be the source.
A suspect pool D matches. It means something if we find that A, B and C are all unlikely to also match. A B C D
Database searches
What weight should be given to DNA evidence? Statistics do not lie. But, you have to pay close attention to the questions they are addressing. What is the chance that a randomly chosen, unrelated individual from a given population would have the same DNA profile observed in a sample?
Consider cold hits UKs National DNA Database (NDNAD) Maintained by the Home Office Contains 6,929,946 arrested individuals as of 31 March, 2012 Assisted in 409,715 investigations (2,595 murders)
In which case is the DNA evidence most damning? Probable Cause Case –Suspect is first identified by non- DNA evidence –DNA evidence is used to corroborate traditional police investigation Cold Hit Case –Suspect is first identified by search of DNA database –Traditional police work is no longer focus
In which case is the DNA evidence most damning? Probable Cause Case –Suspect is first identified by non- DNA evidence –DNA evidence is used to corroborate traditional police investigation –RMNE = 1 in 10 million Cold Hit Case –Suspect is first identified by search of DNA database –Traditional police work is no longer focus –RMNE = 1 in 10 million
In which case is the DNA evidence most damning? Probable Cause Case –Suspect is first identified by non- DNA evidence –DNA evidence is used to corroborate traditional police investigation –RMNE = 1 in 10 million Cold Hit Case –Suspect is first identified by search of DNA database –Traditional police work is no longer focus –RMNE = 1 in 10 million –DMP = in 1
What weight should be given to DNA evidence? Statistics do not lie. But, you have to pay close attention to the questions they are addressing. What is the chance that a randomly chosen, unrelated individual from a given population would have the same DNA profile observed in a sample?
Familial searches Database search yields a close but imperfect DNA match Can suggest a relative is the true perpetrator UK performs them relatively rarely – a total of 29 were carried out in Reluctance to perform them in US since 1992 NRC report
Is the true DNA match a relative or a random individual? Given a closely matching profile, who is more likely to match, a relative or a randomly chosen, unrelated individual? Use a likelihood ratio
Is the true DNA match a relative or a random individual? This question is ultimately governed by two considerations: –What is the size of the alternative suspect pool? –What is an acceptable rate of false positives?
What weight should be given to DNA evidence? Statistics do not lie. But, you have to pay close attention to the questions they are addressing. What is the chance that a randomly chosen, unrelated individual from a given population would have the same DNA profile observed in a sample?
Additional (free) resources Forensic Bioinformatics ( GenoStat® ( dex.html) Eight 50-minute YouTube videos (