Individual Identity and Population Assignment Lab. 8 Date: 10/17/2012
Goals 1.Use LR method to assess probability that a person left DNA at a crime scene 2.Assign individuals to a predefined population using LR method. 3.Understand the role of human population differentiation in reliable population assignment.
Individual Identity Scenario: A person is accused of a murder. CSU was able to recover the skin cells of the perpetrator where the murder was alleged to have taken place. Genotyping of the sample gives a single DNA profile that matches the suspect.
Individual Identity Ideally… For the courtroom… =1 Expected Freq. based on HWE
Example LocusAllele 1 (frequency) Allele 2 (frequency) AA1 (0.3) A2 (0.7) BB1 (0.4)
Interpretation “It is about 15 times more likely that the sample came from the suspect than from a random person unrelated to the suspect.” Prosecutor’s Fallacy: “It is 15 times more likely that the sample is from the suspect than from someone else” Defense Attorney’s Fallacy: “Because the odds that the sample came from the suspect rather than someone else are only 15:1, there are hundreds of thousands of people who are just as likely to be the sources of the sample found at the scene
Interpretation Prosecutor’s Fallacy (part 2): “Given the DNA evidence, the probability that the sample came from somebody other than the suspect is ”.
Individual identity
Problem 1 The profile of a crime suspect genotyped for three of the CODIS loci is: Suspect Profile CODIS Locus Allele 1 (frequency) Allele 2 (frequency) D8S (0.1119) 14 (0.2238) D21S11 28 (0.1049) 31 (0.0664) D7S (0.2797) 11 (0.2797) 1.Calculate the likelihood ratio and provide a strictly correct interpretation 2.Calculate the posterior probability for each of the following scenarios: ScenarioPrior Probability P(H 1 ) Rationale for P(H 1 ) 1 1/(6.6 10 9 ) All ~6.6 billion people on the planet are considered equally likely to be the perpetrators. 2 1/(3.03 10 8 ) All ~303 million U.S. citizens are considered equally likely to be the perpetrators. 31/56,000 All ~56,000 people (including students) currently living in Morgantown are considered equally likely to be the perpetrators. 41/18 All 18 students currently enrolled for BIOL 464 are considered equally likely to be the perpetrators.
Accounting for Pop. Structure F ST typically much less than 0.01, but a conservative value of 0.01 is used with all ethnic groups. A value of 0.03 is used for Native Americans. Typically, the ethnic group with the lowest LR (most conservative) is used for the courtroom calculation.
Problem 2 There are two hypotheses about the ethnicity of the crime perpetrator from Problem 1, and the frequencies of the alleles from the profile of the suspect in these groups are as follows: Based on these allele frequencies, what would be the LR in each group: a)In the absence of substructure b)If F ST = 0.01 in group 1, and F ST = 0.03 in group 2. c)Which estimates should be used in court for each case and how should the evidence be presented? LocusAlleleFrequency in group 1Frequency in group 2 D8S D8S D21S D21S D7S
Problem 3 Starting with the profile from problem 1, use OmniPop200.1 to calculate likelihood ratios for a large number of ethnic groups. a)Determine how many additional CODIS loci would be necessary to obtain a likelihood ratio greater than in any of the ethnic groups included in the database. b)Which 2 ethnic groups show the largest difference in likelihoods? How great is the difference? What do you conclude about the importance of population structure for likelihood calculations using the CODIS loci? Does this conclusion depend on the loci or the genotypes observed?
Assigning individuals to predefined populations based on LR Lets say, H 1 is the hypothesis that an individual is from population 1 H 2 is the hypothesis that the individual is from population 2 G is the multilocus genotype of the individual. Assumptions: Both populations are in HWE. Selected loci are in linkage equilibrium. Example: LR = 230 means “the individual is 230 times more likely to originate from population 1 than from population 2”
Assignment success depends on: 1.# and variability of molecular markers. 2.# of potential source populations. 3.Accuracy of allele frequency estimations. 4.F ST.
Problem 4: Glacier National Park (British Columbia). Black bear- Less dangerous Brown bear- Dangerous Bear feces: Black or Brown Bear ?? LocusAlleleFrequency in brown bears Frequency in black bears G1A G1A196Not detected0.074 G1D G10B Not detected G10B Bear Profile LocusAllele 1Allele 2 G1A G1D178 G10B156158
Problem 5: Use GenAlex to analyze human_struc.xls and human_by_region.xls data to see success of population assignment in these two datasets. PopSelf PopOther Pop AFRICA913 AMERICA EAST_ASIA EURASIA OCEANIA Total Percent PopSelf PopOther Pop BantuKenya75 BiakaPygmy Mandenka MbutiPygmy Total Percent