Download presentation
Presentation is loading. Please wait.
Published bySherman Henry Modified over 9 years ago
1
Lab 8: Individual Identity and Population Assignment
2
A Simple Application There’s toy factory which produces two types of coins. One type is normal, like any other coins in the world, and the other one is “magic” because it produces a “heads” result on 90% of flips. However, there’s no difference between the coins in appearance or weight. One night, Nathan decided to steal some coins from the factory, but he mixed up the two types. Now… What’s the probability of getting “heads” if Nathan flips one of the stolen coins? Nathan learned something about probability in high school, so he decided to use empirical estimation of the probability to identify the coin types. According to Nathan: If the probability of getting heads is 0.5, the coin is normal, and if the probability of getting heads is 0.9, the coin is magic. You catch Nathan in the act of testing his coins, and he seems pretty disturbed. When you ask him about it, he replies: “I got heads on 70% of the flips for some coins, so I don’t know how to classify those.” Can you tell him which type of coin is more likely to yield 70% Heads?
3
Bayesian inference H, hypothesis, probability of H may be affected by data E, evidence, corresponds to new data that were not used in computing the prior probability. P(H), prior probability, is the probability of H before E is observed P(H|E) posterior probability P(E|H) likelihood Posterior is proportional to likelihood times prior.
4
Now H1, it’s a normal coin H2, it’s a magic coin The victim (owner of the factory) also told you in your investigation that “To maximize profit, 90% of the products in our pipeline are magic coins, and only 10% are normal coins” Nathan tossed each coin ten times to decide its type Can you use the prior information to decide which coin is most likely to yield 70% Heads?
5
This gives us the answer that the coin is highly possible a magic one when given the prior probability. P(H 1 IE)=0.117*0.1/0.063= 0.186 P(H 2 IE)=0.057*0.9/0.063= 0.814
7
Individual Identity Scenario: Skin cells under the fingernails of the murder victim match the DNA profile of a suspected Sicilian hitman who was seen exiting the apartment. Genotyping of the sample gives a single DNA profile that matches the suspect.
8
Individual Identity Ideally… For the courtroom… =1 Expected Freq. based on HWE
9
Example LocusAllele 1 (frequency) Allele 2 (frequency) AA1 (0.3) A2 (0.7) BB1 (0.4)
10
Interpretation “It is about 15 times more likely that the sample came from the suspect than from a random person unrelated to the suspect.” Prosecutor’s Fallacy: “It is 15 times more likely that the sample is from the suspect than from someone else” Defense Attorney’s Fallacy: “Because the odds that the sample came from the suspect rather than someone else are only 15:1, there are hundreds of thousands of people who are just as likely to be the sources of the sample found at the scene
11
Interpretation Prosecutor’s Fallacy (part 2): “Given the DNA evidence, the probability that the sample came from somebody other than the suspect is 0.0672”.
12
Problem 1. The profile of a crime suspect genotyped for three of the Combined DNA Index System (CODIS) loci used by the U.S. Federal Bureau of Investigation is: This profile matches perfectly to the profile from a hair sample found at the crime scene. Suspect Profile CODIS Locus Allele 1 (frequency) Allele 2 (frequency) D8S1179 12 (0.1119) 14 (0.2238) D21S11 28 (0.1049) 31 (0.0664) D7S820 11 (0.2797) 11 (0.2797) a)Calculate the likelihood ratio if H 1 is the hypothesis that the sample found at the crime scene is from the suspect and H 2 is the hypothesis that the sample found at the crime scene is from a person unrelated to the suspect. Be sure to provide a strictly correct interpretation of this likelihood ratio. b)Calculate the posterior probability P(H 1 |E) for each of the following scenarios: Discuss the extent to which different prior probabilities affect posterior probabilities in b), and why this is relevant to forensic identification of samples at crime scenes. Prior Prob. P(H 1 )Rationale for P(H 1 ) 1 1/(6.6 10 9 ) All ~6.6 billion people on the planet are considered equally likely to be the perpetrators. 2 1/(3.03 10 8 ) All ~303 million U.S. citizens are considered equally likely to be the perpetrators. 31/56,000 All ~56,000 people (including students) currently living in Morgantown are considered equally likely to be the perpetrators. 41/24 All 24 students currently enrolled for BIOL 464 are considered equally likely to be the perpetrators.
13
Homozygous lociHeterozygous loci F ST typically much less than 0.01, but a conservative value of 0.01 is used with all ethnic groups. A value of 0.03 is used for Native Americans. Typically, the ethnic group with the lowest LR (most conservative) is used for the courtroom calculation. Accounting for Pop. Structure
14
Problem 2. There are two hypotheses about the ethnicity of the crime perpetrator from Problem 1 (group 1 and group 2), and the frequencies of the alleles from the profile of the suspect in these groups are as follows: LocusAlleleFrequency in group 1 Frequency in group 2 D8S1179 12 0.11190.1622 D8S1179 14 0.22380.1982 D21S11 28 0.10490.0495 D21S1131 0.06640.0495 D7S82011 0.27970.3378 Based on the allele frequencies in the two ethnic groups, what would be the estimated likelihood ratio in each group: a)In the absence of substructure. b)If F ST = 0.01 in group 1, and F ST = 0.03 in group 2. c)Which estimates should be used in court for each case and how should the evidence be presented (i.e., please provide a correct interpretation of the likelihood ratios)?
15
Assigning individuals to predefined populations based on LR Lets say, H 1 is the hypothesis that an individual is from population 1 H 2 is the hypothesis that the individual is from population 2 G is the multilocus genotype of the individual. Assumptions: Both populations are in HWE. Selected loci are in linkage equilibrium. Example: LR = 230 means “the individual is 230 times more likely to originate from population 1 than from population 2”
16
Assignment success depends on: 1.# and variability of molecular markers. 2.# of potential source populations. 3.Accuracy of allele frequency estimations. 4.F ST.
17
Modern Tech Plants 2-locus combination of rbcL+matK as the plant barcode (Hollingsworth et al. 2009) Nuclear ribosomal Internal transcribed spacer (ITS) for plants (Li et al. 2011) 16s rRNA sequencing for microbiomes Biodiversity and operational taxonomic units (OTUs)
18
Problem 3. During a camping trip in Glacier National Park (British Columbia), you discover what appear to be bear feces not far from where you had pitched your tent with the intention to spend the following several weeks. Knowing that both brown bears (Ursus arctos) and black bears (Ursus americanus) inhabit this area, you decide to determine whether your neighbor is a brown or a black bear. You mail a sample to a friend at Stanford and, within a day, you receive the genotype of the bear for three microsatellite loci and allele frequency distributions for brown and black bears in the area (see tables below). a)Is it more likely that the feces are from a black bear or from a brown bear? Provide a correct interpretation of the likelihood ratio as part of your answer. b)Discuss the practical significance of your findings. If you are lacking ideas, consider the fact that black bears rarely attack humans, whereas brown bear attacks result in an average of two human casualties per year in the U.S. Bear Profile LocusAllele 1Allele 2 G1A 192196 G1D 178 G10B 156158 LocusAlleleFrequency in brown bears Frequency in black bears G1A192 0.0190.224 G1A196 Not detected0.074 G1D 178 0.0690.152 G10B156 0.032Not detected G10B158 0.2780.190
19
Problem 4: Use GenAlEx to analyze human_struc.xls and human_by_region.xls data to see success of population assignment in these two datasets. a)Discuss the success of population assignment in both analyses. If the success was different in the two analyses, explain why. If it was not, explain why not. b)Compare these results to the Structure results from Lab #7. Which approach gave the clearest answer? Which do you think is most appropriate for this dataset? Which do you think would be most appropriate for assigning a randomly- selected human to a population of origin? c)Discuss the practical significance of your findings. For example, can population assignment be used reliably in criminal investigations? If yes, explain under what circumstances. If not, explain why not.
20
Problem 5. GRADUATE STUDENTS ONLY: Find an application of population assignment in the literature. Describe: the question or hypothesis that was addressed, the test(s) applied, and the general conclusions. Be sure to critique the method. Two points of extra credit will be awarded if you discover an improper application of population assignment in a peer-reviewed publication. Be sure to send the paper to Me with your report
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.