Lab 8: Individual Identity and Population Assignment.

Slides:



Advertisements
Similar presentations
Juror Understanding of Random Match Probabilities Dale A. Nance Case Western Reserve University August, 2007.
Advertisements

Attaching statistical weight to DNA test results 1.Single source samples 2.Relatives 3.Substructure 4.Error rates 5.Mixtures/allelic drop out 6.Database.
How strong is DNA evidence?
Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
PV92 PCR/Informatics Kit
Lab 3 : Exact tests and Measuring of Genetic Variation.
Lab 3 : Exact tests and Measuring Genetic Variation.
Expected Value Suppose it costs $2 to get a ticket from the parking ticket machine. Suppose if you are caught without a ticket, the fine is $20. Suppose.
Bayesian inference “Very much lies in the posterior distribution” Bayesian definition of sufficiency: A statistic T (x 1, …, x n ) is sufficient for 
Forensic DNA Analysis (Part II)
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
Evaluating Diagnostic Accuracy of Prostate Cancer Using Bayesian Analysis Part of an Undergraduate Research course Chantal D. Larose.
Sample Size.
CHAPTER 23: Two Categorical Variables: The Chi-Square Test
Lecture 16: Individual Identity and Paternity Analysis March 7, 2014.
Chapter 14 Comparing two groups Dr Richard Bußmann.
Evaluation and interpretation of crime forensic evidence Crime Trace recovery Potential sources of the traces scenarios producing the traces Evaluation.
12.1 Inference for A Population Proportion.  Calculate and analyze a one proportion z-test in order to generalize about an unknown population proportion.
Processing physical evidence discovering, recognizing and examining it; collecting, recording and identifying it; packaging, conveying and storing it;
Fundamentals of Forensic DNA Typing Slides prepared by John M. Butler June 2009 Appendix 3 Probability and Statistics.
 How does the graph represent a gel? Each group filled in a ‘band’ that represents where different – sized DNA fragments would have migrated on a gel,
Copyright © Cengage Learning. All rights reserved. 9 Inferences Based on Two Samples.
DNA Forensics MUPGRET Workshop. “DNA evidence…offers prosecutors important new tools for the identification and apprehension of some of the most violent.
Probability and Statistics of DNA Fingerprinting.
Chapter 6 Probability.
Forensic Statistics From the ground up…. Basics Interpretation Hardy-Weinberg equations Random Match Probability Likelihood Ratio Substructure.
Hypothesis Testing:.
The smokers’ proportion in H.K. is 40%. How to testify this claim ?
Significance Tests …and their significance. Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from.
Comparing Means: t-tests Wednesday 22 February 2012/ Thursday 23 February 2012.
Significance Testing Statistical testing of the mean (z test)
Chi-Square as a Statistical Test Chi-square test: an inferential statistics technique designed to test for significant relationships between two variables.
Scientific Validation of Mixture Interpretation Methods 17th International Symposium on Human Identification Sponsored by the Promega Corporation October,
Thinking About DNA Database Searches William C. Thompson Dept. of Criminology, Law & Society University of California, Irvine.
Lecture 19: Association Studies II Date: 10/29/02  Finish case-control  TDT  Relative Risk.
Copyright © 2010 Pearson Education, Inc. Slide
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Cybergenetics Webinar January, 2015 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA Cybergenetics © How TrueAllele ® Works (Part 4)
Simple examples of the Bayesian approach For proportions and means.
Getting Past First Bayes with DNA Mixtures American Academy of Forensic Sciences February, 2014 Seattle, WA Mark W Perlin, PhD, MD, PhD Cybergenetics,
Welcome to MM570 Psychological Statistics
Lecture 14: Population Assignment and Individual Identity October 8, 2015.
Implications of database searches for DNA profiling statistics Forensic Bioinformatics ( Dan E. Krane, Wright State University, Dayton,
Copyright © Cengage Learning. All rights reserved. 9 Inferences Based on Two Samples.
Section 10.2: Tests of Significance Hypothesis Testing Null and Alternative Hypothesis P-value Statistically Significant.
Objective DNA Mixture Information in the Courtroom: Relevance, Reliability & Acceptance NIST International Symposium on Forensic Science Error Management:
IE241 Final Exam. 1. What is a test of a statistical hypothesis? Decision rule to either reject or not reject the null hypothesis.
Exercise 1 DNA identification. To which population an individual belongs? Two populations of lab-mice have been accidentally put in a same cage. Your.
Individual Identity and Population Assignment Lab. 8 Date: 10/17/2012.
Problems with Variance ©2005 Dr. B. C. Paul. Determining What To Do We have looked at techniques that depend on normally distributed data with variance.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Copyright © 2009 Pearson Education, Inc. 9.2 Hypothesis Tests for Population Means LEARNING GOAL Understand and interpret one- and two-tailed hypothesis.
Chi Square Pg 302. Why Chi - Squared ▪Biologists and other scientists use relationships they have discovered in the lab to predict events that might happen.
Seventh Annual Prescriptions for Criminal Justice Forensics Program Fordham University School of Law June 3, 2016 DNA Panel.
Scientists typically collect data on a sample of a population and use this data to draw conclusions, or make inferences, about the entire population. (for.
Statistics (Chapter 3). CHE Statistics Forensic science is based in experiment, measurement, and analysis. Whenever measurements are made, however,
Lecture 15: Individual Identity and Forensics October 17, 2011.
Statistical Weights of DNA Profiles
What Is a Test of Significance?
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Fall 2016 Room 150 Harvill Building 10: :50 Mondays, Wednesdays.
Lecture 15: Individual Identity and Paternity Analysis
Explaining the Likelihood Ratio in DNA Mixture Interpretation
Distorting DNA evidence: methods of math distraction
Stat 217 – Day 28 Review Stat 217.
Solving Crimes using MCMC to Analyze Previously Unusable DNA Evidence
Forensic match information: exact calculation and applications
11.2 Applying Mendel’s Principles
TESTs about a population mean
Goals: To identify subpopulations (subsets of the sample with distinct allele frequencies) To assign individuals (probabilistically) to subpopulations.
DNA Identification: Mixture Interpretation
Presentation transcript:

Lab 8: Individual Identity and Population Assignment

A Simple Application There’s toy factory which produces two types of coins. One type is normal, like any other coins in the world, and the other one is “magic” because it produces a “heads” result on 90% of flips. However, there’s no difference between the coins in appearance or weight. One night, Nathan decided to steal some coins from the factory, but he mixed up the two types. Now… What’s the probability of getting “heads” if Nathan flips one of the stolen coins? Nathan learned something about probability in high school, so he decided to use empirical estimation of the probability to identify the coin types. According to Nathan: If the probability of getting heads is 0.5, the coin is normal, and if the probability of getting heads is 0.9, the coin is magic. You catch Nathan in the act of testing his coins, and he seems pretty disturbed. When you ask him about it, he replies: “I got heads on 70% of the flips for some coins, so I don’t know how to classify those.” Can you tell him which type of coin is more likely to yield 70% Heads?

Bayesian inference H, hypothesis, probability of H may be affected by data E, evidence, corresponds to new data that were not used in computing the prior probability. P(H), prior probability, is the probability of H before E is observed P(H|E) posterior probability P(E|H) likelihood Posterior is proportional to likelihood times prior.

Now H1, it’s a normal coin H2, it’s a magic coin The victim (owner of the factory) also told you in your investigation that “To maximize profit, 90% of the products in our pipeline are magic coins, and only 10% are normal coins” Nathan tossed each coin ten times to decide its type Can you use the prior information to decide which coin is most likely to yield 70% Heads?

This gives us the answer that the coin is highly possible a magic one when given the prior probability. P(H 1 IE)=0.117*0.1/0.063= P(H 2 IE)=0.057*0.9/0.063= 0.814

Individual Identity Scenario: Skin cells under the fingernails of the murder victim match the DNA profile of a suspected Sicilian hitman who was seen exiting the apartment. Genotyping of the sample gives a single DNA profile that matches the suspect.

Individual Identity Ideally… For the courtroom… =1 Expected Freq. based on HWE

Example LocusAllele 1 (frequency) Allele 2 (frequency) AA1 (0.3) A2 (0.7) BB1 (0.4)

Interpretation “It is about 15 times more likely that the sample came from the suspect than from a random person unrelated to the suspect.” Prosecutor’s Fallacy: “It is 15 times more likely that the sample is from the suspect than from someone else” Defense Attorney’s Fallacy: “Because the odds that the sample came from the suspect rather than someone else are only 15:1, there are hundreds of thousands of people who are just as likely to be the sources of the sample found at the scene

Interpretation Prosecutor’s Fallacy (part 2): “Given the DNA evidence, the probability that the sample came from somebody other than the suspect is ”.

Problem 1. The profile of a crime suspect genotyped for three of the Combined DNA Index System (CODIS) loci used by the U.S. Federal Bureau of Investigation is: This profile matches perfectly to the profile from a hair sample found at the crime scene. Suspect Profile CODIS Locus Allele 1 (frequency) Allele 2 (frequency) D8S (0.1119) 14 (0.2238) D21S11 28 (0.1049) 31 (0.0664) D7S (0.2797) 11 (0.2797) a)Calculate the likelihood ratio if H 1 is the hypothesis that the sample found at the crime scene is from the suspect and H 2 is the hypothesis that the sample found at the crime scene is from a person unrelated to the suspect. Be sure to provide a strictly correct interpretation of this likelihood ratio. b)Calculate the posterior probability P(H 1 |E) for each of the following scenarios: Discuss the extent to which different prior probabilities affect posterior probabilities in b), and why this is relevant to forensic identification of samples at crime scenes. Prior Prob. P(H 1 )Rationale for P(H 1 ) 1 1/(6.6  10 9 ) All ~6.6 billion people on the planet are considered equally likely to be the perpetrators. 2 1/(3.03  10 8 ) All ~303 million U.S. citizens are considered equally likely to be the perpetrators. 31/56,000 All ~56,000 people (including students) currently living in Morgantown are considered equally likely to be the perpetrators. 41/24 All 24 students currently enrolled for BIOL 464 are considered equally likely to be the perpetrators.

Homozygous lociHeterozygous loci F ST typically much less than 0.01, but a conservative value of 0.01 is used with all ethnic groups. A value of 0.03 is used for Native Americans. Typically, the ethnic group with the lowest LR (most conservative) is used for the courtroom calculation. Accounting for Pop. Structure

Problem 2. There are two hypotheses about the ethnicity of the crime perpetrator from Problem 1 (group 1 and group 2), and the frequencies of the alleles from the profile of the suspect in these groups are as follows: LocusAlleleFrequency in group 1 Frequency in group 2 D8S D8S D21S D21S D7S Based on the allele frequencies in the two ethnic groups, what would be the estimated likelihood ratio in each group: a)In the absence of substructure. b)If F ST = 0.01 in group 1, and F ST = 0.03 in group 2. c)Which estimates should be used in court for each case and how should the evidence be presented (i.e., please provide a correct interpretation of the likelihood ratios)?

Assigning individuals to predefined populations based on LR Lets say, H 1 is the hypothesis that an individual is from population 1 H 2 is the hypothesis that the individual is from population 2 G is the multilocus genotype of the individual. Assumptions: Both populations are in HWE. Selected loci are in linkage equilibrium. Example: LR = 230 means “the individual is 230 times more likely to originate from population 1 than from population 2”

Assignment success depends on: 1.# and variability of molecular markers. 2.# of potential source populations. 3.Accuracy of allele frequency estimations. 4.F ST.

Modern Tech Plants 2-locus combination of rbcL+matK as the plant barcode (Hollingsworth et al. 2009) Nuclear ribosomal Internal transcribed spacer (ITS) for plants (Li et al. 2011) 16s rRNA sequencing for microbiomes Biodiversity and operational taxonomic units (OTUs)

Problem 3. During a camping trip in Glacier National Park (British Columbia), you discover what appear to be bear feces not far from where you had pitched your tent with the intention to spend the following several weeks. Knowing that both brown bears (Ursus arctos) and black bears (Ursus americanus) inhabit this area, you decide to determine whether your neighbor is a brown or a black bear. You mail a sample to a friend at Stanford and, within a day, you receive the genotype of the bear for three microsatellite loci and allele frequency distributions for brown and black bears in the area (see tables below). a)Is it more likely that the feces are from a black bear or from a brown bear? Provide a correct interpretation of the likelihood ratio as part of your answer. b)Discuss the practical significance of your findings. If you are lacking ideas, consider the fact that black bears rarely attack humans, whereas brown bear attacks result in an average of two human casualties per year in the U.S. Bear Profile LocusAllele 1Allele 2 G1A G1D 178 G10B LocusAlleleFrequency in brown bears Frequency in black bears G1A G1A196 Not detected0.074 G1D G10B Not detected G10B

Problem 4: Use GenAlEx to analyze human_struc.xls and human_by_region.xls data to see success of population assignment in these two datasets. a)Discuss the success of population assignment in both analyses. If the success was different in the two analyses, explain why. If it was not, explain why not. b)Compare these results to the Structure results from Lab #7. Which approach gave the clearest answer? Which do you think is most appropriate for this dataset? Which do you think would be most appropriate for assigning a randomly- selected human to a population of origin? c)Discuss the practical significance of your findings. For example, can population assignment be used reliably in criminal investigations? If yes, explain under what circumstances. If not, explain why not.

Problem 5. GRADUATE STUDENTS ONLY: Find an application of population assignment in the literature. Describe: the question or hypothesis that was addressed, the test(s) applied, and the general conclusions. Be sure to critique the method. Two points of extra credit will be awarded if you discover an improper application of population assignment in a peer-reviewed publication. Be sure to send the paper to Me with your report