Multiple Comparisons Measures of LD Jess Paulus, ScD January 29, 2013.

Slides:



Advertisements
Similar presentations
What is an association study? Define linkage disequilibrium
Advertisements

1 Health Warning! All may not be what it seems! These examples demonstrate both the importance of graphing data before analysing it and the effect of outliers.
METHODS FOR HAPLOTYPE RECONSTRUCTION
Objectives Cover some of the essential concepts for GWAS that have not yet been covered Hardy-Weinberg equilibrium Meta-analysis SNP Imputation Review.
BMI 731- Winter 2005 Chapter1: SNP Analysis Catalin Barbacioru Department of Biomedical Informatics Ohio State University.
Linkage disequilibrium (LD) extent in B1 population by chromosome Chromosomes I to XII Mariela Aponte Villadoma Elisa J. Mihovilovich Castro Merideth Bonierbale.
MALD Mapping by Admixture Linkage Disequilibrium.
Ingredients for a successful genome-wide association studies: A statistical view Scott Weiss and Christoph Lange Channing Laboratory Pulmonary and Critical.
Statistical Significance What is Statistical Significance? What is Statistical Significance? How Do We Know Whether a Result is Statistically Significant?
HYPOTHESIS TESTING Four Steps Statistical Significance Outcomes Sampling Distributions.
REVIEW OF BASICS PART II Probability Distributions Confidence Intervals Statistical Significance.
Hypothesis Testing Steps of a Statistical Significance Test. 1. Assumptions Type of data, form of population, method of sampling, sample size.
DATA ANALYSIS I MKT525. Plan of analysis What decision must be made? What are research objectives? What do you have to know to reach those objectives?
Statistical Significance What is Statistical Significance? How Do We Know Whether a Result is Statistically Significant? How Do We Know Whether a Result.
MSc GBE Course: Genes: from sequence to function Genome-wide Association Studies Sven Bergmann Department of Medical Genetics University of Lausanne Rue.
Using biological networks to search for interacting loci in genome-wide association studies Mathieu Emily et. al. European journal of human genetics, e-pub.
Hypothesis Testing. G/RG/R Null Hypothesis: The means of the populations from which the samples were drawn are the same. The samples come from the same.
Today Concepts underlying inferential statistics
Richard M. Jacobs, OSA, Ph.D.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Hypothesis Testing.
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
1 STATISTICAL HYPOTHESES AND THEIR VERIFICATION Kazimieras Pukėnas.
Section 9.1 Introduction to Statistical Tests 9.1 / 1 Hypothesis testing is used to make decisions concerning the value of a parameter.
Chapter 8 Introduction to Hypothesis Testing
Basic Statistics. Basics Of Measurement Sampling Distribution of the Mean: The set of all possible means of samples of a given size taken from a population.
Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
Lecture 19: Association Studies II Date: 10/29/02  Finish case-control  TDT  Relative Risk.
Type 1 Error and Power Calculation for Association Analysis Pak Sham & Shaun Purcell Advanced Workshop Boulder, CO, 2005.
Jianfeng Xu, M.D., Dr.PH Professor of Public Health and Cancer Biology Director, Program for Genetic and Molecular Epidemiology of Cancer Associate Director,
Multiple Testing Matthew Kowgier. Multiple Testing In statistics, the multiple comparisons/testing problem occurs when one considers a set of statistical.
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 8 th Edition Chapter 9 Hypothesis Testing: Single.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Type I and Type II Errors. An analogy may help us to understand two types of errors we can make with inference. Consider the judicial system in the US.
Association analysis Genetics for Computer Scientists Biomedicum & Department of Computer Science, Helsinki Päivi Onkamo.
Analyzing Statistical Inferences July 30, Inferential Statistics? When? When you infer from a sample to a population Generalize sample results to.
Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
Linkage Disequilibrium and Recent Studies of Haplotypes and SNPs
Lectures 7 – Oct 19, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20 Johnson Hall.
Linkage. Announcements Problem set 1 is available for download. Due April 14. class videos are available from a link on the schedule web page, and at.
T tests comparing two means t tests comparing two means.
Power of a test. power The power of a test (against a specific alternative value) Is In practice, we carry out the test in hope of showing that the null.
Linkage. Announcements Problem set 1 is available for download. Due April 14. class videos are available from a link on the schedule web page, and at.
Hypothesis Testing Steps for the Rejection Region Method State H 1 and State H 0 State the Test Statistic and its sampling distribution (normal or t) Determine.
Chi square and Hardy-Weinberg
ANalysis Of VAriance (ANOVA) Used for continuous outcomes with a nominal exposure with three or more categories (groups) Result of test is F statistic.
Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses pt.1.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
More about tests and intervals CHAPTER 21. Do not state your claim as the null hypothesis, instead make what you’re trying to prove the alternative. The.
Increasing Power in Association Studies by using Linkage Disequilibrium Structure and Molecular Function as Prior Information Eleazar Eskin UCLA.
Power and Meta-Analysis Dr Geraldine M. Clarke Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for.
Chapter 10 Hypothesis Testing 1.
Dr. Amjad El-Shanti MD, PMH,Dr PH University of Palestine 2016
Dr.MUSTAQUE AHMED MBBS,MD(COMMUNITY MEDICINE), FELLOWSHIP IN HIV/AIDS
Genome Wide Association Studies using SNP
Recombination (Crossing Over)
Sampling and Sampling Distributions
P-value Approach for Test Conclusion
Quantitative Methods PSY302 Quiz Chapter 9
Chapter 9: Hypothesis Tests Based on a Single Sample
P-VALUE.
Power of a test.
Chapter 7: Statistical Issues in Research planning and Evaluation
Chapter 8 Making Sense of Statistical Significance: Effect Size, Decision Errors, and Statistical Power.
Power Problems.
Inference as Decision Section 10.4.
Statistical Power.
Presentation transcript:

Multiple Comparisons Measures of LD Jess Paulus, ScD January 29, 2013

Today’s topics 1. Multiple comparisons 2. Measures of Linkage disequilibrium D’ and r 2 r 2 and power

Multiple testing & significance thresholds Concern about multiple testing Standard thresholds (p<0.05) will lead to a large number of “significant” results Vast majority of which are false positives Various approaches to handling this statistically

Possible Errors in Statistical Inference Unobserved Truth in the Population H a : SNP prevents DM H 0 : No association Observed in the Sample Reject H 0 : SNP prevents DM True positive (1 – β) False positive Type I error (α) Fail to reject H 0 : No assoc. False negative Type II error (β): True negative (1- α)

Probability of Errors α =Also known as: “Level of significance” Probability of Type I error – rejecting null hypothesis when it is in fact true (false positive), typically 5% p value = The probability of obtaining a result as extreme or more extreme than you found in your study by chance alone

Type I Error (α) in Genetic and Molecular Research A genome-wide association scan of 500,000 SNPs will yield: 25,000 false positives by chance alone using α = ,000 false positives by chance alone using α = false positives by chance alone using α = 0.001

Multiple Comparisons Problem Multiple comparisons (or "multiple testing") problem occurs when one considers a set, or family, of statistical inferences simultaneously Type I errors are more likely to occur Several statistical techniques have been developed to attempt to adjust for multiple comparisons Bonferroni adjustment

Adjusting alpha Standard Bonferroni correction Test each SNP at the α* =α /m 1 level Where m 1 = number of markers tested Assuming m 1 = 500,000, a Bonferroni-corrected threshold of α*= 0.05/500,000 = 1x10–7 Conservative when the tests are correlated Permutation or simulation procedures may increase power by accounting for test correlation

Measures of LD Jess Paulus, ScD January 29, 2013

Haplotype definition Haplotype: an ordered sequence of alleles at a subset of loci along a chromosome Moving from examining single genetic markers to sets of markers

Measures of linkage disequilibrium Basic data: table of haplotype frequencies AG ag AG ag Ag AG ag AG AG ag AG Ag ag AG ag AG Aa G8050% g %37.5%

D’ and r 2 are most common Both measure correlation between two loci D prime … Ranges from 0 [no LD] to 1 [complete LD] R squared… also ranges from 0 to 1 is correlation between alleles on the same chromosome

D Deviation of the observed frequency of a haplotype from the expected is a quantity called the linkage disequilibrium (D) If two alleles are in LD, it means D ≠ 0 If D=1, there is complete dependency between loci Linkage equilibrium means D=0

Aa Gn 11 n 10 n1n1 gn 01 n 00 n0n0 n1n1 n0n0 MeasureFormulaRef. D’Lewontin (1964)  2 = r 2 Hill and Weir (1994) ** Levin (1953)  Edwards (1963) QYule (1900)

AG ag AG ag Ag AG ag AG AG ag AG Ag ag AG ag AG Aa G8050% g %37.5% D’ = (8  6 – 0x2) / (8  6) =1 r 2 = (8  6 – 0x2) 2 / (10  6  8  8) =.6 R 2 = D ’ =

r 2 and power r 2 is directly related to study power A low r 2 corresponds to a large sample size that is required to detect the LD between the markers r 2 *N is the “effective sample size” If a marker M and causal gene G are in LD, then a study with N cases and controls which measures M (but not G) will have the same power to detect an association as a study with r 2 *N cases and controls that directly measured G

r 2 and power Example: N = 1000 (500 cases and 500 controls) r 2 = 0.4 If you had genotyped the causal gene directly, would only need a total N=400 (200 cases and 200 controls)

Today’s topics 1. Multiple comparisons 2. Measures of Linkage disequilibrium D’ and r 2 r 2 and power