Power Calculations for GWAS

Slides:



Advertisements
Similar presentations
Statistical methods for genetic association studies
Advertisements

The next generation Chapters 9, 10, 17 in the course textbook, especially pages , ,
Genetics. Cells-Nucleus-Chromosomes- DNA- Genes They are all found in a cell. They are all located in the body.
What is an association study? Define linkage disequilibrium
Tutorial #1 by Ma’ayan Fishelson
SNP Applications statwww.epfl.ch/davison/teaching/Microarrays/snp.ppt.
Hardy-Weinberg Equilibrium
 Read Chapter 6 of text  Brachydachtyly displays the classic 3:1 pattern of inheritance (for a cross between heterozygotes) that mendel described.
Patterns of Inheritance Chapter Early Ideas of Heredity Before the 20 th century, 2 concepts were the basis for ideas about heredity: -heredity.

Single nucleotide polymorphisms Usman Roshan. SNPs DNA sequence variations that occur when a single nucleotide is altered. Must be present in at least.
 Read Chapter 6 of text  We saw in chapter 5 that a cross between two individuals heterozygous for a dominant allele produces a 3:1 ratio of individuals.
Lecture 5: Segregation Analysis I Date: 9/10/02  Counting number of genotypes, mating types  Segregation analysis: dominant, codominant, estimating segregation.
Population Genetics Learning Objectives
Disease Models and Association Statistics Nicolas Widman CS 224- Computational Genetics Nicolas Widman CS 224- Computational Genetics.
Multifactorial Traits
Population Genetics is the study of the genetic
14 Population Genetics and Evolution. Population Genetics Population genetics involves the application of genetic principles to entire populations of.
Lecture 19: Association Studies II Date: 10/29/02  Finish case-control  TDT  Relative Risk.
1 B-b B-B B-b b-b Lecture 2 - Segregation Analysis 1/15/04 Biomath 207B / Biostat 237 / HG 207B.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 4-1 Human Genetics Concepts and Applications Eighth Edition.
Chapter 14: Mendel & The Gene Idea Quantitative approach to science Pea plants Austrian Monk.
Allele Frequency  The first equation looks at the percentage of alleles in a population A + a = 1 ○ A is the percentage of dominant alleles ○ a is the.
Linkage. Announcements Problem set 1 is available for download. Due April 14. class videos are available from a link on the schedule web page, and at.
Linkage. Announcements Problem set 1 is available for download. Due April 14. class videos are available from a link on the schedule web page, and at.
Genome-Wides Association Studies (GWAS) Veryan Codd.
Increasing Power in Association Studies by using Linkage Disequilibrium Structure and Molecular Function as Prior Information Eleazar Eskin UCLA.
Gregor Mendel. Trait: – A specific characteristic that varies from one individual to another.
SNPs and complex traits: where is the hidden heritability?
The Basic Principles of Heredity
Part 2: Genetics, monohybrid vs. Dihybrid crosses, Chi Square
Genetics Unit 3.
PEDIGREE ANALYSIS AND PROBABILITY
Hardy Weinberg Principle
Hardy-Weinberg Theorem
Genome Wide Association Studies using SNP
Combinations of Two Gene Pairs Involving Two Modes of Inheritance Modify the 9:3:3:1 Ratio
Exceptions to Mendelian Inheritance
Genes and Inheritance Review
Migrant Studies Migrant Studies: vary environment, keep genetics constant: Evaluate incidence of disorder among ethnically-similar individuals living.
Recombination (Crossing Over)
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
High level GWAS analysis
Epidemiology 101 Epidemiology is the study of the distribution and determinants of health-related states in populations Study design is a key component.
Evolutionary Change in Populations
Mendel & The Gene Idea Chapter 14
Different mode and types of inheritance
Linkage, Recombination, and Eukaryotic Gene Mapping
The ‘V’ in the Tajima D equation is:
Beyond GWAS Erik Fransen.
Lecture 4: Testing for Departures from Hardy-Weinberg Equilibrium
Genetics: Mendel & The Gene Idea.
MENDEL AND THE GENE IDEA OUTLINE
Meiosis and Sexual Life Cycles
Lecture: Natural Selection and Genetic Drift and Genetic Equilibrium
Mendelian Inheritance
Continuous and discontinuous variation Genes in population
Mendelian Genetics Part 3
Pedigree Analysis.
Medical genomics BI420 Department of Biology, Boston College
Punnett squares illustrate genetic crosses.
Lecture 9: QTL Mapping II: Outbred Populations
The Genetics of Major Depression
3.4 Review PBS.
Linkage Analysis Problems
Medical genomics BI420 Department of Biology, Boston College
Benjamin A. Rybicki, Robert C. Elston 
Patterns of Inheritance
Hardy-Weinberg Principle
Predicting inheritance in a population
Presentation transcript:

Power Calculations for GWAS

What is Power? Power is the probability that we will detect a true association between a SNP and a phenotype It is common to use a power of 80% or 90% for study design An 80% power means that there is a 80% chance that the association will be discovered given the assumptions that you have made The main assumptions are about the p value that is significant, the sample size and the odds ratio or relative risk

Relative Risk RR = (DR/TR) (DP/TP) Disease Healthy Total Risk Allele Count DR HR TR Protective Allele Count DP HP TP Probability of Disease With Risk Allele Probability of Disease With Protective Allele Disease Healthy Total Risk allele 220 180 400 Protective Allele 780 820 1600 1000 2000 220/400 0.550 DR/TR 780/1660 0.488 DP/TP 1.128 Relative Risk

Odds Ratio OR = (DR/HR) (DP/HP) Disease Healthy Total Risk Allele Count DR HR TR Protective Allele Count DP HP TP Ratio of Diseased to Healthy with Risk Allele Ratio of Diseased to Healthy With Protective Allele Disease Healthy Total Risk 220 180 400 Protected 780 820 1600 1000 2000 220/180 1.222 DR/HR 780/820 0.951 DP/HP 1.285 Odds ratio 1.128 Relative Risk

Comparison of Relative Risk and Odds Ratio Minor Allele frequency = 0.3 The exact relationship will depend on MAF Odds Ratio will always give a larger estimate of effect than Relative Risk

Assumptions required for Power calculation Power depends on Inheritance model Sample size Number of Independent tests (SNP tested) Minor Allele Frequency Probability of having phenotype if you have the risk allele (odds ratio) Linkage disequilibrium between marker SNP and causative SNP

Inheritance Models Mendelian trait Quantative trait Phenotype controlled by a single locus. Easiest to detect. All affected will have the risk allele Some unaffected may have risk allele (<100% penetrance) eg Huntingdons’ Disease Easily discovered in small sets of families Quantative trait Phenotype controlled by multiple loci. Eg hundreds of loci have been found associated with height. Not all individuals with phenotype will have the causative allele at a particular locus A critical mass of causative loci may be required before phenotype develops. Eg an individual might need 50 risk alleles to develop cancer out of many hundred possible risk alleles We need to discover a statistically significant excess of an allele in the affected population over the control population We are measuring the increased risk of exhibiting the phenotype associated with each variant

Inheritance model Dominant Additive (Co-dominant) Recessive These are the easiest to detect as the contribution to risk from a heterozygote will the same as homozygotes. Therefore the association with the locus will be stronger. These were made famous by Mendel’s peas but I am not aware of any examples associated with Quantative traits and we will ignore them Additive (Co-dominant) Both alleles contribute approximately the same amount to risk of disease. This is the commonest mode of inheritance and the one that we will assume. Recessive This is the hardest to detect since only homozygotes will be at risk of disease and these may be rare in the population. Eg An allele present at 10% frequency will be homozygous in only 1% of the population (Hardy Weinberg) Can be maintained by balancing selection Eg sickle cell anaemia

Population size Increasing population size will increase power to detect a given Odds Ratio This shows how power increase with population size for a small Odds ratio. The plot is highly dependent on the particular Odds Ratio Chosen

Multiple Testing We will be testing 1,000s of SNP loci which are assumed to have independent effects If we test 100 loci using a 5% alpha then we would expect to get 5 positive associations even if all the data was completely random. We will use the Bonferroni correction to control for this Divide the alpha by the number of loci tested If we use 100 SNP loci then we would set the required alpha to 0.05/100 = 0.0005

Effect of Number of Tests on Power Hong, E. P. & Park, J. W. Sample Size and Statistical Power Calculation in Genetic Association Studies. Genomics Inform 10, 117 (2012).

Effect of Minor Allele Frequency on Power Hong, E. P. & Park, J. W. Sample Size and Statistical Power Calculation in Genetic Association Studies. Genomics Inform 10, 117 (2012).

Effect of Disease Prevalence on Power Hong, E. P. & Park, J. W. Sample Size and Statistical Power Calculation in Genetic Association Studies. Genomics Inform 10, 117 (2012).

Effect of linkage disequilibrium on Power Hong, E. P. & Park, J. W. Sample Size and Statistical Power Calculation in Genetic Association Studies. Genomics Inform 10, 117 (2012).

Excercise In the folder “Power Analysis” in your Flash disk there is a word doc “Power Analysis.docx”. Please open it and follow the instructions