Statistical Power and Meta-analysis

Slides:

Advertisements

Similar presentations

Association Tests for Rare Variants Using Sequence Data

Advertisements

Meta-analysis for GWAS BST775 Fall DEMO Replication Criteria for a successful GWAS P

Inferential Statistics & Hypothesis Testing

Lecture 23: Tues., Dec. 2 Today: Thursday:

Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.

The Simple Regression Model

Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,

Inferences About Process Quality

1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.

Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.

Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.

Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

AM Recitation 2/10/11.

Inference for regression - Simple linear regression

Hypothesis Testing:.

Statistical Analysis Statistical Analysis

Lecture 5: Segregation Analysis I Date: 9/10/02  Counting number of genotypes, mating types  Segregation analysis: dominant, codominant, estimating segregation.

Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.

Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.

Lecture 5: Chapter 5: Part I: pg Statistical Analysis of Data …yes the “S” word.

Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.

Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.

Essential Question:  How do scientists use statistical analyses to draw meaningful conclusions from experimental results?

Type 1 Error and Power Calculation for Association Analysis Pak Sham & Shaun Purcell Advanced Workshop Boulder, CO, 2005.

1 B-b B-B B-b b-b Lecture 2 - Segregation Analysis 1/15/04 Biomath 207B / Biostat 237 / HG 207B.

Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.

Powerful Regression-based Quantitative Trait Linkage Analysis of General Pedigrees Pak Sham, Shaun Purcell, Stacey Cherny, Gonçalo Abecasis.

Chapter 13 Understanding research results: statistical inference.

NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.

Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.

Power and Meta-Analysis Dr Geraldine M. Clarke Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for.

Methods of Presenting and Interpreting Information Class 9.

Power in QTL linkage analysis

Regression Models for Linkage: Merlin Regress

Logic of Hypothesis Testing

METAL Practical You have ran a GWA analysis.

Regression Analysis.

Chapter 7. Classification and Prediction

Correlation and Simple Linear Regression

Mendelian randomization with invalid instruments: Egger regression and Weighted Median Approaches David Evans.

Genome Wide Association Studies using SNP

Analyzing and Interpreting Quantitative Data

Lecture 4: Meta-analysis

Chapter 11 Simple Regression

12 Inferential Analysis.

Elementary Statistics

POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.

Correlation and Simple Linear Regression

Inferential Statistics

Chapter 11 Goodness-of-Fit and Contingency Tables

Regression-based linkage analysis

Epidemiology 101 Epidemiology is the study of the distribution and determinants of health-related states in populations Study design is a key component.

Power to detect QTL Association

Chapter 9 Hypothesis Testing.

Review for Exam 2 Some important themes from Chapters 6-9

Correlation and Simple Linear Regression

12 Inferential Analysis.

UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE

Simple Linear Regression and Correlation

Statistics II: An Overview of Statistics

Product moment correlation

What are their purposes? What kinds?

15.1 The Role of Statistics in the Research Process

Power Calculation for QTL Association

3.2. SIMPLE LINEAR REGRESSION

Chapter 9 Hypothesis Testing: Single Population

Statistical inference for the slope and intercept in SLR

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression

Presentation transcript:

Statistical Power and Meta-analysis Pak Sham, Ben Neale, Meike Bartels International Workshop on Statistical Genetic Methods for Human Complex Traits March 3, 2015

What is statistical power? Statistical power is the probability of rejecting the null hypothesis in a statistical test, when the alternative hypothesis is true The concept of statistical power was introduced by Neyman and Pearson, extending Fisher’s work on significance testing

Error rates The type 1 error rate (α) is the probability of rejecting the null hypothesis when it is true The type 2 error rate (β) is the probability of NOT rejecting the null hypothesis when the alternative hypothesis is true (power = 1-β) The false positive report rate is the probability that a null hypothesis is true when a significant result occurs

Why calculate power? To determine if a study has a reasonable chance of success (detecting a true effect) To design a study that has greatest power given available resources To determine the minimum required sample size in a study Usually obligatory in grant proposals

Both type 1 and type 2 error rates affect the false positive rate 1000 Tests H0 H1 10 990 NS S NS S 990 10(1-) What is the false positive report rate?

Bayes’ theorem α = type 1 error rate β = type 2 error rate π0 = prior probability of null hypothesis

Candidate gene studies Hirschhorn et al. 2002: Reviewed 166 putative single allelic association with 2 or more replication attempts: 6 reliably replicated (≥75% positive replications) 97 with at least 1 replication 63 with no subsequent replications Other such surveys reached similar conclusions (Ioannidis 2003; Ioannidis et al. 2003; Lohmueller et al. 2003) Probably caused by widespread multiple testing, inadequate control of type 1 error rate, and Inadequate statistical power

Power calculation “triad” Effect size Sample size Power Given any two, calculate the third

Defining the effect The impact of the predictor variable (e.g. genotype) on the outcome variable (e.g. disease risk). Example:

Odds Ratio vs Risk Ratio Conversion between OR and RR requires knowledge of population risk (see So et al, Genetic Epi, 2011)

Quantitative trait For trait X: Mean trait difference = E(X|G1) – E(X|G0)

Some complications A biallelic locus has 3 possible genotypes Comparing each genotype (AA and AB) to a reference genotype (BB) gives raise to 2 effect sizes Simplification to a single effect size only possible under particular model assumptions Dominant model: AA and AB have same effect Recessive model: AB has no effect Multiplicative model: the effect of AA equals the square of the effect for AB Additive model: the effect of AA equals twice the effect of AB

How to set the effect size? Replication studies Take effect size of original study (possibly with adjustment for winner’s curse if original study involved extensive multiple testing) Original studies Take typical effect sizes found by previous studies of similar phenotypes and similar genetic variants Take smallest effect size considered to be scientifically interesting Often desirable to consider a range of plausible effect sizes and present results in tables or graphs

Winner’s curse Suppose 100 independent SNPs on a SNP chip each has 1% power to reach critical genome-wide significance The probability that at least one SNP achieves genome-wide significance is 1-(0.99)100 ≈ 0.63. The estimated effect size of the most significant SNP will also be much greater than its true effect size A replication study with identical design and sample size has only a 1% chance of replicating a particular SNP at the same genome-wide level of significance.

Allele frequency Power is also influenced by the variance of the independent variable For a locus with allele frequency p, coded as 0, 1, 2 (additive model) and in Hardy-Weinberg equilibrium, the population variance of the genotype is 2p(1-p)

Sample size plot OR=1.2 1.3 1.5 2.0 Wang et al, (2005)

Power calculation via NCP Sample size Effect size NCP Power Allele frequency α

NCP of chi-squared test If Z~N(,1), then Z2 ~ 2(df=1, NCP=2) df = degrees of freedom NCP = Non-Centrality Parameter Mean of Z2 = df + NCP Nice properties of NCP for direct association NCP  sample size NCP  effect size of genotype (function of) NCP  variance of genotype

NCP determines power Rejection of H0

NCP determines power

Linear regression Y = α + βX + ε H0: β=0, usually t-test or F-test In large samples, t ≈ Normal, F ≈ Chi-squared Directly related to the proportion of variance explained by QTL, when the residual variance is close to the trait variance

Variance Explained Pawitan Y et al, PLOSone, 2009 For binary disease trait determined by liability-threshold model, the proportion of variance in liability explained by a SNP with allele frequency p and allelic odds ratio θ is approximately This. together with N, determine the NCP for simple random samples of the population Note the regression coefficient in a logistic regression represents ln(OR) Pawitan Y et al, PLOSone, 2009 So HC et al, Genet Epi, 2011

Indirect association If a direct association study of a causal SNP would provide an NCP of  Then an indirect association study of a SNP in LD with the causal SNP has NCP of R2 i.e. sample size to achieve the same power is increased by a factor of 1/ R2 Sham et al, Am J Hum Genet 2000

Selecting extremes Selecting individuals with extreme (very low or very high) phenotypic values for genotyping can improve study efficiency NCPS / NCPP = VarS / VarP

Repeated measurements If the quantitative trait has test-retest correlation r, then taking the average of k independent measurements reduces the measurement variance from 1-r to (1-r)/k. The variance explained by a causal locus, and therefore the NCP, increases by a factor of

Genetic Power Calculator (GPC) http://pngu.mgh.harvard.edu/~purcell/gpc/ Purcell, Cherny and Sham, Bioinformatics, 2003

Exercise 1 Candidate gene case-control study Disease prevalence 2% Genotype risk ratio Aa = 2, genotype risk ratio AA = 4 Frequency of high risk disease allele = 0.05 Frequency of associated marker allele = 0.1 Linkage disequilibrium D-Prime = 0.8 Sample size: 500 cases, 500 controls Type 1 error rate: 0.01 Calculate Marker allele frequencies in cases and controls, NCP, Power

Exercise 2 For a discrete trait TDT study Assumptions: same as in Exercise 1 Sample size: 500 parent-offspring trios Type 1 error rate: 0.01 Calculate: Ratio of transmission of marker alleles from heterozygous parents, NCP, Power

Exercise 3 For the same assumptions as Exercise 1 Find the type 1 error rates that correspond to 80% power for sample sizes of 500 cases, 500 controls 1000 cases, 1000 controls 2000 cases, 2000 controls

Answer to Exercise 1 High risk allele frequencies in cases and controls are 13.43% and 9.93%, respectively, NCP = 5.933 Power = 0.4443

Answer to Exercise 2 Ratio of transmission of high-risk and low-risk alleles from heterozygous parents to affected children is 0.2417:0.1731 NCP = 5.667 Power = 0.4226

Answer to Exercise 3 Sample size NCP Critical 2 Critical  500 5.933 2.54164 .111 1000 11.866 6.77605 .00924 2000 23.732 16.2403 .0000558 NCPs obtained by multiplication of NCP from Ex 1 Critical 2 obtained from inverse non-central chi-square distribution function Critical  obtained from chi-square distribution function (with NCP set to 0)

Meta-analysis Combine data from multiple studies to increase statistical power obtain more precise effect size estimates Uses summary statistics from each study, rather than raw data estimates (β) + standard error test statistics (Z) + sample size p-values

Steps in meta-analysis Identify relevant studies Obtain agreement for participation Ensure uniformity in phenotype definition, marker set (imputation), analysis method, file format for summary statistics Share summary statistics files Combine the summary statistics of studies to give “meta-statistics” (Z, β, p, etc) Look for evidence of heterogeneity in effect size Check for signs of publication bias

Phenotype definition Make sure phenotypes have same definition in different studies. If not, use Z-combination, p-combination, or re-scale β’s and their standard errors before combining

Strand orientation ATCTGGT[A/C]CTCCAT TAGACCA[T/G]GAGGTA A is equivalent to T C is equivalent to G No ambiguity

Strand orientation The annoying problem: ATCTGGT[A/T]CTCCAT TAGACCA[T/A]GAGGTA Allele A in one study may be labeled as T in another G/C SNPs have the same problem

Strand orientation Two ambiguous SNP types A/T and G/C Flip alleles if probe sequence is complementary to reference sequence (the + strand): http://www.well.ox.ac.uk/~wrayner/strand/ Further checks Allele frequencies [know your population] LD [if you have raw data] Directionless combination

Weighted β where

Weighted Z where The test statistics Zi can be obtained from two-tailed p-values and the direction of effect, or one-tailed p-values, using the inverse normal distribution function

Directionless combination Fisher’s method: Sum of χ2’s Correlation of p-values from the methods ~ 0.99 Chi-squared statistics can be weighted by sample size

Test for Heterogeneity Cochran’s Q

Publication bias: funnel plot Positive studies easier to publish than negative studies, especially of sample size is modest or small Does not affect GWAS meta-analyses!

Why not random effects? Random effects (RE) meta-analysis allows β to differ in different populations In contrast, fixed effects (FE) meta-analysis assumes the same β across populations Since variation of β is likely to exist, why is FE meta-analysis preferred? H0: β = 0 for all populations H0: E(β) = 0 across populations The first H0 is more appropriate, but the RE model is designed to test the second H0 Han & Eskin, 2011, AJHG

METAL Practical You have ran a GWA analysis. We will take the results file and a second results file to run a meta-analysis using METAL Copy files from faculty/meike/2015/metal-practical to your own folder

Documentation can be found at the metal wiki: http://www.sph.umich.edu/csg/abecasis/metal/ Documentation can be found at the metal wiki: http://genome.sph.umich.edu/wiki/Metal_Documentation

METAL Metal is flexible By default, METAL combined p-values across studies (sample size, direction of effect) Alternative, standard error based weights (but beta and standard error use same units in all studies)

METAL Requires results files ‘Driver’ file Describes the input files Defines meta-analysis strategy Names output file

Steps Check format of results files Prepare driver file Run metal Ensure all necessary columns are available Modify files to include all information Prepare driver file Ensure headers match description Crosscheck each results file matches Process name Run metal

Results Files Previously asked for standard columns in SOP Current procedure is to upload complete files (results and info files)

INPUT FILES We will use the GWAs results (results1.txt) We will use a second set of results (results2.txt)

Columns METAL uses SNP OR SE [for standard error meta-analysis] P-value [for Z-score meta-analysis] If we had two samples of different sizes we would have to add an N/weight column

Meta-analysis running We will run meta-analysis based on effect size and on test statistic For the weights of test statistic, I’ve assumed that the sample sizes are the same METAL defaults to weight of 1 when no weight column is supplied

Step 2: driver file: meta_run_file # PERFORM META-ANALYSIS based on effect size and on test statistic # Loading in the input files with results from the participating samples # Note: Order of samples is …[sample size, alphabetic order,..] # Phenotype is .. # MB March 2015 MARKER SNP ALLELE A1 A2 PVALUE P EFFECT log(OR) STDERR SE specifies column names PROCESS results1.txt PROCESS results2.txt processes two results files OUTFILE meta_res_Z .txt Output file naming ANALYZE Conducts Z-based meta-analysis from test statistic CLEAR Clears workspace SCHEME STDERR Changes meta-analysis scheme to beta + SE OUTFILE meta_res_SE .txt Output file naming ANALYZE Conducts effect size meta-analysis

Larger Consortia # PERFORM META-ANALYSIS on P-values # Loading in the inputfiles with results from the participating samples # Note: Order of samples is alpahabetic # Phenotype is WB # 1. AGES_HAP MARKER SNPID ALLELE coded_all noncoded_all EFFECT Beta PVALUE Pval WEIGHT n_total GENOMICCONTROL ON COLUMNCOUNTING LENIENT PROCESS AGES_HAP.txt # 2. ALSPAC_HAP PROCESS ALSPAC_HAP.txt AND SO ON (in this case 40 files)

Running metal metal < metal_run_file metal is the command metal_run_file is the driver file This will output information on the running of METAL things to standard out [the terminal] It will spawn 4 files: 2 results files: meta_res_Z1.txt + meta_res_SE1.txt 2 info files: meta_res_Z1.txt.info + meta_res_SE1.txt.info

Output you’ll see Overview of METAL commands Any errors And your best hit from meta-analysis

To load into Haploview We have to change the header In the same directory run: ./reformat.sh This changes 1st column name to SNP We can then load the meta-analysis results files into haploview Same as before but load in the meta_res_Z1.txt Make sure to include gwas-example.bim

Plot

Example: Wellbeing I presented a single cohort GWAs on WB at the BGA conference in 2011 (n=3089) Social Science Genetic Association Consortium (SSGAC) SOP has been sent out FEB 2012 Only aware of a couple of cohorts with WB data DEC 2012 30 cohorts agreed to participate (PA: 90K) DEC 2014 41 cohort (PA: 137K, LS: 92K, WB: 172K)