Increasing Power in Association Studies by using Linkage Disequilibrium Structure and Molecular Function as Prior Information Eleazar Eskin UCLA.

Increasing Power in Association Studies by using Linkage Disequilibrium Structure and Molecular Function as Prior Information Eleazar Eskin UCLA

Motivation Whole genome association study How to perform multiple hypothesis correction –To increase statistical power Incorporate prior information on molecular function of associated loci Information on linkage disequilibrium structure

Main idea Traditional method –Use a single significance threshold In practice, markers are not identical Set a different threshold at each marker, which reflects both intrinsic (e.g. LD, allele freq.) and extrinsic information on the markers

Standard Association Study M markers in N cases and N controls f i = minor allele frequency at marker i True case/control allele frequency Marker d: casual variant with a relative risk

Standard Association Study Test statistic ~ N(,1) Power at a single marker (probability of detecting an association with N individuals at p-value or significance threshold t

Multiple Hypothesis correction Fix the false positive rate at each marker so that the total false positive rate is α Bonferroni correction – t i = α/M Expected power: where c i is the probability of marker i to be causal  Probability of rejecting the correct null hypothesis

Multi-Threshold Association Allow a different threshold t i for each marker Power: with adjusted false positive rate Goal: set values for t i to maximize the power subject to the constraints

Maximizing the Power Gradient at each marker will be equal at the optimal point Given a value of gradient, solve for the threshold at each marker to achieve that gradient Do binary search over the gradient until thresholds sum to α

Maximizing Power for Proxies In practice, markers are tags for causal variation Given K variants, assign each potential causal variation v k to the best marker i The effective non-centrality parameter is reduced by a factor of | r ki | where r ki is the correlation coefficient between variant k and marker i. If v k is causal, the power function when observing proxy marker i is

Maximizing Power for Proxies Each variant k has a prob of being causal c k The total power captured by each marker i The total power of the association study

Candidate Gene study 1000 cases and controls over ENCODE regions using markers in Affymetrix 500k genechip

Robustness over relative risks

Whole Genome Association Assumption –Each SNP is equally likely to be causal with relative risk of 2 Power for traditional study and multi- threshold association for 2,614,057 SNPs –avg: 0.593 / 0.610 –Avg over power in [0.1, 0.9]: 0.568 / 0.615

Impact of extrinsic information 1.cSNPs are more likely to be involved in disease 2.Add information on se of genes which are more likely to be involved in specific disease 30,700 cSNPs in HapMap contributes to 20% of the disease causing variation Cancer Gene Census: 363 genes in which mutations have been implicated in cancer. 20% of causal variation is assumed in these genes

Increasing Power in Association Studies by using Linkage Disequilibrium Structure and Molecular Function as Prior Information Eleazar Eskin UCLA.

Similar presentations

Presentation on theme: "Increasing Power in Association Studies by using Linkage Disequilibrium Structure and Molecular Function as Prior Information Eleazar Eskin UCLA."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Increasing Power in Association Studies by using Linkage Disequilibrium Structure and Molecular Function as Prior Information Eleazar Eskin UCLA.

Similar presentations

Presentation on theme: "Increasing Power in Association Studies by using Linkage Disequilibrium Structure and Molecular Function as Prior Information Eleazar Eskin UCLA."— Presentation transcript:

Similar presentations

About project

Feedback