Download presentation
Published byGarey Simmons Modified over 8 years ago
1
Sequence Kernel Association Tests (SKAT) for the Combined Effect of Rare and Common Variants
統計論文 奈良原
2
The American journal of Human Genetics (2013)
SKAT Wu, M.C., Lee, S., Cai, T., Li, Y., Boehnke, M., and Lin, X. (2011). Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89, 82–93. Developed for rare-variant analysis
3
Background of rare variant analysis
Classic: burden tests Collapsing method: rare variant +/- in a region Counts of rare alleles Combined multivariate and collapsing (CMC) method rare variants: collapsed, common variants: each forms a separate group --> Combined by Hotelling's T2 statistic Weighted sum Non-burden tests C-alpha test Sequence kernel association test (SKAT) Problem of Burden tests Burden tests assume that all rare variants influence the phenotype in the same direction with the same magnitude of effects (after weighting). methods that are robust to different direction and magnitude of effects
4
Development of SKAT SKAT (2011) A kernel regression approach
non-parametric non-linear regression flexible weighting function weights based on minor allele frequency based on SNP functional annotation Wide range of application binary/continuous traits adjustment for covariates both rare and common variants (up-weighting rare variants) Efficient computation Score test for variance-component in linear mixed model
5
Development of SKAT (2) SKAT-O: Optimal unified approach (AJHG, 2012)
Combination of a burden test and SKAT Burden test ... optimal when most variants in a region are causal and the effects are in the same direction SKAT ... optimal when a large fraction of the variants in a regions are non-causal or the effects of causal variants are in different directions Extension of SKAT-O to testing a combined effect of rare and common variants (AJHG, 2013)
6
methods
7
Linear mixed model Genetic effect: random effect
8
Variance component score test
Choice of a kernel function weighted linear weighted quadratic weighted IBS = kernel function Genetic similarity between subjects (weighted) Choice of weights Typical parameter: a1=1, a2=25 P value given by the Davies method Approximation of Q statistic
9
Optimal unified approach, SKAT-O
Next, they unified burden test and SKAT to optimize the rare variant analysis. Burden test More powerful than SKAT when most variants in a region are causal and the effects are in the same direction SKAT More powerful than burden test when a large fraction of the variants in a regions are non-causal or the effects of causal variants are in different directions
10
Weighted burden test statistic
SKAT statistic aggregates the variants before regression first regresses and aggregates the individual variant statistics
11
Unifying two test statistics
Optimal value of ρ is determined by grid search. Qρ is equivalently calculated by the formula of score test statistic ρ: correlation between different βj's ρ=0: regression coefficients are not correlated to each other --> SKAT ρ=1: regression coefficients are perfectly correlated --> Burden test
12
Rare and common variants together in the SKAT-O framework
Different weighting functions are defined for rare and common variants. The effects of rare and common variants are fitted together using separate random effect terms.
13
Model
14
Statistic Weighted sum of statistics of rare and common variants
15
Predefined parameters
Weights Rare variants Common variants Contribution, φ Equal contribution or searching the optimal value of φ Beta(1, 25) Beta(0.5, 0.5) MAF
16
Appendix
17
Kernel in statistics In Bayesian statistics
The kernel of a probability density function or probability mass function is the form of PDF or PMF in which any factors that are not functions of any of the variables in the domain are omitted (normalization factor). Ex. kernel of a normal distribution PDF: Kernel:
18
Kernel in statistics (2)
In non-parametric statistics A kernel is a weighting function Usage Kernel density estimation to estimate random variables' density functions In kernel regression to estimate the conditional expectation of a random variable In time-series to estimate the spectral density Estimation of a time-varying intensity for a point process Definition A kernel is a non-negative real-valued integrable function K satisfying the following two requirements: If K is a kernel, then so is the function K*. K*(u) = λK(λu), where λ > 0 --> A kernel is a PDF. --> A kernel is symmetric about u=0.
19
Kernel regression The kernel regression is a non-parametric approach to find a non-linear relation between a pair of random variables X and Y. The goal is to estimate a function m that gives conditional expectation of a variable Y relative to a variable X: A kernel is used to estimate a function m.
20
Kernel trick A kernel trick is a method to project data into a higher-dimensional space so that non-linear data can be separated by a hyperplane. non-linear --> linear Kernel function K(x, z) = <Φ(x), Φ(z)> Φ(・): a function to project data into higher-dimensional space <・, ・> : inner product
21
Application of kernel method
Kernel PCA (non-linear PCA) Kernel CCA Support vector machine
22
Support vector machine
SVM is a machine learning approach that utilizes the kernel function to project data in a higher-dimensional space that can separate the data by a hyperplane. SVM is a non-linear classifier.
23
Variance component score test
Lin, X. (1997). Variance component testing in generalised linear models with random effects. Biometrika84, 309–326. Variance component tests in linear mixed model Likelihood ratio test Score statistic Computationally efficient Wald statistic
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.