Presentation is loading. Please wait.

Presentation is loading. Please wait.

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Graph Regularized Dual Lasso for Robust eQTL Mapping Wei Cheng 1 Xiang Zhang 2 Zhishan Guo 1 Yu Shi 3 Wei.

Similar presentations


Presentation on theme: "The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Graph Regularized Dual Lasso for Robust eQTL Mapping Wei Cheng 1 Xiang Zhang 2 Zhishan Guo 1 Yu Shi 3 Wei."— Presentation transcript:

1 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Graph Regularized Dual Lasso for Robust eQTL Mapping Wei Cheng 1 Xiang Zhang 2 Zhishan Guo 1 Yu Shi 3 Wei Wang 4 1 University of North Carolina at Chapel Hill, 2 Case Western Reserve University, 3 University of Science and Technology of China, 4 University of California, Los Angeles Speaker: Wei Cheng The 22 th Annual International Conference on Intelligent Systems for Molecular Biology (ISMB’14)

2 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL eQTL (Expression QTL) Goal: Identify genomic locations where genotype significantly affects gene expression.

3 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Partition individuals into groups according to genotype of a SNP Do a statistic (t, ANOVA) test Repeat for each SNP Statistical Test SNPs (X) Gene expression levels (Z)...... 0 0 0 0 0 0 1 1 1 1 1 1 0 0 1 1 1 1 0 0 1 0 0 0 0 0 1 0 0 0 1 0 1 0 0 1 1 0 0 0 1 0 1 0 1 1 1 1 0 0 0 1 0 0 1 1 1 0 0 0 1 0 1 0 1 0...... 8 7 12 11 9 13 6 4 2 5 0 3 9 8 1 0 8 5 2 1 0 8 6 2...... individuals SNP1 1 0 4812 Gene expression level

4 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Lasso-based feature selection  X: the SNP matrix (each row is one SNP)  Z: the gene expression matrix (each row is one gene expression level)  Objective:

5 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Incorporating prior knowledge SNPs (and genes) usually are not independent The interplay among SNPs and the interplay among genes can be represented as networks and used as prior knowledge  Prior knowledge: genetic interaction network, PPI network, gene co-expression network, etc. E.g., group lasso, multi-task, SIOL, MTLasso 2G.

6 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Limitations of current methods A clustering step is usually needed to obtain the grouping information. Do not take into consideration the incompleteness of the prior knowledge and the noise in them  E.g., PPI networks may contain many false interactions and miss true interactions Other prior knowledge, such as location and gene pathway information, are not considered.

7 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Motivation Examples of prior knowledge on genetic interaction network S and gene-gene interactions represented by PPI network (or gene co-expression network G).W is the regression coefficients to be learned.

8 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL GD-Lasso: Graph-regularized Dual Lasso Objective: Lasso objective considering confounding factors (L), ||L|| * is the nuclear norm to control L as low-rank. The graph regularizer The fitting constraint for prior knowledge

9 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL GGD-Lasso: Generalized Graph-regularized Dual Lasso Further incorporating location and pathway information. Objective: D(·, ·) is a nonnegative distance measure.

10 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL GGD-Lasso: Optimization Executes the following two steps iteratively until the termination condition is met:  1) update W while fixing S and G;  2) update S and G according to W, while decreasing:  and  We can maintain a fixed number of edges in S and G. E.g., to update G, we can swap edge (i’, j’) and edge (i,j) when Further integrate location and pathway information

11 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Experimental Study: simulation 10 gene expression profiles are generated by ~ ~ ~

12 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Experimental Study: simulation The ROC curve. The black solid line denotes what random guessing would have achieved.

13 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Experimental Study: simulation AUCs of Lasso, LORS, G-Lasso and GD-Lasso. In each panel, we vary the percentage of noises in the prior networks S 0 and G 0.

14 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Experimental Study: Yeast yeast eQTL dataset  112 yeast segregants generated from a cross of two inbred strains: BY and RM;  removing those SNP markers with percentage of NAs larger than 0.1 (the incomplete SNPs are imputed), and merging those markers with the same genotypes, dropping genes with missing values;  get 1017 SNP markers, 4474 expression profiles; Genetic interaction network and PPI network (S and G)

15 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Experimental Study: Yeast cis-enrichment analysis  (1) one-tailed Mann-Whitney: test on each SNP for cis hypotheses;  (2) a paired Wilcoxon sign-rank: test on the p-values obtained from (1). trans-enrichment:  Similar strategy: genes regulated by transcription factors (TF) are used as trans-acting signals.

16 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Experimental Study: Yeast Pairwise comparison of different models using cis-enrichment and trans-enrichment analysis

17 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Experimental Study: Yeast Summary of the top-15 hotspots detected by GGD-Lasso. Hotspot (12) in bold cannot be detected by G-Lasso. Hotspot (6) in italic cannot be detected by SIOL. Hotspot (3) in teletype cannot be detected by LORS.

18 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Experimental Study: Yeast Hotspots detected by different methods

19 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Conclusion In this paper…  We propose novel and robust graph regularized regression models to take into account the prior networks of SNPs and genes simultaneously.  Exploiting the duality between the learned coefficients and incomplete prior networks enables more robust model.  We also generalize our model to integrate other types of information, such as location and gene pathway information.

20 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Thank You ! Questions? Travel funding to ISMB 2014 was generously provided by DOE


Download ppt "The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Graph Regularized Dual Lasso for Robust eQTL Mapping Wei Cheng 1 Xiang Zhang 2 Zhishan Guo 1 Yu Shi 3 Wei."

Similar presentations


Ads by Google