SEQSpark: A Complete Analysis Tool for Large-Scale Rare Variant Association Studies Using Whole-Genome and Exome Sequence Data  Di Zhang, Linhai Zhao,

Slides:



Advertisements
Similar presentations
Length Distributions of Identity by Descent Reveal Fine-Scale Demographic History Pier Francesco Palamara, Todd Lencz, Ariel Darvasi, Itsik Pe’er The American.
Advertisements

TFIIH Subunit Alterations Causing Xeroderma Pigmentosum and Trichothiodystrophy Specifically Disturb Several Steps during Transcription Amita Singh, Emanuel.
Previous Estimates of Mitochondrial DNA Mutation Level Variance Did Not Account for Sampling Error: Comparing the mtDNA Genetic Bottleneck in Mice and.
Connexin Mutations in Skin Disease and Hearing Loss David P. Kelsell, Wei-Li Di, Mark J. Houseman The American Journal of Human Genetics Volume 68, Issue.
Functional Analysis of the Neurofibromatosis Type 2 Protein by Means of Disease- Causing Point Mutations Renee P. Stokowski, David R. Cox The American.
Gene Preference in Maple Syrup Urine Disease Mary M. Nellis, Dean J. Danner The American Journal of Human Genetics Volume 68, Issue 1, Pages (January.
Alternative Splicing QTLs in European and African Populations Halit Ongen, Emmanouil T. Dermitzakis The American Journal of Human Genetics Volume 97, Issue.
A Multilocus Model of the Genetic Architecture of Autoimmune Thyroid Disorder, with Clinical Implications Veronica J. Vieland, Yungui Huang, Christopher.
Genome Scan Meta-Analysis of Schizophrenia and Bipolar Disorder, Part I: Methods and Power Analysis Douglas F. Levinson, Matthew D. Levinson, Ricardo Segurado,
Fragile X and X-Linked Intellectual Disability: Four Decades of Discovery Herbert A. Lubs, Roger E. Stevenson, Charles E. Schwartz The American Journal.
Genetic Landscape of Eurasia and “Admixture” in Uyghurs
DOMINO: Using Machine Learning to Predict Genes Associated with Dominant Disorders  Mathieu Quinodoz, Beryl Royer-Bertrand, Katarina Cisarova, Silvio.
Practical Approaches for Whole-Genome Sequence Analysis of Heart- and Blood- Related Traits  Alanna C. Morrison, Zhuoyi Huang, Bing Yu, Ginger Metcalf,
Recurrent CNVs Disrupt Three Candidate Genes in Schizophrenia Patients
Exome Sequencing Followed by Large-Scale Genotyping Suggests a Limited Role for Moderately Rare Risk Factors of Strong Effect in Schizophrenia  Anna C.
Jacek Majewski  The American Journal of Human Genetics 
K. Alaine Broadaway, David J. Cutler, Richard Duncan, Jacob L
Was ADH1B under Selection in European Populations?
CHEK2*1100delC and Susceptibility to Breast Cancer: A Collaborative Analysis Involving 10,860 Breast Cancer Cases and 9,065 Controls from 10 Studies 
2016 Curt Stern Award Address: From Rare to Common Diseases: Translating Genetic Discovery to Therapy1  Brendan Lee  The American Journal of Human Genetics 
Reliable Identification of Genomic Variants from RNA-Seq Data
Yu Jiang, Glen A. Satten, Yujun Han, Michael P. Epstein, Erin L
Meta-analysis of Genetic-Linkage Analysis of Quantitative-Trait Loci
Rare-Variant Extensions of the Transmission Disequilibrium Test: Application to Autism Exome Sequence Data  Zongxiao He, Brian J. O’Roak, Joshua D. Smith,
Linkage Thresholds for Two-stage Genome Scans
Zheng-Zheng Tang, Dan-Yu Lin  The American Journal of Human Genetics 
Missense Variants in HIF1A and LACC1 Contribute to Leprosy Risk in Han Chinese  Dong Wang, Yu Fan, Mahadev Malhi, Rui Bi, Yong Wu, Min Xu, Xiu-Feng Yu,
Rounak Dey, Ellen M. Schmidt, Goncalo R. Abecasis, Seunggeun Lee 
Relationship between Deleterious Variation, Genomic Autozygosity, and Disease Risk: Insights from The 1000 Genomes Project  Trevor J. Pemberton, Zachary.
Imputation of Exome Sequence Variants into Population- Based Samples and Blood- Cell-Trait-Associated Loci in African Americans: NHLBI GO Exome Sequencing.
HYST: A Hybrid Set-Based Test for Genome-wide Association Studies, with Application to Protein-Protein Interaction-Based Association Analysis  Miao-Xin.
Variant Association Tools for Quality Control and Analysis of Large-Scale Sequence and Genotyping Array Data  Gao T. Wang, Bo Peng, Suzanne M. Leal  The.
Kristina Allen-Brady, Peggy A. Norton, James M
Guidelines for Large-Scale Sequence-Based Complex Trait Association Studies: Lessons Learned from the NHLBI Exome Sequencing Project  Paul L. Auer, Alex.
Xing Hua, Haiming Xu, Yaning Yang, Jun Zhu, Pengyuan Liu, Yan Lu 
A Joint Location-Scale Test Improves Power to Detect Associated SNPs, Gene Sets, and Pathways  David Soave, Harriet Corvol, Naim Panjwani, Jiafen Gong,
Genomic Dissection of Population Substructure of Han Chinese and Its Implication in Association Studies  Shuhua Xu, Xianyong Yin, Shilin Li, Wenfei Jin,
Robust Inference of Identity by Descent from Exome-Sequencing Data
Sherlock: Detecting Gene-Disease Associations by Matching Patterns of Expression QTL and GWAS  Xin He, Chris K. Fuller, Yi Song, Qingying Meng, Bin Zhang,
The Rare-Variant Generalized Disequilibrium Test for Association Analysis of Nuclear and Extended Pedigrees with Application to Alzheimer Disease WGS.
Family-Based Association Studies for Next-Generation Sequencing
Sang Hong Lee, Naomi R. Wray, Michael E. Goddard, Peter M. Visscher 
Structural Architecture of SNP Effects on Complex Traits
A Weighted False Discovery Rate Control Procedure Reveals Alleles at FOXA2 that Influence Fasting Glucose Levels  Chao Xing, Jonathan C. Cohen, Eric Boerwinkle 
A Powerful Approach to Estimating Annotation-Stratified Genetic Covariance via GWAS Summary Statistics  Qiongshi Lu, Boyang Li, Derek Ou, Margret Erlendsdottir,
Genotype Imputation with Millions of Reference Samples
Accurate Non-parametric Estimation of Recent Effective Population Size from Segments of Identity by Descent  Sharon R. Browning, Brian L. Browning  The.
Hugues Aschard, Bjarni J. Vilhjálmsson, Amit D. Joshi, Alkes L
Rare-Variant Association Testing for Sequencing Data with the Sequence Kernel Association Test  Michael C. Wu, Seunggeun Lee, Tianxi Cai, Yun Li, Michael.
Privacy Risks from Genomic Data-Sharing Beacons
Dan-Yu Lin, Zheng-Zheng Tang  The American Journal of Human Genetics 
Shuhua Xu, Wei Huang, Ji Qian, Li Jin 
Erratum The American Journal of Human Genetics
Quan Li, Kai Wang  The American Journal of Human Genetics 
A Critical Appraisal of the Scientific Basis of Commercial Genomic Profiles Used to Assess Health Risks and Personalize Health Interventions  A. Cecile.
Estimating Genetic Effects and Quantifying Missing Heritability Explained by Identified Rare-Variant Associations  Dajiang J. Liu, Suzanne M. Leal  The.
JCSE01.16 Positive Correlation Between Whole Genomic Copy Number Variant Scoring and the Grading System in Lung Non-Mucinous Invasive Adenocarcinoma 
Imputing Phenotypes for Genome-wide Association Studies
Wei Pan, Il-Youp Kwak, Peng Wei  The American Journal of Human Genetics 
L-GATOR: Genetic Association Testing for a Longitudinally Measured Quantitative Trait in Samples with Related Individuals  Xiaowei Wu, Mary Sara McPeek 
Interpretation of Association Signals and Identification of Causal Variants from Genome- wide Association Studies  Kai Wang, Samuel P. Dickson, Catherine.
Unified Sequence-Based Association Tests Allowing for Multiple Functional Annotations and Meta-analysis of Noncoding Variation in Metabochip Data  Zihuai.
Long Runs of Homozygosity Are Enriched for Deleterious Variation
A Joint Location-Scale Test Improves Power to Detect Associated SNPs, Gene Sets, and Pathways  David Soave, Harriet Corvol, Naim Panjwani, Jiafen Gong,
Efficient Computation of Significance Levels for Multiple Associations in Large Studies of Correlated Data, Including Genomewide Association Studies 
Tao Wang, Robert C. Elston  The American Journal of Human Genetics 
Iuliana Ionita-Laza, Seunggeun Lee, Vlad Makarov, Joseph D
Xing Hua, Haiming Xu, Yaning Yang, Jun Zhu, Pengyuan Liu, Yan Lu 
Zuoheng Wang, Mary Sara McPeek  The American Journal of Human Genetics 
Michael P. Epstein, Richard Duncan, Erin B. Ware, Min A
Presentation transcript:

SEQSpark: A Complete Analysis Tool for Large-Scale Rare Variant Association Studies Using Whole-Genome and Exome Sequence Data  Di Zhang, Linhai Zhao, Biao Li, Zongxiao He, Gao T. Wang, Dajiang J. Liu, Suzanne M. Leal  The American Journal of Human Genetics  Volume 101, Issue 1, Pages 115-122 (July 2017) DOI: 10.1016/j.ajhg.2017.05.017 Copyright © 2017 American Society of Human Genetics Terms and Conditions

Figure 1 Spark Architecture and SEQSpark Workflow (A) Interaction of the Spark components—driver and workers and the Hadoop filesystem (HDFS) components—NameNode and DataNodes. The NameNode is the master node and manages the file system’s meta-data. A file in the HDFS can be split into several blocks and those blocks are stored in a set of slave nodes (DataNodes). The NameNode determines the mapping of the blocks to the DataNodes, while the DataNodes performs the read and write operations within the file system. The Spark driver talks with the HDFS NameNode and obtains the meta-data from NameNode and then distributes the jobs to the Spark workers. (B) SEQSpark workflow that begins with importing data and databases (used for annotation). The data are loaded into the internal data structures of Spark. Data quality control and annotation can be performed followed by association testing. The American Journal of Human Genetics 2017 101, 115-122DOI: (10.1016/j.ajhg.2017.05.017) Copyright © 2017 American Society of Human Genetics Terms and Conditions

Figure 2 UK10K Waist-to-Hip Ratio Data Scatterplot of the First Two Principal Components and Quantile-Quantile Plots for the Association Analyses (A) First two PCs for the WGS data from 1,811 UK10K study subjects with WHR data. The PCs were constructed using variants with an MAF ≥ 0.01. For the first PC, μ = −0.0228 and STD = 9.8292 × 10−5 while for second PC, μ = −0.0021 and STD = 0.0073. The dashes outline the 4 STDs for the first and second PCs. 13 individuals which are shown in red fall outside of 4 STDs for the second PC and were removed from additional analysis. (B) Quantile-quantile plots for each association analysis performed: single variants, CMC, BRV, VT, SKAT, and SKAT-O. The American Journal of Human Genetics 2017 101, 115-122DOI: (10.1016/j.ajhg.2017.05.017) Copyright © 2017 American Society of Human Genetics Terms and Conditions