Phevor Combines Multiple Biomedical Ontologies for Accurate Identification of Disease- Causing Alleles in Single Individuals and Small Nuclear Families 

Slides:



Advertisements
Similar presentations
A Haplotype at STAT2 Introgressed from Neanderthals and Serves as a Candidate of Positive Selection in Papua New Guinea  Fernando L. Mendez, Joseph C.
Advertisements

Michael Dannemann, Janet Kelso  The American Journal of Human Genetics 
A Targeted High-Throughput Next-Generation Sequencing Panel for Clinical Screening of Mutations, Gene Amplifications, and Fusions in Solid Tumors  Rajyalakshmi.
Single-Color Digital PCR Provides High-Performance Detection of Cancer Mutations from Circulating DNA  Christina Wood-Bouwens, Billy T. Lau, Christine.
In Silico Proficiency Testing for Clinical Next-Generation Sequencing
DOMINO: Using Machine Learning to Predict Genes Associated with Dominant Disorders  Mathieu Quinodoz, Beryl Royer-Bertrand, Katarina Cisarova, Silvio.
Refinement and Discovery of New Hotspots of Copy-Number Variation Associated with Autism Spectrum Disorder  Santhosh Girirajan, Megan Y. Dennis, Carl.
Signal, Noise, and Variation in Neural and Sensory-Motor Latency
K. Alaine Broadaway, David J. Cutler, Richard Duncan, Jacob L
Pathogenic Variants for Mendelian and Complex Traits in Exomes of 6,517 European and African Americans: Implications for the Return of Incidental Results 
Total-Genome Analysis of BRCA1/2-Related Invasive Carcinomas of the Breast Identifies Tumor Stroma as Potential Landscaper for Neoplastic Initiation 
Utilization of Whole-Exome Next-Generation Sequencing Variant Read Frequency for Detection of Lesion-Specific, Somatic Loss of Heterozygosity in a Neurofibromatosis.
Conserved Seed Pairing, Often Flanked by Adenosines, Indicates that Thousands of Human Genes are MicroRNA Targets  Benjamin P. Lewis, Christopher B. Burge,
Daniel Greene, Sylvia Richardson, Ernest Turro 
Identification of Small Exonic CNV from Whole-Exome Sequence Data and Application to Autism Spectrum Disorder  Christopher S. Poultney, Arthur P. Goldberg,
Evaluation of ACMG-Guideline-Based Variant Classification of Cancer Susceptibility and Non-Cancer-Associated Genes in Families Affected by Breast Cancer 
Comparing Algorithms for Genotype Imputation
Volume 125, Issue 4, Pages (May 2006)
Haplotype Estimation Using Sequencing Reads
Linkage Thresholds for Two-stage Genome Scans
Exome Sequencing and Functional Analysis Identifies BANF1 Mutation as the Cause of a Hereditary Progeroid Syndrome  Xose S. Puente, Victor Quesada, Fernando G.
Daniel C. Koboldt, David E. Larson, Lori S. Sullivan, Sara J
Walking the Interactome for Prioritization of Candidate Disease Genes
A Whole-Genome Analysis Framework for Effective Identification of Pathogenic Regulatory Variants in Mendelian Disease  Damian Smedley, Max Schubach, Julius O.B.
Genome-wide Analysis of Body Proportion Classifies Height-Associated Variants by Mechanism of Action and Implicates Genes Important for Skeletal Development 
Genome-wide Transcriptome Profiling Reveals the Functional Impact of Rare De Novo and Recurrent CNVs in Autism Spectrum Disorders  Rui Luo, Stephan J.
Thomas Willems, Melissa Gymrek, G
Proportioning Whole-Genome Single-Nucleotide–Polymorphism Diversity for the Identification of Geographic Population Structure and Genetic Ancestry  Oscar.
PheWAS and Beyond: The Landscape of Associations with Medical Diagnoses and Clinical Measures across 38,662 Individuals from Geisinger  Anurag Verma,
Rounak Dey, Ellen M. Schmidt, Goncalo R. Abecasis, Seunggeun Lee 
Cyclin E1 Is Amplified and Overexpressed in Osteosarcoma
Weight Loss after Gastric Bypass Is Associated with a Variant at 15q26
Relationship between Deleterious Variation, Genomic Autozygosity, and Disease Risk: Insights from The 1000 Genomes Project  Trevor J. Pemberton, Zachary.
Michael Dannemann, Janet Kelso  The American Journal of Human Genetics 
XMCPDT Does Have Correct Type I Error Rates
Variant Association Tools for Quality Control and Analysis of Large-Scale Sequence and Genotyping Array Data  Gao T. Wang, Bo Peng, Suzanne M. Leal  The.
Towfique Raj, Manik Kuchroo, Joseph M
A Subset-Based Approach Improves Power and Interpretation for the Combined Analysis of Genetic Association Studies of Heterogeneous Traits  Samsiddhi.
PubCaseFinder: A Case-Report-Based, Phenotype-Driven Differential-Diagnosis System for Rare Diseases  Toyofumi Fujiwara, Yasunori Yamamoto, Jin-Dong Kim,
Integrative Multi-omic Analysis of Human Platelet eQTLs Reveals Alternative Start Site in Mitofusin 2  Lukas M. Simon, Edward S. Chen, Leonard C. Edelstein,
Maximizing the Power of Principal-Component Analysis of Correlated Phenotypes in Genome-wide Association Studies  Hugues Aschard, Bjarni J. Vilhjálmsson,
Xin Li, Alexis Battle, Konrad J. Karczewski, Zach Zappala, David A
Homozygosity Haplotype Allows a Genomewide Search for the Autosomal Segments Shared among Patients  Hitoshi Miyazawa, Masaaki Kato, Takuya Awata, Masakazu.
Meta-analysis of Correlated Traits via Summary Statistics from GWASs with an Application in Hypertension  Xiaofeng Zhu, Tao Feng, Bamidele O. Tayo, Jingjing.
Jeffrey Staples, Dandi Qiao, Michael H. Cho, Edwin K
Alkes L. Price, Gregory V. Kryukov, Paul I. W. de Bakker, Shaun M
A Weighted False Discovery Rate Control Procedure Reveals Alleles at FOXA2 that Influence Fasting Glucose Levels  Chao Xing, Jonathan C. Cohen, Eric Boerwinkle 
Mendelian Randomization Analysis Identifies CpG Sites as Putative Mediators for Genetic Influences on Cardiovascular Disease Risk  Tom G. Richardson,
Simultaneous Genotype Calling and Haplotype Phasing Improves Genotype Accuracy and Reduces False-Positive Associations for Genome-wide Association Studies 
Genomic Technologies and the New Era of Genomic Medicine
Jon Wakefield  The American Journal of Human Genetics 
Five Years of GWAS Discovery
Diego Calderon, Anand Bhaskar, David A
Structural Variation of Chromosomes in Autism Spectrum Disorder
Deep Phenotyping on Electronic Health Records Facilitates Genetic Diagnosis by Clinical Exomes  Jung Hoon Son, Gangcai Xie, Chi Yuan, Lyudmila Ena, Ziran.
Leonardo Arbiza, Srikanth Gottipati, Adam Siepel, Alon Keinan 
A Fast, Powerful Method for Detecting Identity by Descent
Brennan Decker, Danielle M. Karyadi, Brian W
Stephen Leslie, Peter Donnelly, Gil McVean 
Improving the Assessment of the Outcome of Nonsynonymous SNVs with a Consensus Deleteriousness Score, Condel  Abel González-Pérez, Nuria López-Bigas 
Joseph K. Pickrell  The American Journal of Human Genetics 
L-GATOR: Genetic Association Testing for a Longitudinally Measured Quantitative Trait in Samples with Related Individuals  Xiaowei Wu, Mary Sara McPeek 
Development and Validation of a Computational Method for Assessment of Missense Variants in Hypertrophic Cardiomyopathy  Daniel M. Jordan, Adam Kiezun,
Deleterious- and Disease-Allele Prevalence in Healthy Individuals: Insights from Current Predictions, Mutation Databases, and Population-Scale Resequencing 
Long Runs of Homozygosity Are Enriched for Deleterious Variation
A Whole-Genome Analysis Framework for Effective Identification of Pathogenic Regulatory Variants in Mendelian Disease  Damian Smedley, Max Schubach, Julius O.B.
Qing-Rong Chen, Gordon Vansant, Kahuku Oades, Maria Pickering, Jun S
The HTT CAG-Expansion Mutation Determines Age at Death but Not Disease Duration in Huntington Disease  Jae Whan Keum, Aram Shin, Tammy Gillis, Jayalakshmi Srinidhi.
Harold A. Nieuwboer, René Pool, Conor V. Dolan, Dorret I
A Haplotype at STAT2 Introgressed from Neanderthals and Serves as a Candidate of Positive Selection in Papua New Guinea  Fernando L. Mendez, Joseph C.
Presentation transcript:

Phevor Combines Multiple Biomedical Ontologies for Accurate Identification of Disease- Causing Alleles in Single Individuals and Small Nuclear Families  Marc V. Singleton, Stephen L. Guthery, Karl V. Voelkerding, Karin Chen, Brett Kennedy, Rebecca L. Margraf, Jacob Durtschi, Karen Eilbeck, Martin G. Reese, Lynn B. Jorde, Chad D. Huff, Mark Yandell  The American Journal of Human Genetics  Volume 94, Issue 4, Pages 599-610 (April 2014) DOI: 10.1016/j.ajhg.2014.03.010 Copyright © 2014 The American Society of Human Genetics Terms and Conditions

Figure 1 Variant Prioritization for Known Disease-Causing Alleles Performance comparisons of four different variant-prioritization tools before (A) and after (B) postprocessing them with Phevor. Two copies of a known disease-causing allele were randomly selected from HGMD and spiked into a single target exome at the reported genomic location; hence, these results model simple, recessive diseases. This process was repeated 100 times for 100 different, randomly selected already disease-associated genes for determining margins of error. Bar charts show the percentage of time for which the disease-associated gene was ranked among the top ten candidates genome-wide (red) or among the top 100 candidates (blue); white denotes a rank greater than 100 in the candidate list. For the Phevor analyses in (B), each tool’s output files were fed to Phevor along with phenotype report containing the HPO terms annotated to each disease-associated gene. The table below the bar charts summarizes this information in more detail. Bars do not reach 100% because of false negatives, i.e., not every tool is able to prioritize every disease-causing allele. When the target gene’s disease-causing alleles were unscored or predicted to be benign by a tool, the gene was placed at the midpoint of the list of the 22,107 annotated human genes. The American Journal of Human Genetics 2014 94, 599-610DOI: (10.1016/j.ajhg.2014.03.010) Copyright © 2014 The American Society of Human Genetics Terms and Conditions

Figure 2 Variant Prioritization for Genes Previously Unassociated with Disease The procedure used in Figure 1B was repeated, but instead the disease-associated gene’s ontological annotations were removed from all but the specified ontologies prior to running Phevor. For economic reasons, only VAAST results are shown. Removing all the disease-associated gene’s annotations from all ontologies mimics the case of a previously unreported allele in a gene with unknown GO function, process, and cellular location and no previous association with a known disease or phenotype. This is equivalent to running VAAST alone (“none”), and the leftmost bar chart and table column summarize these results. The right-hand bar and table column (“All”) summarize the results of running VAAST and Phevor with the current ontological annotations of the disease-associated gene. The “GO only” column reports the results of removing the disease-associated gene’s phenotype annotations, depicting discovery success with only GO ontological annotations. This column models the ability of Phevor to identify a disease association when that gene is annotated to GO but has no disease, human, or model-organism phenotype annotations. In contrast The “MPO, HPO, and DO” column assays the impact of removing a gene’s GO annotations but leaving its disease, human, and model-organism phenotype annotations intact. The American Journal of Human Genetics 2014 94, 599-610DOI: (10.1016/j.ajhg.2014.03.010) Copyright © 2014 The American Society of Human Genetics Terms and Conditions

Figure 3 Comparison of Phevor to the Exomiser’s PHIVE Comparison of disease-allele-identification success rates for Phevor and the PHIVE methodology, which is available through the Exomiser. The Exomiser is based upon ANNOVAR’s filtering logic; thus, the Phevor comparison uses ANNOVAR as the variant-prioritization tool. Shown are the results of 100 searches of known recessive disease-associated genes. Identical variant files and phenotype descriptions were given to Exomiser + PHIVE and ANNOVAR + Phevor. Bar charts show the percentage of time for which the target, i.e., disease-associated, gene was ranked among the top ten candidates genome-wide (red) or among the top 100 candidates (blue); white denotes a rank greater than 100 in the candidate list. The table below the bar charts summarizes this information in more detail. Bars do not reach 100% because of false negatives, i.e., the tool reported the disease-causing allele to be nondeleterious; these cases were placed at the midpoint of the list of 22,107 annotated human genes. The American Journal of Human Genetics 2014 94, 599-610DOI: (10.1016/j.ajhg.2014.03.010) Copyright © 2014 The American Society of Human Genetics Terms and Conditions

Figure 4 Phevor Accuracy and Atypical Disease Presentation In order to evaluate the impact of incorrect diagnosis or atypical phenotypic presentation on Phevor’s accuracy, we repeated the analysis shown in Figure 1; this time, we randomly shuffled the phenotype descriptions for each gene at runtime and used the same phenotype descriptions for every member of a case cohort. For economic reasons, only VAAST results are shown. The results of running VAAST with and without Phevor for case cohorts of one, three, and five unrelated individuals are shown. As would be expected, providing Phevor with incorrect phenotype data significantly affected its diagnostic accuracy. For a single affected individual, Phevor declined in accuracy from ranking the damaged gene in the top ten candidates genome-wide in 100% of the cases to ranking it in 26% of cases. Nevertheless, Phevor was still able to improve upon VAAST’s performance alone. Phevor placed 95% of the damaged genes in the top ten candidates with cohorts of three and five unrelated affected individuals, despite the misleading phenotype data, given that the additional statistical power provided by VAAST increasingly outweighed the incorrect prior probabilities provided by Phevor. The American Journal of Human Genetics 2014 94, 599-610DOI: (10.1016/j.ajhg.2014.03.010) Copyright © 2014 The American Society of Human Genetics Terms and Conditions

Figure 5 Phevor Analyses of Three Clinical Cases Plotted on the x axes of each Manhattan plot are the genomic coordinates of the candidate genes. The y axes show the log10 value of the ANNOVAR score, VAAST p value, or Phevor score depending upon the panel. Black, filled circles denote top ranked gene(s), all of which had either the same ANNOVAR score or the same VAAST p value. Red circles denote the gene containing disease-causing allele(s). For purposes of comparison to VAAST, we transformed the ANNOVAR scores to frequencies by dividing the number of gene candidates identified by ANNOVAR by the total number of annotated human genes. (A) Phevor identified NFKB2 as a disease-associated gene. (Top) Results of running ANNOVAR (left) and VAAST (right) on the union of variants identified in affected members of family A and those in the affected individual from family B. Both ANNOVAR and VAAST identified a large number of equally likely candidate genes. NFKB2 (shown in red) was among them in both cases. (Bottom) Phevor identified a single best candidate, NFKB2, by using the VAAST output, and NFKB2 was ranked second with the ANNOVAR output (two other genes were tied for first place). (B) Phevor identified a de novo variant in STAT1 as responsible for a previously undescribed phenotype in an already disease-associated gene. (Top) Results of running ANNOVAR (left) and VAAST (right) on the single affected individual’s exome. Both ANNOVAR and VAAST identified multiple candidate genes. STAT1 (shown in red) was among them in both cases. (Bottom) Phevor identified a single best candidate, STAT1, by using the VAAST output. STAT1 was the third best candidate with the ANNOVAR output. (C) Phevor identified a mutation in ABCB11, a known disease-associated gene. (Top) Results of running ANNOVAR (left) and VAAST (right) on the single affected child’s exome. Both ANNOVAR and VAAST identified a number of equally likely candidate genes. ABCB11 (shown in red) was among them. (Bottom) Phevor identified a single best candidate, ABCB11, by using the ANNOVAR and VAAST outputs. The American Journal of Human Genetics 2014 94, 599-610DOI: (10.1016/j.ajhg.2014.03.010) Copyright © 2014 The American Society of Human Genetics Terms and Conditions