Benchmarking Methods for Identifying Causal Mutations Tal Friedman.

Slides:



Advertisements
Similar presentations
CZ5225 Methods in Computational Biology Lecture 9: Pharmacogenetics and individual variation of drug response CZ5225 Methods in Computational Biology.
Advertisements

Shibing Deng Pfizer, Inc. Efficient Outlier Identification in Lung Cancer Study.
Learning Algorithm Evaluation
Clinical Diagnostics in Human Genetics with Semantic Similarity Searches in Ontologies Signs, Symptoms and Findings: Towards an Ontology for clinical Phenotypes.
The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press 1 Computational Statistics with Application to Bioinformatics Prof. William.
Assessing and Comparing Classification Algorithms Introduction Resampling and Cross Validation Measuring Error Interval Estimation and Hypothesis Testing.
Applying haplotype models to association study design Natalie Castellana June 7, 2005.
Positional Cloning LOD Sib pairs Chromosome Region Association Study Genetics Genomics Physical Mapping/ Sequencing Candidate Gene Selection/ Polymorphism.
Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.
DEMO CSE fall. What is GeneMANIA GeneMANIA finds other genes that are related to a set of input genes, using a very large set of functional.
Thoughts on Biomarker Discovery and Validation Karla Ballman, Ph.D. Division of Biostatistics October 29, 2007.
Science Projects  Review published materials related to your problem or question. This is called background research.  Evaluate possible solutions and.
Srihari-CSE730-Spring 2003 CSE 730 Information Retrieval of Biomedical Text and Data Inroduction.
Phenotypes at the Saccharomyces Genome Database
Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.
Chapter 26: Phylogeny and the Tree of Life Objectives 1.Identify how phylogenies show evolutionary relationships. 2.Phylogenies are inferred based homologies.
Computational research for medical discovery at Boston College Biology Gabor T. Marth Boston College Department of Biology
Data Analysis 1 Mark Stamp. Topics  Experimental design o Training set, test set, n-fold cross validation, thresholding, imbalance, etc.  Accuracy o.
Performance of Energy Detection: A Complementary AUC Approach
From Genome-Wide Association Studies to Medicine Florian Schmitzberger - CS 374 – 4/28/2009 Stanford University Biomedical Informatics
Acknowledgements Contact Information Anthony Wong, MTech 1, Senthil K. Nachimuthu, MD 1, Peter J. Haug, MD 1,2 Patterns and Rules  Vital signs medoids.
Bioinformatics MEDC601 Lecture by Brad Windle Ph# Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Association analysis Genetics for Computer Scientists Biomedicum & Department of Computer Science, Helsinki Päivi Onkamo.
De novo discovery of mutated driver pathways in cancer Discussion leader: Matthew Bernstein Scribe: Kun-Chieh Wang Computational Network Biology BMI 826/Computer.
Professor William H. Press, Department of Computer Science, the University of Texas at Austin1 Opinionated in Statistics by Bill Press Lessons #50 Binary.
Lectures 7 – Oct 19, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20 Johnson Hall.
Isella et al., Supplemenrary Figure 1. ROC curve analysis for MACC1, MET and HGF gene dosage, as indicated.
1 SSC 2006: Case Study #2: Obstructive Sleep Apnea Rachel Chu, Shuyu Fan, Kimberly Fernandes, and Jesse Raffa Department of Statistics, University of British.
Artificial Intelligence By Mr. Ejaz CIIT Sahiwal Evolutionary Computation.
Genetics and Behavior With reference to relevant research studies, to what extent does genetic inheritance influence behavior?
Mystery Patients: Using genomic tools to help diagnose and treat rare disease Stuart Turvey MBBS DPhil FRCPC Director of Clinical Research, CFRI Aubrey.
Raphael Sandaltzopoulos, PhD, MBA Professor at MBG (Molecular Biology) Lab. of Gene Expression, Molecular Diagnosis and Modern Therapeutics,
Date of download: 7/2/2016 Copyright © 2016 American Medical Association. All rights reserved. From: How to Interpret a Genome-wide Association Study JAMA.
Population sequencing using short reads: HIV as a case study Vladimir Jojic et.al. PSB 13: (2008) Presenter: Yong Li.
Statistical Genomics Zhiwu Zhang Washington State University Lecture 11: Power, type I error and FDR.
Comprehensive Analysis of Tissue-wide Gene Expression and Phenotype Data Reveals Tissues Affected in Rare Genetic Disorders  Ariel Feiglin, Bryce K. Allen,
Research Paper on BioInformatics
Evaluating Classifiers
1. SELECTION OF THE KEY GENE SET 2. BIOLOGICAL NETWORK SELECTION
CS 698 | Current Topics in Data Science
Gene Selection for Microarray-based Cancer Classification Using Genetic Algorithm 이 정문 2003/04/01 BI Lab.
Deep Phenotyping for Deep Learning (DPDL): Progress Report
Boosting For Tumor Classification With Gene Expression Data
Figure 1 Translocations involved in multiple myeloma
A Whole-Genome Analysis Framework for Effective Identification of Pathogenic Regulatory Variants in Mendelian Disease  Damian Smedley, Max Schubach, Julius O.B.
Q-Q plot of observed P values against theoretical P values for factor analysis (red dots) and single gene–based methods (in blue). Q-Q plot of observed.
Florian T. Merkle, Kevin Eggan  Cell Stem Cell 
Lecture 11: Power, type I error and FDR
A Flexible Bayesian Framework for Modeling Haplotype Association with Disease, Allowing for Dominance Effects of the Underlying Causative Variants  Andrew.
Lecture 11: Power, type I error and FDR
Populations Change Over Time through Natural Selection
ROC for predicted combined death or neurodevelopmental impairment compared with actual death or neurodevelopmental impairment as stratified by gestational.
Serum LAMC2 levels in pancreatic adenocarcinoma (PDAC) and other samples from Japan. Serum LAMC2 levels in pancreatic adenocarcinoma (PDAC) and other samples.
ROC curve for separating advanced stages (advanced BE and EA) versus early stages (BE or BE instability) using total genome-wide LOH [area under the curve.
Diagnostic Performance of Four-dimensional CT and Sestamibi SPECT/CT in Localizing Parathyroid Adenomas in Primary Hyperparathyroidism Combined four-dimensional.
Evaluation of assay performance with positive and negative binary reference sets Evaluation of assay performance with positive and negative binary reference.
Comprehensive Analysis of Tissue-wide Gene Expression and Phenotype Data Reveals Tissues Affected in Rare Genetic Disorders  Ariel Feiglin, Bryce K. Allen,
Christopher T. Nguyen et al. BTS 2018;3:97-109
Receiver under the operator characteristic (ROC) curve for the test accuracy of the final risk score in the entire external validation sample (c statistic=0.84,
Comparative receiver operating characteristic (ROC) curves and the area under the curve (AUC) in association with 30-day mortality for severity scores.
ZIKV ELISA results by NHP type and prior exposure.
Receiver operating characteristic curves (ROC) for the metabolites between the systemic inflammatory response syndrome (n=15) and sepsis (n=35) groups.
Distribution of podocyte gene mutations in patients with genetic congenital nephrotic syndrome (CNS) and steroid–resistant nephrotic syndrome (SRNS). Distribution.
A Whole-Genome Analysis Framework for Effective Identification of Pathogenic Regulatory Variants in Mendelian Disease  Damian Smedley, Max Schubach, Julius O.B.
Manhattan plots for GWAS of LD50, µg/ml survival, 0
—ROC curves for each simple test compared with NCS (gold standard) plotting the sensitivity versus 1-specificity (the false-positive rate) for different.
ROC curves showing additive value of DHE in diagnosing ongoing episode of recurrence among all patients with established history of recurrent pericarditis.
Fig. 1. ROC curve for K-IADL to diagnose dementia or MCI
Receiver operator characteristic (ROC) curve for fasting blood glucose (FBG) predicting posttransplantation diabetes (PTD) using time 0 FBG (a) and screening.
Presentation transcript:

Benchmarking Methods for Identifying Causal Mutations Tal Friedman

Rare Genetic Diseases Our goal: identify and diagnose rare genetic diseases Difficult for clinicians due to incredibly low exposure Often not already documented

Overview Why are we doing any of this? Background What you did Why this was important Finding the cause of rare genetic diseases (introduce rare genetic diseases) We have this PC thing, trying to help clinicians find additional patients and candidate genes It uses the HPO to do some fancy stuff But, we don’t really know how well it’s working Exomiser is a good start (how well is it working) 1) try to recreate results, 2) extend to patient matching domain

PhenomeCentral Clinicians upload patient data

PhenomeCentral Matchmaking algorithm displays most similar patients Get additional evidence from other clinicians

Background Phenotype: Observable characteristics Human Phenotype Ontology (HPO) Robinson et. al

Exomiser (Robinson et. al, 2014)

Objectives Reproduce Exomiser performance Expand to new patient similarity domain

Patient Simulation Control Genome Mutation HPO Terms Infected Patient Disease

Results

Patient Similarity Phenotypic similarity algorithm Hypothesis: same disease/causal gene Combine Exomiser results

Patient Pair Simulation Control Genome A Sampled mutation Sampled HPO terms Patient 1 Control Genome B Sampled mutation Sampled HPO terms Patient 2 Disease Phenotypic Noise & Imprecision

Results (preliminary)

Challenges Data More data

Challenges ROC Curve for Phenotypic Similarity Algorithm

ROC Curves For a binary classifier Plots TPR vs. FPR for varying threshold values Often compared with AUC

Acknowledgements

Questions!