Challenge 4 – Moving beyond coding regions

Slides:



Advertisements
Similar presentations
Why this paper Causal genetic variants at loci contributing to complex phenotypes unknown Rat/mice model organisms in physiology and diseases Relevant.
Advertisements

Targeted Data Introduction  Many mapping, alignment and variant calling algorithms  Most of these have been developed for whole genome sequencing and.
Computational Tools for Finding and Interpreting Genetic Variations Gabor T. Marth Department of Biology, Boston College
Considerations for Analyzing Targeted NGS Data Exome Tim Hague, CTO.
Presented by Karen Xu. Introduction Cancer is commonly referred to as the “disease of the genes” Cancer may be favored by genetic predisposition, but.
Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College
MES Genome Informatics I - Lecture VIII. Interpreting variants Sangwoo Kim, Ph.D. Assistant Professor, Severance Biomedical Research Institute,
Computational research for medical discovery at Boston College Biology Gabor T. Marth Boston College Department of Biology
Genetics-multistep tumorigenesis genomic integrity & cancer Sections from Weinberg’s ‘the biology of Cancer’ Cancer genetics and genomics Selected.
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen,
CS177 Lecture 10 SNPs and Human Genetic Variation
Next-Generation Sequencing Eric Jorgenson Epidemiology 217 2/28/12.
Considerations for Analyzing Targeted NGS Data Exome Tim Hague, CTO.
LARVA: An integrative framework for Large-scale Analysis of Recurrent Variants in noncoding Annotations M Gerstein, Yale Slides freely downloadable from.
Cancer genomics Yao Fu March 4, Cancer is a genetic disease In the early 1970’s, Janet Rowley’s microscopy studies of leukemia cell chromosomes.
Recombination breakpoints Family Inheritance Me vs. my brother My dad (my Y)Mom’s dad (uncle’s Y) Human ancestry Disease risk Genomics: Regions  mechanisms.
Do not reproduce without permission 1 Gerstein.info/talks (c) (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Gerstein Lab Aims in ModENCODE.
INTERPRETING GENETIC MUTATIONAL DATA FOR CLINICAL ONCOLOGY Ben Ho Park, M.D., Ph.D. Associate Professor of Oncology Johns Hopkins University May 2014.
Recent Advances in Genomic Science Julian Sampson Institute of Medical Genetics, Cardiff.
1 Finding disease genes: A challenge for Medicine, Mathematics and Computer Science Andrew Collins, Professor of Genetic Epidemiology and Bioinformatics.
From Reads to Results Exome-seq analysis at CCBR
Million Veteran Program: Industry Day Genomic Data Processing and Storage Saiju Pyarajan, PhD and Philip Tsao, PhD Million Veteran Program: Industry Day.
Genetics Journal Club Sumeet A. Khetarpal 10 December 2015.
Interpreting exomes and genomes: a beginner’s guide
Single Nucleotide Polymorphisms (SNPs
Pff disease education webinar series
Sungkyunkwan University, School of Medicine.
Genomic Analysis: GWAS
Ø Novel approaches for linkage mapping in dairy cattle
GraDe-SVM: Graph-Diffused Classification for the Analysis of Somatic Mutations in Cancer Morteza H.Chalabi, Fabio Vandin Hello.
Timing, rates and spectra of human germline mutation
Disease risk prediction
National Human Genome Research Institute
Genetic Testing for the Clinician
Functional Mapping and Annotation of GWAS: FUMA
Introduction to Direct-to-Consumer Genetic Testing
Senior Director of Genomics and Content
Thoughts related to integration of genomic information with clinical phenotypes & issues related to data privacy Mark Gerstein, Yale.
Detection of genes causing Fibromyalgia
University of Pittsburgh
POLYMORPHISMS & ASSOCIATION TESTS
Content and Labeling of Tests Marketed as Clinical “Whole-Exome Sequencing” Perspectives from a cancer genetics clinician and clinical lab director Allen.
Genetics and Genomics 5a. Integrative Genomics
Figure 1 The genomic nephrology workflow: genetic diagnosis and clinical application Figure 1 |The genomic nephrology workflow: genetic diagnosis and clinical.
Linking Genetic Variation to Important Phenotypes
Volume 67, Issue 4, Pages (April 2015)
Genetics and genomics of psychiatric disease
GersteinLab.org Overview
Revisiting the Thrifty Gene Hypothesis via 65 Loci Associated with Susceptibility to Type 2 Diabetes  Qasim Ayub, Loukas Moutsianas, Yuan Chen, Kalliope.
Pierre Nahon, Jessica Zucman-Rossi  Journal of Hepatology 
Medical genomics BI420 Department of Biology, Boston College
Genetics of Human Cardiovascular Disease
BF528 - Genomic Variation and SNP Analysis
Medical genomics BI420 Department of Biology, Boston College
An Expanded View of Complex Traits: From Polygenic to Omnigenic
BF528 - Whole Genome Sequencing and Genomic Variation
GWAS-eQTL signal colocalisation methods
Session 3: Coverage and Reimbursement for Genetic Testing
The principles of genetic association
Analysis of protein-coding genetic variation in 60,706 humans
Pan-cancer genome and transcriptome analyses of 1,699 paediatric leukaemias and solid tumours By: Anh Pham.
Somatic mutational rates and survival analysis of IBD-CRC.
Results from a GWAS of prostate cancer in the KP population (8,399 cases and 38,745 controls), highlighting key chromosomal regions. Results from a GWAS.
Comparison of variant associations from previous reports and results from the KP GWAS meta-analysis for 105 known prostate cancer risk SNPs. Plotted values.
The long tail of mutational hotspots in cancer.
Volume 76, Issue 6, Pages (December 2012)
High genomic fidelity of SCLC PDX models derived from both CTCs and biopsies. High genomic fidelity of SCLC PDX models derived from both CTCs and biopsies.
Presentation transcript:

Challenge 4 – Moving beyond coding regions Mark Gerstein Yale CMG

Moving beyond coding CMG projects have been mostly WES, with a few incorporating WGS (incl. Dubowitz & unsolved cases). Compared to WES, interpreting variants in non-coding regions is challenging. 3 things to consider in this regard (annotation qual., differential impact, variant qual.) …

Things to consider in moving beyond coding #1: Quality & scale of coding v. non-coding annotation & the impact of this on statistical power ENCODE has developed non-coding annotations & a number of tools have been developed to synthesize these (eg HaploReg, FunSeq, &c) Compared to coding regions, the underlying functional territory of non-coding regions is not as well defined nor is the differential effect of different mutations This creates power issues in non-coding variant prioritization. More precise (ie more compact) annotation may be useful. Also, integration of tissue-specific annotations & epigenetic data is important for deciphering impact of non-coding variants [Nature 547: 40]

Things we need to consider in moving from coding to non-coding #2: Most of the high-impact variants found so far tend to occur in coding regions (lessons from cancer genomics) Somatic coding driver vs non-coding passenger as an example of extreme dichotomy. Or is this a function of ascertainment ? Despite 1000s of WGS call sets, very few non-coding drivers have been found in cancer genomics [Rhienbay et al bioRxiv ’17; Khurana et al NRG ’16] In general (ie for CMG), do high-impact variants tend to occur in coding regions & “softer” regulatory ones, in non-coding regions? high Mendelian risk variants found from family studies Drivers found from cohort-level recurrence Common variants found in GWAS Effect Size Common SNPs w/o clinical utility Passenger mutations VUS in Mendelian studies low [Adapted from Thomas et al., Lancet ('15)]

Things we need to consider in moving from coding to non-coding #3: Variant calls (even coding ones) from WGS maybe more informative & accurate WGS can detect full spectrum of variants including SNPs, INDELs, & SVs. SVs, in particular, are harder to interpret just in terms of exomes [Yang et al. AJHG ‘15]. Accuracy of mapping can be better (even to coding), esp. w/ regard to repeats & pseudogenes [Zhang et al. PLOS Comp. Bio. ‘17]. Potentially better uniformity in coverage may lead to better accuracy in coding variants (& handling of mosaicism) [Belkadi et al. PNAS. ’15]. WGS also makes possible more precise references for mapping – ie individual-specific, personal dipoloid genomes & population specific references [Chaisson et al. NRG ’15; Rozowsky et al. MSB ‘11] .