Analysis of protein-coding genetic variation in 60,706 humans

Slides:



Advertisements
Similar presentations
Charles He, Jessica McClendon, Kaelin Priger, and Wangshu Yang Group B2 Genes and Mutations.
Advertisements

SHI Meng. Abstract The genetic basis of gene expression variation has long been studied with the aim to understand the landscape of regulatory variants,
Genome-wide Association Study Focus on association between SNPs and traits Tendency – Larger and larger sample size – Use of more narrowly defined phenotypes(blood.
PROCESS OF EVOLUTION I (Genetic Context). Since the Time of Darwin  Darwin did not explain how variation originates or passed on  The genetic principles.
Type 2 Diabetes With type 2 diabetes, your body either resists the effects of insulin — a hormone that regulates the movement of sugar into your cells.
An informatics approach to analyzing the incidentalome J.Berg et al. Genetics in Medicine Presented by Li Changjian.
 What is genetics?  Genetics is the study of heredity, the process in which a parent passes certain genes onto their children. What does that mean?
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
Allele. Alternate form of a gene gene variant autosome.
E XOME SEQUENCING AND COMPLEX DISEASE : practical aspects of rare variant association studies Alice Bouchoms Amaury Vanvinckenroye Maxime Legrand 1.
Molecular Genetics in the Von Willebrand disease Ghasem Rastegarlari.
The same gene can have many versions.
The International Consortium. The International HapMap Project.
A genetic polymorphism in the Drosophila insulin receptor suggests adaptation to climate variation across continents Annalise Paaby a, Mark Blacket b,
Evololution Part 1 Genes and Variation Part 1: Genes and Variation.
Mutations to Aid in Gene Study By: Yvette Medina Cell Phys
Using public resources to understand associations Dr Luke Jostins Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015.
How do we interpret the variants?. Overview How do we prioritize the filtered variants? What filters can be used to identify the causative variants? What.
Low-density Lipoprotein Cholesterol, Familial Hypercholesterolemia Mutation Status, and Risk for Coronary Artery Disease Amit V. Khera, Hong-Hee Won, Gina.
Armenian Genome Project
Interpreting exomes and genomes: a beginner’s guide
Gene sequencing Analysis
Complex disease and long-range regulation: Interpreting the GWAS using a Dual Colour Transgenesis Strategy in Zebrafish.
BRC Science Highlight Many genomic positions in switchgrass contribute to flowering time, a major biomass yield determinant Objective Gain a better understanding.
Detection of genes causing Fibromyalgia
Types of Mutations.
School of Pharmacy, University of Nizwa
The Heritage of Pathogen Pressures and Ancient Demography in the Human Innate- Immunity CD209/CD209L Region  Luis B. Barreiro, Etienne Patin, Olivier Neyrolles,
Genetics Definitions Definition Key Word
The same gene can have many versions.
DOMINO: Using Machine Learning to Predict Genes Associated with Dominant Disorders  Mathieu Quinodoz, Beryl Royer-Bertrand, Katarina Cisarova, Silvio.
The same gene can have many versions.
Beyond GWAS Erik Fransen.
Type 2 Diabetes With type 2 diabetes, your body either resists the effects of insulin — a hormone that regulates the movement of sugar into your cells.
Unit 5 “Mendelian Genetics”
The same gene can have many versions.
VWF sequence variants: innocent until proven guilty
The same gene can have many versions.
The same gene can have many versions.
The student is expected to: 6A identify components of DNA, and describe how information for specifying the traits of an organism is carried in the DNA.
The same gene can have many versions.
Group A1 Caroline Kissel, Meg Sabourin, Kaylee Isaacs, Alex Maeder
Alicia R. Martin, Christopher R. Gignoux, Raymond K
Mutations.
Chapter 7 Review Carrier Sex-linked genes X chromosome inactivation
The same gene can have many versions.
The same gene can have many versions.
Proportioning Whole-Genome Single-Nucleotide–Polymorphism Diversity for the Identification of Geographic Population Structure and Genetic Ancestry  Oscar.
The same gene can have many versions.
Biased Gene Conversion Skews Allele Frequencies in Human Populations, Increasing the Disease Burden of Recessive Alleles  Joseph Lachance, Sarah A. Tishkoff 
Pharmacogenomic variability and anaesthesia
CATALYST Recall and Review: How do these terms relate to DNA?
Variant Association Tools for Quality Control and Analysis of Large-Scale Sequence and Genotyping Array Data  Gao T. Wang, Bo Peng, Suzanne M. Leal  The.
Higher Biology Unit 1: 1.6 Mutations.
Guidelines for Large-Scale Sequence-Based Complex Trait Association Studies: Lessons Learned from the NHLBI Exome Sequencing Project  Paul L. Auer, Alex.
Volume 173, Issue 1, Pages e9 (March 2018)
Robust Inference of Identity by Descent from Exome-Sequencing Data
The same gene can have many versions.
The same gene can have many versions.
The same gene can have many versions.
Mutation Notes.
Leonardo Arbiza, Srikanth Gottipati, Adam Siepel, Alon Keinan 
Introduction to Heredity Vocabulary
The same gene can have many versions.
Investigation 2 Part 2 Vocabulary
Figure 2 Distribution of DEPDC5 variants in patients and controls
The same gene can have many versions.
The same gene can have many versions.
Figure Genetic characterization of the novel GYG1 gene mutation (A) GYG1_cDNA sequence and position of primers used. Genetic characterization of the novel.
The Heritage of Pathogen Pressures and Ancient Demography in the Human Innate- Immunity CD209/CD209L Region  Luis B. Barreiro, Etienne Patin, Olivier Neyrolles,
Presentation transcript:

Analysis of protein-coding genetic variation in 60,706 humans

Introduction: DNA sequence data was generated for 60,706 individuals from diverse ancestries by the ExAC. The results were used to study the role of different genes in pathogenicity ( Mendelian disease) and mutations. ExAC: Exome Aggregation Consortium

The ExAC data set: Sequencing data processing was performed on over 91,000 exomes initially, and after filtering the finale data set was composed of 60,706 individuals. PCA was performed to identify the geographic ancestry of each ExAC individual. Population clusters corresponding to individuals of European, African, South Asian, East Asian, and Latin America, was identified.

The size and diversity of public reference exome data sets ExAC exeecds previous data sets in size for all studied populations.

Principal component analysis was performed (PCA) to divide ExAC individuals into 5 continental populations: European, African, South Asian, East Asian, and Latin American. The apparent separation between East Asian and other samples reflects a certain deficiency in data.

The analysis of the ExAC allele frequency reveals that the majority of genetic variants are rare and novel. The majority of the alleles are low-frequency. This depends on factors such as mutational properties and selective pressure.

The proportion of possible variation observed by mutational context and functional class: over half of all possible CpG transitions are observed. A similar pattern is observed for the three variants, with lower proportions for missense and nonsense due to selective pressures.

The number and frequency distribution of indels by size: compared to in-frame indels, frameshift variants are less

Filtering for Mendelian variant discovery: ExAC improves variant interpretation in rare disease. The value of the ExAC is used as a reference data set for clinical sequencing approaches.

Exome Sequencing Project (ESP) is not well-powered to filter at 0 Exome Sequencing Project (ESP) is not well-powered to filter at 0.1%: Estimates of allele frequency in Europeans based on ESP are more precise at higher allele frequencies.

Allele frequency of disease-causing variants in the Human Gene Mutation Database (HGMD) and/or ClinVar for well-characterized autosomal dominant and autosomal recessive disease genes. As the ExaC allele frequency increases, both autosomal dominant and recessive decreases.

Effect of rare protein-truncating variants: The distribution of PTVs was analyzed through the introduction of a stop codon, frameshift, or disruption of a splite site.

The average ExAC individual has 85 heterozygous and 35 homozygous PTV, of which 18.5 and 0.19 are rare.

Breakdown of PTVs per individual : across all populations, most PTVs found in a given individual are common ( >5% allele frequency)

Number of genes with at least one PTV, or one homozygous PTV variant across all populations : PTVs scales vary differently across human populations with the discovery of both homozygous and heterozygous.

Discussion: The use of a large number of individuals provides a high resolution for the analysis of low-frequency protein-coding variants in human population. The ExAC resource provides the largest database to date for the estimation of allele frequency for protein-coding genetic variants A powerful filter for analysis of candidate pathogenic variants in severe Mendelian diseases is provided. In contrast to ESP that only provides allele resolution at <0.1%, ExAC provides improved power for Mendelian analyses.

Different populations contribute in the discovery of gene-disrupting PTVs ,providing guidance for the understanding of gene function. Common PTV variation is investigated. The discovery of homozygous PTVs is markedly enhanced in the South Asian samples

Limitations? Most ExAC were ascertained for biomedically important disease, although severe paediatric diseases were excluded. The inclusion of both cases and controls for several polygenic disorders means that ExAC certainly contains disease-associated variants. The inclusion of whole genomes will also be critical to investigate additional classes of functional variants and identifying non-coding constrained regions. Detailed phenotype data are unavailable for the vast majority of ExAC samples