E XOME SEQUENCING AND COMPLEX DISEASE : practical aspects of rare variant association studies Alice Bouchoms Amaury Vanvinckenroye Maxime Legrand 1.

Slides:



Advertisements
Similar presentations
Linkage and Genetic Mapping
Advertisements

RNA-Seq as a Discovery Tool
Analysis of imputed rare variants
Association Tests for Rare Variants Using Sequence Data
Considerations for Analyzing Targeted NGS Data HLA
Single Nucleotide Polymorphism And Association Studies Stat 115 Dec 12, 2006.
Reference mapping and variant detection Peter Tsai Bioinformatics Institute, University of Auckland.
We processed six samples in triplicate using 11 different array platforms at one or two laboratories. we obtained measures of array signal variability.
Presented by Qing Duan Dr. Yun Li group UNC at Chapel Hill
PRIORITIZING REGIONS OF CANDIDATE GENES FOR EFFICIENT MUTATION SCREENING.
Genome-wide Association Study Focus on association between SNPs and traits Tendency – Larger and larger sample size – Use of more narrowly defined phenotypes(blood.
More Powerful Genome-wide Association Methods for Case-control Data Robert C. Elston, PhD Case Western Reserve University Cleveland Ohio.
Predicting the Function of Single Nucleotide Polymorphisms Corey Harada Advisor: Eleazar Eskin.
Introduction to Linkage Analysis March Stages of Genetic Mapping Are there genes influencing this trait? Epidemiological studies Where are those.
Something related to genetics? Dr. Lars Eijssen. Bioinformatics to understand studies in genomics – São Paulo – June Image:
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
RFLP DNA molecular testing and DNA Typing
Introduction Basic Genetic Mechanisms Eukaryotic Gene Regulation The Human Genome Project Test 1 Genome I - Genes Genome II – Repetitive DNA Genome III.
Whole Exome Sequencing for Variant Discovery and Prioritisation
Modes of selection on quantitative traits. Directional selection The population responds to selection when the mean value changes in one direction Here,
Comments on Rare Variants Analyses Ryo Yamada Kyoto University 2012/08/27 Japan.
Biotechnology SB2.f – Examine the use of DNA technology in forensics, medicine and agriculture.
Introduction to BST775: Statistical Methods for Genetic Analysis I Course master: Degui Zhi, Ph.D. Assistant professor Section on Statistical Genetics.
Broad-Sense Heritability Index
MES Genome Informatics I - Lecture VIII. Interpreting variants Sangwoo Kim, Ph.D. Assistant Professor, Severance Biomedical Research Institute,
SNP Haplotypes as Diagnostic Markers Shrish Tiwari CCMB, Hyderabad.
1 Association Analysis of Rare Genetic Variants Qunyuan Zhang Division of Statistical Genomics Course M Computational Statistical Genetics.
Considerations for Analyzing Targeted NGS Data Exome Tim Hague, CTO.
Main Idea #4 Gene Expression is regulated by the cell, and mutations can affect this expression.
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (1) J. R. O’Connell 1 and P. M. VanRaden 2 1 University of Maryland School of Medicine,
Jianfeng Xu, M.D., Dr.PH Professor of Public Health and Cancer Biology Director, Program for Genetic and Molecular Epidemiology of Cancer Associate Director,
Copy Number Variation Eleanor Feingold University of Pittsburgh March 2012.
Julia N. Chapman, Alia Kamal, Archith Ramkumar, Owen L. Astrachan Duke University, Genome Revolution Focus, Department of Computer Science Sources
Geuvadis Analysis Meeting 16/02/2012 Micha Sammeth CNAG – Barcelona.
HW2: exome sequencing and complex disease Jacquemin Jonathan de Bournonville Sébastien.
The International Consortium. The International HapMap Project.
12/16/14 StarterConnection/Exit: What is the true meaning of the word mutation? Are mutations bad / harmful? 12/16/14 Protein Synthesis Writing
Key Area 1.6 (a) and (b) Gene Mutations. Learning Outcomes.
Single nucleotide polymorphisms and Large scale variation
Chapter 2 Genetic Variations. Introduction The human genome contains variations in base sequence from one individual to another. Some sequence variants.
Analysis of Next Generation Sequence Data BIOST /06/2015.
CyVerse Workshop Transcriptome Assembly. Overview of work RNA-Seq without a reference genome Generate Sequence QC and Processing Transcriptome Assembly.
A brief guide to sequencing Dr Gavin Band Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for Health.
Using public resources to understand associations Dr Luke Jostins Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015.
8 and 11 April, 2005 Chapter 17 Population Genetics Genes in natural populations.
The Haplotype Blocks Problems Wu Ling-Yun
Power and Meta-Analysis Dr Geraldine M. Clarke Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for.
Armenian Genome Project
Canadian Bioinformatics Workshops
Interpreting exomes and genomes: a beginner’s guide
Single Nucleotide Polymorphisms (SNPs
Genomic Analysis: GWAS
SNP Detection Congtam Pham 2/24/04 Dr. Marth’s Class.
Press report 13/10/ publications selected.
Gil McVean Department of Statistics
Genome Wide Association Studies using SNP
Next Generation Sequencing
Content and Labeling of Tests Marketed as Clinical “Whole-Exome Sequencing” Perspectives from a cancer genetics clinician and clinical lab director Allen.
Beyond GWAS Erik Fransen.
Genome organization and Bioinformatics
Group A1 Caroline Kissel, Meg Sabourin, Kaylee Isaacs, Alex Maeder
Identification of Paralogs in RADseq data
DNA and the Genome Key Area 6a & b Mutations.
DNA and the Genome Key Area 6a & b Mutations.
Guidelines for Large-Scale Sequence-Based Complex Trait Association Studies: Lessons Learned from the NHLBI Exome Sequencing Project  Paul L. Auer, Alex.
Robust Inference of Identity by Descent from Exome-Sequencing Data
BF528 - Whole Genome Sequencing and Genomic Variation
Basic Local Alignment Search Tool
Terms MT: Allele Frequency.
Analysis of protein-coding genetic variation in 60,706 humans
Presentation transcript:

E XOME SEQUENCING AND COMPLEX DISEASE : practical aspects of rare variant association studies Alice Bouchoms Amaury Vanvinckenroye Maxime Legrand 1

W HAT IS EXOME SEQUENCING ? Exon : coding sequence of the DNA Exome sequencing : Aim : to sequence the coding part of the DNA i.e. the exons 2

I NTRODUCTION GWAS : helped discover common coding variants Exome sequencing Also rare coding variants Faster, better large sample ( > individuals) Before 2010 : only few publications on PUBMED Now : more than 2000 publications on PUBMED

K EY QUESTIONS TO ASK YOURSELF 4

S TUDY DESIGN State objectives Focus on extreme outcomes Unusual phenotype or traits BUT : CAREFUL : de novo mutations Geographical restrictions ? 5

S TUDY DESIGN Sequencing strategy ? Quality of the sample : 20x or greater level of coverage depth of sequencing/person : 60x or greater Non-coding regions : can still be usefull Determine ancestries or estimate genotype 0,2x to 2x 6

V ARIANT CALLING Goal : obtain high-quality genotypes Several steps: DNA contamination, DNA fingerprints, good follow- up? Alignment with reference genome, calibration of base quality score, removal of duplicate reads. 7

V ARIANT CALLING After reads mapping: Sample quality metrics (spotting of outlier properties) Variant calling: Look for differences where overlaps appear in alignment with the reference genome 8

V ARIANT CALLING Machine-learning-based classifier: Polymorphic variants / artifacts Evaluate metrics : true / false positives Quality metrics on samples Recommendation: min depth of coverage 20X Development of standards for storing sequence data and variant calls 9

A SSOCIATION ANALYSIS Goal: find functional effects of variants Score: indicates the effect on the protein function Separation between variants with high damage and the others If multiple annotations, 3 ways: Focus on the longest transcript Focus on the most deleterious effect Focus on the canonical transcript

A SSOCIATION ANALYSIS Single variant association test Check of quality data Usual way of processing rare variants: gather them in groups acting on the same gene to do the analysis 11

A SSOCIATION ANALYSIS 2 methods for processing groups: Comparison of the number of variants between cases and controls Comparison with chance expectations Recommendation: at least a test of each category with different thresholds If no threshold, variety of frequency cut-offs 12

A SSOCIATION ANALYSIS Packages available to perform the tests with subsets of data Example : 1. missense, splice, stop altering variants 2. subset of deleterious variants 3. splice, stop altering variants 13

A SSOCIATION ANALYSIS No optimal choices for the analysis because of variability of variants and of their charateristics between genes. Permutation-based approaches Statistical significance If no permutation-based threshold, p values ≤ QQ plots to summarize the results 14

A PPROACHES FOR FOLLOW - UP To demonstrate association based on the analysed samples, additional samples are needed. 15

A PPROACHES FOR FOLLOW - UP Exome chip experiments examine most of the varaints, but not very sensitive to non-European populations. 16

A PPROACHES FOR FOLLOW - UP Statistical imputation Take the base which has the highest correlation with the missing one, and assume it is the same allele than T (i.e. minor or major). But again, often not possible for mixed populations 17

R OLE OF FUNCTIONAL ASSAYS Study the changes in the proteins due to coding variants Study why these changes result in diverse diseases. 18

F ORWARD GENETICS Other approach to study functional variants First look at which proteins show changes Then search in the DNA sequence for the variant(s) 19

D ISCUSSION In other articles : more careful about the sample quality gain of sensitivity in variant calls if made among several samples indels in variant call are the major source of false positive. Need alignment algorithm which allows gapped alignement Check results of association in data bases 20

D ISCUSSION Because of costs, exome sequencing studies focus on coding part of the genome. Thus not suitable for non- exonic sequence. (stuctural variants, chromosomal rearrangements) These problems will be partially solved by the cut in costs of sequencing 21

REFERENCES 22

23