Download presentation
Presentation is loading. Please wait.
Published byEvelin Whiddon Modified over 10 years ago
1
The effect of Next-Generation Sequencing technology on complex trait research
Challenge analysis Ladang Auxane Mombaerts Laurent Uyttendaele Vincent Presented by December 10, 2013 University of Liège GBIO : Krystel Van Steen, Kyrill Bessonov
2
TABLE OF CONTENTS Introduction Challenge analysis Applications
Optimizing parameters for study design Storing and handling data Mapping and aligning Variant calling Analyzing low frequency and rare variants Applications Discussion Conclusion University of Liège GBIO : Krystel Van Steen, Kyrill Bessonov
3
INTRODUCTION What is Next-Generation Sequencing (NGS) ? Applications
3 What is Next-Generation Sequencing (NGS) ? High throughput Low-cost Applications From 1970 until now F. Sanger University of Liège GBIO : Krystel Van Steen, Kyrill Bessonov
4
CHALLENGE ANALYSIS Three mains parameters:
4 1. Optimizing parameters for study design Three mains parameters: High cost-to-data. Only parts of the genome? Power based on depth of coverage. Sample selection. University of Liège GBIO : Krystel Van Steen, Kyrill Bessonov
5
CHALLENGE ANALYSIS Storing and handling data Two years ago
The concept of NGS was still theoretical. Today Devices are operational and affordable → raw data available. University of Liège GBIO : Krystel Van Steen, Kyrill Bessonov
6
CHALLENGE ANALYSIS Production of raw data Using a fluorescent-dye DNA sequencer Labeling of DNA strands with 4 fluorescent dyes Separation of fragments by electrophoresis Monitoring by chromatography Storage of raw data One run can provide until 4 Tb of data → Requirement of a huge memory capacity Handling of raw data Algorithms will be applied for mapping → Requirement of powerful computing tools University of Liège GBIO : Krystel Van Steen, Kyrill Bessonov
7
CHALLENGE ANALYSIS De novo assembly Mapping
3. Mapping and aligning algorithms De novo assembly Sequencing a genome without the use of a reference genome. Reads are assembled by an overlapping method. Mapping Building a sequence that is similar to a reference genome. Reads are aligned on the backbone. University of Liège GBIO : Krystel Van Steen, Kyrill Bessonov
8
CHALLENGE ANALYSIS Improving speed and efficiency of algorithms to deal with large throughput Detecting non-unique mapping (reads corresponding to different sequences of the reference genome) Taking into consideration different base qualities (degrees of certainty) Using a more accurate reference genome (including individual sequences) University of Liège GBIO : Krystel Van Steen, Kyrill Bessonov
9
CHALLENGE ANALYSIS 4. Variant calling
Distinguish true variant from sequencing or mapping errors → Decrease the number of false positive SNP-calls Detecting misalignment around indels Indel at the middle of a read : Perfect match on either side → the algorithm opens a gap. Indel at one extremity of the read : Hard recognition of the indel → misalignment of the read → false positive SNP-call University of Liège GBIO : Krystel Van Steen, Kyrill Bessonov
10
CHALLENGE ANALYSIS Considering different error rates depending on the base location Nucleobases at the extremities have a higher error rate. If misalignment : false positive confident SNP call. SOLUTION : algorithms that consider a recalibrated “base quality score” and select only the central portion of a read. Decreasing the number of errors introduced by PCR artefacts PCR → not uniform cover of the reference genome → over-represented reads SOLUTION : paired-end sequencing libraries to discard clonal reads University of Liège GBIO : Krystel Van Steen, Kyrill Bessonov
11
CHALLENGE ANALYSIS 5. Analyzing low frequency and rare variants
May be a painful step ! Single-Point Low power Would require hundreds of thousands of individuals Across sample sets (composite analysis) A bit less heavy in terms of computing time and data volume University of Liège GBIO : Krystel Van Steen, Kyrill Bessonov
12
APPLICATIONS The number of scientific publications has exploded !
University of Liège GBIO : Krystel Van Steen, Kyrill Bessonov
13
DISCUSSION Development of new study design
Development of more effective methods to distinguish errors from low frequency & rare variants Development of the most appropriate strategy to identify one disease. University of Liège GBIO : Krystel Van Steen, Kyrill Bessonov University of Liège
14
DISCUSSION Cost-benefit analysis
Whole genome sequencing is unlikely to be cost effective as it still presents huge challenges. → coupling a reduction of the costs with an increase of the efficiency and the accuracy. → make NGS platforms marketable, competitive and usable for clinical applications. University of Liège GBIO : Krystel Van Steen, Kyrill Bessonov
15
DISCUSSION Validation analysis
Standards for NGS clinical genomics are required, for instance to validate the test accuracy. → important downstream impact on the patient diagnostic and management. University of Liège GBIO : Krystel Van Steen, Kyrill Bessonov
16
DISCUSSION Current knowledge and research Lack of knowledge
in what a SNP implies in how we detect interaction between genes in which influence gene expression has in which interpretation must be given to the genome variance … The more we make tests, the more knowledge we get, the more associations between phenotype and genome we can do. University of Liège GBIO : Krystel Van Steen, Kyrill Bessonov
17
Enable a wide variety of applications
Conclusion Multiple issues Study design Error handling Data interpretation Enable a wide variety of applications University of Liège GBIO : Krystel Van Steen, Kyrill Bessonov
18
REFERENCES A G Day-Williams, E Zeggini, The effect of Next-Generation Sequencing technology on complex trait research, Eur J Clin Invest 2011, Vol 41 : [online on 7th December 2013] [online on 9th December 2013] Figures University of Liège GBIO : Krystel Van Steen, Kyrill Bessonov
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.