Presentation is loading. Please wait.

Presentation is loading. Please wait.

Joachim De Schrijver.  Short introduction on 454 sequencing  Variant Identification pipeline  Possibilities of a DB oriented pipeline  Examples ◦

Similar presentations


Presentation on theme: "Joachim De Schrijver.  Short introduction on 454 sequencing  Variant Identification pipeline  Possibilities of a DB oriented pipeline  Examples ◦"— Presentation transcript:

1 Joachim De Schrijver

2  Short introduction on 454 sequencing  Variant Identification pipeline  Possibilities of a DB oriented pipeline  Examples ◦ Coverage ◦ Improving PCR ◦ Fast Q assessment ◦ Homopolymers

3  Roche/454 GS-FLX sequencing: ◦ Pyrosequencing ◦ ± 400,000 reads/run ◦ Average length: 200-250bp  Applications: ◦ Resequencing: Variant identification ◦ De novo (genome) sequencing: Assembly of new regions, plasmids or entire genomes  Standard Software: ◦ Variants: Amplicon Variant Analyzer (AVA) ◦ Assembly: Standard 454 assembler

4  Standard software ◦ + Easy to use ◦ + reproducible results on similar datasets ◦ + GUI (graphical user interface) ◦ - No answer for ‘non-standard’ questions  Methylation experiments  Different types of experiments grouped together  … ◦ - What about ‘hidden’ information?  Homopolymer error rates  Quality score ~ length of sequenced read  ‘Multirun’ information  …

5  Modular and database oriented pipeline  Modular: ◦ Efficient planning ◦ Scalable  Database (DB): ◦ No loss of data ◦ Grouping several runs together

6  Basic idea: Data is processed and stored in DB. Results (reports) are calculated ‘on the fly’ using the DB data. ◦ Fast & efficient ◦ Calculations only happen once ◦ Everybody can access the database without risk of data modification ◦ Reporting is independent from the dataprocessing  Paper: De Schrijver et al. 2009. Analysing 454 sequences with a modular and database oriented Variant Identification Pipeline

7  VIP originally developed for variant identification  Now being used in: ◦ Amplicon resequencing ◦ De novo shotgun ◦ Methylation ◦ ~ solexa experiments  ‘Hidden’ data can be extracted using intelligent querying strategies  Results per lane/Multiplex MID/run…

8  Coverage can be calculated per ◦ Lane ◦ MID ◦ Amplicon ◦ Base position  Assessment of errors (PCR dropouts vs. human errors)

9  Amplicon Resequencing experiment  Goal: Variant identification  Length distributions ◦ Mapped ◦ Unmapped ◦ ‘Short’ mapped  Additional length separation + Improved PCR  Result: Improved efficiency

10  Can the length of a homopolymer be assessed using the Q score?  Yes, when homopolymer length < 6bp

11  Fast assessment of the quality of a run Lab work OKErrors in lab work

12  Biobix – Ugent Wim Van Criekinge Tim De Meyer Geert Trooskens Tom Vandekerkhove Leander Van Neste Gerben Mensschaert  CMG – UZ Gent Jo Vandesompele Jan Hellemans Filip Pattyn Steve Lefever Kim Deleeneer Jean-Pierre Renard  NXT-GNT Paul Coucke Sofie Bekaert Filip Van Nieuwerburgh Dieter Deforce Wim Van Criekinge Jo Vandesompele

13 Questions ? Joachim.deschrijver@ugent.be


Download ppt "Joachim De Schrijver.  Short introduction on 454 sequencing  Variant Identification pipeline  Possibilities of a DB oriented pipeline  Examples ◦"

Similar presentations


Ads by Google