Joachim De Schrijver.  Short introduction on 454 sequencing  Variant Identification pipeline  Possibilities of a DB oriented pipeline  Examples ◦

Joachim De Schrijver

 Short introduction on 454 sequencing  Variant Identification pipeline  Possibilities of a DB oriented pipeline  Examples ◦ Coverage ◦ Improving PCR ◦ Fast Q assessment ◦ Homopolymers

 Roche/454 GS-FLX sequencing: ◦ Pyrosequencing ◦ ± 400,000 reads/run ◦ Average length: 200-250bp  Applications: ◦ Resequencing: Variant identification ◦ De novo (genome) sequencing: Assembly of new regions, plasmids or entire genomes  Standard Software: ◦ Variants: Amplicon Variant Analyzer (AVA) ◦ Assembly: Standard 454 assembler

 Standard software ◦ + Easy to use ◦ + reproducible results on similar datasets ◦ + GUI (graphical user interface) ◦ - No answer for ‘non-standard’ questions  Methylation experiments  Different types of experiments grouped together  … ◦ - What about ‘hidden’ information?  Homopolymer error rates  Quality score ~ length of sequenced read  ‘Multirun’ information  …

 Modular and database oriented pipeline  Modular: ◦ Efficient planning ◦ Scalable  Database (DB): ◦ No loss of data ◦ Grouping several runs together

 Basic idea: Data is processed and stored in DB. Results (reports) are calculated ‘on the fly’ using the DB data. ◦ Fast & efficient ◦ Calculations only happen once ◦ Everybody can access the database without risk of data modification ◦ Reporting is independent from the dataprocessing  Paper: De Schrijver et al. 2009. Analysing 454 sequences with a modular and database oriented Variant Identification Pipeline

 VIP originally developed for variant identification  Now being used in: ◦ Amplicon resequencing ◦ De novo shotgun ◦ Methylation ◦ ~ solexa experiments  ‘Hidden’ data can be extracted using intelligent querying strategies  Results per lane/Multiplex MID/run…

 Coverage can be calculated per ◦ Lane ◦ MID ◦ Amplicon ◦ Base position  Assessment of errors (PCR dropouts vs. human errors)

 Amplicon Resequencing experiment  Goal: Variant identification  Length distributions ◦ Mapped ◦ Unmapped ◦ ‘Short’ mapped  Additional length separation + Improved PCR  Result: Improved efficiency

 Can the length of a homopolymer be assessed using the Q score?  Yes, when homopolymer length < 6bp

 Fast assessment of the quality of a run Lab work OKErrors in lab work

 Biobix – Ugent Wim Van Criekinge Tim De Meyer Geert Trooskens Tom Vandekerkhove Leander Van Neste Gerben Mensschaert  CMG – UZ Gent Jo Vandesompele Jan Hellemans Filip Pattyn Steve Lefever Kim Deleeneer Jean-Pierre Renard  NXT-GNT Paul Coucke Sofie Bekaert Filip Van Nieuwerburgh Dieter Deforce Wim Van Criekinge Jo Vandesompele

Questions ? Joachim.deschrijver@ugent.be

Joachim De Schrijver.  Short introduction on 454 sequencing  Variant Identification pipeline  Possibilities of a DB oriented pipeline  Examples ◦

Similar presentations

Presentation on theme: "Joachim De Schrijver.  Short introduction on 454 sequencing  Variant Identification pipeline  Possibilities of a DB oriented pipeline  Examples ◦"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Joachim De Schrijver.  Short introduction on 454 sequencing  Variant Identification pipeline  Possibilities of a DB oriented pipeline  Examples ◦

Similar presentations

Presentation on theme: "Joachim De Schrijver.  Short introduction on 454 sequencing  Variant Identification pipeline  Possibilities of a DB oriented pipeline  Examples ◦"— Presentation transcript:

Similar presentations

About project

Feedback