Presentation is loading. Please wait.

Presentation is loading. Please wait.

Next Generation Sequencing in Virus and Parasite Research.

Similar presentations


Presentation on theme: "Next Generation Sequencing in Virus and Parasite Research."— Presentation transcript:

1 Next Generation Sequencing in Virus and Parasite Research

2 Sanger Read >800bp GS-FLX read ~250bp500 bp 100Mb | 500Mb per run WGS Annotation Population Diversity Pathogen Discovery Applications Presented Four main projects In the lab

3 Brugia malayi Genome Project Parasitic nematode, causes lymphatic filariasis Total scaffolds: ~8250 Longest scaffold: 6.5 Mb Total bases in scaffolds: 71 Mb Total span of scaffolds: 80 Mb Genome size ~100Mb 6 chromosomes in 8250 pieces Sanger (cloning bias)

4 Closing the Genome Next-generation sequencing Fingerprint maps Curating the Data DATABASE Mapping 5’ and 3’UTRs Functional annotation Re-assemble genomeRe-annotate Brugia malayi Genome Project PHASE II – Use Next-Gen Data (Hybrid Sanger-GSFLX assembly) (Confirm UTRs by GSFLX)

5 Mix of random reads and paired reads Avg read length: ~220bp ~100 Mb GS-FLX Sequencing of Worm gDNA and cDNA 5 runs= 5X coverage of the genome 5’UTR3’UTR SL gDNA Paired-Ends and WGS UTRs Whole Plate4-well gasket

6 Mapping of paired and non-paired reads onto genomic assembly SEQUENCE ASSEMBLY hits 100% | 80% Paired-ends No apparent Bias 20Mb of Brugia reads = ~0.25X coverage

7 Sequencing UTRs of B. malayi mRNA P AAAA CIP TAP RNA ligase AAAA RT-PCR RNA oligo MmeI site NlaIII SAGE Tag Unique sequence Concatenated SAGE Tags AAAA DITAGS (variable length)

8 Sequencing Results One sequence run ~50Mb of data in ~400,000 reads 5’UTR 3’UTR SL

9 Data processing Raw Data Remove Linker, Small tags(<10), Identical, Junk Blast against GenomeEST Exon CDS Unmatched tags Blast against Small contigs MitochondrionBacterialsingletons

10 EST 3’-tag SL-tag 5’-tag 40S ribosomal protein S18 Mapping of Tags

11 Intra-Host Diversity of Influenza A Virus Antigenic variants Drug resistant and Sensitive variants

12 HA1HA2 566aa 1,757nt Amplicons: Mapped GS-FLX Sequence Reads on antigenic domain of Hemagglutinin 450bp

13 Mapped Translated GS-FLX Reads on Epitopes of HA1 Domain ED AB DBDDEC

14 Patterns: Non-Synonymous mutations are predominantly in epitope regions (13/19 sites) BBA A AAD #reads 2 3 122 1 122 1 2

15 4 137 4 2 1 171 78 1 4 1 35 Identifying rare variants: Drug resistance mutation Resistant H1N1 1/437=0.2% agt (S)  aat (N) N31S #reads Matrix segment in H1N1 isolate

16 SNP Analyses: Probability that Polymorphism is Real Base# A C G N T GAP SNP probability pbShort (polybayes) - Marth Lab, Boston College

17 Error Correction (homopolymer tracks)

18 Signal Processing: Length Distribution adjusting the stringency of quality filters Changes length distribution Reads slightly shorter BUT Average quality is higher Default Higher stringency Read length 75,000 – avg ln 200 70,000 – avg ln 195

19 Signal Processing: Quality Distribution Reduce the # of bases BUT Increase the proportion of bases of HIGH QUALITY Default Higher stringency Quality Score 15 Million bp 14 Million bp

20 Whole Virus Genome Sequencing Limitation of read length BUT: - Isolate single genome (limited dilution, other?) - Random prime or specific primers with barcodes - use barcode to amplify - Multiplex: 20 barcodes, 16-well gasket = 320 samples

21 Virus Genomic Library Construction - Discovery - RNA RT PCR cDNA or ssDNA Klenow Exo-DNA polymerase dsDNA Select 500 bp amplicons for emulsion PCR and pyrosequencing NNNN 1a Reverse transcription 1b DNA extension from random primers 2 Amplification from tags 3 Size selection & Sequencing

22 Multiplexing by Barcoding Pools

23 Barcodes mapped onto reads NUCMER MySQL db BLASTN BLASTX Post-Processing Pipeline Reads clustered and reduced to a unique set

24 26,750 contigs  BLASTN  56% match human DNA 12, 889 contigs  BLASTX  120 match viruses

25 Periodontal DiseaseCaries VIRAL BACTERIAL Pool 1 Family BU128 WV409 BK026 BR095 HIGHLOW HIGHLOW TagA TagB TagC TagD 5237684 BU128 WV409 BK026 BR095 WV001 WV213 BK044 BU130 WV001 WV213 BK044 BU130 BR009 WV597 WV631 BU133 BR009 WV597 WV631 BU133 BR023 WV041 BU137 WV628 BR023 WV041 BU137 WV628 Oral Microbiome Project

26 Bacterial Diversity Heat Maps: Sequencing of 16S rRNA variable region Sequencing of PCR Amplicons 250bp in size

27 Acknowledgments School of Dental Medicine Mary Marazita Ghedin Lab School of Medicine Jay DePasse Adam Fitch Xu Zhang Graduate School of Public health Robert Ferrell Mike Barmaba Funding:NIDCR/NIHCTSIJDRF Burroughs- Wellcome Fund GPCL Debby Hollingshead Paul Wood Janette Lamb


Download ppt "Next Generation Sequencing in Virus and Parasite Research."

Similar presentations


Ads by Google