__________________________________________________________________________________________________ Fall 2015GCBA 815 __________________________________________________________________________________________________ Fall 2015GCBA 815 Tools and Algorithms in Bioinformatics GCBA815, Fall 2015 Week15: CLC Genomics Matthew Cserhati, Ph.D. Bioinformatics Programmer (Guda lab) Department of Genetics, Cell Biology and Anatomy University of Nebraska Medical Center __________________________________________________________________________________________________ Fall 2015GCBA 815
__________________________________________________________________________________________________ Fall 2015GCBA 815 Introduction A comprehensive and user-friendly analysis package for analyzing, comparing, and visualizing next generation sequencing data Website: genomics-workbench/ genomics-workbench/ Latest version Also available campus wide via INBREweb in Virtual Machine Which we will test in this class
__________________________________________________________________________________________________ Fall 2015GCBA 815
__________________________________________________________________________________________________ Fall 2015GCBA 815 Types of tools Classical Sequence Analysis Tools Alignments, sequence shuffling, motif search, nucleotide and protein analysis Molecular Biology Tools Primer design, restriction analysis BLAST Download databases, BLAST at NCBI, create database NGS Core Tools QC report, trim reads, read mapping, consensus sequence extraction De Novo sequencing And much, much more! …
__________________________________________________________________________________________________ Fall 2015GCBA 815 Description of test files Paired end fastq files X5: 3.4M reads X8: 16.6M reads Derived from whole genome Belonging to strains of the same microbial species Goal is whole genome assembly from these fastq files
__________________________________________________________________________________________________ Fall 2015GCBA 815 Exercises Import data Open read data Reference genomes Quality checks QC report trimming Guided assembly De novo assembly Remove duplicate reads ORF prediction Extra: runs with Example data mRNA secondary structure Motif search
__________________________________________________________________________________________________ Fall 2015GCBA 815 Guided vs. de novo genome assembly Guided Aligning reads in fastq sequence files to a genome from a relative species More efficient and precise than de novo alignment Faster Variant analysis possible only with guided assembly De novo Done if lacking a relative species Results in contigs which must be joined Can be combined with mapping contigs to genome from relative species Much slower Similar to putting together jigsaw puzzle with/without similar puzzle template
__________________________________________________________________________________________________ Fall 2015GCBA 815 Variant analysis Basic Germline and somatic variants Detects any variants observed in reads Fixed ploidy Germline variants For known ploidy (microbe => 1) Discards variants which are due to sequencing error or mapping artefacts Low frequency Germline and somatic variants For unknown/mixed ploidy Discards variants which are due to sequencing error
__________________________________________________________________________________________________ Fall 2015GCBA 815 Sample outputs
__________________________________________________________________________________________________ Fall 2015GCBA 815
__________________________________________________________________________________________________ Fall 2015GCBA 815
__________________________________________________________________________________________________ Fall 2015GCBA 815
__________________________________________________________________________________________________ Fall 2015GCBA 815 Thanks for your attention!