Short Read Workshop Day 5: Mapping and Visualization

Slides:



Advertisements
Similar presentations
In Silico Primer Design and Simulation for Targeted High Throughput Sequencing I519 – FALL 2010 Adam Thomas, Kanishka Jain, Tulip Nandu.
Advertisements

Computer Science & Engineering 2111 Text Functions 1CSE 2111 Lecture-Text Functions.
Reference mapping and variant detection Peter Tsai Bioinformatics Institute, University of Auckland.
SCHOOL OF COMPUTING ANDREW MAXWELL 9/11/2013 SEQUENCE ALIGNMENT AND COMPARISON BETWEEN BLAST AND BWA-MEM.
High Throughput Sequencing
Getting the computer setup Follow directions on handout to login to server. Type “qsub -I” to get a compute node. The data you will be using is stored.
SOLiD Sequencing & Data
Introduction to Short Read Sequencing Analysis
Ultrafast and memory-efficient alignment of short DNA sequences to the human genome Ben Langmead, Cole Trapnell, Mihai Pop, Steven L Salzberg 林恩羽 宋曉亞 陳翰平.
TOPHAT Next-Generation Sequencing Workshop RNA-Seq Mapping
Institute for Quantitative & Computational Biosciences Workshop4: NGS- study design and short read mapping.
Global Alignment: Dynamic Progamming Table s 1 : acagagtaac s 2 : acaagtgatc -acaagtgatc - a c a g a g t a a c j s2s2 i s1s1 Scores: match=1, mismatch=-1,
Bowtie2: Extending Burrows-Wheeler-based read alignment to longer reads and gapped alignments Ben Langmead 1, 2, Mihai Pop 1, Rafael A. Irizarry 2 and.
Some new sequencing technologies. Molecular Inversion Probes.
Biological Sequence Analysis BNFO 691/602 Spring 2014 Mark Reimers
Heuristic methods for sequence alignment in practice Sushmita Roy BMI/CS 576 Sushmita Roy Sep 27 th,
SOAP3-dp Workflow.
Bacterial Genome Assembly | Victor Jongeneel Radhika S. Khetani
NGS Analysis Using Galaxy
Sequence Alignment.
Presented by Mario Flores, Xuepo Ma, and Nguyen Nguyen.
Li and Dewey BMC Bioinformatics 2011, 12:323
PE-Assembler: De novo assembler using short paired-end reads Pramila Nuwantha Ariyaratne.
Introduction to Short Read Sequencing Analysis
MES Genome Informatics I - Lecture V. Short Read Alignment
How I learned to quit worrying Deanna M. Church Staff Scientist, Short Course in Medical Genetics 2013 And love multiple coordinate.
Massive Parallel Sequencing
Assignment 2: Papers read for this assignment Paper 1: PALMA: mRNA to Genome Alignments using Large Margin Algorithms Paper 2: Optimal spliced alignments.
Sequence assembly using paired- end short tags Pramila Ariyaratne Genome Institute of Singapore SOC-FOS-SICS Joint Workshop on Computational Analysis of.
NGS data analysis CCM Seminar series Michael Liang:
Aligning Reads Ramesh Hariharan Strand Life Sciences IISc.
Next Generation DNA Sequencing
TopHat Mi-kyoung Seo. Today’s paper..TopHat Cole Trapnell at the University of Washington's Department of Genome Sciences Steven Salzberg Center.
RNA-seq workshop ALIGNMENT
Indexing DNA sequences for local similarity search Joint work of Angela, Dr. Mamoulis and Dr. Yiu 17/5/2007.
ParSNP Hash Pipeline to parse SNP data and output summary statistics across sliding windows.
Trinity College Dublin, The University of Dublin GE3M25: Data Analysis, Class 4 Karsten Hokamp, PhD Genetics TCD, 07/12/2015
Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up.
IGV tools. Pipeline Download genome from Ensembl bacteria database Export the mapping reads file (SAM) Map reads to genome by CLC Using the mapping.
Lecture 15 Algorithm Analysis
Ke Lin 23 rd Feb, 2012 Structural Variation Detection Using NGS technology.
Introduction of the ChIP-seq pipeline Shigeki Nakagome November 16 th, 2015 Di Rienzo lab meeting.
GSVCaller – R-based computational framework for detection and annotation of short sequence variations in the human genome Vasily V. Grinev Associate Professor.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
From Reads to Results Exome-seq analysis at CCBR
Short Read Workshop Day 5: Mapping and Visualization Video 3 Introduction to BWA.
DAY 2. GETTING FAMILIAR WITH NGS SANGREA SHIM. INDEX  Day 2  Get familiar with NGS  Understanding of NGS raw read file  Quality issue  Alignment/Mapping.
Using command line tools to process sequencing data
Day 5 Mapping and Visualization
Short Read Sequencing Analysis Workshop
Lesson: Sequence processing
Dowell Short Read Class Phillip Richmond
Integrative Genomics Viewer (IGV)
Presented By: Chinua Umoja
M. roreri de novo genome assembly using abyss/1.9.0-maxk96
Ssaha_pileup - a SNP/indel detection pipeline from new sequencing data
Sequence Alignment Using Dynamic Programming
MapView: visualization of short reads alignment on a desktop computer
ChIP-Seq Data Processing and QC
CSC2431 February 3rd 2010 Alecia Fowler
Next-generation sequencing - Mapping short reads
Maximize read usage through mapping strategies
Next-generation sequencing - Mapping short reads
CS 6293 Advanced Topics: Translational Bioinformatics
Canadian Bioinformatics Workshops
Alignment of Next-Generation Sequencing Data
BF528 - Sequence Analysis Fundamentals
PeakSeq: [Rozowsky, 2009].
RNA-Seq Data Analysis UND Genomics Core.
Presentation transcript:

Short Read Workshop Day 5: Mapping and Visualization Video 2 Bowtie Short Read Alinger

http://bowtie-bio.sourceforge.net/index.shtml Bowtie is an ultrafast, memory-efficient short read aligner geared towards quickly aligning large sets of short DNA sequences (reads) to large genomes. We like Bowtie for single-end read alignment

http://bowtie-bio.sourceforge.net/index.shtml Bowtie is an ultrafast, memory-efficient short read aligner geared toward quickly aligning large sets of short DNA sequences (reads) to large genomes. We like Bowtie for single-end read alignment Bowtie Only local alignment of reads (no gapped alignment) Optimized for short read lengths (25-50bp); max read length 1000bp Reads not allowed to overlap ambiguous characters (N, Y, R) Colorspace support Bowtie2 Local, end-to-end/gapped read alignment options Optimized for long read lengths (≥50bp); no max read length Allows read to overlap multiple ambiguous characters (N, Y, R) No colorspace support

Bowtie Alignment Options -v Read Mismatch Allow –v <int> mismatches in whole read Where <int> = 0-3 Quality score is ignored -n Seed Mismatch (default) Allow -n <int> mismatches in seed region Where <int> = 0-3 Seed length = -l <int> Where <int> > 5

Bowtie Alignment Options -v Read Mismatch Allow -v <int> mismatches in whole read Where <int> = 0-3 Quality score is ignored -n Seed Mismatch (default) Allow -n <int> mismatches in seed region Where <int> = 0-3 Seed length = -l <int> Where <int> > 5 10mer seed region Read: ATGCAGCTAGCTATCTAGGTAGAT ||||||||||||| |||| ||| | Genome: ATGCAGCTAGCTAGCTAGCTAGCT Settings: -n 0 -l 10

Bowtie Alignment Options -v Read Mismatch Allow -v <int> mismatches in whole read Where <int> = 0-3 Quality score is ignored -n Seed Mismatch (default) Allow -n <int> mismatches in seed region Where <int> = 0-3 Seed length = -l <int> Where <int> > 5 Read: ATGCAGATAGCTATCTAGCTAGCT |||||| |||||| |||||||||| Genome: ATGCAGCTAGCTAGCTAGCTAGCT Settings: -v 2

Bowtie Alignment Report Options @read1 @read2 Chr3 Chr3 Chr1 Chr1 Non-uniquely mapping read Uniquely mapping read

Bowtie Alignment Report Options @read1 @read2 Chr3 Chr3 Chr1 Chr1 Non-uniquely mapping read Uniquely mapping read Bowtie Option To Control Alignment Reporting -k <int> report up to <int> alignments for each read (default -k 1) -a reports all valid alignments for each read -m <int> prevents reporting reads with > int alignments (use –m 1 to report only uniquely mapping reads) -S for SAM output

Reference genome index Using Bowtie 1.) index reference genome with Bowtie-build (Video 1) 2.) Command line setup or Set options like -n or -v Reference genome index Alignment Output File Read file input Bowtie by default assumes your reads are in a fastq file, for other file formats you must specify the format of the reads

Reference genome index Using Bowtie 1.) index reference genome with Bowtie-build (Video 1) 2.) Command line setup or Set options like -n or -v Reference genome index Alignment Output File Read file input Most basic, default setting run of bowtie Bowtie by default assumes your reads are in a fastq file, for other file formats you must specify the format of the reads $ bowtie -S index reads.fq bowtie-out.sam 2> bowtie-out.stderr If you were running paired-end reads $ bowtie -S index -1 fwd_reads.fq -2 rev_reads.fq bowtie-out.sam

Bowtie Alignment Report Options @read1 Uniquely mapping read Chr3 Chr1

Bowtie Alignment Report Options @read1 Uniquely mapping read Chr3 Chr1 Bowtie Option To Control Alignment Reporting -k <int> report up to <int> alignments for each read (default -k 1) -a reports all valid alignments for each read -m <int> prevents reporting reads with > int alignments (use –m 0 to report only uniquely mapping reads) -S for SAM output

Bowtie2 Bowtie2 ≠ Bowtie Bowtie2 can perform gapped alignments --local versus --end-to-end Uses a scoring scheme based on matches, mismatches, gap opens, gap extensions to identify read alignments generally faster and more sensitive for reads >50bp long Unlike v1, Bowtie v2 can perform gapped alignments Must choose between --local and --end-to-end alignment, default is Is generally faster and more sensitive for reads >50nt long Uses a scoring scheme based on matches, mismatches, gap opens, gap extentions to dictate alignment report

Bowtie2 end-to-end Alignment Global/end-to-end alignment (default) Align the whole read Option: --end-to-end Can control how many gaps, how long, etc. Read: ATGCAGCTAGCTAGCTAGCTAGCT |||||||||||||||||||||||| Genome: ATGCAGCTAGCTAGC---CTAGCT 3nt insertion Read: ATGCA--TAGCTAGCTAGCAAGCT |||||||||||||||||||||||| Genome: ATGCAGCTAGCTAGCTAGCTAGCT 2nt deletion 1 mismatch

Bowtie2 end-to-end Alignment Global/end-to-end alignment (default) Align the whole read Option: --end-to-end Can control how many gaps, how long, etc. Preset Alignment options: --very-fast --fast --sensitive (default) = -D 15 -R 2 -L 22 -i S,1,1.15 --very-sensitive -D determines how many extensions to try for a given seed match -L dictates seed length for multiseed alignment

Bowtie2 Local Alignment Soft trimming/clipping Can include gaps Option: --local Also has -local versions of preset options --sensitive-local (default) = -D 15 -R 2 -N 0 -L 20 -i S,1,0.75 Read: ATGCAGCTAGCTAGCTAGCTAGCT ||||||||||||||||||| Genome: GCACAGCTAGCTAGCTAGCTAGAC 3’ and 5’ soft trimming -N dictates mismatches allowed in seed region

Bowtie2 Local Alignment Soft trimming/clipping Can include gaps Option: --local Also has -local versions of preset options --sensitive-local (default) = -D 15 -R 2 -N 0 -L 20 -i S,1,0.75 Read: ATGCAGCTAGCTAGCTAGCTAGCT ||||||||||||||||||| Genome: GCACAGCTAGCTAGCTAGCTAGAC 3’ and 5’ soft trimming -N dictates mismatches allowed in seed region

Bowtie2 Reporting Options Default to report a single alignment per read -a report all valid alignments -k set how many alignments per read

Reference genome index Using Bowtie2 1.) index reference genome with bowtie2-build (Video 1) -bowtie indexes not compatible with bowtie2 and vice versa 2.) Command line setup Set options like --local Reference genome index Alignment Output SAM File Read file input

Reference genome index Using Bowtie2 1.) index reference genome with bowtie2-build (Video 1) -bowtie indexes not compatible with bowtie2 and vice versa 2.) Command line setup Set options like --local Reference genome index Alignment Output SAM File Read file input Most basic, default setting run of bowtie $ bowtie2 --local -x index reads.fq -S bowtie-out.sam 2> bowtie-out.stderr If you were running paired-end reads $ bowtie2 --fr --local -x index -1 fwd_reads.fq -2 rev_reads2.fq -S bowtie-out.sam

Capture standard error from Bowtie(2) Only local alignment of reads (no gapped alignment) Bowtie2 Local, end-to-end/gapped read alignment options Capture standard error from Bowtie(2) Output SAM file can be used with a number of downstream applications Use SAMTools to convert to BAM file