Short Read Workshop Day 5: Mapping and Visualization Video 2 Bowtie Short Read Alinger
http://bowtie-bio.sourceforge.net/index.shtml Bowtie is an ultrafast, memory-efficient short read aligner geared towards quickly aligning large sets of short DNA sequences (reads) to large genomes. We like Bowtie for single-end read alignment
http://bowtie-bio.sourceforge.net/index.shtml Bowtie is an ultrafast, memory-efficient short read aligner geared toward quickly aligning large sets of short DNA sequences (reads) to large genomes. We like Bowtie for single-end read alignment Bowtie Only local alignment of reads (no gapped alignment) Optimized for short read lengths (25-50bp); max read length 1000bp Reads not allowed to overlap ambiguous characters (N, Y, R) Colorspace support Bowtie2 Local, end-to-end/gapped read alignment options Optimized for long read lengths (≥50bp); no max read length Allows read to overlap multiple ambiguous characters (N, Y, R) No colorspace support
Bowtie Alignment Options -v Read Mismatch Allow –v <int> mismatches in whole read Where <int> = 0-3 Quality score is ignored -n Seed Mismatch (default) Allow -n <int> mismatches in seed region Where <int> = 0-3 Seed length = -l <int> Where <int> > 5
Bowtie Alignment Options -v Read Mismatch Allow -v <int> mismatches in whole read Where <int> = 0-3 Quality score is ignored -n Seed Mismatch (default) Allow -n <int> mismatches in seed region Where <int> = 0-3 Seed length = -l <int> Where <int> > 5 10mer seed region Read: ATGCAGCTAGCTATCTAGGTAGAT ||||||||||||| |||| ||| | Genome: ATGCAGCTAGCTAGCTAGCTAGCT Settings: -n 0 -l 10
Bowtie Alignment Options -v Read Mismatch Allow -v <int> mismatches in whole read Where <int> = 0-3 Quality score is ignored -n Seed Mismatch (default) Allow -n <int> mismatches in seed region Where <int> = 0-3 Seed length = -l <int> Where <int> > 5 Read: ATGCAGATAGCTATCTAGCTAGCT |||||| |||||| |||||||||| Genome: ATGCAGCTAGCTAGCTAGCTAGCT Settings: -v 2
Bowtie Alignment Report Options @read1 @read2 Chr3 Chr3 Chr1 Chr1 Non-uniquely mapping read Uniquely mapping read
Bowtie Alignment Report Options @read1 @read2 Chr3 Chr3 Chr1 Chr1 Non-uniquely mapping read Uniquely mapping read Bowtie Option To Control Alignment Reporting -k <int> report up to <int> alignments for each read (default -k 1) -a reports all valid alignments for each read -m <int> prevents reporting reads with > int alignments (use –m 1 to report only uniquely mapping reads) -S for SAM output
Reference genome index Using Bowtie 1.) index reference genome with Bowtie-build (Video 1) 2.) Command line setup or Set options like -n or -v Reference genome index Alignment Output File Read file input Bowtie by default assumes your reads are in a fastq file, for other file formats you must specify the format of the reads
Reference genome index Using Bowtie 1.) index reference genome with Bowtie-build (Video 1) 2.) Command line setup or Set options like -n or -v Reference genome index Alignment Output File Read file input Most basic, default setting run of bowtie Bowtie by default assumes your reads are in a fastq file, for other file formats you must specify the format of the reads $ bowtie -S index reads.fq bowtie-out.sam 2> bowtie-out.stderr If you were running paired-end reads $ bowtie -S index -1 fwd_reads.fq -2 rev_reads.fq bowtie-out.sam
Bowtie Alignment Report Options @read1 Uniquely mapping read Chr3 Chr1
Bowtie Alignment Report Options @read1 Uniquely mapping read Chr3 Chr1 Bowtie Option To Control Alignment Reporting -k <int> report up to <int> alignments for each read (default -k 1) -a reports all valid alignments for each read -m <int> prevents reporting reads with > int alignments (use –m 0 to report only uniquely mapping reads) -S for SAM output
Bowtie2 Bowtie2 ≠ Bowtie Bowtie2 can perform gapped alignments --local versus --end-to-end Uses a scoring scheme based on matches, mismatches, gap opens, gap extensions to identify read alignments generally faster and more sensitive for reads >50bp long Unlike v1, Bowtie v2 can perform gapped alignments Must choose between --local and --end-to-end alignment, default is Is generally faster and more sensitive for reads >50nt long Uses a scoring scheme based on matches, mismatches, gap opens, gap extentions to dictate alignment report
Bowtie2 end-to-end Alignment Global/end-to-end alignment (default) Align the whole read Option: --end-to-end Can control how many gaps, how long, etc. Read: ATGCAGCTAGCTAGCTAGCTAGCT |||||||||||||||||||||||| Genome: ATGCAGCTAGCTAGC---CTAGCT 3nt insertion Read: ATGCA--TAGCTAGCTAGCAAGCT |||||||||||||||||||||||| Genome: ATGCAGCTAGCTAGCTAGCTAGCT 2nt deletion 1 mismatch
Bowtie2 end-to-end Alignment Global/end-to-end alignment (default) Align the whole read Option: --end-to-end Can control how many gaps, how long, etc. Preset Alignment options: --very-fast --fast --sensitive (default) = -D 15 -R 2 -L 22 -i S,1,1.15 --very-sensitive -D determines how many extensions to try for a given seed match -L dictates seed length for multiseed alignment
Bowtie2 Local Alignment Soft trimming/clipping Can include gaps Option: --local Also has -local versions of preset options --sensitive-local (default) = -D 15 -R 2 -N 0 -L 20 -i S,1,0.75 Read: ATGCAGCTAGCTAGCTAGCTAGCT ||||||||||||||||||| Genome: GCACAGCTAGCTAGCTAGCTAGAC 3’ and 5’ soft trimming -N dictates mismatches allowed in seed region
Bowtie2 Local Alignment Soft trimming/clipping Can include gaps Option: --local Also has -local versions of preset options --sensitive-local (default) = -D 15 -R 2 -N 0 -L 20 -i S,1,0.75 Read: ATGCAGCTAGCTAGCTAGCTAGCT ||||||||||||||||||| Genome: GCACAGCTAGCTAGCTAGCTAGAC 3’ and 5’ soft trimming -N dictates mismatches allowed in seed region
Bowtie2 Reporting Options Default to report a single alignment per read -a report all valid alignments -k set how many alignments per read
Reference genome index Using Bowtie2 1.) index reference genome with bowtie2-build (Video 1) -bowtie indexes not compatible with bowtie2 and vice versa 2.) Command line setup Set options like --local Reference genome index Alignment Output SAM File Read file input
Reference genome index Using Bowtie2 1.) index reference genome with bowtie2-build (Video 1) -bowtie indexes not compatible with bowtie2 and vice versa 2.) Command line setup Set options like --local Reference genome index Alignment Output SAM File Read file input Most basic, default setting run of bowtie $ bowtie2 --local -x index reads.fq -S bowtie-out.sam 2> bowtie-out.stderr If you were running paired-end reads $ bowtie2 --fr --local -x index -1 fwd_reads.fq -2 rev_reads2.fq -S bowtie-out.sam
Capture standard error from Bowtie(2) Only local alignment of reads (no gapped alignment) Bowtie2 Local, end-to-end/gapped read alignment options Capture standard error from Bowtie(2) Output SAM file can be used with a number of downstream applications Use SAMTools to convert to BAM file