Canadian Bioinformatics Workshops www.bioinformatics.ca.

Slides:



Advertisements
Similar presentations
IMGS 2012 Bioinformatics Workshop: RNA Seq using Galaxy
Advertisements

Peter Tsai Bioinformatics Institute, University of Auckland
RNAseq analysis Bioinformatics Analysis Team
RNA-seq data analysis Project
TOPHAT Next-Generation Sequencing Workshop RNA-Seq Mapping
RNA-seq Analysis in Galaxy
Bacterial Genome Assembly | Victor Jongeneel Radhika S. Khetani
Before we start: Align sequence reads to the reference genome
NGS Analysis Using Galaxy
Mapping NGS sequences to a reference genome. Why? Resequencing studies (DNA) – Structural variation – SNP identification RNAseq – Mapping transcripts.
Introduction to RNA-Seq and Transcriptome Analysis
Expression Analysis of RNA-seq Data
Bioinformatics and OMICs Group Meeting REFERENCE GUIDED RNA SEQUENCING.
Introduction to RNA-Seq & Transcriptome Analysis
NGS data analysis CCM Seminar series Michael Liang:
Next Generation DNA Sequencing
TopHat Mi-kyoung Seo. Today’s paper..TopHat Cole Trapnell at the University of Washington's Department of Genome Sciences Steven Salzberg Center.
Transcriptome Analysis
RNA-seq workshop ALIGNMENT
An Introduction to RNA-Seq Transcriptome Profiling with iPlant.
Introductory RNA-seq Transcriptome Profiling. Before we start: Align sequence reads to the reference genome The most time-consuming part of the analysis.
Introduction to RNA-Seq
Introduction To Next Generation Sequencing (NGS) Data Analysis
Cloud Implementation of GT-FAR (Genome and Transcriptome-Free Analysis of RNA-Seq) University of Southern California.
Introductory RNA-seq Transcriptome Profiling. Before we start: Align sequence reads to the reference genome The most time-consuming part of the analysis.
Galaxy – Set up your account. Galaxy – Two ways to get your data.
RNA-Seq Transcriptome Profiling. Before we start: Align sequence reads to the reference genome The most time-consuming part of the analysis is doing the.
Trinity College Dublin, The University of Dublin GE3M25: Data Analysis, Class 4 Karsten Hokamp, PhD Genetics TCD, 07/12/2015
IGV tools. Pipeline Download genome from Ensembl bacteria database Export the mapping reads file (SAM) Map reads to genome by CLC Using the mapping.
The iPlant Collaborative
The iPlant Collaborative
Comparative transcriptomics of fungi Group Nicotiana Daan van Vliet, Dou Hu, Joost de Jong, Krista Kokki.
Computing on TSCC Make a folder for the class and move into it –mkdir –p /oasis/tscc/scratch/username/biom262_harismendy –cd /oasis/tscc/scratch/username/biom262_harismendy.
Short Read Workshop Day 5: Mapping and Visualization
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Introduction to Exome Analysis in Galaxy Carol Bult, Ph.D. Professor Deputy Director, JAX Cancer Center Short Course Bioinformatics Workshops 2014 Disclaimer…I.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Group Medicago Basic Project: Gene expression in yeast Advanced Bioinformatics.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
RNA Seq Analysis Aaron Odell June 17 th Mapping Strategy A few questions you’ll want to ask about your data… - What organism is the data from? -
Introductory RNA-seq Transcriptome Profiling of the hy5 mutation in Arabidopsis thaliana.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
RNA-Seq with the Tuxedo Suite Monica Britton, Ph.D. Sr. Bioinformatics Analyst September 2015 Workshop.
Introductory RNA-seq Transcriptome Profiling
NGS File formats Raw data from various vendors => various formats
GCC Workshop 9 RNA-Seq with Galaxy
WS9: RNA-Seq Analysis with Galaxy (non-model organism )
RNA Sequencing Day 7 Wooohoooo!
Short Read Sequencing Analysis Workshop
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Introductory RNA-Seq Transcriptome Profiling
Maximize read usage through mapping strategies
Canadian Bioinformatics Workshops
Computational Pipeline Strategies
Introduction to RNA-Seq & Transcriptome Analysis

RNA-Seq Data Analysis UND Genomics Core.
Quality Control & Nascent Sequencing
Presentation transcript:

Canadian Bioinformatics Workshops

2Module #: Title of Module

Module 2 RNA-seq alignment and visualization (tutorial)

Module 2 – RNA-seq alignment and visualization bioinformatics.ca Learning Objectives of Tutorial Run Bowtie2/TopHat2 (or STAR) with parameters suitable for gene expression analysis Use samtools to demonstrate the features of the SAM/BAM format and basic manipulation of these alignment files (view, sort, index, filter) Use IGV to visualize RNA-seq alignments, view a variant position, etc. Determine BAM-read counts at a variant position Use samtools flagstat, samstat, FastQC to assess quality of alignments

Module 2 – RNA-seq alignment and visualization bioinformatics.ca Tutorial files One part – Tutorial_Module2_Linux.txt Use Bowtie2/Tophat2 to align reads to the genome Compare performance of STAR aligner Examine features of SAM/BAM files Prepare files for loading in IGV Perform bam-read-count Create QC reports using samtools, FastQC, samstat

Module 2 – RNA-seq alignment and visualization bioinformatics.ca 6. Align reads with tophat Align all reads in the 8 libraries of the test data – 8 libraries with two files each (one for each read1 and read2 of the paired-end reads) Use tophat for the alignment – Supply the gene GTF file obtained in step 3 – Supply the bowtie indexed genome obtained in step 4 – The ‘-G’ option tells tophat to look for the exon-exon junctions of known transcripts. It will still look for novel exon-exon junctions as well Since there are 8 libraries in the test data set, 8 alignment commands are run On a test system, each of these alignments took ~1.5 minutes using 8 CPUs Each alignment job outputs a SAM/BAM file –

Module 2 – RNA-seq alignment and visualization bioinformatics.ca 6b. Align reads with STAR Again, align all reads in the 8 libraries of the test data, now with STAR – Supply the same gene GTF file obtained in step 3 – Supply the STAR indexed genome obtained in step 4 – The ‘-outSAMstrandField intronMotif’is needed so that STAR produces an alignment compatible with cufflinks How long did the alignment take compared to tophat? What additional steps are needed?

Module 2 – RNA-seq alignment and visualization bioinformatics.ca 7. Post-alignment vizualization Create indexed versions of bam files – These are needed by IGV for efficient loading of alignments Visualize spliced alignments – Identify exon-exon junction supporting reads – Identify differentially expressed genes – Compare tophat and STAR alignments Try to find variant positions Create a pileup from bam file Determine read counts at a specific position

Module 2 – RNA-seq alignment and visualization bioinformatics.ca 7. Post-alignment vizualization (IGV)

Module 2 – RNA-seq alignment and visualization bioinformatics.ca 8. Post-alignment QC Use 'samtools view' to see the format of a SAM/BAM alignment file – Use ‘FLAGs’ to filter out certain kinds of alignments Use 'samtools flagstat' to get a basic summary of an alignment Run samstat on Tumor/Normal BAMs and review the resulting report in your browser Use FastQC to perform basic QC of your alignments

Module 2 – RNA-seq alignment and visualization bioinformatics.ca 8. Post-alignment QC (samstat)

Module 2 – RNA-seq alignment and visualization bioinformatics.ca We are on a Coffee Break & Networking Session