Dr. Christoph W. Sensen und Dr. Jung Soh Trieste Course 2017

Slides:

Advertisements

Similar presentations

RNA-Seq as a Discovery Tool

Advertisements

RNA-seq library prep introduction

An Introduction to Studying Expression Data Through RNA-seq

Vanderbilt Center for Quantitative Sciences Summer Institute Sequencing Analysis Yan Guo.

IMGS 2012 Bioinformatics Workshop: RNA Seq using Galaxy

12/04/2017 RNA seq (I) Edouard Severing.

Processing of miRNA samples and primary data analysis

Peter Tsai Bioinformatics Institute, University of Auckland

RNA-seq: the future of transcriptomics ……. ?

RNAseq analysis Bioinformatics Analysis Team

Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520

Transcriptomics Jim Noonan GENE 760.

MCB Lecture #21 Nov 20/14 Prokaryote RNAseq.

RNA-seq Analysis in Galaxy

mRNA-Seq: methods and applications

Before we start: Align sequence reads to the reference genome

RNA-Seq and RNA Structure Prediction

Li and Dewey BMC Bioinformatics 2011, 12:323

A cell and its population of genes :. DNA forms double strands by a process called hybridization:

Expression Analysis of RNA-seq Data

Bioinformatics and OMICs Group Meeting REFERENCE GUIDED RNA SEQUENCING.

Transcriptome analysis With a reference – Challenging due to size and complexity of datasets – Many tools available, driven by biomedical research – GATK.

RNAseq analyses -- methods

Genomics and High Throughput Sequencing Technologies: Applications Jim Noonan Department of Genetics.

Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.

Next Generation DNA Sequencing

Schedule change Day 2: AM - Introduction to RNA-Seq (and a touch of miRNA-Seq) Day 2: PM - RNA-Seq practical (Tophat + Cuffdiff pipeline on Galaxy) Day.

Transcriptome Analysis

RNA-seq workshop ALIGNMENT

The iPlant Collaborative

1 Transcript modeling Brent lab. 2 Overview Of Entertainment  Gene prediction Jeltje van Baren  Improving gene prediction with tiling arrays Aaron Tenney.

Tag profiling is dead... October 2009 Claudia Voelckel Patrick Biggs...long live mRNA-Seq!

1 Global expression analysis Monday 10/1: Intro* 1 page Project Overview Due Intro to R lab Wednesday 10/3: Stats & FDR - * read the paper! Monday 10/8:

RNA-Seq Primer Understanding the RNA-Seq evidence tracks on the GEP UCSC Genome Browser Wilson Leung08/2014.

Introduction to RNAseq

RNA-seq: Quantifying the Transcriptome

TOX680 Unveiling the Transcriptome using RNA-seq Jinze Liu.

No reference available

Canadian Bioinformatics Workshops

Canadian Bioinformatics Workshops

Canadian Bioinformatics Workshops

Reliable Identification of Genomic Variants from RNA-seq Data Robert Piskol, Gokul Ramaswami, Jin Billy Li PRESENTED BY GAYATHRI RAJAN VINEELA GANGALAPUDI.

Canadian Bioinformatics Workshops

Canadian Bioinformatics Workshops

Canadian Bioinformatics Workshops

From Reads to Results Exome-seq analysis at CCBR

Arrays How do they work ? What are they ?. WT Dwarf Transgenic Other species Arrays are inverted Northerns: Extract target RNA YFG Label probe + hybridise.

RNA-Seq with the Tuxedo Suite Monica Britton, Ph.D. Sr. Bioinformatics Analyst September 2015 Workshop.

Transcriptomics History and practice.

RNA-Seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520

RNA-Seq Primer Understanding the RNA-Seq evidence tracks on

Canadian Bioinformatics Workshops

Next generation sequencing

An Introduction to RNA-Seq Data and Differential Expression Tools in R

RNA-Seq for the Next Generation RNA-Seq Intro Slides

Cancer Genomics Core Lab

Gene expression from RNA-Seq

RNA-Seq analysis in R (Bioconductor)

S1 Supporting information Bioinformatic workflow and quality of the metrics Number of slides: 10.

High-Throughput Analysis of Genomic Data [S7] ENRIQUE BLANCO

Canadian Bioinformatics Workshops

Gene expression estimation from RNA-Seq data

From: TopHat: discovering splice junctions with RNA-Seq

Transcriptomics History and practice.

RNA sequencing (RNA-Seq) and its application in ovarian cancer

Next-generation DNA sequencing

Sequence Analysis - RNA-Seq 2

Schematic representation of a transcriptomic evaluation approach.

RNA-Seq Data Analysis UND Genomics Core.

Presentation transcript:

Dr. Christoph W. Sensen und Dr. Jung Soh Trieste Course 2017 RNA-Seq Dr. Christoph W. Sensen und Dr. Jung Soh Trieste Course 2017

What is RNA-Seq? An experimental protocol that uses next-generation sequencing technologies to sequence the messenger RNA molecules within a biological sample in an effort to determine the primary sequence and relative abundance of each mRNA Martin JA, Wang Z (2011) Next-generation transcriptome assembly. Nat Rev Genet. 12(10):671-682 Also known as “Whole Transcriptome Shotgun Sequencing” (WTSS)

Sequencing strategy Metabolite profiling Plant material combination of ½-plate of 454 and 1 lane of 108PE Illumina sequencing excellent depth and coverage high-quality assemblies submission of total RNA samples improves quality control takes better advantage of sequencing facilities similar overall cost 76SE Illumina sequencing on selected species for comparative transcriptomics Plant material Biochemistry PIs Total RNA extraction Bioanalyzer (RNA quality) mRNA isolation cDNA libraries Genome Québec Innovation Centre 454 (1/2-plate) Illumina 1 lane 108PE Reference transcriptomes (75) repeat sequencing in rare cases of low-quality initial output Bioinformatics Innovation Centre Bioinformatics

RNA-Seq workflow intron Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 10(1):57-63.

RNA-Seq vs. microarray Characteristics RNA-Seq Microarray Which transcripts? All in a sample Only those for which probes are designed Transcript sequence generation Yes No Low-abundance transcript detection Limited Abundance info source Count (of the reads aligned to gene) Fluorescence level (of the probe spot for gene) Resolution Base Probe sequence Background noise Low High Additional info Alternative splicing, transcriptome-level variation

RNA-Seq data analysis Map reads Bin reads to features Normalize counts Lots of short reads Reference genome Map reads Table of mapped loci per read Feature annotation (exons, genes, transcripts) Bin reads to features Table of counts per feature Usually combined in a tool Normalize counts Table of normalized quantification values per feature Detect differentially expressed (DE) features DE features

Mapping reads Need a reference genome Issues Huge amounts of data Reads spanning across exon junction Alternative splicing Reads mapping to multiple locations in the genome Huge amounts of data Most common mapping results format SAM: sequence alignment/map BAM: binary format of SAM Many tools Bowtie, SOAP, BWA, SHRiMP, mrFAST, mrsFAST, ZOOM, SSAHA2, Mosaik

Bowtie

Binning reads Need annotated features Exons, genes, transcripts For each feature, the total number of reads mapped is produced Not directly comparable across features/samples yet Usually followed by normalization

Normalizing counts Why normalize? RPKM is most frequently used Longer features have more reads mapped Deeper sequencing produces more reads RPKM is most frequently used Reads Per Kilobase per Million reads Defined as C/(LN) C = number of reads mapped to a feature L = length of the feature (in kilobases) N = total number of reads from the sample (in millions)

RPKM examples http://jura.wi.mit.edu/bio/education/hot_topics/RNAseq/RNA_Seq.pdf

Gene model predicted for fungus Trametes versicolor using Augustus and RNA-seq hints Above is a screenshot of Gbrowse instance for fungal species Trametes versicolor for Genozymes project. Project is sequencing both DNA and transcriptome (RNA-seq) and COE is responsible for annotation. Example of gene predicted using ab intio predictor Augustus (Confident models) using hints from RNA-seq to check accuracy of prediction - Hints are built from short-read alignment of Illumina RNA-seq spliced reads onto the genome (Mapped Reads) - Splice reads show direct evidence of introns (next slide) - Hints are used with ab initio predictors (Augustus) during training and prediction stages

Splice Variants

“non-coding” RNA molecules LincRNA-p21 Tran et al., In press

MIRA Assembly Contig: T_rep_c1201 Read members: 96 Length: 2429 bp Example MIRA Assembly Contig: T_rep_c1201 Read members: 96 Length: 2429 bp Combined Assembly T_rep_c1201 is part of a 6 member contig 2 are partial transcripts assembled by PTA

Detecting Differential Expression Compare quantification values across samples or across features Most tools summarize/normalize counts and suggest DE features Cufflinks/Cuffdiff, R packages (DESeq, edgeR, baySeq, TSPM), SAMtools DE features go through similar analysis to microarray data analysis (e.g. validation)

Cufflinks

Cufflinks Tutorial https://docs.google.com/document/d/1t1gi2Djxd0ykMVe2bF8BVOBsOsPngjFh2999u3rZq-A/edit?hl=en&authkey=CKL1i8sD#

Anaerobic biocorrosion in reactors filled with WP-LS medium

SSV1 Replication Cycle (UV Induced)