Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520

Slides:



Advertisements
Similar presentations
RNA-Seq as a Discovery Tool
Advertisements

RNA-seq library prep introduction
An Introduction to Studying Expression Data Through RNA-seq
IMGS 2012 Bioinformatics Workshop: RNA Seq using Galaxy
RNAseq.
12/04/2017 RNA seq (I) Edouard Severing.
Simon v2.3 RNA-Seq Analysis Simon v2.3.
Peter Tsai Bioinformatics Institute, University of Auckland
DEG Mi-kyoung Seo.
RNA-seq: the future of transcriptomics ……. ?
Transcriptome Assembly and Quantification from Ion Torrent RNA-Seq Data Alex Zelikovsky Department of Computer Science Georgia State University Joint work.
Transcriptomics Jim Noonan GENE 760.
RNA-seq Analysis in Galaxy
mRNA-Seq: methods and applications
RNA-Seq and RNA Structure Prediction
Brief workflow RNA is isolated from cells, fragmented at random positions, and copied into complementary DNA (cDNA). Fragments meeting a certain size specification.
Li and Dewey BMC Bioinformatics 2011, 12:323
Expression Analysis of RNA-seq Data
Todd J. Treangen, Steven L. Salzberg
Transcriptome analysis With a reference – Challenging due to size and complexity of datasets – Many tools available, driven by biomedical research – GATK.
RNAseq analyses -- methods
June 11, 2013 Intro to Bioinformatics – Assembling a Transcriptome Tom Doak Carrie Ganote National Center for Genome Analysis Support.
Schedule change Day 2: AM - Introduction to RNA-Seq (and a touch of miRNA-Seq) Day 2: PM - RNA-Seq practical (Tophat + Cuffdiff pipeline on Galaxy) Day.
RNA-Seq Analysis Simon V4.1.
Adrian Caciula Department of Computer Science Georgia State University Joint work with Serghei Mangul (UCLA) Ion Mandoiu (UCONN) Alex Zelikovsky (GSU)
Transcriptome Analysis
Next Generation Sequencing. Overview of RNA-seq experimental procedures. Wang L et al. Briefings in Functional Genomics 2010;9: © The Author.
The iPlant Collaborative
RNA-Seq Assembly 转录组拼接 唐海宝 基因组与生物技术研究中心 2013 年 11 月 23 日.
RNA-seq workshop COUNTING & HTSEQ Erin Osborne Nishimura.
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on the GEP UCSC Genome Browser Wilson Leung08/2014.
Introduction to RNAseq
The iPlant Collaborative
RNA-seq: Quantifying the Transcriptome
TOX680 Unveiling the Transcriptome using RNA-seq Jinze Liu.
No reference available
CyVerse Workshop Transcriptome Assembly. Overview of work RNA-Seq without a reference genome Generate Sequence QC and Processing Transcriptome Assembly.
RNA Sequencing and transcriptome reconstruction Manfred G. Grabherr.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Reliable Identification of Genomic Variants from RNA-seq Data Robert Piskol, Gokul Ramaswami, Jin Billy Li PRESENTED BY GAYATHRI RAJAN VINEELA GANGALAPUDI.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
RNA-Seq with the Tuxedo Suite Monica Britton, Ph.D. Sr. Bioinformatics Analyst September 2015 Workshop.
RNA-Seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520
Statistics Behind Differential Gene Expression
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on
Simon v RNA-Seq Analysis Simon v
RNA Quantitation from RNAseq Data
An Introduction to RNA-Seq Data and Differential Expression Tools in R
RNA-Seq for the Next Generation RNA-Seq Intro Slides
Moderní metody analýzy genomu
Dr. Christoph W. Sensen und Dr. Jung Soh Trieste Course 2017
Gene expression from RNA-Seq
RNA-Seq analysis in R (Bioconductor)
The RNA-Seq Bid Idea: Statistical Design and Analysis for RNA Sequencing Data The RNA-Seq Big Idea Team: Yaqing Zhao1,2, Erika Cule1†, Andrew Gehman1,
S1 Supporting information Bioinformatic workflow and quality of the metrics Number of slides: 10.
High-Throughput Analysis of Genomic Data [S7] ENRIQUE BLANCO
Kallisto: near-optimal RNA seq quantification tool
Differential Expression from RNA-seq
Gene expression estimation from RNA-Seq data
Sequence Analysis 2- RNA-Seq
Transcriptome analysis
RNA sequencing (RNA-Seq) and its application in ovarian cancer
Quantitative analyses using RNA-seq data
Sequence Analysis - RNA-Seq 2
Sequence Analysis - RNA-Seq 1
Presentation transcript:

Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520 RNA-Seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520

RNA-seq Protocol Martin and Wang Nat. Rev. Genet. (2011)

RNA-seq Applications Expression levels, differential expression Alternative splicing, novel isoforms Novel genes or transcripts, lncRNA Detect gene fusions Many different protocols Can use on any sequenced genome Better dynamic range, cleaner data

Experimental Design Assessing biological variation requires biological replicates (no need for technical replicates) 3 preferred, 2 OK, 1 only for exploratory assays (not good for publications) For differential expression, don’t pool RNA from multiple biological replicates Batch effects still exist, try to be consistent or process all samples at the same time

Experimental Design Ribo-minus (remove too abundant genes) PolyA (mRNA, enrich for exons) Strand specific (anti-sense lncRNA) Sequencing: PE (resolve redundancy) or SE: expression PE for splicing, novel transcripts Depth: 30-50M differential expression, deeper transcript assembly Read length: longer for transcript assembly

RNA-seq Analysis

Alignment Prefer splice-aware aligners TopHat, BWA, STAR (not DNASTAR) Sometimes need to trim the beginning bases

Reference-based assembly Transcript Assembly Reference-based assembly Cufflinks De novo assembly Trinity

Quality Control: RSeQC

Expression Index RPKM (Reads per kilobase of transcript per million reads of library) Corrects for coverage, gene length 1 RPKM ~ 0.3 -1 transcript / cell Comparable between different genes within the same dataset TopHat / Cufflinks FPKM (Fragments), PE libraries, RPKM/2 TPM (transcripts per million) Normalizes to transcript copies instead of reads Longer transcripts have more reads RSEM, HTSeq

Differential Expression

Sequencing Read Distribution Poisson distribution: # events within an interval Sequencing data is overdispersed Poisson Negative binomial Def: # of successes before r failures occur, if Pb(each success) is p

Differential Expression Negative binomial for RNA-seq Variance estimated by borrowing information from all the genes – hierarchical models Test whether μi is the same for gene i between samples j FDR?

Differential Expression Should we do differential expression on RPKM/FPKM or TPM? Cufflinks: RPKM/FPKM LIMMA-VOOM and DESeq: TPM Power to detect DE is proportional to length Continued development and updates Gene A (1kb) Gene B (8kb)

Alternative Splicing Assign reads to splice isoforms

Isoform Inference If given known set of isoforms Estimate x to maximize the likelihood of observing n

Known Isoform Abundance Inference

Isoform Inference With known isoform set, sometimes the gene-level expression level inference is great, although isoform abundances have big uncertainty (e.g. known set incomplete) De novo isoform inference is a non-identifiable problem if RNA-seq reads are short and gene is long with too many exons Algorithm: MATS

Gene Fusion More seen in cancer samples Still a bit hard to call TopHatFusion in TopHat2 Maher et al, Nat 2009

Other Applications RNA editing Circular RNA Change on RNA sequence after transcription Most frequent: A to I (behaves like G), C to U Evolves from mononucleotide deaminases, might be involved in RNA degradation Circular RNA Mostly arise from splicing Varying length, abundance, and stability Possible function: sponge for RBP or miRNA

Summary RNA-seq design considerations Read mapping TopHat, BWA, STAR De novo transcriptome assembly: TRINITY Expression index: FPKM and TPM Differential expression Cufflinks: versatile LIMMA-VOOM and DESeq: better variance estimates Alternative splicing: MATS Gene fusion, genome editing, circular RNA

Acknowledgement Alisha Holloway Simon Andrews Radhika Khetani