An Introduction to RNA-Seq Data and Differential Expression Tools in R

Slides:



Advertisements
Similar presentations
IMGS 2012 Bioinformatics Workshop: RNA Seq using Galaxy
Advertisements

RNAseq.
12/04/2017 RNA seq (I) Edouard Severing.
Simon v2.3 RNA-Seq Analysis Simon v2.3.
DEG Mi-kyoung Seo.
RNA-seq: the future of transcriptomics ……. ?
Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520
Transcriptomics Jim Noonan GENE 760.
Introduction to Genomics, Bioinformatics & Proteomics Brian Rybarczyk, PhD PMABS Department of Biology University of North Carolina Chapel Hill.
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
RNA-seq Analysis in Galaxy
mRNA-Seq: methods and applications
RNA-Seq and RNA Structure Prediction
Li and Dewey BMC Bioinformatics 2011, 12:323
Expression Analysis of RNA-seq Data
Bioinformatics and OMICs Group Meeting REFERENCE GUIDED RNA SEQUENCING.
Transcriptome analysis With a reference – Challenging due to size and complexity of datasets – Many tools available, driven by biomedical research – GATK.
Introduction to DESeq and edgeR packages Peter A.C. ’t Hoen.
RNAseq analyses -- methods
TopHat Mi-kyoung Seo. Today’s paper..TopHat Cole Trapnell at the University of Washington's Department of Genome Sciences Steven Salzberg Center.
RNA-Seq Analysis Simon V4.1.
Transcriptome Analysis
Next Generation Sequencing. Overview of RNA-seq experimental procedures. Wang L et al. Briefings in Functional Genomics 2010;9: © The Author.
1 Identifying differentially expressed genes from RNA-seq data Many recent algorithms for calling differentially expressed genes: edgeR: Empirical analysis.
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on the GEP UCSC Genome Browser Wilson Leung08/2014.
Introduction to RNAseq
Bioinformatics for biologists Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University Presented.
RNA-seq: Quantifying the Transcriptome
The iPlant Collaborative
No reference available
Case study: Saccharomyces cerevisiae grown under two different conditions RNAseq data plataform: Illumina Goal: Generate a platform where the user will.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Overview of Genomics Workflows
RNA Seq Analysis Aaron Odell June 17 th Mapping Strategy A few questions you’ll want to ask about your data… - What organism is the data from? -
Bioinformatics for biologists (2) Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University Presented.
RNA-Seq with the Tuxedo Suite Monica Britton, Ph.D. Sr. Bioinformatics Analyst September 2015 Workshop.
RNA-Seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520
Konstantin Okonechnikov Qualimap v2: advanced quality control of
Bioinformatics core facility, OUS/UiO
Statistics Behind Differential Gene Expression
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on
Simon v RNA-Seq Analysis Simon v
Easier Workflows & Tool comparison with oqtans+
RNA Quantitation from RNAseq Data
Cancer Genomics Core Lab
Cloud based NGS data analysis
Data-intensive Computing: Case Study Area 1: Bioinformatics
apeglm: Shrinkage Estimators for Differential Expression of RNA-Seq
Dr. Christoph W. Sensen und Dr. Jung Soh Trieste Course 2017
Gene expression from RNA-Seq
RNA-Seq analysis in R (Bioconductor)
Transcriptomics II De novo assembly
The RNA-Seq Bid Idea: Statistical Design and Analysis for RNA Sequencing Data The RNA-Seq Big Idea Team: Yaqing Zhao1,2, Erika Cule1†, Andrew Gehman1,
S1 Supporting information Bioinformatic workflow and quality of the metrics Number of slides: 10.
High-Throughput Analysis of Genomic Data [S7] ENRIQUE BLANCO
Kallisto: near-optimal RNA seq quantification tool
Differential Expression from RNA-seq
Day 4 Session 22: Questions and follow-up…. James C. Fleet, PhD
Gene expression estimation from RNA-Seq data
A Correlated Random Effects Hurdle Model for Detecting Differentially Expressed Genes in Discrete Single Cell RNA Sequencing Data Michael Sekula Department.
Assessing changes in data – Part 2, Differential Expression with DESeq2
NMDS clustering of sample types and differential expression analysis.
Alignment of Next-Generation Sequencing Data
Transcriptomics Data Visualization Using Partek Flow Software
Quantitative analyses using RNA-seq data
Introduction to RNA-seq
Sequence Analysis - RNA-Seq 2
The Technology and Biology of Single-Cell RNA Sequencing
RNA-Seq Data Analysis UND Genomics Core.
Presentation transcript:

An Introduction to RNA-Seq Data and Differential Expression Tools in R Is it numbers? An Introduction to RNA-Seq Data and Differential Expression Tools in R Kara Martinez PhD Student at North Carolina State University

Central Dogma of Biology Motsinger, A. (2017). Types of Biological Data [PowerPoint Slides]. Retrieved from North Carolina State University ST 810.

What do we want to measure? We are interested in analyzing gene expression Whether a gene in the DNA is being used or not If so, to what degree?

What can we measure? RNA Sequencing (RNA-Seq) data measures the presence and quantity of mRNA in a sample at a time point Presence of mRNA Measure whether or not a gene is being expressed Quantity of mRNA Measure to what extent a gene is being expressed “At a time point” The transciptome (set of all mRNA in an organism) is constantly changing https://en.wikipedia.org/wiki/RNA-Seq

Is it numbers? There’s a click animation on the raw reads output Prithwishpal (2015, July 23). BaseMount: A Linux command line interface for BaseSpace. Retrieved from https://blog.basespace.illumina.com/2015/07/23/basemount-a-linux-command-line-interface-for-basespace/

Illumina Sequencing 3402 Bioinformatics Group. Next Generation Sequencing. Retrieved from http://www.3402bioinformaticsgroup.com/service/

Illumina Sequencing Sequencing machine parameters Flowcell lane Tile number Coordinates of cluster within that tile Read Sequence Quality values for the sequence

Sequence Alignment JBrowse Configuration Guide (2012). Generic Model Organism Database (GMOD). Retrieved from http://gmod.org/wiki/JBrowse_Configuration_Guide

Tools for Sequence Alignment TopHat2 Uses Bowtie and is part of Tuxedo Suite GSNAP Genomic Short-read Nucleotide Alignment Program STAR Spliced Transcripts Aligned to a Reference

Transcript Quantification

Transcript Quantification Raw count of mapped reads HTSeq-Count Python-based featureCounts R package (wrapper for compiled C code) Faster and requires less memory

Transcript Quantification Estimate the counts RSEM RNA-Seq by Expectation Maximization Uses Gibbs Sampling to come up with 95% CIs for the ML estimates Cufflinks Uses TopHat output in an EM algorithm Other Algorithms Kallisto eXpress Sailfish

Is it numbers? Anders, S., McCarthy, D., et al. (2013) Count-based differential expression analysis of RNA sequencing data using R and Bioconductor. Nature Protocols,8(9) 1765–1786. doi:10.1038/nprot.2013.099

Differential Expression Analysis Tools Negative Binomial Models edgeR DESeq2 Poisson Models GPSeq Empirical Bayes EBSeq baySeq Mixed Models maSigPro DyNB (MatLab) timeSeq package

Differential Expression Analysis Tools Negative Binomial Models edgeR DESeq2 Both take count data Similar in functionality and performance Estimate dispersion parameters differently edgeR: more sensitive to outliers DESeq2: less powerful

edgeR Output Anders, S., McCarthy, D., et al. (2013) Count-based differential expression analysis of RNA sequencing data using R and Bioconductor. Nature Protocols,8(9) 1765–1786. doi:10.1038/nprot.2013.099

Output MMC Volcano Plot Venn Diagram http://mmc.gnets.ncsu.edu http://www.gettinggeneticsdone.com/2014/05/r-volcano-plots-to-visualize-rnaseq-microarray.html Wong et al. (2015) BMC Genomics 16:425

Bibliography Detailed protocols Anders, S., McCarthy, D., et al. (2013) Count-based differential expression analysis of RNA sequencing data using R and Bioconductor. Nature Protocols,8(9) 1765–1786. doi:10.1038/nprot.2013.099 Chen, Y., McCarthy, D. et al. (2008) edgeR: differential expression analysis of digital gene expression data user’s guide. Conesa, A. Madrigal, P. et al (2016) A survey of best practices for RNA-seq data analysis. Genome Biology 17 (13). Doi: 10.1186/s13059-016-0881-8 Fang, Z., Martin, J., & Wang, Z. (2012) Statistical methods for identifying differentially expressed genes in RNA-Seq experiments. Cell & Bioscience. 2 (26)

Bibliography Other packages Li, B., Dewey, C. (2011). RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics. 323 (12). Doi: 10.1186/1471-2105-12-323 Liao, Y., Smyth, G., Shi, W. (2014) featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics. 30 (7): 923-930 doi:10.1093/bioinformatics/btt656 Dobin, A., Davis, C., et al (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 29 (1) doi:10.1093/bioinformatics/bts635

Bibliography Other Wong, R., Lamm, M., & Godwin, J. (2015) Characterizing the neurotranscriptomic states in alternative stress coping styles. BMC Genomics. 16 (425). Doi:10.1186/s12864-015-1626-x

Questions?