RNA-seq workshop COUNTING & HTSEQ Erin Osborne Nishimura.

RNA-seq workshop COUNTING & HTSEQ Erin Osborne Nishimura

_trim.fastq file.bam/.sam file.bw file counts.txt file TOPHAT2 bedGraphToBigWig bedtools genomecov.bg file HTseq DESeq2/R Differentially Abundant genes IGV/UCSC Pretty browser shots Today’s simple analysis pipeline.fastq file trimmomatic/bbduk.sh

Quantification with htseq

The problem

Counting reads What will we count? Genes? Exons? Isoforms? What are some of the issues we need to account for when counting reads? Paralogs? Overlap? Isoforms? Errors? How to count? Raw counts RPKM -- Reads aligned kilobase per million mapped reads FPKM -- Fragments per kilobase per million mapped reads

htseq-count Manual: –http://www- huber.embl.de/users/anders/HTSeq/doc/count.htmlhttp://www- huber.embl.de/users/anders/HTSeq/doc/count.html Paper –http://bioinformatics.oxfordjournals.org/content/31/2/166http://bioinformatics.oxfordjournals.org/content/31/2/166

The problem

The three htseq-count modes

Switch to hands on tutorial https://github.com/erinosb/HTSF_worksho p/blob/master/02_RNAseq_count.mdhttps://github.com/erinosb/HTSF_worksho p/blob/master/02_RNAseq_count.md

Assessing differential abundance

Assessing pairwise differential abundance, relatively simple Anders and Huber, 2010

Identifying genes with shared patterns across multiple samples, complex

For today… Anders and Huber, 2010

Many publications report performance comparisons of the of different packages Seyednasrollah et al., 2013 –http://bib.oxfordjournals.org/content/1 6/1/59.full.pdf+htmlhttp://bib.oxfordjournals.org/content/1 6/1/59.full.pdf+html Soneson et al., 2013. http://www.biomedcentral.com/1471- 2105/14/91http://www.biomedcentral.com/1471- 2105/14/91 Rapaport et al., 2013 –http://www.genomebiology.com/2013/ 14/9/r95http://www.genomebiology.com/2013/ 14/9/r95

Why is this hard? Why is this different from other types of data? Your question The data –Discretness –Small numbers of replicates –Large dynamic range –Outliers –Data is overdispersed Variance does not scale linearly with mean Breaks the assumptions of some inference tests Anders and Huber, 2010

Why DESeq? Original paper http://www.genomebiology.com/content/11/10/R106 DESeq2 paper http://www.genomebiology.com/2014/15/12/550 Bioconductor http://bioconductor.org/packages/release/bioc/html/DESeq2.ht mlhttp://bioconductor.org/packages/release/bioc/html/DESeq2.ht ml Vignette https://www.bioconductor.org/packages/release/bioc/vignettes/ DESeq2/inst/doc/DESeq2.pdfhttps://www.bioconductor.org/packages/release/bioc/vignettes/ DESeq2/inst/doc/DESeq2.pdf

A final word about the fate of your data You will need to submit your raw and processed files in a repository PRIOR to submitting your paper for publication. Keep track of what you did! –Module Versions –Conversion & transformation steps –Settings/Options

Switch to hands-on tutorial https://github.com/erinosb/HTSF_worksho p/blob/master/02_RNAseq_count.mdhttps://github.com/erinosb/HTSF_worksho p/blob/master/02_RNAseq_count.md

Key Quality Control Metrics 20 Gene coverage –CEAS Over-amplification –FASTQC Complexity –TOPHAT output Reproducibilitybility

RNA-seq workshop COUNTING & HTSEQ Erin Osborne Nishimura.

Similar presentations

Presentation on theme: "RNA-seq workshop COUNTING & HTSEQ Erin Osborne Nishimura."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

RNA-seq workshop COUNTING & HTSEQ Erin Osborne Nishimura.

Similar presentations

Presentation on theme: "RNA-seq workshop COUNTING & HTSEQ Erin Osborne Nishimura."— Presentation transcript:

Similar presentations

About project

Feedback