Ribosomal Profiling Data Handling and Analysis MGL Users Group James Iben 5/4/15
Goals of Ribosomal Profiling Basic RNAseq experiments measure relative quantities of transcripts in the cell. Not all transcripts are equally engaged with translating ribosomes. Ribosomal Profiling captures RNA bound by polysomes giving a snapshot of active translation in the cell. Tends to provide a better proxy for protein levels in the cell than just RNAseq. (Duncan & Mata, Nature Structural & Molecular Biology 21, 641–647 (2014)) Can also identify where ribosomes are engaged at the codon level if RNA footprints are well-trimmed to just the ribosome protected fragments. Identify pause sites, frame of translation, etc.
Sequencing Prepared Samples Following an involved library preparation, short DNA fragments are sequenced according to regular RNA-seq protocols with minor modifications: Since inserts are expected to be primarily ~28 nucleotides, only single end sequencing is performed Fragments are read to 50bp with the expectation of some adapter being read We performed sequencing of 14 samples 4 sets of triplicate conditions/controls (4x, Mod5, Tit1, Tyr) Two test mouse library preparations
Trimming of Sequenced Reads Reads are trimmed for adapter sequence ONLY No quality trimming is performed at this stage as size is critically important Read length distribution is observed in samples, expecting a tight distribution around the size of the ribosome protected fragment Some ‘Full length’ (50 nt) reads may also be obtained. These are primarily found to map to contaminants (rRNA, tRNA, other ncRNA, etc)
Length Distribution Obtained Suggests More Aggressive RNAse Needed in Yeast Samples Nothing 34bp+ other than 50bp fragments. Lacking a strong 28/29bp (expected yeast footprint) component, positioning of A/P sites cannot be reliably performed. Mouse Fraction of Reads Length of Fragment
Mapping of Reads Alignment is performed using Bowtie1 (short, non-permissive mapping) against the transcriptome and genome. Mapping is found primarily on coding RNA in direction of transcript, spanning introns. Ribosomal RNA contamination was ~40% in pombe and ~10% in mouse
Measure of Translational Efficiency of mRNAs A measure of how engaged ribosomes are with mRNA Assumption: more ribosomes = more active translation Translational Efficiency (TE) is defined by Ribosomal occupancy of the message normalized by the amount of message in the cell (as measured by RNAseq) TE = (RPKM in RiboProfiling) / (RPKM in RNAseq) Previously, RNAseq had been performed on these same conditions in triplicate
Quantitation and Comparisons (Approach 1) Read density on genes was measured as RPKM (Reads Per Kilobase of transcript per Million reads in the experiment). Used HTSeq with a GTF of gene definitions from the latest pombe build (ASM294v2.22) used for alignment. Tables were prepared for both RNAseq and RP experiments. TE calculated over 9 comparisons (3 RNAseq x 3 RP) TE compared across experimental conditions as a difference of means. Bonferroni correction for multiple testing to establish significance.
Cross Sample Comparison of TEs Per-gene mean TE is generally well correlated between conditions. Spearman correlation >0.8
ANOTA (Approach 2) ANalysis Of Translational Activity R Bioconductor package (Larsson O, Sonenberg N and Nadon R (2011). anota: ANalysis Of Translational Activity (ANOTA).. R package version 1.16.0.) Attempts to control for non-translation related changes (localization, etc) that may cause false positives in comparing raw TE. Uses regression analysis between the translationally active mRNA levels and the cytosolic mRNA levels. Dependent of several criteria for appropriate use Outlier samples cannot exist Consistency (polysome prep, etc) Residuals close to normal (no major bias)
Additional Types of Analysis With well trimmed fragments, reading frame may be analyzed Bioconductor package riboSeqR (Hardcastle TJ (2014). riboSeqR: Analysis of sequencing data from ribosome profiling experiments.. R package version 1.2.0.)
Other Considerations Transcript level view of pausing Codon occupancy Phenotype of these samples in particular is expected to alter decoding of codons somewhat Enrichment analysis of differentially expressed genes