RNA-Seq and RNA Structure Prediction

Slides:



Advertisements
Similar presentations
B. Knudsen and J. Hein Department of Genetics and Ecology
Advertisements

RNA Secondary Structure Prediction
RNA-Seq as a Discovery Tool
RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
RNA-Seq based discovery and reconstruction of unannotated transcripts
RNAseq.
Processing of miRNA samples and primary data analysis
Transcriptome Sequencing with Reference
Peter Tsai Bioinformatics Institute, University of Auckland
DEG Mi-kyoung Seo.
RNA-seq: the future of transcriptomics ……. ?
Predicting the 3D Structure of RNA motifs Ali Mokdad – UCSF May 28, 2007.
Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520
Transcriptomics Jim Noonan GENE 760.
Predicting RNA Structure and Function. Non coding DNA (98.5% human genome) Intergenic Repetitive elements Promoters Introns mRNA untranslated region (UTR)
Predicting RNA Structure and Function
RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
Non-coding RNA William Liu CS374: Algorithms in Biology November 23, 2004.
RNA Secondary Structure Prediction
Predicting RNA Structure and Function. Nobel prize 1989Nobel prize 2009 Ribozyme Ribosome RNA has many biological functions The function of the RNA molecule.
Predicting RNA Structure and Function. Following the human genome sequencing there is a high interest in RNA “Just when scientists thought they had deciphered.
[Bejerano Fall10/11] 1.
1 Ref: Ch. 5 Mount: Bioinformatics i.Protein synthesis: ribosomal RNA transfer RNA messenger RNA ii.Catalysis e.g. ribozymes iii.Regulatory molecules 17.1.
Predicting RNA Structure and Function
RNA-seq Analysis in Galaxy
Predicting RNA Structure and Function. Nobel prize 1989 Nobel prize 2009 Ribozyme Ribosome.
mRNA-Seq: methods and applications
Non-coding RNA gene finding problems. Outline Introduction RNA secondary structure prediction RNA sequence-structure alignment.
LECTURE 2 Splicing graphs / Annoteted transcript expression estimation.
Li and Dewey BMC Bioinformatics 2011, 12:323
Todd J. Treangen, Steven L. Salzberg
Transcriptome analysis With a reference – Challenging due to size and complexity of datasets – Many tools available, driven by biomedical research – GATK.
RNAseq analyses -- methods
RNA Secondary Structure Prediction Spring Objectives  Can we predict the structure of an RNA?  Can we predict the structure of a protein?
From Structure to Function. Given a protein structure can we predict the function of a protein when we do not have a known homolog in the database ?
Next Generation DNA Sequencing
RNA-Seq Analysis Simon V4.1.
RNA folding & ncRNA discovery I519 Introduction to Bioinformatics, Fall, 2012.
RNA Secondary Structure Prediction. 16s rRNA RNA Secondary Structure Hairpin loop Junction (Multiloop)Bulge Single- Stranded Interior Loop Stem Image–
The iPlant Collaborative
Lecture 9 CS5661 RNA – The “REAL nucleic acid” Motivation Concepts Structural prediction –Dot-matrix –Dynamic programming Simple cost model Energy cost.
Questions?. Novel ncRNAs are abundant: Ex: miRNAs miRNAs were the second major story in 2001 (after the genome). Subsequently, many other non-coding genes.
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on the GEP UCSC Genome Browser Wilson Leung08/2014.
Introduction to RNAseq
RNA-seq: Quantifying the Transcriptome
TOX680 Unveiling the Transcriptome using RNA-seq Jinze Liu.
Motif Search and RNA Structure Prediction Lesson 9.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
RNAs. RNA Basics transfer RNA (tRNA) transfer RNA (tRNA) messenger RNA (mRNA) messenger RNA (mRNA) ribosomal RNA (rRNA) ribosomal RNA (rRNA) small interfering.
RNA-Seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on
Simon v RNA-Seq Analysis Simon v
RNA Quantitation from RNAseq Data
Dr. Christoph W. Sensen und Dr. Jung Soh Trieste Course 2017
Gene expression from RNA-Seq
RNA-Seq analysis in R (Bioconductor)
Lab 8.3: RNA Secondary Structure
Differential Expression from RNA-seq
Predicting RNA Structure and Function
RNA Secondary Structure Prediction
RNA Secondary Structure Prediction
Gene expression estimation from RNA-Seq data
Predicting the Secondary Structure of RNA
Comparative RNA Structural Analysis
RNA Secondary Structure Prediction
RNA 2D and 3D Structure Craig L. Zirbel October 7, 2010.
Quantitative analyses using RNA-seq data
Sequence Analysis - RNA-Seq 2
Presentation transcript:

RNA-Seq and RNA Structure Prediction Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520

Outline RNA-seq RNA structure prediction Experiments Analysis: read mapping, expression index, isoform inference, differential expression RNA structure prediction Covariance model Base-pair maximization Free energy method

RNA-seq Mortazavi et al, Nat Meth 2008

RNA-Frag has Less 3’ Biase Wang et al. 2009

RNA-Seq: Alternative to Microarrays General expression profiling Novel genes Alternative splicing Detect gene fusion Can use on any sequenced genome Better dynamic range Cleaner and more informative data Data analysis challenges

Mapping Bowtie or Maq mapping identify transcribed known or novel exons Longer (e,g. 100bp) paired-end libraries are better

Transcript Abundances More reads mapped to longer genes More reads mapped if sequencing is deep RPKM: reads per kb transcript per million reads: 1 RPKM ~ 0.3 -1 transcript / cell Low technical noise (Poisson distribution) but high biological noise (over dispersion, neg binomial)

Different Alternative Splicing

Isoform Inference If given known set of isoforms Estimate x to maximize the likelihood of observing n

Known Isoform Abundance Inference

De novo isoform inference

Isoform Inference With known isoform set, sometimes the gene-level expression level inference is great, although isoform abundances have big uncertainty (e.g. known set is not complete) De novo isoform inference is a non-identifiable problem with current RNA-seq protocol and (short) read length (e.g. exon and isoform numbers are big)

Gene Fusion Down regulation of tumor suppressor or up regulation of oncogenes Maher et al, Nat 2009

A Few Algorithms Expression index and isoform inference Cufflinks from Steve Salzburg Rseq from Wing Wong Scripture from Aviv Regev Differential expression Cufflinks DESeq from Wolfgang Huber EdgeR from Gordon Smyth Replicates are still preferred! Still need systematic evaluation

Why do we Care? RNA (tRNA, rRNA) structure determines function Many non-coding RNA genes have special structure, which leads to special functions ncRNA genes later Mostly RNA 2nd structure: G-C and A-U; G-U

Simple RNA Structures

More Complex Interactions Kissing hairpins Pseudoknots Hairpin-bulge contact

RNA Structure Representations

Covariance Models Get related RNA sequences, obtain multiple sequence alignment E.g. orthologous RNA from many species or family of RNA believed to have similar structure and function Require sequences be similar enough so that they can be initially aligned Look at every pair of columns and check for covarying substitutions Sequences should be dissimilar enough for covarying substitutions to be detected

Base-Pair Maximization Find structure with the max # of base pairs Efficient dynamic programming solution introduced by Nussinov (1970s) Compare a sequence against itself in a dynamic programming matrix Since structure folds upon itself, only necessary to calculate half the matrix Four rules for scoring the structure at a particular point

Nussinov Algorithm Initialization: score for complementary matches along main diagonal and diagonal just below it are set to zero

Nussinov Algorithm Fill matrix: M[i][j] = max of the following M[i+1][j-1] + S(xi, xj) M[i+1][j] M[i][j-1] M[i][j] = MAXi<k<j (M[i][k] + M[k+1][j])

Nussinov Algorithm Fill diagonal by diagonal (assume no bulge penalty, similar to SW gap penalty) j i

Nussinov Algorithm Trace back from upper right corner to get the structure

Free Energy Method Mfold: Mathews, JMB 1999 Predict the correct secondary structure by minimizing the free energy (G) Energy: Base pairing and base stacking

Energy Factors Consecutive basepairing, good Internal bulge, bad Terminal basepairing, not stable Hairpin loop, interior and bulge loop destabilize energies

Energy Minimization Assume: the most likely structure is the most stable structure energetically Energy associated with any position is only influenced by local sequence and structure Does not consider pseudoknot formation Dynamic program

Energy Minimization

Vienna RNA Package Vienna RNA web

Summary RNA-seq RNA structure prediction methods Cutflinks: read mapping, expression index, isoform inference, differential expression Different technique, analysis, and output for different tasks Awaiting RMA of RNA-seq 3rd generation sequencing might read whole transcript RNA structure prediction methods Covariance model: mutual information Base-pair maximization: Nussinov Free energy method: Mfold, Vienna RNA Caution: best is the enemy of the good