Transcriptome Sequencing with Reference

Slides:



Advertisements
Similar presentations
RNA-seq library prep introduction
Advertisements

Functional Genomics with Next-Generation Sequencing
RNAseq.
© Wiley Publishing All Rights Reserved. Using Nucleotide Sequence Databases.
Walk-thru of CAGE exercise Also at /tag_analysis/ /tag_analysis/
Peter Tsai Bioinformatics Institute, University of Auckland
Transcriptome Assembly and Quantification from Ion Torrent RNA-Seq Data Alex Zelikovsky Department of Computer Science Georgia State University Joint work.
TOPHAT Next-Generation Sequencing Workshop RNA-Seq Mapping
Canadian Bioinformatics Workshops
Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520
Transcriptomics Jim Noonan GENE 760.
Displaying associations, improving alignments and gene sets at UCSC Jim Kent and the UCSC Genome Bioinformatics Group.
CS 374: Relating the Genetic Code to Gene Expression Sandeep Chinchali.
Why microarrays in a bioinformatics class? Design of chips Quantitation of signals Integration of the data Extraction of groups of genes with linked expression.
mRNA-Seq: methods and applications
Next generation sequencing Xusheng Wang 4/29/2010.
LECTURE 2 Splicing graphs / Annoteted transcript expression estimation.
Li and Dewey BMC Bioinformatics 2011, 12:323
Expression Analysis of RNA-seq Data
Todd J. Treangen, Steven L. Salzberg
What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease Understand.
Transcriptome analysis With a reference – Challenging due to size and complexity of datasets – Many tools available, driven by biomedical research – GATK.
Variables: – T(p) - set of candidate transcripts on which pe read p can be mapped within 1 std. dev. – y(t) -1 if a candidate transcript t is selected,
SAGExplore web server tutorial for Module II: Genome Mapping.
발표자 석사 2 년 김태형 Vol. 11, Issue 3, , March 2001 Comparative DNA Sequence Analysis of Mouse and Human Protocadherin Gene Clusters 인간과 마우스의 PCDH 유전자.
LOC_Os02g08480 Supplementary Figure S1. Exons shorter than a read length have few or no reads aligned. The gene at LOC_Os02g08040 contains exons shorter.
Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.
Next Generation DNA Sequencing
TopHat Mi-kyoung Seo. Today’s paper..TopHat Cole Trapnell at the University of Washington's Department of Genome Sciences Steven Salzberg Center.
Adrian Caciula Department of Computer Science Georgia State University Joint work with Serghei Mangul (UCLA) Ion Mandoiu (UCONN) Alex Zelikovsky (GSU)
Vidyadhar Karmarkar Genomics and Bioinformatics 414 Life Sciences Building, Huck Institute of Life Sciences.
Verna Vu & Timothy Abreo
Exon Array Analysis Changing the Landscape of Gene Expresson Profiling Tzu L. Phang Ph. D. Department of Medicine Division of Pulmonary Sciences and Critical.
The iPlant Collaborative
RNA Sequencing I: De novo RNAseq
The generalized transcription of the genome Víctor Gámez Visairas Genomics Course 2014/15.
1 Transcript modeling Brent lab. 2 Overview Of Entertainment  Gene prediction Jeltje van Baren  Improving gene prediction with tiling arrays Aaron Tenney.
RNA-Seq Assembly 转录组拼接 唐海宝 基因组与生物技术研究中心 2013 年 11 月 23 日.
Proposed redefinition of “gene” requires it to have a biological role Gerstein MB, …, Snyder M Genome Res 17: example of complexities observed.
1 Global expression analysis Monday 10/1: Intro* 1 page Project Overview Due Intro to R lab Wednesday 10/3: Stats & FDR - * read the paper! Monday 10/8:
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on the GEP UCSC Genome Browser Wilson Leung08/2014.
Introduction to RNAseq
Do not reproduce without permission 1 Gerstein.info/talks (c) (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Gerstein Lab Aims in ModENCODE.
RNA-seq: Quantifying the Transcriptome
Manuel Holtgrewe Algorithmic Bioinformatics, Department of Mathematics and Computer Science PMSB Project: RNA-Seq Read Simulation.
-1- Module 3: RNA-Seq Module 3 BAMView Introduction Recently, the use of new sequencing technologies (pyrosequencing, Illumina-Solexa) have produced large.
IB Saccharomyces cerevisiae - Jan Major model system for molecular genetics. For example, one can clone the gene encoding a protein if you.
An Integer Programming Approach to Novel Transcript Reconstruction from Paired-End RNA-Seq Reads Serghei Mangul Department of Computer Science Georgia.
CyVerse Workshop Transcriptome Assembly. Overview of work RNA-Seq without a reference genome Generate Sequence QC and Processing Transcriptome Assembly.
RNA Sequencing and transcriptome reconstruction Manfred G. Grabherr.
Case study: Saccharomyces cerevisiae grown under two different conditions RNAseq data plataform: Illumina Goal: Generate a platform where the user will.
A high-resolution map of human evolutionary constraints using 29 mammals Kerstin Lindblad-Toh et al Presentation by Robert Lewis and Kaylee Wells.
Canadian Bioinformatics Workshops
Reliable Identification of Genomic Variants from RNA-seq Data Robert Piskol, Gokul Ramaswami, Jin Billy Li PRESENTED BY GAYATHRI RAJAN VINEELA GANGALAPUDI.
Canadian Bioinformatics Workshops
The Transcriptional Landscape of the Mammalian Genome
Volume 8, Issue 5, Pages (May 2017)
S1 Supporting information Bioinformatic workflow and quality of the metrics Number of slides: 10.
From: TopHat: discovering splice junctions with RNA-Seq
Volume 38, Issue 4, Pages (May 2010)
Volume 8, Issue 5, Pages (May 2017)
The transcript profiles in the three human cell lines based on RNA sequencing (RNA‐seq). The transcript profiles in the three human cell lines based on.
Figure 1 SpliceSeq “splice graphs” for the 7 qRT-PCR tested genes
Genome-wide binding sites of OsMADS1 and the distribution of binding sites in different regions of annotated genes. Genome-wide binding sites of OsMADS1.
Introduction to Alternative Splicing and my research report
Integrative omic approaches for the study of host–pathogen interactions Integrative omic approaches for the study of host–pathogen interactions (A) Proteomic.
Universal Alternative Splicing of Noncoding Exons
Comparison of species and function profiles with ultradeep sequencing data. Comparison of species and function profiles with ultradeep sequencing data.
Manfred Schmid, Agnieszka Tudek, Torben Heick Jensen  Cell Reports 
Presentation transcript:

Transcriptome Sequencing with Reference To be continue

Transcriptome Sequencing with Reference

Overview of RNA-Seq

Bioinformatics Analysis Pipeline for Transcritpome Sequencing with Reference

Gene structure refinement UTRs by RNA‐Seq in yeast (Nagalakshmi, U. et al.,2008)

Gene structure refinement The gene structure was optimized according to the distribution of the reads, information of paired-end and the annotation of reference gene. We can get the distribution of reads in the genome by aligning the continuous and overlap reads form a Transcription Active Region (TAR). According to paired-end data, we can connect the different TARs to form a potential gene model. We can compare the gene model with the existing gene annotated to extend the gene 5'and 3' end

Poly(A) tags from RNA-Seq A region containing two overlapping transcripts (ACT1, from the actin gene, and YFL040W, an uncharacterized ORF) from the Saccharomyces cerevisiae genome is shown. Arrows point to transcription direction. The poly(A) tags from RNA-Seq experiments are shown below these transcripts, with arrows indicating transcription direction. The precise location of each locus identified by poly(A) tags reveals the heterogeneity in poly(A) sites, for example, ACT1 has two big clusters, both with a few bases of local heterogeneity. The transcription direction revealed by poly(A) tags also helps to resolve 3'-end overlapping transcribed regions Nature Reviews Genetics 10, 57-63

Alternative Splicing (AS)

92%~94% of human multi-exon gene undergo Alternative Splicing (AS)

Exon junction reads

Output of SOAPals

Splice Junction Database

Novel Transcripts Novel transcripts can be found by high throughput sequencing since present databases may be incomplete. Gene models found in intergenic regions (200 bp away from upstream or downstream genes) were thought to be candidate of novel transcripts

How deep should we go? 80% of yeast genes (genome size: ~12MB) were detected at 4 million uniquely mapped RNA-Seq reads, and coverage reaches a plateau afterwards despite the increasing sequencing depth. Expressed genes are defined as having at least four independent reads from a 50-bp window at the 3' end. The number of unique start sites detected starts to reach a plateau when the depth of sequencing reaches 80 million in two mouse transcriptomes. ES, embryonic stem cells; EB, embryonic body. Nature Reviews Genetics 10, 57-63

How deep should we go? De novo assembled rice transcriptome 1.3 Gb RNA‐Seq data (genome size: ~400MB) 85% of assembled unigenes were covered by genemodels

RNA-Seq and microarray compared Expression levels are shown, as measured by RNA-Seq and tiling arrays, for Saccharomyces cerevisiae cells grown in nutrient-rich media. The two methods agree fairly well for genes with medium levels of expression (middle), but correlation is very low for genes with either low or high expression levels. Nature Reviews Genetics 10, 57-63

Nature Reviews Genetics 10, 57-63

Digital Gene Expression Profiling DGE

Tag length vs Complexity

Exercise AGAIN……………….