From: TopHat: discovering splice junctions with RNA-Seq

Slides:



Advertisements
Similar presentations
RNA-Seq as a Discovery Tool
Advertisements

RNAseq.
Transcriptome Sequencing with Reference
Peter Tsai Bioinformatics Institute, University of Auckland
TOPHAT Next-Generation Sequencing Workshop RNA-Seq Mapping
Visualization of genomic data Genome browsers. UCSC browser Ensembl browser Others ? Survey.
Displaying associations, improving alignments and gene sets at UCSC Jim Kent and the UCSC Genome Bioinformatics Group.
UCSC Genome Browser 1. The Progress 2 Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools.
Assignment 2: Papers read for this assignment Paper 1: PALMA: mRNA to Genome Alignments using Large Margin Algorithms Paper 2: Optimal spliced alignments.
LOC_Os02g08480 Supplementary Figure S1. Exons shorter than a read length have few or no reads aligned. The gene at LOC_Os02g08040 contains exons shorter.
TopHat Mi-kyoung Seo. Today’s paper..TopHat Cole Trapnell at the University of Washington's Department of Genome Sciences Steven Salzberg Center.
Web Databases for Drosophila Introduction to FlyBase and Ensembl Database Wilson Leung6/06.
Sackler Medical School
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on the GEP UCSC Genome Browser Wilson Leung08/2014.
Introduction to RNAseq
Work Presentation Novel RNA genes in A. thaliana Gaurav Moghe Oct, 2008-Nov, 2008.
Reliable Identification of Genomic Variants from RNA-seq Data Robert Piskol, Gokul Ramaswami, Jin Billy Li PRESENTED BY GAYATHRI RAJAN VINEELA GANGALAPUDI.
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on
Fig. 1 Computing the four node TOMs for nodes A,B,C,D in two simple networks 1) tA,B,C,D=0+40+6=0.667 and 2) tA,B,C,D=1+41+6= From: Network neighborhood.
Fig. 1. Cross-correlation curve indicating the fragment-length estimate for yeast nucleosome single-end data created from a paired-end dataset. The estimated.
The Transcriptional Landscape of the Mammalian Genome
Figure 1. Unified models predicting gene regulation based on landscapes of gene-regulating factors. For each gene, position specific combinatorial patterns.
Fig. 1. Arsenate uptake of root tips after 20 min in a range of arsenate solutions. Filled symbols and the solid line are Azucena, and open symbols and.
Figure 1. Gene expression analysis
J Exp Bot. 2017;68(17): doi: /jxb/erx352
Fig. 1 Location of QTLs for yield and the yield components ears per plant (E), grains per ear (G), and TGW (T) on chromosome 7A from 27 (site×year×treatment)
Figure 1. Annotation and characterization of genomic target of p63 in mouse keratinocytes (MK) based on ChIP-Seq. (A) Scatterplot representing high degree.
Figure 3: MetaLIMS sample input.
Fig. 4. — Analysis of putative RNA editing events in P. confluens
RNA-Seq analysis in R (Bioconductor)
Figure 2. The graphic integration of CNAs with altered expression genes in lung AD and SCC. The red lines represent the amplification regions for CNA and.
Fig. 1 Nodes in a conceptual knowledge graph
From: qPrimerDepot: a primer database for quantitative real time PCR
From: DynaMIT: the dynamic motif integration toolkit
From: Optimized design and assessment of whole genome tiling arrays
S1 Supporting information Bioinformatic workflow and quality of the metrics Number of slides: 10.
Figure 1. The Vicsek set graphs $G_0$, $G_1$, $G_2$.
Fig. 1 Stages of development of P
Figure 2. Number of SNPs detected from empirical ddRAD-Seq analysis
FIG. 1.— Correlation of RPKM between (A) 1,509 TE families estimated from B73–104 and the in silico data with 5 outliers indicated in gray; (B) 1,514 TE.
Fig. 1. RUbioSeq pipelines for exome variant detection and BS-Seq analyses. Dark gray boxes correspond to the main steps of the pipelines. Light gray boxes.
Figure 3. Schematic of the parameters to assess junctions in SpliceMap
Figure 1. Effect of acute TNF treatment on transcription in human SGBS adipocytes as assessed by RNA-seq and RNAPII ChIP-seq. Following 10 days in vitro.
From: Computational analysis of bacterial RNA-Seq data
TSS Annotation Workflow
Figure 1. Complete work-flow of the Scasat
Visualization of genomic data
Figure 2. Workflow of MethMotif Batch Query
Figure 1. (A) Number of 8-oxodGs per million of dGs (8-oxodg/106 dG) measured by LC-MS/MS in untreated (NT), UV-irradiated (UV) and NAC-treated.
Christopher R. Cabanski, Vincent Magrini, Malachi Griffith, Obi L
Discovery and Characterization of piRNAs in the Human Fetal Ovary
lincRNAs: Genomics, Evolution, and Mechanisms
Diverse abnormalities manifest in RNA
Figure 1. Circular taxonomy tree based on the species that were sequenced in our study. Unless provided in the caption above, the following copyright applies.
Figure 4. (A) Venn diagram showing the overlap of peaks differentially changed in DHT as compared to NT with peaks ... Figure 4. (A) Venn diagram showing.
The transcript profiles in the three human cell lines based on RNA sequencing (RNA‐seq). The transcript profiles in the three human cell lines based on.
Figure 1. The 12 species in this study and details of the improved G4-seq method. (A) Phylogenetic representation of ... Figure 1. The 12 species in this.
Figure 1. Nanopore methylation calls are consistent with expected results and established technologies. (A) Metaplot of ... Figure 1. Nanopore methylation.
Figure 1. Illustration of DGR systems and their prediction using myDGR
Fig. 1. iS-CellR pipeline overview
Determine CDS Coordinates
Universal Alternative Splicing of Noncoding Exons
Figure 2. Result page of a Primer3Plus Cloning run showing the left and right primers in blue and yellow. The included ... Figure 2. Result page of a Primer3Plus.
Fig. 1. The ZNF804A interactome
Figure 1. An outline of the novel ∼3
Figure 1 The workflow of CAR development from a hybridoma
Volume 11, Issue 7, Pages (May 2015)
Origins and Impacts of New Mammalian Exons
Fig. 1. A graphical representation of the GGDNA workflow used to identify each single expressed TE transcript from ... Fig. 1. A graphical representation.
Presentation transcript:

From: TopHat: discovering splice junctions with RNA-Seq Fig. 1. The TopHat pipeline. RNA-Seq reads are mapped against the whole reference genome, and those reads that do not map are set aside. An initial consensus of mapped regions is computed by Maq. Sequences flanking potential donor/acceptor splice sites within neighboring regions are joined to form potential splice junctions. The IUM reads are indexed and aligned to these splice junction sequences. From: TopHat: discovering splice junctions with RNA-Seq Bioinformatics. 2009;25(9):1105-1111. doi:10.1093/bioinformatics/btp120 Bioinformatics | © 2009 The Author(s)This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

From: TopHat: discovering splice junctions with RNA-Seq Fig. 2. An intron entirely overlapped by the 5′-UTR of another transcript. Both isoforms are present in the brain tissue RNA sample. The top track is the normalized uniquely mappable read coverage reported by ERANGE for this region (Mortazavi et al., 2008). The lack of a large coverage gap causes TopHat to report a single island containing both exons. TopHat looks for introns within single islands in order to detect this junction. From: TopHat: discovering splice junctions with RNA-Seq Bioinformatics. 2009;25(9):1105-1111. doi:10.1093/bioinformatics/btp120 Bioinformatics | © 2009 The Author(s)This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

From: TopHat: discovering splice junctions with RNA-Seq Fig. 3. The seed and extend alignment used to match reads to possible splice sites. For each possible splice site, a seed is formed by combining a small amount of sequence upstream of the donor and downstream of the acceptor. This seed, shown in dark gray, is used to query the index of reads that were not initially mapped by Bowtie. Any read containing the seed is checked for a complete alignment to the exons on either side of the possible splice. In the light gray portion of the alignment, TopHat allows a user-specified number of mismatches. Because reads typically contain low-quality base calls on their 3′ ends, TopHat only examines the first 28 bp on the 5′ end of each read by default. From: TopHat: discovering splice junctions with RNA-Seq Bioinformatics. 2009;25(9):1105-1111. doi:10.1093/bioinformatics/btp120 Bioinformatics | © 2009 The Author(s)This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

From: TopHat: discovering splice junctions with RNA-Seq Fig. 4. TopHat sensitivity as RPKM varies. For genes transcribed above 15.0 RPKM, TopHat detects more than 80% reported by ERANGE in the M. musculus brain tissue study. TopHat detects more than 72% of all junctions observed by ERANGE, including those in genes expressed at only a single transcript per cell. A de novo assembly of the RNA-Seq reads, followed by spliced alignment of the assembled transcripts produces markedly poorer sensitivity, detecting around 40% of junctions in genes transcribed above 25.0 RPKM, but comparatively few junctions in more highly transcribed genes. From: TopHat: discovering splice junctions with RNA-Seq Bioinformatics. 2009;25(9):1105-1111. doi:10.1093/bioinformatics/btp120 Bioinformatics | © 2009 The Author(s)This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

From: TopHat: discovering splice junctions with RNA-Seq Fig. 5. The BLAT E-value distribution of known, previously unreported, and randomly generated splice junction sequences when searched against GenBank mouse ESTs. As expected, known junctions have high-quality BLAT hits to the EST database. Randomly-generated junction sequences do not. High-quality BLAT hits for more than 11% of the junctions identified by TopHat suggest that the UCSC gene models for mouse are incomplete. These junctions are almost certainly genuine, and because the mouse EST database is not complete, 11% is only a lower bound on the specificity of TopHat. From: TopHat: discovering splice junctions with RNA-Seq Bioinformatics. 2009;25(9):1105-1111. doi:10.1093/bioinformatics/btp120 Bioinformatics | © 2009 The Author(s)This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

From: TopHat: discovering splice junctions with RNA-Seq Fig. 6. TopHat detects junctions in genes transcribed at very low levels. The gene Pnlip was transcribed at only 7.88 RPKM in the brain tissue according to ERANGE, and yet TopHat reports the complete known gene model. From: TopHat: discovering splice junctions with RNA-Seq Bioinformatics. 2009;25(9):1105-1111. doi:10.1093/bioinformatics/btp120 Bioinformatics | © 2009 The Author(s)This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

From: TopHat: discovering splice junctions with RNA-Seq Fig. 7. A previously unreported splice junction detected by TopHat is shown as the topmost horizontal line. This junction skips two exons in the ADP-ribosylation gene Arfgef1. As explained in Section 2, islands of read coverage in the Bowtie mapping are extended by 45 bp on either side. From: TopHat: discovering splice junctions with RNA-Seq Bioinformatics. 2009;25(9):1105-1111. doi:10.1093/bioinformatics/btp120 Bioinformatics | © 2009 The Author(s)This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.