Presented by: Pham Kien Cuong NUS Graduate School for Integrative Sciences and Engineering.

Slides:



Advertisements
Similar presentations
RNA-Seq as a Discovery Tool
Advertisements

Marius Nicolae Computer Science and Engineering Department
RNA-Seq based discovery and reconstruction of unannotated transcripts
Transcriptomics Breakout. Topics Discussed Transcriptomics Applications and Challenges For Each Systems Biology Project –Host and Pathogen Bacteria Viruses.
Peter Tsai Bioinformatics Institute, University of Auckland
Transcriptome Assembly and Quantification from Ion Torrent RNA-Seq Data Alex Zelikovsky Department of Computer Science Georgia State University Joint work.
Timothy H. W. Chan, Calum MacAulay, Wan Lam, Stephen Lam, Kim Lonergan, Steven Jones, Marco Marra, Raymond T. Ng Department of Computer Science, University.
Bi-correlation clustering algorithm for determining a set of co- regulated genes BIOINFORMATICS vol. 25 no Anindya Bhattacharya and Rajat K. De.
A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae Article by Peter Uetz, et.al. Presented by Kerstin Obando.
Mutual Information Mathematical Biology Seminar
RNA-Seq based discovery and reconstruction of unannotated transcripts in partially annotated genomes 3 Serghei Mangul*, Adrian Caciula*, Ion.
Estimation of alternative splicing isoform frequencies from RNA-Seq data Ion Mandoiu Computer Science and Engineering Department University of Connecticut.
1 Integration of Background Modeling and Object Tracking Yu-Ting Chen, Chu-Song Chen, Yi-Ping Hung IEEE ICME, 2006.
Estimation of alternative splicing isoform frequencies from RNA-Seq data Ion Mandoiu Computer Science and Engineering Department University of Connecticut.
Estimation of alternative splicing isoform frequencies from RNA-Seq data Ion Mandoiu Computer Science and Engineering Department University of Connecticut.
RNA-seq Analysis in Galaxy
Towards accurate detection and genotyping of expressed variants from whole transcriptome sequencing data Jorge Duitama 1, Pramod Srivastava 2, and Ion.
High Throughput Sequencing
Human Molecular Genetics Section 14–3
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Quiz 7 Hypothesis testing of single samples z-test and t-test.
Predicting Missing Provenance Using Semantic Associations in Reservoir Engineering Jing Zhao University of Southern California Sep 19 th,
P300 Marks Active Enhancers Ruijuan LiChao HeRui Fu.
Todd J. Treangen, Steven L. Salzberg
Evaluation of software engineering. Software engineering research : Research in SE aims to achieve two main goals: 1) To increase the knowledge about.
Adrian Caciula Department of Computer Science Georgia State University Joint work with Serghei Mangul (UCLA) Ion Mandoiu (UCONN) Alex Zelikovsky (GSU)
VarDetect: a nucleotide sequence variation exploratory tool VarDetect Chumpol Ngamphiw 1, Supasak Kulawonganunchai 2, Anunchai Assawamakin 3, Ekachai Jenwitheesuk.
Computational Identification of Drosophila microRNA Genes Journal Club 09/05/03 Jared Bischof.
The iPlant Collaborative
Reduction of Training Noises for Text Classifiers Rey-Long Liu Dept. of Medical Informatics Tzu Chi University Taiwan.
BING: Binarized Normed Gradients for Objectness Estimation at 300fps
Serghei Mangul Department of Computer Science Georgia State University Joint work with Irina Astrovskaya, Marius Nicolae, Bassam Tork, Ion Mandoiu and.
VTS 2012: Zhao-Agrawal1 Net Diagnosis using Stuck-at and Transition Fault Models Lixing Zhao* Vishwani D. Agrawal Department of Electrical and Computer.
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on the GEP UCSC Genome Browser Wilson Leung08/2014.
Introduction to RNAseq
Geuvadis Analysis Meeting 16/02/2012 Micha Sammeth CNAG – Barcelona.
Experimentation in Computer Science (Part 2). Experimentation in Software Engineering --- Outline  Empirical Strategies  Measurement  Experiment Process.
1 1 Slide Simulation Professor Ahmadi. 2 2 Slide Simulation Chapter Outline n Computer Simulation n Simulation Modeling n Random Variables and Pseudo-Random.
Supplemental Figure 1. Bias-corrected NGS bioinformatics strategies. Paired-end DNA sequencing reveals the sequence of the genomic clone, the sample ID.
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
Manuel Holtgrewe Algorithmic Bioinformatics, Department of Mathematics and Computer Science PMSB Project: RNA-Seq Read Simulation.
From: Duggan et.al. Nature Genetics 21:10-14, 1999 Microarray-Based Assays (The Basics) Each feature or “spot” represents a specific expressed gene (mRNA).
An Integer Programming Approach to Novel Transcript Reconstruction from Paired-End RNA-Seq Reads Serghei Mangul Department of Computer Science Georgia.
Detecting Protein Function and Protein-Protein Interactions from Genome Sequences TuyetLinh Nguyen.
Using DNA Subway in the Classroom Genome Annotation: Red Line.
Reliable Identification of Genomic Variants from RNA-seq Data Robert Piskol, Gokul Ramaswami, Jin Billy Li PRESENTED BY GAYATHRI RAJAN VINEELA GANGALAPUDI.
Canadian Bioinformatics Workshops
Considerations for multi-omics data integration Michael Tress CNIO,
1 Finding disease genes: A challenge for Medicine, Mathematics and Computer Science Andrew Collins, Professor of Genetic Epidemiology and Bioinformatics.
Canadian Bioinformatics Workshops
David Amar, Tom Hait, and Ron Shamir
RNA Quantitation from RNAseq Data
Amos Tanay Nir Yosef 1st HCA Jamboree, 8/2017
Biases and their Effect on Biological Interpretation
RNA-Seq analysis in R (Bioconductor)
Detect alternative splicing
A Targeted High-Throughput Next-Generation Sequencing Panel for Clinical Screening of Mutations, Gene Amplifications, and Fusions in Solid Tumors  Rajyalakshmi.
26.5 Molecular Clocks Help Track Evolutionary Time
Development and Verification of an RNA Sequencing (RNA-Seq) Assay for the Detection of Gene Fusions in Tumors  Jennifer L. Winters, Jaime I. Davila, Amber.
Detection of TMPRSS2-ERG Translocations in Human Prostate Cancer by Expression Profiling Using GeneChip Human Exon 1.0 ST Arrays  Sameer Jhavar, Alison.
Reliable Identification of Genomic Variants from RNA-Seq Data
Citation-based Extraction of Core Contents from Biomedical Articles
Joseph Rodriguez, Jerome S. Menet, Michael Rosbash  Molecular Cell 
Alex M. Plocik, Brenton R. Graveley  Molecular Cell 
Integrating human omics data to prioritize candidate genes
Single Sample Expression-Anchored Mechanisms Predict Survival in Head and Neck Cancer Yang et al Presented by Yves A. Lussier MD PhD The University.
Identification and characterization of a novel KRAS rearrangement in metastatic prostate cancer. Identification and characterization of a novel KRAS rearrangement.
Figure 1. Identification of three tumour molecular subtypes in CIT and TCGA cohorts. We used CIT multi-omics data ( Figure 1. Identification of.
Integrated analysis of gene expression and copy number alterations.
Transcriptional and epigenetic landscapes of RMS cell lines and primary tumors. Transcriptional and epigenetic landscapes of RMS cell lines and primary.
Presentation transcript:

Presented by: Pham Kien Cuong NUS Graduate School for Integrative Sciences and Engineering

Outline Problem background FusionSeq: Method FusionSeq: Assessment Conclusion 2Pham Kien Cuong - NUS Graduate School for Integrative Sciences and Engineering

Problem background Fusion genes as key molecular events for cancers RNA-Seq reveals expressed fusion genes State-of-the-art: – Lack of flexibility in choosing mapping tools – Little analysis of artefacts Pham Kien Cuong - NUS Graduate School for Integrative Sciences and Engineering3

FusionSeq: Method 1 Fusion transcript detection 2 Filtration cascade 3 Junction sequence identifier Pham Kien Cuong - NUS Graduate School for Integrative Sciences and Engineering4

FusionSeq: Method Pham Kien Cuong - NUS Graduate School for Integrative Sciences and Engineering5 1 Fusion transcript detection 2 Filtration cascade 3 Junction sequence identifier

FusionSeq: Method Pham Kien Cuong - NUS Graduate School for Integrative Sciences and Engineering6 1 Fusion transcript detection 2 Filtration cascade 3 Junction sequence identifier

FusionSeq: Method Pham Kien Cuong - NUS Graduate School for Integrative Sciences and Engineering7 Misalignment filters (Computational errors) Large scale sequence similarity Small scale sequence similarity Repetitive region Random pairing filter (Experimental errors) Abnormal insert size Misalignment + Random pairing Scoring and ranking 1 Fusion transcript detection 2 Filtration cascade 3 Junction sequence identifier

Abnormal insert size filter Pham Kien Cuong - NUS Graduate School for Integrative Sciences and Engineering8

Scoring and Ranking of candidates Supportive PE reads (SPER): number of inter- transcript PE reads (m i ) normalized by the total number of mapped PE reads (N mapped ) Pham Kien Cuong - NUS Graduate School for Integrative Sciences and Engineering9 DASPER (the difference between the observed and analytically calculated expected SPER) RESPER (the ratio of empirically computed SPERs)

FusionSeq: Method Pham Kien Cuong - NUS Graduate School for Integrative Sciences and Engineering10 1 Fusion transcript detection 2 Filtration cascade 3 Junction sequence identifier

FusionSeq: Assessment Apply on prostate cancer samples harbouring: – Common TMPRSS2-ERG fusion – Less common fusions (SLC45A3-ERG, NDRG1- ERG) – No evidence of known fusions – No fusions expected Pham Kien Cuong - NUS Graduate School for Integrative Sciences and Engineering11

FusionSeq: Assessment The most common fusion, TMPRSS2-ERG, is ranked at the top of the list The other known fusions between ERG and other 5 ’ partners, namely SLC45A3 and NDRG1, are also included in the top candidates The remaining candidates appear to be read- through events, including ZNF649-ZNF577 Novel fusion transcripts identified and experimentally validated Pham Kien Cuong - NUS Graduate School for Integrative Sciences and Engineering12

FusionSeq: Assessment Filters remove 98% of candidates Real fusion stands out from artefacts with higher sequencing depth Pham Kien Cuong - NUS Graduate School for Integrative Sciences and Engineering13

FusionSeq: Assessment Exons included in fusions expressed more than those not included in fusion transcripts. Junction sequence identified and then validated by RT- PCR Pham Kien Cuong - NUS Graduate School for Integrative Sciences and Engineering14

FusionSeq: Assessment Simulation: introduce inter-transcripts reads in to experimental data of sample with no expected fusion – FusionSeq identified the introduced fusions Pham Kien Cuong - NUS Graduate School for Integrative Sciences and Engineering15

Conclusion FusionSeq demonstrates abilities: – Filter and identify fusion transcripts – Independent of mapping tool choice However: need reference genome sequence and genes annotation Aggressive filters may discard some real fusions Pham Kien Cuong - NUS Graduate School for Integrative Sciences and Engineering16

Pham Kien Cuong - NUS Graduate School for Integrative Sciences and Engineering17