Vertex labels swapping Edges swapping Pathway activity levels with ratio Abstract Metabolic pathway activity estimation from RNA-Seq data Yvette Temate-Tiagueu,

Slides:



Advertisements
Similar presentations
Marius Nicolae Computer Science and Engineering Department
Advertisements

RNA-Seq based discovery and reconstruction of unannotated transcripts
Alex Zelikovsky Department of Computer Science Georgia State University Joint work with Serghei Mangul, Irina Astrovskaya, Bassam Tork, Ion Mandoiu Viral.
Detecting active subnetworks in molecular interaction networks with missing data Luke Hunter Texas A&M University SHURP 2007 Student.
Peter Tsai Bioinformatics Institute, University of Auckland
Marius Nicolae and Ion Măndoiu (University of Connecticut, USA)
Transcriptome Assembly and Quantification from Ion Torrent RNA-Seq Data Alex Zelikovsky Department of Computer Science Georgia State University Joint work.
University at BuffaloThe State University of New York Interactive Exploration of Coherent Patterns in Time-series Gene Expression Data Daxin Jiang Jian.
A hub-attachment based method to detect functional modules from confidence-scored protein interactions and expression profiles Authors: Chia-Hao Chin 1,4,
Timothy H. W. Chan, Calum MacAulay, Wan Lam, Stephen Lam, Kim Lonergan, Steven Jones, Marco Marra, Raymond T. Ng Department of Computer Science, University.
Gene Expression Chapter 9.
Transcriptomics Jim Noonan GENE 760.
RNA-Seq based discovery and reconstruction of unannotated transcripts in partially annotated genomes 3 Serghei Mangul*, Adrian Caciula*, Ion.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Estimation of alternative splicing isoform frequencies from RNA-Seq data Ion Mandoiu Computer Science and Engineering Department University of Connecticut.
Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break 14:45 – 15:15Regulatory pathways lecture 15:15 – 15:45Exercise.
Gene Set Analysis 09/24/07. From individual gene to gene sets Finding a list of differentially expressed genes is only the starting point. Suppose we.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Marius Nicolae Computer Science and Engineering Department University of Connecticut Joint work with Serghei Mangul, Ion Mandoiu and Alex Zelikovsky.
Estimation of alternative splicing isoform frequencies from RNA-Seq data Ion Mandoiu Computer Science and Engineering Department University of Connecticut.
Estimation of alternative splicing isoform frequencies from RNA-Seq data Ion Mandoiu Computer Science and Engineering Department University of Connecticut.
Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.
Gene Ontology and Functional Enrichment Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
Software for Robust Transcript Discovery and Quantification from RNA-Seq Ion Mandoiu, Alex Zelikovsky, Serghei Mangul.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Metagenomics Binning and Machine Learning
Expression profiling of peripheral blood cells for early detection of breast cancer Introduction Early detection of breast cancer is a key to successful.
Metagenomic Analysis Using MEGAN4
Shankar Subramaniam University of California at San Diego Data to Biology.
Gene Set Enrichment Analysis (GSEA)
Todd J. Treangen, Steven L. Salzberg
Gene Regulatory Network Inference. Progress in Disease Treatment  Personalized medicine is becoming more prevalent for several kinds of cancer treatment.
Genomic assessment of mass-reared vs wild Hawaiian Mediterranean fruit flies Bernarda Calla, Brian Hall, Shaobin Hu, and Scott Geib Tropical Crop and Commodity.
CSCE555 Bioinformatics Lecture 16 Identifying Differentially Expressed Genes from microarray data Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun.
Variables: – T(p) - set of candidate transcripts on which pe read p can be mapped within 1 std. dev. – y(t) -1 if a candidate transcript t is selected,
MicroRNA Control of Appendage Regeneration Benjamin L. King 1,2, Heather Carlisle 1, Ashley Smith 1, Viravuth P. Yin 1,2 1 Mount Desert Island Biological.
BASys: A Web Server for Automated Bacterial Genome Annotation Gary Van Domselaar †, Paul Stothard, Savita Shrivastava, Joseph A. Cruz, AnChi Guo, Xiaoli.
O PTICAL M APPING AS A M ETHOD OF W HOLE G ENOME A NALYSIS M AY 4, 2009 C OURSE : 22M:151 P RESENTED BY : A USTIN J. R AMME.
Adrian Caciula Department of Computer Science Georgia State University Joint work with Serghei Mangul (UCLA) Ion Mandoiu (UCONN) Alex Zelikovsky (GSU)
Identification of Cancer-Specific Motifs in
Network & Systems Modeling 29 June 2009 NCSU GO Workshop.
RNA Sequencing I: De novo RNAseq
Serghei Mangul Department of Computer Science Georgia State University Joint work with Irina Astrovskaya, Marius Nicolae, Bassam Tork, Ion Mandoiu and.
Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction.
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on the GEP UCSC Genome Browser Wilson Leung08/2014.
KEY CONCEPT Biotechnology relies on cutting DNA at specific places.
While gene expression data is widely available describing mRNA levels in different cancer cells lines, the molecular regulatory mechanisms responsible.
Scalable Algorithms for Next-Generation Sequencing Data Analysis Ion Mandoiu UTC Associate Professor in Engineering Innovation Department of Computer Science.
TOX680 Unveiling the Transcriptome using RNA-seq Jinze Liu.
Biases in RNA-Seq data. Transcript length bias Two transcripts of length 50 and 100 have the same abundance in a control sample. The expression of both.
No reference available
Alex Zelikovsky Department of Computer Science Georgia State University Joint work with Adrian Caciula (GSU), Serghei Mangul (UCLA) James Lindsay, Ion.
BLAST Sequences queried against the nr or grass databases. GO ANALYSIS Contigs classified based on homology to known plant or fungal genes Next.
PROTEIN INTERACTION NETWORK – INFERENCE TOOL DIVYA RAO CANDIDATE FOR MASTER OF SCIENCE IN BIOINFORMATICS ADVISOR: Dr. FILIPPO MENCZER CAPSTONE PROJECT.
An Integer Programming Approach to Novel Transcript Reconstruction from Paired-End RNA-Seq Reads Serghei Mangul Department of Computer Science Georgia.
RNA Sequencing and transcriptome reconstruction Manfred G. Grabherr.
Validation of RNA-Seq data An introduction to qPCR Sarah Diermeier, Ph.D. Cold Spring Harbor Laboratory
Canadian Bioinformatics Workshops
De Novo Assembly of Mitochondrial Genomes from Low Coverage Whole-Genome Sequencing Reads Fahad Alqahtani and Ion Mandoiu University of Connecticut Computer.
University of California at San Diego
Calling All Hosts: Bacterial Communication In Situ
University of California at San Diego
CSCI2950-C Lecture 13 Network Motifs; Network Integration
Schedule for the Afternoon
SEG5010 Presentation Zhou Lanjun.
A Presentation by Regina Strelecki
Schematic representation of a transcriptomic evaluation approach.
Comparison of species and function profiles with ultradeep sequencing data. Comparison of species and function profiles with ultradeep sequencing data.
HIF-1α is not required for the classic transcriptional response to hypoxia. HIF-1α is not required for the classic transcriptional response to hypoxia.
Presentation transcript:

Vertex labels swapping Edges swapping Pathway activity levels with ratio Abstract Metabolic pathway activity estimation from RNA-Seq data Yvette Temate-Tiagueu, Qiong Cheng, Meril Mathew, Igor Mandric, Olga Glebova, Nicole Beth Lopanik, Ion Mandoiu and Alex Zelikovsky Department of Computer Science, Department of Biology, Georgia State University Computer Science and Engineering, University of Connecticut Our Contribution Using Kegg: database resource for understanding high-level functions and utilities of the biological system from molecular-level information. [Kanehisa M., and Goto S., 2000] (1)A novel graph-based approach to analyze pathways significance (2)Representing a pathway as a set an inferring activity from the information extracted from those sets (3)Validating the two approaches through differential expression analysis at the transcripts and genes level and also through qPCR experiment Objectives Methods Results 1. Moran NA: Symbiosis. Curr Biol 2006, 16:R866–R McFall-Ngai M, Hadfield MG, Bosch TCG, Carey HV, Domazet-Loso T, Douglas AE, Dubilier N, Eberl G, Fukami T, Gilbert SF et al: Animals in a bacterial world, a new imperative for the life sciences. Proc Natl Acad Sci USA 2013, 110(9): Haine ER: Symbiont-mediated protection. Proc R Soc B-Biol Sci 2008, 275(1633): Lopanik NB: Chemical defensive symbioses in the marine environment. Funct Ecol 2013, 28: Cragg GM, Newman DJ: Natural products: A continuing source of novel drug leads. Biochimica Et Biophysica Acta-General Subjects 2013, 1830(6): Piel J: Metabolites from symbiotic bacteria. Natural Product Reports 2009, 26(3): Gerwick WH, Moore BS: Lessons from the past and charting the future of marine natural products drug discovery and chemical biology. Chem Biol 2012, 19(1): Our experimental studies on Bugula neritina RNA-seq data (mutualistic symbiosis data vs none) show that, by analyzing metabolic pathways using our tool XPathway, we can effectively locate pathways which activities level significantly differ. This result is been validated through qPCR. This project is supported in part by the Molecular Basis of Disease fellowship of GSU Conclusions and Future Work The application of RNA-Seq has allowed various differential analysis studies including differential expression for pathways. A standard approach to study the metabolic differences between species is metabolic pathway. In this study, we introduce a novel approach to characterize pathways activity levels of two samples. We present XPathway, a set of pathways activity analysis tools based on Kegg-Kaas mapping of proteins to pathways. We applied our proposed methods on RNA-Seq Bugula neritina metagenomics data. We successfully identified several pathways with differential activity levels using our novel computational approaches implemented in XPathway. Further validation of initial results is conducted through qPCR.  Develop efficient algorithms for reliable estimation of pathway activity level  Identify pathways which activities significantly differ between two conditions Validation Experimental studies: Bugula neritina In United States - Three sibling species: 1.Deep-water (West coast of United States) 2.Shallow-water (West and Southern East coasts) 3.Northern Atlantic (Northern East coast) Illumina sequence paired-end reads: Sample 1: Bugula with symbiont Sample 2: Bugula without symbiont  50bp paired-end reads  200bp mean fragment length  Assembly into contigs by Trinity  BLAST with Swissprot database Sample 1 Sample 2  Topology-based estimation of pathway significance  EM-based estimation of pathway activity  Selected pathways for qPCR validation  qPCR Model 1: permutation of labels a e b c c a d e b d Model 2: permutation of edges a c b c d a b d RNA-seq reads 2 Samples Trinity Binary EM Contigs IsoDE Contigs validation KEGG, SEED Ortholog groups K00161 K00162 K00163 KEGG, SEED Ortholog groups K00161 K00162 K00163 Graph-based Pathway significance Pathway activity Differentially expressed pathways Experimental validation Proteins MAFSAED VLK EYD RRMEAL BLAST Bootstrapping: -Repeat 1000 times 1. Randomly switch edges 2. Compute density of the largest component -Sort wrt to density -Find the rank of the observed induced subgraph  For gene expression analyses: -Select pathways with significantly different activity -Select DE transcripts from these pathways -Select the genes from these transcripts -Primers are created to test genes per condition  Preliminary results  More primers ordered References In induced graph: # nodes N # edges M # green connected components # 0 in- & out-degrees Density of the induced graph: M/(N-1)