Introduction The goal of translational bioinformatics is to enable the transformation of increasingly voluminous genomic and biological data into diagnostics.

Slides:



Advertisements
Similar presentations
Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
Advertisements

Mining Association Rules from Microarray Gene Expression Data.
Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
13:10:58 A New Tool for Mapping Microarray Data onto the Gene Ontology Structure ( Abstract e GOn (explore Gene Ontology) is a.
Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
Data Integration for Cancer Genomics. Personalized Medicine Tumor Board Question: given all we know about a patient, what is the “optimal” treatment?
Oncomine Database Lauren Smalls-Mantey Georgia Institute of Technology June 19, 2006 Note: This presentation contains animation.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Gene Co-expression Network Analysis BMI 730 Kun Huang Department of Biomedical Informatics Ohio State University.
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
Introduction to Genomics, Bioinformatics & Proteomics Brian Rybarczyk, PhD PMABS Department of Biology University of North Carolina Chapel Hill.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
ONCOMINE: A Bioinformatics Infrastructure for Cancer Genomics
Biological Interpretation of Microarray Data Helen Lockstone DTC Bioinformatics Course 9 th February 2010.
Introduction to Bioinformatics - Tutorial no. 12
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In.
Midterm project Course: Statistics in Bioinformatics Date: 指導教授 : 陳光琦 學生 : 吳昱賢.
Findings Department of Health and Human Services National Institutes of Health National Institute of General Medical Sciences Dr. Data Pediatrician Atul.
Different Expression Multiple Hypothesis Testing STAT115 Spring 2012.
A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer Simon Chan, Thursday Trainee Seminar – October 11.
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
DNA microarray technology allows an individual to rapidly and quantitatively measure the expression levels of thousands of genes in a biological sample.
The aim of my research is to establish a relation among diseases, physiological processes and the action of small molecules like mithramycin Our goal.
Sage Bionetworks Mission Sage Bionetworks is a non-profit organization with a vision to create a “commons” where integrative bionetworks are evolved by.
Enabling biomarker validation in breast cancer molecular subtypes: sensitivity and specificity of array-based subtype classification in 983 patients Balázs.
Exagen Diagnostics, Inc., all rights reserved Biomarker Discovery in Genomic Data with Partial Clinical Annotation Cole Harris, Noushin Ghaffari.
CSCE555 Bioinformatics Lecture 16 Identifying Differentially Expressed Genes from microarray data Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun.
A New Oklahoma Bioinformatics Company. Microarray and Bioinformatics.
Computational biology of cancer cell pathways Modelling of cancer cell function and response to therapy.
Scenario 6 Distinguishing different types of leukemia to target treatment.
Abstract Background: In this work, a candidate gene prioritization method is described, and based on protein-protein interaction network (PPIN) analysis.
Unraveling condition specific gene transcriptional regulatory networks in Saccharomyces cerevisiae Speaker: Chunhui Cai.
Construction of cancer pathways for personalized medicine | Presented By Date Construction of cancer pathways for personalized medicine Predictive, Preventive.
Analysis of GEO datasets using GEO2R Parthav Jailwala CCR Collaborative Bioinformatics Resource CCR/NCI/NIH.
Bioinformatics MEDC601 Lecture by Brad Windle Ph# Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Gene Expression Omnibus (GEO)
Gene set analyses of genomic datasets Andreas Schlicker Jelle ten Hoeve Lodewyk Wessels.
While gene expression data is widely available describing mRNA levels in different cancer cells lines, the molecular regulatory mechanisms responsible.
Class 23, 2001 CBCl/AI MIT Bioinformatics Applications and Feature Selection for SVMs S. Mukherjee.
A collaborative tool for sequence annotation. Contact:
Bioinformatics and Computational Biology
Affymetrix microarray analysis by using Cmap By NFU Biology Algorithm lab.
SUPPLEMENTAL FIGURES AND TABLES. Supplementary Table 1: List of new and improved features in GSEA-P version 2 Java software. Examples and screenshots.
GeWorkbench Overview Support Team Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of MIT and Harvard.
Introduction and Applications of Microarray Databases Chen-hsiung Chan Department of Computer Science and Information Engineering National Taiwan University.
Pan-cancer analysis of prognostic genes Jordan Anaya Omnes Res, In this study I have used publicly available clinical and.
Case Study: Characterizing Diseased States from Expression/Regulation Data Tuck et al., BMC Bioinformatics, 2006.
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
Eigengenes as biological signatures Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University 5.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
Gene Set Analysis using R and Bioconductor Daniel Gusenleitner
Canadian Bioinformatics Workshops
Nature as blueprint to design antibody factories Life Science Technologies Project course 2016 Aalto CHEM.
 We investigated for biomarkers that distinguish metastatic or recurring disease with non-metastatic disease, with a particular focus on breast cancer.
GEO (Gene Expression Omnibus) Deepak Sambhara Georgia Institute of Technology 21 June, 2006.
STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2016 Xiaole Shirley Liu.
Show & Tell Limsoon Wong Kent Ridge Digital Labs Singapore Role of Bioinformatics in the Genomic Era.
David Amar, Tom Hait, and Ron Shamir
Ashwani Kumar and Tiratha Raj Singh*
M. Fu, G. Huang, Z. Zhang, J. Liu, Z. Zhang, Z. Huang, B. Yu, F. Meng 
Gene Expression Omnibus (GEO)
Accelerating drug discovery: Open source cancer cell biology?
Benjamin Wooden, Nicolas Goossens, Yujin Hoshida, Scott L. Friedman 
Gene expression and genomic profiling reveal estrogen-independent ER transcriptional activity. Gene expression and genomic profiling reveal estrogen-independent.
Volume 29, Issue 5, Pages (May 2016)
Altered Caspase-8 Expression
Analysis of renal transcriptome responses identifies LX-regulated transcriptional networks. Analysis of renal transcriptome responses identifies LX-regulated.
PD-L1 expression correlates with T-cell markers and an IFN response signature in human melanomas. PD-L1 expression correlates with T-cell markers and an.
Perturbational Gene-Expression Signatures for Combinatorial Drug Discovery  Chen-Tsung Huang, Chiao-Hui Hsieh, Yun-Hsien Chung, Yen-Jen Oyang, Hsuan-Cheng.
Presentation transcript:

Introduction The goal of translational bioinformatics is to enable the transformation of increasingly voluminous genomic and biological data into diagnostics and therapeutics for the clinician. Microarray technology allows us to analyze expression of thousands of genes in a single experiment quickly and efficiently. Traditionally, comparative microarray analysis has been used in order to pinpoint genetic abnormalities in a disease of interest. By examining genes that are upregulated and downregulated in a disease state as opposed to a normal state, we can create a genetic profile of a disease. In addition, microarrays have been used to monitor changes in gene expression in response to drug treatments. Combining results of disease and drug related microarray experiments enables the discovery of possible functional connections between drugs, genes and diseases through common gene expression changes. In a recent study, Lamb et al. present us with a collection of genome-wide transcriptional expression data from cultured human cells treated with bioactive small molecules. The study consists of 453 experiments with different dosages of 164 compound perturbagens and corresponding vehicle controls. The selected compounds include several FDA approved drugs as well as some nondrug bioactive compounds chosen to represent a broad range of effects. The authors create 11 disease signatures manually by examining the relevant literature and study associations between drugs, molecular compounds such as HDAC inhibitors and disease states such as diet-induced obesity and Alzheimer’s disease. (Lamb et al., 2006) The Connectivity Map Concept. Gene-expression profiles derived from the treatment of cultured human cells with a large number of perturbagens populate a reference database. Gene-expression signatures represent any induced or organic cell state of interest (left). Pattern-matching algorithms score each reference profile for the direction and strength of enrichment with the query signature (center). Perturbagens are ranked by this ‘‘connectivity score’’; those at the top (‘‘positive’’) and bottom (‘‘negative’’) are functionally connected with the query state (right) through the transitory feature of common gene-expression changes. (Lamb et al., 2006) In this work, we recreate and extend the drug-disease “connectivity map” using publically available disease related gene expression data obtained from the Gene Expression Omnibus (GEO). We automate the process of creating disease signatures using publically available data. We extend the original set of 11 signatures to examine nearly 70 diseases and predict possible therapeutics based on the drug-disease connectivity scores. We validate our findings using the known drug disease associations from the Micromedex database. Methods Data The Gene Expression Omnibus (GEO) is a publically available gene expression and molecular abundance repository. It is an online resource for gene expression data browsing, query and retrieval. The database contains roughly 200,000 microarray experiments derived from over 100 organisms, addressing a wide range of biological issues. For the purposes of this project we are primarily interested in the GEO microarray datasets which allow for a comparative analysis between diseased and normal individuals. In our analysis we combine data from nearly 70 disease microarray datasets obtained from GEO and gene expression data from human cell lines treated with roughly 160 drugs or small molecules. The drug related data was generated by comparing treated and untreated cancerous cell lines including MCF7 breast cancer cell line, PC3 prostate cancer cell line, HL60 leukemia and SKMEL5 melanoma cell lines (Lamb et al., 2006). Generating Disease Signatures We start by choosing a single most representative disease dataset with a corresponding control in GEO. We mine GEO for disease related experiments by making use of annotations relating GEO experiments with PUBMED identifiers representing the publication in which each experiment was published (Butte et al., 2006). We require that each of the datasets contains a disease and control experiment. We carry out Significance Analysis of Microarrays (SAM) (Tusher et al., 2001) on every control-disease pairing to generate a list of upregulated and downregulated genes for each disease state. Using a 0.05 significance cutoff on the q-values from the SAM analysis we generate a signature profile of significantly up and down regulated genes for each disease of interest. Data Processing and Analysis. In order to be able to analyze data across multiple experiments from different platforms we need to standardize the gene identifiers from chip probe ids to NCBI GeneID (Chen et al., 2007). Computing Enrichment Scores For each treatment-disease pair we compute an enrichment score for the probe sets representing the up or down regulated signature genes separately using a rank-based Kolmogorov Smirnov statistic (Lamb et al., 2006). The scores from up and down regulated genes are combined into a single connectivity score for each drug disease combination. Results Results Summary. A heatmap visualizing connectivity scores of 164 drugs and compounds for each of the 66 diseases. Red indicates a score of -1 suggesting the drug is a possible treatment for the disease. Green indicates a score of +1 suggesting a possible adverse reaction or cause for the disease. Validation. We validate our findings by querying Micromedex for known drug disease associations, namely FDA approved treatments and known adverse effects. Above is the result for breast cancer. The green circles indicate FDA approved treatments for breast cancer and red circles indicate the drugs that have been recorded as having an adverse effect in patients with breast cancer. The treatment vs. adverse effect distributions are significantly different (p-value = 0.008). References 1.Butte AJ, Chen R. Finding Disease-Related Genomic Experiments Within an International Repository: First Steps in Translational Bioinformatics. AMIA Annu Symp Proc. 2006; 2006: 106– Chen R, Butte AJ. AILUN: reannotating gene expression data automatically. Nature Methods 2007; 4: Lamb J, Crawford ED, Peck D, et al. The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science 2006; 313: Tusher V, et al. Significance Analysis of Microarrays Applied to the Ionizing Radiation Response. Proceedings of the National Academy of Science 2001; 98: Integrating Multiple Publically Available Gene Expression Datasets to Predict Therapeutic Options across the Disease Nosology Marina Sirota, Annie P. Chiang, Joel Dudley and Atul J. Butte Experiments Affy Probes GeneID Affy ProbeGene IDAffy Probe _s_at _at _x_at _s_at _at _x_at _at _s_at _s_at _s_at _at _at _s_at _at _x_at _s_at _at _at _x_at _s_at _s_at _at _at _x_at _at UP DOWN Drug Expression Data UPDOWN Disease Signatures Up Down