ONCOMINE: A Bioinformatics Infrastructure for Cancer Genomics

Slides:



Advertisements
Similar presentations
Statistical methods and tools for integrative analysis of perturbation signatures Mario Medvedovic Laboratory for Statistical Genomics and Systems Biology.
Advertisements

Oncomine Database Lauren Smalls-Mantey Georgia Institute of Technology June 19, 2006 Note: This presentation contains animation.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
The STRING database Michael Kuhn EMBL Heidelberg.
Gene regulation in cancer 11/14/07. Overview The hallmark of cancer is uncontrolled cell proliferation. Oncogenes code for proteins that help to regulate.
Generalized Protein Parsimony and Spectral Counting for Functional Enrichment Analysis Nathan Edwards Department of Biochemistry and Molecular & Cellular.
Andrey Alexeyenko M edical E pidemiology and B iostatistics Network biology and cancer data integration.
Cluster analysis of networks generated through homology: automatic identification of important protein communities involved in cancer metastasis Jonsson.
Gene expression analysis summary Where are we now?
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Gene Co-expression Network Analysis BMI 730 Kun Huang Department of Biomedical Informatics Ohio State University.
Gene Set Analysis 09/24/07. From individual gene to gene sets Finding a list of differentially expressed genes is only the starting point. Suppose we.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Demonstration Trupti Joshi Computer Science Department 317 Engineering Building North (O)
CISC667, F05, Lec24, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) DNA Microarray, 2d gel, MSMS, yeast 2-hybrid.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Why microarrays in a bioinformatics class? Design of chips Quantitation of signals Integration of the data Extraction of groups of genes with linked expression.
Pathway Informatics 6 th July, 2015 Ansuman Chattopadhyay, PhD Head, Molecular Biology Information Services Health Sciences Library System University of.
---- Mark Borodovsky a short intro Position open: Scientist - Pathway Informatics (June 2009) THE POSITION The successful candidate will join the Computational.
Pathways Database System: An Integrated System For Biological Pathways L. Krishnamurthy, J. Nadeau, G. Ozsoyoglu, M. Ozsoyoglu, G. Schaeffer, M. Tasan.
Introduction The goal of translational bioinformatics is to enable the transformation of increasingly voluminous genomic and biological data into diagnostics.
Computational Molecular Biology Biochem 218 – BioMedical Informatics Gene Regulatory.
Malignant Melanoma and CDKN2A
Proteomics Informatics – Data Analysis and Visualization (Week 13)
Metagenomic Analysis Using MEGAN4
Identification of network motifs in lung disease Cecily Swinburne Mentor: Carol J. Bult Ph.D. Summer 2007.
Analysis and Management of Microarray Data Dr G. P. S. Raghava.
Gene Set Enrichment Analysis (GSEA)
Bioinformatics Dr. Víctor Treviño BT4007
EGAN: Exploratory Gene Association Networks by Jesse Paquette Biostatistics and Computational Biology Core Helen Diller Family Comprehensive Cancer Center.
CS 790 – Bioinformatics Introduction and overview.
Basic features for portal users. Agenda - Basic features Overview –features and navigation Browsing data –Files and Samples Gene Summary pages Performing.
Intralab Workshop - Reactome CMAP Chang-Feng Quo June 29 th, 2006.
1 Bio-Trac 40 (Protein Bioinformatics) October 8, 2009 Zhang-Zhi Hu, M.D. Associate Professor Department of Oncology Department of Biochemistry and Molecular.
EADGENE and SABRE Post-Analyses Workshop 12-14th November 2008, Lelystad, Netherlands 1 François Moreews SIGENAE, INRA, Rennes Cytoscape.
Network & Systems Modeling 29 June 2009 NCSU GO Workshop.
Computational biology of cancer cell pathways Modelling of cancer cell function and response to therapy.
Construction of cancer pathways for personalized medicine | Presented By Date Construction of cancer pathways for personalized medicine Predictive, Preventive.
Bioinformatics lectures at Rice University Li Zhang Lecture 9: Networks and integrative genomic analysis
Other biological databases and ontologies. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and.
1 Global expression analysis Monday 10/1: Intro* 1 page Project Overview Due Intro to R lab Wednesday 10/3: Stats & FDR - * read the paper! Monday 10/8:
Bioinformatics MEDC601 Lecture by Brad Windle Ph# Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Gene Expression Omnibus (GEO)
GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic.
Gene set analyses of genomic datasets Andreas Schlicker Jelle ten Hoeve Lodewyk Wessels.
Bioinformatics Curriculum Issues, goals, curriculum.
1 ArrayTrack Demonstration National Center for Toxicological Research U.S. Food and Drug Administration 3900 NCTR Road, Jefferson, AR
While gene expression data is widely available describing mRNA levels in different cancer cells lines, the molecular regulatory mechanisms responsible.
Lecture 11. Topics in Omic Studies (Cancer Genomics, Transcriptomics and Epignomics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational.
Gene Sleuthing Lorraine Sartori Majid Masso Paul R. McCreary.
A Report on CAMDA’01 Biointelligence Lab School of Computer Science and Engineering Seoul National University Kyu-Baek Hwang and Jeong-Ho Chang.
Microarray analysis Quantitation of Gene Expression Expression Data to Networks BIO520 BioinformaticsJim Lund Reading: Ch 16.
GeWorkbench Overview Support Team Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of MIT and Harvard.
Copyright OpenHelix. No use or reproduction without express written consent1.
Modeling the cell cycle regulation by the RB/E2F pathway Laurence Calzone Service de Bioinformatique U900 Inserm / Ecoles de Mines / Institut Curie Collaborative.
CBioPortal Web resource for exploring, visualizing, and analyzing multidimentional cancer genomics data.
Case Study: Characterizing Diseased States from Expression/Regulation Data Tuck et al., BMC Bioinformatics, 2006.
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
Eigengenes as biological signatures Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University 5.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
Gene Set Analysis using R and Bioconductor Daniel Gusenleitner
Introduction to Oncomine Xiayu Stacy Huang. Oncomine is a cancer-specific microarray database and has a web-based data-mining platform aimed at facilitating.
Nature as blueprint to design antibody factories Life Science Technologies Project course 2016 Aalto CHEM.
Pathway Informatics 30 th March, 2016 Ansuman Chattopadhyay, PhD Head, Molecular Biology Information Services Health Sciences Library System University.
David Amar, Tom Hait, and Ron Shamir
Interrogation of cross talk between proteins and gene regulatory networks in breast cancer Chambers, Teressa Lee Hiren Karathia Sridhar Hannenhalli.
Data challenges in the pharmaceutical industry
Functional Genomics in Evolutionary Research
Single Sample Expression-Anchored Mechanisms Predict Survival in Head and Neck Cancer Yang et al Presented by Yves A. Lussier MD PhD The University.
Altered Caspase-8 Expression
Presentation transcript:

ONCOMINE: A Bioinformatics Infrastructure for Cancer Genomics Dan Rhodes Chinnaiyan Laboratory Bioinformatics Program Cancer Biology Training Program Medical Scientist Training Program University of Michigan Medical School

Outline Background ONCOMINE ONCOMINE Data Integration DNA Microarrays and the Cancer Transcriptome ONCOMINE Data collection, normalization & storage Statistical Analysis Visualization of Data and Analysis ONCOMINE Data Integration Therapeutic Targets / Biomarkers Metabolic and Signaling Pathways Known protein-protein Interactions ONCOMINE tutorial

Outline Background ONCOMINE ONCOMINE Data Integration DNA Microarrays and the Cancer Transcriptome ONCOMINE Data collection, normalization & storage Statistical Analysis Visualization of Data and Analysis ONCOMINE Data Integration Therapeutic Targets / Biomarkers Metabolic and Signaling Pathways Known protein-protein Interactions ONCOMINE tutorial

The Cancer Transcriptome

The Cancer Transcriptome

The Cancer Transcriptome

The Cancer Transcriptome

The Cancer Transcriptome

The Cancer Transcriptome

The Cancer Transcriptome

The Cancer Transcriptome

The Cancer Transcriptome

The Cancer Transcriptome

The Cancer Transcriptome 180+ studies profiling human cancer Each profiling 5 – 100+ samples We estimate > 10,000 microarrays 10k chips measuring 20k genes = 200+ million data points

Outline Background ONCOMINE ONCOMINE Data Integration DNA Microarrays and the Cancer Transcriptome ONCOMINE Data collection, normalization & storage Statistical Analysis Visualization of Data and Analysis ONCOMINE Data Integration Therapeutic Targets / Biomarkers Metabolic and Signaling Pathways Known protein-protein Interactions ONCOMINE tutorial

Oncomine oncology + data-mining = oncomine 105 independent datasets (90 analyzed) 7,292 cancer microarrays 79 million gene expression measurements 382 distinct cancer signatures > 5 million tests of differential expression > 5 million tests of gene set enrichment > 5 billion pairwise correlations

Oncomine Database – relational, Oracle 9.2 Statistical computing – R, Perl, Java Front End – Java Server Pages Server – Apache/Tomcat Graphics – Scalable Vector Graphics (SVG)

Data Collection Monthly Pubmed searches (cancer + microarray + transcriptome + tumor + gene expression profiling) Gene Expression Repositories Gene Expression Omnibus (GEO) (http://www.ncbi.nlm.nih.gov/geo/) ArrayExpress (http://www.ebi.ac.uk/arrayexpress/) Stanford Microarray Database (http://genome-www5.stanford.edu/) Whitehead Cancer Genomics (http://www.broad.mit.edu/cancer/)

Data Normalization Global normalization – same scaling factors applied to all microarray features – mean and variance normalization Affymetrix - Quantile normalization Spotted cDNA - Loess normalization normalize an M vs. A plot

Data Storage Generic data structures to accommodate a variety of data Samples Microarray Features / Genes Normalized Data Statistical Tests Gene Sets

Samples

Samples

Microarray Features / Genes

Normalized Data

Gene Sets

Statistical Tests

Statistical Tests

Outline Background ONCOMINE ONCOMINE Data Integration DNA Microarrays and the Cancer Transcriptome ONCOMINE Data collection, normalization & schema Statistical Analysis Visualization of Data and Analysis ONCOMINE Data Integration Therapeutic Targets / Biomarkers Metabolic and Signaling Pathways Known protein-protein Interactions ONCOMINE tutorial

Differential Expression Analysis Two-sided t-test for each gene: False discovery rate correction for multiple hypothesis testing

R, Oracle, RODBC

Outline Background ONCOMINE ONCOMINE Data Integration DNA Microarrays and the Cancer Transcriptome ONCOMINE Data collection, normalization & storage Statistical Analysis Visualization of Data and Analysis ONCOMINE Data Integration Therapeutic Targets / Biomarkers Metabolic and Signaling Pathways Known protein-protein Interactions ONCOMINE tutorial

Oncomine Tutorial part I Gene Differential Expression Gene Co-Expression Study Differential Expression WWW.ONCOMINE.ORG EMAIL: SHORTCOURSE PASSWORD: MCBI

Outline Background ONCOMINE ONCOMINE Data Integration DNA Microarrays and the Cancer Transcriptome ONCOMINE Data collection, normalization & storage Statistical Analysis Visualization of Data and Analysis ONCOMINE Data Integration Therapeutic Targets / Biomarkers Metabolic and Signaling Pathways Known protein-protein Interactions ONCOMINE tutorial

Therapeutic Targets / Biomarkers Gene Ontology Consortium Biological Process (apoptosis, cell cycle) Cellular Component (cytoplasmic membrane, extracellular) Molecular Function (kinase, phosphatase, protease, etc.) Known Therapeutic Targets NCI Clinical Trials Database Therapeutic Target Database

Therapeutic Target Database 338 proteins with Literature-documented Inhibitor, antagonist, Blocker, etc. http://xin.cz3.nus.edu.sg/group/cjttd/ttd.asp

Known Drug Targets Expressed in Bladder Cancer

Secreted proteins highly expressed in Ovarian Cancer

Outline Background ONCOMINE ONCOMINE Data Integration DNA Microarrays and the Cancer Transcriptome ONCOMINE Data collection, normalization & storage Statistical Analysis Visualization of Data and Analysis ONCOMINE Data Integration Therapeutic Targets / Biomarkers Metabolic and Signaling Pathways Known protein-protein Interactions ONCOMINE tutorial

Metabolic & Signaling Pathways KEGG Kyoto Encyclopedia of Genes & Genomes 87 metabolic pathways, 1700 gene assignments Biocarta Signaling pathways reviewed and entered by ‘expert’ biologists 215 signaling pathways, 3700 gene assignments

Pathway enrichment analysis Identify pathways and functional groups of genes deregulated in particular cancer types Enrichment Analysis using Kolmogrov-Smirnov Scanning (Lamb et al)

Kolmogrov-Smirnov Scanning (Lamb et al) 1 2 * 3 4 * 5 6 * 7 * 8 9 10 11 12 13 14 15 16 17 18 * 19 20 (1,2,3,4…,19,20) Vs. (2,4,6,7,18)

Pathway Enrichment Liver vs. other Normal tissues

Pathway Enrichment cont

Pathway enrichment analysis A search for the Biocarta pathways most enriched in a medulloblastoma signature (C2) uncovered involvement of the Ras/Rho pathway

Pathway enrichment analysis cont. A direct link to the Biocarta pathway provides the details (Medulloblastoma genes with red boxes)

Outline Background ONCOMINE ONCOMINE Data Integration DNA Microarrays and the Cancer Transcriptome ONCOMINE Data collection, normalization & storage Statistical Analysis Visualization of Data and Analysis ONCOMINE Data Integration Therapeutic Targets / Biomarkers Metabolic and Signaling Pathways Known protein-protein Interactions ONCOMINE tutorial

Known Protein-Protein Interactions HPRD Human Protein Reference Database Manually curated 20,000+ papers, 15,000+ distinct interactions PKDB Protein Kinase Database Natural Language Processing 60,000+ abstracts suggest interaciton, 16,000 distinct interactions Error prone Co-RIF Locus Link Reference into Function 12,000+ co-RIFs

Human Interactome Map (www.himap.org)

INTERACT

Outline Background ONCOMINE ONCOMINE Data Integration DNA Microarrays and the Cancer Transcriptome ONCOMINE Data collection, normalization & storage Statistical Analysis Visualization of Data and Analysis ONCOMINE Data Integration Therapeutic Targets / Biomarkers Metabolic and Signaling Pathways Known protein-protein Interactions ONCOMINE tutorial

Oncomine Tutorial Part II Gene set filtering to identify therapeutic targets and biomarkers Enrichment Analysis to identify pathways and processes deregulated in cancer Pathway and protein interaction networks deregulated in cancer

Acknowledgements Chinnaiyan Lab Pandey Lab IOB Radhika, Terry, Vasu, Jianjun, Scott, Soory Pandey Lab IOB Shanker, Nandan