The aim of my research is to establish a relation among diseases, physiological processes and the action of small molecules like mithramycin Our goal.

Slides:



Advertisements
Similar presentations
Linear Models for Microarray Data
Advertisements

ADVANCED STATISTICS FOR MEDICAL STUDIES Mwarumba Mwavita, Ph.D. School of Educational Studies Research Evaluation Measurement and Statistics (REMS) Oklahoma.
Data Integration for Cancer Genomics. Personalized Medicine Tumor Board Question: given all we know about a patient, what is the “optimal” treatment?
Combined analysis of ChIP- chip data and sequence data Harbison et al. CS 466 Saurabh Sinha.
Probability & Statistical Inference Lecture 7 MSc in Computing (Data Analytics)
Clustering short time series gene expression data Jason Ernst, Gerard J. Nau and Ziv Bar-Joseph BIOINFORMATICS, vol
Detecting Differentially Expressed Genes Pengyu Hong 09/13/2005.
Public data - available for projects 6 data sets: –Human Tissues –Leukemia –Spike-in –FARO compendium – Yeast Cell Cycle –Yeast Rosetta Find one yourself.
10 Hypothesis Testing. 10 Hypothesis Testing Statistical hypothesis testing The expression level of a gene in a given condition is measured several.
Differentially expressed genes
Test statistic: Group Comparison Jobayer Hossain Larry Holmes, Jr Research Statistics, Lecture 5 October 30,2008.
Analysis of Differential Expression T-test ANOVA Non-parametric methods Correlation Regression.
Lecture 9: One Way ANOVA Between Subjects
Business 205. Review Correlation MS5 Preview Chi-Square.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
 MicroRNAs (miRNAs) are a class of small RNA molecules, about ~21 nucleotide (nt) long.  MicroRNA are small non coding RNAs (ncRNAs) that regulate.
Today Concepts underlying inferential statistics
Mann-Whitney and Wilcoxon Tests.
Microarray Data Analysis Illumina Gene Expression Data Analysis Yun Lian.
Introduction The goal of translational bioinformatics is to enable the transformation of increasingly voluminous genomic and biological data into diagnostics.
Choosing Statistical Procedures
Selecting the Correct Statistical Test
Probability Distributions and Test of Hypothesis Ka-Lok Ng Dept. of Bioinformatics Asia University.
Overview of Statistical Hypothesis Testing: The z-Test
Multiple testing in high- throughput biology Petter Mostad.
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
DNA microarray technology allows an individual to rapidly and quantitatively measure the expression levels of thousands of genes in a biological sample.
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
CSCE555 Bioinformatics Lecture 16 Identifying Differentially Expressed Genes from microarray data Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun.
A New Oklahoma Bioinformatics Company. Microarray and Bioinformatics.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Microarray data analysis David A. McClellan, Ph.D. Introduction to Bioinformatics Brigham Young University Dept. Integrative Biology.
Bioinformatics Expression profiling and functional genomics Part II: Differential expression Ad 27/11/2006.
Statistical Principles of Experimental Design Chris Holmes Thanks to Dov Stekel.
C M Clarke-Hill1 Analysing Quantitative Data Forming the Hypothesis Inferential Methods - an overview Research Methods.
Introduction to Microarrays Dr. Özlem İLK & İbrahim ERKAN 2011, Ankara.
Hierarchical Bayesian Model Specification Model is specified by the Directed Acyclic Network (DAG) and the conditional probability distributions of all.
Lesson 15 - R Chapter 15 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
ANOVA: Analysis of Variance.
Statistics for Differential Expression Naomi Altman Oct. 06.
Bioinformatics MEDC601 Lecture by Brad Windle Ph# Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Clustering Algorithms to make sense of Microarray data: Systems Analyses in Biology Doug Welsh and Brian Davis BioQuest Workshop Beloit Wisconsin, June.
While gene expression data is widely available describing mRNA levels in different cancer cells lines, the molecular regulatory mechanisms responsible.
Affymetrix microarray analysis by using Cmap By NFU Biology Algorithm lab.
CSIRO Insert presentation title, do not remove CSIRO from start of footer Experimental Design Why design? removal of technical variance Optimizing your.
Comp. Genomics Recitation 10 4/7/09 Differential expression detection.
T tests comparing two means t tests comparing two means.
Getting the story – biological model based on microarray data Once the differentially expressed genes are identified (sometimes hundreds of them), we need.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 6 –Multiple hypothesis testing Marshall University Genomics.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
S519: Evaluation of Information Systems Social Statistics Inferential Statistics Chapter 15: Chi-square.
Vignesh Ramachandran SMART Summer Research Program
Gene expression.
1 Department of Engineering, 2 Department of Mathematics,
Computational Diagnostics
1 Department of Engineering, 2 Department of Mathematics,
Accelerating drug discovery: Open source cancer cell biology?
Multidimensional Drug Profiling By Automated Microscopy
1 Department of Engineering, 2 Department of Mathematics,
Transcriptional Signature of Histone Deacetylases in Breast cancer
Volume 20, Issue 12, Pages (September 2017)
Volume 20, Issue 12, Pages (September 2017)
Altered Caspase-8 Expression
Perturbational Gene-Expression Signatures for Combinatorial Drug Discovery  Chen-Tsung Huang, Chiao-Hui Hsieh, Yun-Hsien Chung, Yen-Jen Oyang, Hsuan-Cheng.
Global analysis of the chemical–genetic interaction map.
Presentation transcript:

The aim of my research is to establish a relation among diseases, physiological processes and the action of small molecules like mithramycin Our goal is to provide a generic solution to this problem by attempting to describe all biological states…in terms of genomic signatures, create a large public database of signatures of drugs and genes, and develop pattern-matching tools to detect similarities among these signatures

FIRST GENERATION of CONNECTIVITY MAP small molecules : 164 perturbagens tested (FDA approved and nondrug bioactive compounds) cell lines: MCF7 (breast cancer) PC3 (prostate cancer) HL60 (leukemia) SKMEL5 (melanoma) concentration and treatment 10  M ( when the optimal concentration is unknown) x 6h control cells in the same plate and treated with vehicle alone (medium, DMSO…)

OVERALL DATA 164 bioactive small molecules and corresponding vehicle control Affymetrix GeneChip microarrays HG U133A 564 gene expression profiles

Traditional method: HIERARCHICAL CLUSTERING CLUSTER is a collection of objects/data that are: * similar to each object in the same cluster * different to the objects in the other clusters In hierarchical clustering the data are not partitioned into a particular cluster in a single step. Instead, a series of partitions takes place, which may run from a single cluster containing all objects to n clusters each containing a single object. Strategy already used to analyze data from yeast and rat tissues

Drawbacks of hierarchical clustering the structure that they obtained by this approach was related to cell type and batch effects all profiles must be generated on the same microarray platform was necessary an analytical method that could detect multiple component within the cellular response to a perturbation new method based on rank and using Kolmogorov-Smirnov statistic (like to TTest ) QUERY SIGNATURE Gene expression profile correlated with a biological state EXPRESSION PROFILES Gene expression profile for the perturbagens tested comparison

Query signature with up regulated (+) and down-regulated genes (-) Profiles gene expression profile for each perturbagens compared to its vehicle ( genes) connection strong positive … null … strong negative connectivity score +1 … 0 … Connectivity map

SOME EXAMPLES HDAC inhibitors query signature: T24 (bladder), MDA435 and MDA468 (breast cancer) treated with HDAC inhibitors : vorinostat(SAHA), MS , tricostatin A Gene expression profile 8 up-regulated genes 5 down-regulated genes

connectivity map * Vorinostat Thricostatin A * HC toxin Valproic acid Connectivity map allows us to identify compounds unknown for this function In this case the results are independent from the used cell lines and from the dose of the drug

Estrogens query signature: MCF7 treated with 17  -estradiol (E2) natural ligand of ER 129 up and 89 down-regulated genes connectivity map Both agonists and antagonists can be discovered directly from the Connectivity Map is very important to collect the cells in an appropriate physiological state or molecular context

Gedunin Gedunin is able to abrogate AR activity in prostate cancer cells. Mechanism??? query signature: LNCaP treated for 6h with gedunin 35 up and 35 down-regulated genes connectivity map high connectivity with HSP90 inhibitor

DESEASES Diet-induced obesity query signature: gene expression in rat model of diet-induced obesity 163 up and 161 down-regulated genes PPAR  agonists and inducers of adipogenesis there is connection also between data in rat and data in human cell lines (but only in PC3)

Alzheimer disease query signature: two independent studies Comparison between hippocampus from AD and normal brain Comparison between cerebral cortex from AD and age- matched controls 40 genes 25 genes Significant negative connectivity with DAPH

Dexamethasone resistance in ALL query signature: comparison of cells from patients with sensitivity and patients with resistance to Dexamethasone sirolimus, mTOR inhibitor treatment with sirolimus sensitize CEM-CL cell lines to dexamethasone treatment

Sp1 Start site // Start site // Sp1MTM transcription no transcription The anticancer activity of MTM has been associated with its ability to inhibit replication and transcription via cross-linking of the DNA strands; MTM is known to bind to the minor groove of GC-rich DNA as a Mg2+-dimer complex (MTM:Mg2+ = 2:1) Our data: SDK We tested a new MTM analog: SDK

3355 down-regulated genes48 up regulated genes 900 ≥2 fold change 240 ≥3 fold change query signature: A2780 treated with SDK 100nM for 6 hours

DISCUSSION encouraging results connectivity map can be used for: - drugs with common mechanism of action (HDAC inhibitors) - discover unknown mechanism of action (gedunin) - identify potential new therapeutics the genomic signature are often conserved across different cell types and different origins but there are also several limitations at this pilot study - few number of used cell lines - few concentrations - interpretation of the results - the method for statistical analysis

Bye bye

Non-parametric models differ from parametric models in that the model structure is not specified a priori but is instead determined from data. The term nonparametric is not meant to imply that such models completely lack parameters but that the number and nature of the parameters are flexible and not fixed in advance. Nonparametric models are therefore also called distribution free.parametric A histogram is a simple nonparametric estimate of a probability distributionhistogram Non-parametric (or distribution-free) inferential statistical methods are mathematical procedures for statistical hypothesis testing which, unlike parametric statistics, make no assumptions about the frequency distributions of the variables being assessed. The most frequently used tests includeparametric statisticsfrequency distributions the Kolmogorov-Smirnov test (often called the K-S test) is used to determine whether two underlying probability distributions differ, or whether an underlying probability distribution differs from a hypothesized distribution, in either case based on finite samples.KolmogorovSmirnovprobability distributions Nonparametric statistical methods allow one to analyze data without making strong assumptions about the process that generated the data. For example, instead of assuming that the data have a Gaussian distribution, we might assume only that the distribution has a probability density that satisfies some weak, smoothness conditions