CK, October, 2003 A Hidden Markov Model for Microarray Time Course Data Christina Kendziorski and Ming Yuan Department of Biostatistics and Medical Informatics.

Slides:



Advertisements
Similar presentations
Yinyin Yuan and Chang-Tsun Li Computer Science Department
Advertisements

Linear Models for Microarray Data
Pattern Finding and Pattern Discovery in Time Series
1 Parametric Empirical Bayes Methods for Microarrays 3/7/2011 Copyright © 2011 Dan Nettleton.
M. Kathleen Kerr “Design Considerations for Efficient and Effective Microarray Studies” Biometrics 59, ; December 2003 Biostatistics Article Oncology.
Darya Chudova, Alexander Ihler, Kevin K. Lin, Bogi Andersen and Padhraic Smyth BIOINFORMATICS Gene expression Vol. 25 no , pages
Chromatin Immuno-precipitation (CHIP)-chip Analysis
Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.
Profiles for Sequences
Hidden Markov Models First Story! Majid Hajiloo, Aria Khademi.
INTRODUCTION TO Machine Learning 3rd Edition
Visual Recognition Tutorial
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.
Sandrine Dudoit1 Microarray Experimental Design and Analysis Sandrine Dudoit jointly with Yee Hwa Yang Division of Biostatistics, UC Berkeley
Probabilistic Techniques for the Clustering of Gene Expression Data Speaker: Yujing Zeng Advisor: Javier Garcia-Frias Department of Electrical and Computer.
Lecture 5: Learning models using EM
Gene Expression Data Analyses (3)
Introduction to Computational Biology Topics. Molecular Data Definition of data  DNA/RNA  Protein  Expression Basics of programming in Matlab  Vectors.
Hidden Markov Models Usman Roshan BNFO 601. Hidden Markov Models Alphabet of symbols: Set of states that emit symbols from the alphabet: Set of probabilities.
Genome evolution: a sequence-centric approach Lecture 3: From Trees to HMMs.
1 Test of significance for small samples Javier Cabrera.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
A Unifying Review of Linear Gaussian Models
Handwritten Character Recognition using Hidden Markov Models Quantifying the marginal benefit of exploiting correlations between adjacent characters and.
1 A Presentation of ‘Bayesian Models for Gene Expression With DNA Microarray Data’ by Ibrahim, Chen, and Gray Presentation By Lara DePadilla.
Affymetrix GeneChips and Analysis Methods Neil Lawrence.
ArrayCluster: an analytic tool for clustering, data visualization and module finder on gene expression profiles 組員:李祥豪 謝紹陽 江建霖.
Applying statistical tests to microarray data. Introduction to filtering Recall- Filtering is the process of deciding which genes in a microarray experiment.
Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research.
Genetic Reduction of Inositol Triphosphate (InsP 3 ) Increases Tolerance of Tomato Plants to Oxidative Stress Mohammad Alimohammadi 1, Mohamed H. Lahiani.
1 Generative and Discriminative Models Jie Tang Department of Computer Science & Technology Tsinghua University 2012.
A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Introduction to analysis of microarray data David Edwards.
A Short Overview of Microarrays Tex Thompson Spring 2005.
1 Identifying differentially expressed genes from RNA-seq data Many recent algorithms for calling differentially expressed genes: edgeR: Empirical analysis.
Introduction to Microarrays Dr. Özlem İLK & İbrahim ERKAN 2011, Ankara.
Statistical Methods for Identifying Differentially Expressed Genes in Replicated cDNA Microarray Experiments Presented by Nan Lin 13 October 2002.
Differential analysis of Eigengene Networks: Finding And Analyzing Shared Modules Across Multiple Microarray Datasets Peter Langfelder and Steve Horvath.
Statistics for Differential Expression Naomi Altman Oct. 06.
A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Analysis of (cDNA) Microarray.
CS Statistical Machine learning Lecture 24
Jiann-Ming Wu, Ya-Ting Zhou, Chun-Chang Wu National Dong Hwa University Department of Applied Mathematics Hualien, Taiwan Learning Markov-chain embedded.
Suppose we have T genes which we measured under two experimental conditions (Ctl and Nic) in n replicated experiments t i * and p i are the t-statistic.
Introduction to Microarrays Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics
CONCLUSIONS TCDD responsiveness of HL1-1 cell was confirmed with CYP1A1 mRNA expression induction by QRT- PCR TCDD-elicited temporal and dose response.
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
Comp. Genomics Recitation 10 4/7/09 Differential expression detection.
1 Hidden Markov Model Observation : O1,O2,... States in time : q1, q2,... All states : s1, s2,... Si Sj.
 Present by 陳群元.  Introduction  Previous work  Predicting motion patterns  Spatio-temporal transition distribution  Discerning pedestrians  Experimental.
1 Estimation of Gene-Specific Variance 2/17/2011 Copyright © 2011 Dan Nettleton.
SRCOS Summer Research Conf, 2011 Multiple Testing Under Dependency, with Applications to Genomic Data Analysis Zhi Wei Department of Computer Science New.
Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.
A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Analysis of (cDNA) Microarray.
A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Analysis of (cDNA) Microarray.
Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3
Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.
13 October 2004Statistics: Yandell © Inferring Genetic Architecture of Complex Biological Processes Brian S. Yandell 12, Christina Kendziorski 13,
Lab 5 Unsupervised and supervised clustering Feb 22 th 2012 Daniel Fernandez Alejandro Quiroz.
Estimation of Gene-Specific Variance
Bayesian Semi-Parametric Multiple Shrinkage
Mixture Modeling of the p-value Distribution
Statistical Models for Automatic Speech Recognition
Mixture Modeling of the Distribution of p-values from t-tests
Mixture modeling of the distribution of p-values from t-tests
Hidden Markov Models Part 2: Algorithms
Hidden Markov Autoregressive Models
Statistical Models for Automatic Speech Recognition
Parametric Empirical Bayes Methods for Microarrays
A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes International.
A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes International.
Presentation transcript:

CK, October, 2003 A Hidden Markov Model for Microarray Time Course Data Christina Kendziorski and Ming Yuan Department of Biostatistics and Medical Informatics University of Wisconsin-Madison Full references available at

CK, October, 2003  Oxidative stress on the heart (Prolla lab, UW Madison) Comparison of young and old mice. Baseline and 1, 3, 5, and 7 hours after stress induction. 30 MG-U74A Affymetrix chips.  Dioxin exposure in the liver (Bradfield lab, UW Madison) Comparison of three dioxin doses in mice. 1, 2, 4, 8, 16, 32, and 64 days after treatment. 168 cDNA arrays (132 were used).  Type II diabetes on the kidney (Park lab, LSU) Comparison of lean and obese rats. 2 hours, 1 day, 3 days and 7 days after treatment. 18 Affymetrix Rat 230A chips. Case Studies

CK, October, 2003 Data Structure  X = [ X 1, X 2,..., X T ]  g = 1,2,...,m genes ; k = 1,2,...,K conditions; j = 1,2,...j k replicates  t = 1, 2,..., T time points. 12..m12..m A1 A2 A3 A4 A AN-1 AN C1 C2 CK

CK, October, 2003  Identify genes differentially expressed at each time.  Identify a gene’s expression pattern over time.  Cluster genes within one biological condition  clustering, id of cyclic patterns, HMMs Important Tasks

CK, October, 2003  Motivation: demonstrated using one case study.  Empirical Bayes approach for identifying DE genes at a single time point.  Accounting for dependence via HMMs.  Three case studies. Outline

CK, October, 2003 EBarrays at each time (Oxidative Stress on Heart) Baseline1 hour3 hours5 hours7 hours DE Hour DE EE EE DE Hours DE EE EE DE Baseline 1 Hour Hours DE EE EE DE 5 Hours P(DE|DE) = 0.49 P(DE|EE) = 0.03 P(DE|DE) = 0.54 P(DE|EE) = 0.01 P(DE|DE) = 0.73 P(DE|EE) = 0.04

CK, October, 2003 Accounting for time (Oxidative Stress on Heart) Baseline1 hour3 hours5 hours7 hours DE- marg DE- HMM Increase 52%47%95%148%51% In common 100%95%97% 91%

CK, October, 2003 Empirical Bayes approach for identifying DE genes at a single time point.

CK, October, 2003 Hierarchical Model for Expression Data (One condition)

CK, October, 2003  Let denote data (one gene) in conditions C1 and C2.  Two patterns of expression: P0 (EE) : P1 (DE):  For P0,  For P1, Hierarchical Model for Expression Data (Two conditions)

CK, October, 2003 Hierarchical Mixture Model for Expression Data =>  Two conditions:  Multiple conditions: =>  Parameter estimates via EM  Bayes rule determines threshold here; could target specific FDR.

CK, October, 2003 Accounting for time dependence via HMMs

CK, October, 2003 Hidden Markov Model (Two conditions) EE DE time Intensity

CK, October, 2003 Hidden Markov Model (Assumptions)  Expression pattern processes is described by initial probability distribution and transition probability matrix A(t). A(t) might depend on time.  Observed expression vector is characterized by:  Temporal correlation in the data can be completely described by the pattern process:

CK, October, 2003 Hidden Markov Model: How does it help ?  Without HMM:  With HMM:

CK, October, 2003 Does it work ? One Simulation Study - and see Ming Yuan Case Studies

CK, October, 2003 One Simulation Study  HMM with 1500 genes, 2 conditions, 6 time points, no reps  10% DE at first time point. P(DE|EE)=0.1; P(DE|DE) varies.  Compare marginal, homogeneous and non-homogeneous HMM. P(DE|DE)MethodTime 1Time 2Time 3Time 4Time 5Time I II III I II III I II III I II III

CK, October, 2003 Simulation Study Results

CK, October, 2003 Oxidative Stress on Heart - Prolla Lab Baseline1 hour3 hours5 hours7 hours DE- marg DE- HMM Increase 52%47%95%148%51%  110 genes called EE at every time by marginal analysis, but DE at every time by HMM-EBarrays (HMME).

CK, October, 2003 ALL EE by marginal analyses; All DE by HMME Average of log_2(Intensity)

CK, October, 2003 ALL EE by marginal analyses; All DE by HMME Average of log_2(Intensity)

CK, October, 2003 EBarrays at each time (Type 2 Diabetes on Kidney) 2 Hours1 day3 days7 days DE Day DE EE EE DE 2 Hours Days DE EE EE DE 1 Day Days DE EE EE DE 3 Days P(DE|DE) = 0.2 P(DE|EE) = P(DE|DE) = 0.5 P(DE|EE) = P(DE|DE) = P(DE|EE) = 0.001

CK, October, 2003 EBarrays vs. HMME (Type 2 Diabetes on Kidney) 2 Hours1 day3 days7 days DE- marg DE- HMM Increase 44%85%-2%104% In common 98%88% 95%

CK, October, 2003 EE by marginal analyses; DE by HMME (2x) Average of log_2(Intensity)

CK, October, 2003 EBarrays vs. HMME (Dioxin on Liver) Day DE- marg DE- HMM In common % 76% 68% 77% 75% 79% 77%

CK, October, 2003 Summary  Correlation in expression patterns over time exists.  Most methods analyze time course data within condition.  HMME approach identifies temporal expression patterns.  HMME increases sensitivity.  Pattern information is provided at each time point.  RT-PCR results on the way.

CK, October, 2003 Hierarchical Model for Expression Data