Download presentation
Presentation is loading. Please wait.
Published bySabrina Daniels Modified over 9 years ago
1
CK, October, 2003 A Hidden Markov Model for Microarray Time Course Data Christina Kendziorski and Ming Yuan Department of Biostatistics and Medical Informatics University of Wisconsin-Madison Full references available at http://www.biostat.wisc.edu/~kendzior
2
CK, October, 2003 Oxidative stress on the heart (Prolla lab, UW Madison) Comparison of young and old mice. Baseline and 1, 3, 5, and 7 hours after stress induction. 30 MG-U74A Affymetrix chips. Dioxin exposure in the liver (Bradfield lab, UW Madison) Comparison of three dioxin doses in mice. 1, 2, 4, 8, 16, 32, and 64 days after treatment. 168 cDNA arrays (132 were used). Type II diabetes on the kidney (Park lab, LSU) Comparison of lean and obese rats. 2 hours, 1 day, 3 days and 7 days after treatment. 18 Affymetrix Rat 230A chips. Case Studies
3
CK, October, 2003 Data Structure X = [ X 1, X 2,..., X T ] g = 1,2,...,m genes ; k = 1,2,...,K conditions; j = 1,2,...j k replicates t = 1, 2,..., T time points. 12..m12..m A1 A2 A3 A4 A5..... AN-1 AN C1 C2 CK........
4
CK, October, 2003 Identify genes differentially expressed at each time. Identify a gene’s expression pattern over time. Cluster genes within one biological condition clustering, id of cyclic patterns, HMMs Important Tasks
5
CK, October, 2003 Motivation: demonstrated using one case study. Empirical Bayes approach for identifying DE genes at a single time point. Accounting for dependence via HMMs. Three case studies. Outline
6
CK, October, 2003 EBarrays at each time (Oxidative Stress on Heart) Baseline1 hour3 hours5 hours7 hours DE1005701499380637 494511 2078831 1 Hour DE EE EE DE 378323 1219221 3 Hours DE EE EE DE Baseline 1 Hour 278102 3599304 7 Hours DE EE EE DE 5 Hours........ P(DE|DE) = 0.49 P(DE|EE) = 0.03 P(DE|DE) = 0.54 P(DE|EE) = 0.01 P(DE|DE) = 0.73 P(DE|EE) = 0.04
7
CK, October, 2003 Accounting for time (Oxidative Stress on Heart) Baseline1 hour3 hours5 hours7 hours DE- marg 1005701499380637 DE- HMM 15311029972944959 Increase 52%47%95%148%51% In common 100%95%97% 91%
8
CK, October, 2003 Empirical Bayes approach for identifying DE genes at a single time point.
9
CK, October, 2003 Hierarchical Model for Expression Data (One condition)
10
CK, October, 2003 Let denote data (one gene) in conditions C1 and C2. Two patterns of expression: P0 (EE) : P1 (DE): For P0, For P1, Hierarchical Model for Expression Data (Two conditions)
11
CK, October, 2003 Hierarchical Mixture Model for Expression Data => Two conditions: Multiple conditions: => Parameter estimates via EM Bayes rule determines threshold here; could target specific FDR.
12
CK, October, 2003 Accounting for time dependence via HMMs
13
CK, October, 2003 Hidden Markov Model (Two conditions) EE DE........... time Intensity
14
CK, October, 2003 Hidden Markov Model (Assumptions) Expression pattern processes is described by initial probability distribution and transition probability matrix A(t). A(t) might depend on time. Observed expression vector is characterized by: Temporal correlation in the data can be completely described by the pattern process:
15
CK, October, 2003 Hidden Markov Model: How does it help ? Without HMM: With HMM:
16
CK, October, 2003 Does it work ? One Simulation Study - and see Ming Yuan Case Studies
17
CK, October, 2003 One Simulation Study HMM with 1500 genes, 2 conditions, 6 time points, no reps 10% DE at first time point. P(DE|EE)=0.1; P(DE|DE) varies. Compare marginal, homogeneous and non-homogeneous HMM. P(DE|DE)MethodTime 1Time 2Time 3Time 4Time 5Time 6 0.1 I81.6582.3385.5282.3980.0182.41 II81.6982.1882.1582.2980.5081.88 III81.7582.4382.7982.7480.0482.50 0.3 I82.15100.78106.01105.49106.30105.40 II83.14103.22108.67108.53108.97106.43 III83.29103.24109.23108.61109.73106.59 0.5 I84.05120.72134.05142.61144.40145.32 II88.12133.40151.95161.68163.42154.50 III88.27133.80151.42162.28163.07154.93 0.7 I82.58140.76178.48197.66216.63225.55 II91.68170.75222.93252.92269.49262.92 III91.75171.75223.06252.39271.41266.83
18
CK, October, 2003 Simulation Study Results
19
CK, October, 2003 Oxidative Stress on Heart - Prolla Lab Baseline1 hour3 hours5 hours7 hours DE- marg 1005701499380637 DE- HMM 15311029972944959 Increase 52%47%95%148%51% 110 genes called EE at every time by marginal analysis, but DE at every time by HMM-EBarrays (HMME).
20
CK, October, 2003 ALL EE by marginal analyses; All DE by HMME Average of log_2(Intensity)
21
CK, October, 2003 ALL EE by marginal analyses; All DE by HMME Average of log_2(Intensity)
22
CK, October, 2003 EBarrays at each time (Type 2 Diabetes on Kidney) 2 Hours1 day3 days7 days DE 50 11873055 1040 10815765 1 Day DE EE EE DE 2 Hours 59 67115134 3 Days DE EE EE DE 1 Day 37693 1815175 7 Days DE EE EE DE 3 Days P(DE|DE) = 0.2 P(DE|EE) = 0.007 P(DE|DE) = 0.5 P(DE|EE) = 0.042 P(DE|DE) = 0.051 P(DE|EE) = 0.001
23
CK, October, 2003 EBarrays vs. HMME (Type 2 Diabetes on Kidney) 2 Hours1 day3 days7 days DE- marg 50118730 55 DE- HMM 72218717112 Increase 44%85%-2%104% In common 98%88% 95%
24
CK, October, 2003 EE by marginal analyses; DE by HMME (2x) Average of log_2(Intensity)
25
CK, October, 2003 EBarrays vs. HMME (Dioxin on Liver) Day 1 2 4 8 16 32 64 DE- marg DE- HMM In common 57 143 177 376 1081 169 111 142 212 296 543 904 211 149 77% 76% 68% 77% 75% 79% 77%
26
CK, October, 2003 Summary Correlation in expression patterns over time exists. Most methods analyze time course data within condition. HMME approach identifies temporal expression patterns. HMME increases sensitivity. Pattern information is provided at each time point. RT-PCR results on the way.
27
CK, October, 2003 Hierarchical Model for Expression Data
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.