Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series Sergey Kirshner, UC Irvine Padhraic Smyth, UC Irvine Andrew Robertson,

Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series Sergey Kirshner, UC Irvine Padhraic Smyth, UC Irvine Andrew Robertson, IRI July 10, 2004

UAI-2004 © Sergey Kirshner, UC Irvine 2 Overview Data and its modeling aspects Model description –General Approach Hidden Markov models –Capturing data properties Chow-Liu trees Conditional Chow-Liu trees Inference and Learning Experimental Results Summary and Future Extensions

UAI-2004 © Sergey Kirshner, UC Irvine 3 Snapshot of the Data 1 2 3 4 5 12345678… … T N

UAI-2004 © Sergey Kirshner, UC Irvine 4 Data Aspects Correlation –Spatial dependence Temporal structure –First order dependence Variability of individual series –Interannual variability

UAI-2004 © Sergey Kirshner, UC Irvine 5 Modeling Precipitation Occurrence Southwestern Australia, 1978-92 Western US, 1952-90

UAI-2004 © Sergey Kirshner, UC Irvine 6 A Bit of Notation Vector time series R –R 1:T =R 1,..,R T Vector observation of R at time t –R t =(A t,B t,…,Z t ) A1A1 B1B1 Z1Z1 C1C1 R1R1 A2A2 B2B2 Z2Z2 C2C2 R2R2 ATAT BTBT ZTZT CTCT RTRT

UAI-2004 © Sergey Kirshner, UC Irvine 7 Weather Generator R1R1 R2R2 RTRT A1A1 B1B1 Z1Z1 C1C1 A2A2 B2B2 Z2Z2 C2C2 ATAT BTBT ZTZT CTCT Does not take correlation into account

UAI-2004 © Sergey Kirshner, UC Irvine 8 Hidden Markov Model R1R1 R2R2 RtRt R T-1 RTRT S1S1 S2S2 StSt S T-1 STST

UAI-2004 © Sergey Kirshner, UC Irvine 9 HMM-Conditional-Independence RtRt StSt StSt AtAt CtCt ZtZt BtBt = R1R1 R2R2 RtRt R T-1 RTRT S1S1 S2S2 StSt S T-1 STST

UAI-2004 © Sergey Kirshner, UC Irvine 10 HMM-CI: Is It Sufficient? Simple yet effective Requires large number of values for S t Emissions can be made to capture more spatial dependencies

UAI-2004 © Sergey Kirshner, UC Irvine 11 Chow-Liu Trees Approximation of a joint distribution with a tree-structured distribution [Chow and Liu 68]

UAI-2004 © Sergey Kirshner, UC Irvine 12 0.3126 0.0229 0.0172 0.0230 0.0183 0.2603 AB AC AD BC BD CD (0.56, 0.11, 0.02, 0.31) (0.51, 0.17, 0.17, 0.15) (0.53, 0.15, 0.19, 0.13) (0.44, 0.14, 0.23, 0.19) (0.46, 0.12, 0.26, 0.16) (0.64, 0.04, 0.08, 0.24) A C B D A C B D AB AC AD BC BD CD 0.3126 0.0229 0.0172 0.0230 0.0183 0.2603 (0.56, 0.11, 0.02, 0.31) (0.51, 0.17, 0.17, 0.15) (0.53, 0.15, 0.19, 0.13) (0.44, 0.14, 0.23, 0.19) (0.46, 0.12, 0.26, 0.16) (0.64, 0.04, 0.08, 0.24) Illustration of CL-Tree Learning A C B D

UAI-2004 © Sergey Kirshner, UC Irvine 13 Chow-Liu Trees Approximation of a joint distribution with a tree- structured distribution [Chow and Liu 68] Learning the structure and the probabilities –Compute individual and pairwise marginal distributions for all pairs of variables –Compute mutual information (MI) for each pair of variables –Build maximum spanning tree with for a complete graph with variables as nodes and MIs as weights Properties –Efficient: O(#samples×(#variables) 2 ×(#values per variable) 2 ) –Optimal

UAI-2004 © Sergey Kirshner, UC Irvine 14 HMM-Chow-Liu R1R1 R2R2 RtRt R T-1 RTRT S1S1 S2S2 StSt S T-1 STST RtRt StSt BtBt DtDt CtCt BtBt DtDt CtCt BtBt DtDt CtCt StSt S t =1S t =2S t =3 = T1(Rt)T1(Rt)T2(Rt)T2(Rt)T3(Rt)T3(Rt) AtAt AtAt AtAt

UAI-2004 © Sergey Kirshner, UC Irvine 15 Improving on Chow-Liu Trees Tree edges with low MI add little to the approximation. Observations from the previous time point can be more relevant than from the current one. Idea: Build Chow-Liu tree allowing to include variables from the current and the previous time point.

UAI-2004 © Sergey Kirshner, UC Irvine 16 Conditional Chow-Liu Forests Extension of Chow-Liu trees to conditional distributions –Approximation of conditional multivariate distribution with a tree-structured distribution –Uses MI to build maximum spanning trees (forest) Variables of two consecutive time points as nodes All nodes corresponding to the earlier time point considered connected before the tree construction –Same asymptotic complexity as Chow-Liu trees O(#samples×(#variables) 2 ×(#values per variable) 2 ) –Optimal

UAI-2004 © Sergey Kirshner, UC Irvine 17 B’A’ C’ BA C 0.3126 0.0229 0.0230 0.1207 0.1253 0.0623 0.1392 0.1700 0.0559 0.0033 0.0030 0.0625 AB AC BC A’A A’B A’C B’A B’B B’C C’A C’B C’C (0.56, 0.11, 0.02, 0.31) (0.51, 0.17, 0.17, 0.15) (0.44, 0.14, 0.23, 0.19) (0.57, 0.11, 0.11, 0.21) (0.51, 0.17, 0.07, 0.25) (0.54, 0.14, 0.14, 0.18) (0.52, 0.07, 0.16, 0.25) (0.48, 0.10, 0.11, 0.31) (0.47, 0.11, 0.21, 0.21) (0.48, 0.20, 0.20, 0.12) (0.41, 0.26, 0.17, 0.16) (0.53, 0.14, 0.14, 0.19) AB AC BC A’A A’B A’C B’A B’B B’C C’A C’B C’C 0.3126 0.0229 0.0230 0.1207 0.1253 0.0623 0.1392 0.1700 0.0559 0.0033 0.0030 0.0625 B’A’ C’ BA C Example of CCL-Forest Learning B’A’ C’ BA C B’A’ C’ BA C

UAI-2004 © Sergey Kirshner, UC Irvine 18 AR-HMM R1R1 RtRt RTRT S1S1 StSt STST R t-1 S t-1 R2R2 S2S2

UAI-2004 © Sergey Kirshner, UC Irvine 19 HMM-Conditional-Chow-Liu StSt R t-1 RtRt R1R1 RtRt RTRT S1S1 StSt STST S t-1 R2R2 S2S2 A t-1 B t-1 C t-1 D t-1 AtAt BtBt CtCt DtDt C t-1 B t-1 A t-1 CtCt DtDt AtAt BtBt D t-1 C t-1 B t-1 A t-1 DtDt CtCt AtAt BtBt StSt S t =1S t =2S t =3 =

UAI-2004 © Sergey Kirshner, UC Irvine 20 Inference and Learning for HMM-CL and HMM-CCL Inference (calculating P(S|R,  ) ) –Recursively calculate P(R 1:t,S t |  ) and P(R t+1:T |S t,  ) (Forward-Backward) Learning (Baum-Welch or EM) –E-step: calculate P(S|R,  ) Forward-Backward Calculate P(S t |R,  ) and P(S t,S t+1 |R,  ) –M-step: Maximize E P(S|R,  ) [P(S, R|  ’)] Similar to mixtures of Chow-Liu trees

UAI-2004 © Sergey Kirshner, UC Irvine 21 Chain Chow-Liu Forest (CCLF) R1R1 RtRt RTRT R t-1 R2R2 RtRt BtBt CtCt DtDt AtAt AtAt BtBt CtCt DtDt =

UAI-2004 © Sergey Kirshner, UC Irvine 22 Complexity Analysis Model Criterion HMM-CIHMM-CLHMM-CCL # params K 2 +MK(V-1)K 2 +K(M-1)(V 2 -1)K 2 +KM(V 2 -1) Time (per iteration) O(NTK(K+M))O(NTK(K+M 2 V 2 ))O(NTK(K+ +M 2 V 2 )) Space O(NTK(K+M))O(NTK(K+M)+KM 2 V 2 )O(NTK(K+M)+ +KM 2 V 2 ) N – number of sequences T – length of each sequence K – number of hidden states M – dimensionality of each vector V – number of possible values for each vector component

UAI-2004 © Sergey Kirshner, UC Irvine 23 Experimental Setup Data –Australia 15 seasons, 184 days each, 30 stations –Western U.S. 39 seasons, 90 days each, 8 stations Measuring predictive performance –Choose K (number of states) –Leave-one-out cross-validation –Log-likelihood –Error for prediction of a single entry given the rest

UAI-2004 © Sergey Kirshner, UC Irvine 24 Australia (log-likelihood)

UAI-2004 © Sergey Kirshner, UC Irvine 25 Australia (predictive error)

UAI-2004 © Sergey Kirshner, UC Irvine 26 Deeper Look at Weather States

UAI-2004 © Sergey Kirshner, UC Irvine 27 Western U.S. (log-likelihood)

UAI-2004 © Sergey Kirshner, UC Irvine 28 Western U.S. (predictive error)

UAI-2004 © Sergey Kirshner, UC Irvine 29 Summary Efficient approximation for finite-valued conditional distributions –Conditional Chow-Liu forests New models for spatio-temporal finite-valued data –HMM with Chow-Liu trees –HMM with conditional Chow-Liu forests –Chain Chow-Liu forests Applied to precipitation modeling

UAI-2004 © Sergey Kirshner, UC Irvine 30 Future Work Extension to real-valued data Priors on tree structure and parameters [Jaakkola and Meila 00] –Locations of the stations Interannual variability –Atmospheric variables as inputs to non-homogeneous HMM [Robertson et al 04] Other approximations for finite-valued multivariate data –Maximum Entropy –Multivariate probit models (binary)

UAI-2004 © Sergey Kirshner, UC Irvine 31 Acknowledgements DOE (DE-FG02-02ER63413) NSF (SCI-0225642) Dr. Stephen Charles of CSIRO, Australia Datalab @ UCI (http://www.datalab.uci.edu)

UAI-2004 © Sergey Kirshner, UC Irvine 32 A Bit of Notation A1A1 B1B1 Z1Z1 C1C1 R1R1 A2A2 B2B2 Z2Z2 C2C2 R2R2 AtAt BtBt ZtZt CtCt RtRt ATAT BTBT ZTZT CTCT RtRt

UAI-2004 © Sergey Kirshner, UC Irvine 33 Weather Generator A1A1 B1B1 Z1Z1 C1C1 R1R1 A2A2 B2B2 Z2Z2 C2C2 R2R2 ATAT BTBT ZTZT CTCT RTRT

UAI-2004 © Sergey Kirshner, UC Irvine 34 Illustration of CL-Tree Learning ACBD AB AC AD BC BD CD (0.56, 0.11, 0.02, 0.31) (0.51, 0.17, 0.17, 0.15) (0.53, 0.15, 0.19, 0.13) (0.44, 0.14, 0.23, 0.19) (0.46, 0.12, 0.26, 0.16) (0.64, 0.04, 0.08, 0.24) 0.3126 0.0229 0.0172 0.0230 0.0183 0.2603 ACBD

UAI-2004 © Sergey Kirshner, UC Irvine 35 Mixture of Chow-Liu Trees Chow-Liu trees inside a mixture model [Meila and Jordan 00] Parameters and structures learned by Expectation-Maximization RtRt StSt BtBt DtDt CtCt BtBt DtDt CtCt BtBt DtDt CtCt StSt S t =1S t =2S t =3 = T1(Rt)T1(Rt)T2(Rt)T2(Rt)T3(Rt)T3(Rt) AtAt AtAt AtAt

UAI-2004 © Sergey Kirshner, UC Irvine 38 Conditional Chow-Liu Forests Extension of Chow-Liu trees to conditional distributions Same complexity as Chow-Liu trees

UAI-2004 © Sergey Kirshner, UC Irvine 40 Weather Generator R11R11 R12R12 R14R14 R13R13 R1R1 R21R21 R22R22 R24R24 R23R23 R2R2 RT1RT1 RT2RT2 RT4RT4 RT3RT3 RTRT

UAI-2004 © Sergey Kirshner, UC Irvine 41 Complexity Analysis HMM-CIHMM-CLHMM- CCL CCLF # params K 2 +MK(V-1)K 2 +K(M-1)* (V 2 -1) K 2 +KM(V 2 -1) M(V 2 -1) Time O(TK 2 M) O(TM) Space O(O(O(TV+)

Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series Sergey Kirshner, UC Irvine Padhraic Smyth, UC Irvine Andrew Robertson,

Similar presentations

Presentation on theme: "Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series Sergey Kirshner, UC Irvine Padhraic Smyth, UC Irvine Andrew Robertson,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series Sergey Kirshner, UC Irvine Padhraic Smyth, UC Irvine Andrew Robertson,

Similar presentations

Presentation on theme: "Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series Sergey Kirshner, UC Irvine Padhraic Smyth, UC Irvine Andrew Robertson,"— Presentation transcript:

Similar presentations

About project

Feedback