1 / 31

Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series. Sergey Kirshner, UC Irvine Padhraic Smyth, UC Irvine Andrew Robertson, IRI. July 10, 2004. Overview. Data and its modeling aspects Model description General Approach Hidden Markov models

gustav
Download Presentation

Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series Sergey Kirshner, UC Irvine Padhraic Smyth, UC Irvine Andrew Robertson, IRI July 10, 2004

  2. Overview • Data and its modeling aspects • Model description • General Approach • Hidden Markov models • Capturing data properties • Chow-Liu trees • Conditional Chow-Liu trees • Inference and Learning • Experimental Results • Summary and Future Extensions

  3. Snapshot of the Data 1 2 3 4 5 6 7 8 … T 1 2 3 4 5 … N

  4. Data Aspects • Correlation • Spatial dependence • Temporal structure • First order dependence • Variability of individual series • Interannual variability

  5. Modeling Precipitation Occurrence Western US, 1952-90 Southwestern Australia, 1978-92

  6. A Bit of Notation • Vector time series R • R1:T=R1,..,RT • Vector observation of R at time t • Rt=(At,Bt,…,Zt) A1 A2 AT B1 B2 BT C1 C2 CT Z1 Z2 ZT R1 R2 RT

  7. R1 R2 RT A1 A2 AT B1 B2 BT C1 C2 CT Z1 Z2 ZT Weather Generator • Does not take correlation into account

  8. Hidden Markov Model R1 R2 Rt RT-1 RT S1 S2 St ST-1 ST

  9. R1 R2 Rt RT-1 RT S1 S2 St ST-1 ST HMM-Conditional-Independence At Ct Rt Bt Zt = St St

  10. HMM-CI: Is It Sufficient? • Simple yet effective • Requires large number of values for St • Emissions can be made to capture more spatial dependencies

  11. Chow-Liu Trees • Approximation of a joint distribution with a tree-structured distribution [Chow and Liu 68]

  12. B D A B D A C B D A C C (0.56, 0.11, 0.02, 0.31) (0.51, 0.17, 0.17, 0.15) (0.53, 0.15, 0.19, 0.13) (0.44, 0.14, 0.23, 0.19) (0.46, 0.12, 0.26, 0.16) (0.64, 0.04, 0.08, 0.24) AB AC AD BC BD CD AB AC AD BC BD CD (0.56, 0.11, 0.02, 0.31) (0.51, 0.17, 0.17, 0.15) (0.53, 0.15, 0.19, 0.13) (0.44, 0.14, 0.23, 0.19) (0.46, 0.12, 0.26, 0.16) (0.64, 0.04, 0.08, 0.24) 0.3126 0.0229 0.0172 0.0230 0.0183 0.2603 Illustration of CL-Tree Learning 0.3126 0.0229 0.0172 0.0230 0.0183 0.2603

  13. Chow-Liu Trees • Approximation of a joint distribution with a tree-structured distribution [Chow and Liu 68] • Learning the structure and the probabilities • Compute individual and pairwise marginal distributions for all pairs of variables • Compute mutual information (MI) for each pair of variables • Build maximum spanning tree with for a complete graph with variables as nodes and MIs as weights • Properties • Efficient: • O(#samples×(#variables)2×(#values per variable)2) • Optimal

  14. R1 R2 Rt RT-1 RT S1 S2 St ST-1 ST T1(Rt) T2(Rt) T3(Rt) At Bt At Bt At Bt Rt Ct Dt Ct Dt Ct Dt = St=1 St=2 St=3 St St HMM-Chow-Liu

  15. Improving on Chow-Liu Trees • Tree edges with low MI add little to the approximation. • Observations from the previous time point can be more relevant than from the current one. • Idea: Build Chow-Liu tree allowing to include variables from the current and the previous time point.

  16. Conditional Chow-Liu Forests • Extension of Chow-Liu trees to conditional distributions • Approximation of conditional multivariate distribution with a tree-structured distribution • Uses MI to build maximum spanning trees (forest) • Variables of two consecutive time points as nodes • All nodes corresponding to the earlier time point considered connected before the tree construction • Same asymptotic complexity as Chow-Liu trees • O(#samples×(#variables)2×(#values per variable)2) • Optimal

  17. C’ C’ A’ B’ A’ B’ AB AC BC A’A A’B A’C B’A B’B B’C C’A C’B C’C AB AC BC A’A A’B A’C B’A B’B B’C C’A C’B C’C (0.56, 0.11, 0.02, 0.31) (0.51, 0.17, 0.17, 0.15) (0.44, 0.14, 0.23, 0.19) (0.57, 0.11, 0.11, 0.21) (0.51, 0.17, 0.07, 0.25) (0.54, 0.14, 0.14, 0.18) (0.52, 0.07, 0.16, 0.25) (0.48, 0.10, 0.11, 0.31) (0.47, 0.11, 0.21, 0.21) (0.48, 0.20, 0.20, 0.12) (0.41, 0.26, 0.17, 0.16) (0.53, 0.14, 0.14, 0.19) 0.3126 0.0229 0.0230 0.1207 0.1253 0.0623 0.1392 0.1700 0.0559 0.0033 0.0030 0.0625 C A B A C A C B A C B B C’ C’ A’ B’ A’ B’ Example of CCL-Forest Learning 0.3126 0.0229 0.0230 0.1207 0.1253 0.0623 0.1392 0.1700 0.0559 0.0033 0.0030 0.0625

  18. R1 R2 Rt-1 Rt RT S1 S2 St-1 St ST AR-HMM

  19. R1 R2 Rt-1 Rt RT S1 S2 St-1 St ST At-1 At-1 At-1 At Bt At Bt At Bt Bt-1 Bt-1 Bt-1 Rt-1 Rt Ct-1 Ct-1 Ct-1 Dt Ct Dt Ct Dt Ct Dt-1 Dt-1 Dt-1 St St HMM-Conditional-Chow-Liu = St=1 St=2 St=3

  20. Inference and Learning for HMM-CL and HMM-CCL • Inference (calculating P(S|R,Q)) • Recursively calculate P(R1:t,St|Q) and P(Rt+1:T|St,Q) (Forward-Backward) • Learning (Baum-Welch or EM) • E-step: calculate P(S|R,Q) • Forward-Backward • Calculate P(St|R,Q) and P(St,St+1|R,Q) • M-step: • Maximize EP(S|R,Q)[P(S, R|Q’)] • Similar to mixtures of Chow-Liu trees

  21. R1 R2 Rt-1 Rt RT At At Bt Bt Ct Dt Ct Dt Chain Chow-Liu Forest (CCLF) Rt-1 Rt =

  22. Complexity Analysis N – number of sequences T – length of each sequence K – number of hidden states M – dimensionality of each vector V – number of possible values for each vector component

  23. Experimental Setup • Data • Australia • 15 seasons, 184 days each, 30 stations • Western U.S. • 39 seasons, 90 days each, 8 stations • Measuring predictive performance • Choose K (number of states) • Leave-one-out cross-validation • Log-likelihood • Error for prediction of a single entry given the rest

  24. Australia (log-likelihood)

  25. Australia (predictive error)

  26. Deeper Look at Weather States

  27. Western U.S. (log-likelihood)

  28. Western U.S. (predictive error)

  29. Summary • Efficient approximation for finite-valued conditional distributions • Conditional Chow-Liu forests • New models for spatio-temporal finite-valued data • HMM with Chow-Liu trees • HMM with conditional Chow-Liu forests • Chain Chow-Liu forests • Applied to precipitation modeling

  30. Future Work • Extension to real-valued data • Priors on tree structure and parameters [Jaakkola and Meila 00] • Locations of the stations • Interannual variability • Atmospheric variables as inputs to non-homogeneous HMM [Robertson et al 04] • Other approximations for finite-valued multivariate data • Maximum Entropy • Multivariate probit models (binary)

  31. Acknowledgements • DOE (DE-FG02-02ER63413) • NSF (SCI-0225642) • Dr. Stephen Charles of CSIRO, Australia • Datalab @ UCI (http://www.datalab.uci.edu)

More Related