Latent tree models
1 / 27

Latent Tree Models - PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

AAAI 2014 Tutorial. Latent Tree Models. Nevin L. Zhang Dept. of Computer Science & Engineering The Hong Kong Univ. of Sci. & Tech. HKUST 2014. HKUST 1988. Latent Tree Models. Part I: Non-Technical Overview (25 minutes)

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

Latent Tree Models

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

AAAI 2014 Tutorial

Latent Tree Models

Nevin L. Zhang

Dept. of Computer Science & Engineering

The Hong Kong Univ. of Sci. & Tech.





Latent Tree Models

  • Part I: Non-Technical Overview (25 minutes)

  • Part II: Definition and Properties (25 minutes)

  • Part III: Learning Algorithms

    (110 minutes, 30 minutes break half way)

  • Part IV: Applications (50 minutes)

Part I: Non-Technical Overview

  • Latent tree models

  • What can LTMs be used for:

    • Discovery of co-occurrence/correlation patterns

    • Discovery of latent variable/structures

    • Multidimensional clustering

  • Examples

    • Danish beer survey data

    • Text data

Latent Tree Models (LTMs)

  • Tree-structured probabilistic graphical models

    • Leaves observed (manifest variables)

      • Discrete or continuous

    • Internal nodes latent (latent variables)

      • Discrete

    • Each edge is associated with a conditional distribution

    • One node with marginal distribution

    • Defines a joint distributions over all the variables

      (Zhang, JMLR 2004)

Latent Tree Analysis (LTA)

From data on observed variables, obtain latent tree model

Learning latent tree models: Determine

  • Number of latent variables

  • Numbers of possible states for latent variables

  • Connections among nodes

  • Probability distributions

LTA on Danish Beer Market Survey Data

  • 463 consumers, 11 beer brands

  • Questionnaire: For each brand:

    • Never seen the brand before (s0);

    • Seen before, but never tasted (s1);

    • Tasted, but do not drink regularly (s2)

    • Drink regularly (s3).

(Mourad et al. JAIR 2013)

Why variables grouped as such?

  • Responses on brands in each group strongly correlated.

    • GronTuborg and Carlsberg: Main mass-market beers

    • TuborgClas and CarlSpec: Frequent beers, bit darker than the above

    • CeresTop, CeresRoyal, Pokal, …: minor local beers

  • In general, LTA partitions observed variables into groups such that

    • Variables in each group are strongly correlated, and

    • The correlations among each group can be properly be modeled using one single latent variable

Multidmensional Clustering

  • Each Latent variable gives a partition of consumers.

    • H1:

      • Class 1: Likely to have tasted TuborgClas, Carlspecand Heineken , but do not drink regularly

      • Class 2: Likely to have seen or tasted the beers, but did not drink regularly

      • Class 3: Likely to drink TuborgClas and Carlspec regularly

  • H0 and H2 give two other partitions.

  • In general, LTA is a technique for multiple clustering.

    • In contrast, K-Means, mixture models give only one partition.

Unidimensional vs Multidimensional Clustering

  • Grouping of objects intoclusters such that objects in the same cluster are similar while objects from different clusters are dissimilar.

  • Result of clustering is often a partition of all the objects.

How to Cluster Those?

How to Cluster Those?

Style of picture

How to Cluster Those?

Type of object in picture

Multidimensional Clustering

  • Complex data usually have multiple facets and can be meaningfully partitioned in multiple ways. Multidimensional clustering / Multi-Clustering

  • LTA is a model-based method for multidimensional clustering.

  • Other methods:

Clustering of Variables and Objects

  • LTA produces a partition of observed variables.

  • For each cluster of variables, it produces a partition of objects.

Binary Text Data: WebKB

  • 1041web pages collected from 4 CS departments in 1997

  • 336 words

Latent Tree Model for WebKB Data

89 latent variables

(Liu et al. MLJ 2013)

Latent Tree Modes for WebKB Data

Why variables grouped as such?

  • Words in each group tend to co-occur.

  • On binary text data, LTA partitions word variables into groups such that

    • Words in each group tend to co-occur and

    • The correlations can be properly be explained using one single latent variable

LTA is a method for identifying co-occurrence relationships.

Multidimensional Clustering

LTA is an alternative approach to topic detection

  • Y66=4: Object Oriented Programming (oop)

  • Y66=2: Non-oop programming

  • Y66=1: programming language

  • Y66=3: Not on programming

More on this in Part IV


  • Latent tree models:

    • Tree-structured probabilistic graphical models

    • Leaf nodes: observed variables

    • Internal nodes: latent variable

  • What can LTA be used for:

    • Discovery of co-occurrence patterns in binary data

    • Discovery of correlation patterns in general discrete data

    • Discovery of latent variable/structures

    • Multidimensional clustering

    • Topic detection in text data

    • Probabilistic modelling

Key References:

  • Anandkumar, A., Chaudhuri, K., Hsu, D., Kakade, S. M., Song, L., & Zhang, T. (2011). Spectral methods for learning multivariate latent tree structure. In Twenty-Fifth Conference in Neural Information Processing Systems (NIPS-11).

  • Anandkumar, A., Ge, R., Hsu, D., Kakade, S.M., and Telgarsky, M. Tensor decompositions for learning latent variable models. In Preprint, 2012a.

  • Anandkumar, A., Hsu, D., and Kakade, S. M. A method of moments for mixture models and hidden Markov models. In An abridged version appears in the Proc. Of COLT, 2012b.

  • Choi, M. J., Tan, V. Y., Anandkumar, A., & Willsky, A. S. (2011). Learning latent tree graphical models. Journal of Machine Learning Research, 12, 1771–1812.

  • Friedman, N., Ninio, M., Pe’er, I., & Pupko, T. (2002). A structural EM algorithm for phylogenetic inference.. Journal of Computational Biology, 9(2), 331–353.

  • Harmeling, S., & Williams, C. K. I. (2011). Greedy learning of binary latent trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(6), 1087–1097.

  • Hsu, D., Kakade, S., & Zhang, T. (2009). A spectral algorithm for learning hidden Markov models. In The 22nd Annual Conference on Learning Theory (COLT 2009).

Key References:

  • E. Mossel, S. Roch, and A. Sly. Robust estimation of latent tree graphical models: Inferring hidden states with inexact parameters. Submitted., 2011.

  • Mourad, R., Sinoquet, C., & Leray, P. (2011). A hierarchical Bayesian network approach for linkage disequilibrium modeling and data-dimensionality reduction prior to genomewideassociation studies. BMC Bioinformatics, 12, 16.

  • Mourad R., Sinoquet C., Zhang N. L., Liu T. F. and Leray P. (2013). A survey on latent tree models and applications. Journal of Artificial Intelligence Research, 47, 157-203 , 13 May 2013. doi:10.1613/jair.3879.

  • Parikh, A. P., Song, L., & Xing, E. P. (2011). A spectral algorithm for latent tree graphical models. In Proceedings of the 28th International Conference on Machine Learning (ICML-2011).

  • Saitou, N., & Nei, M. (1987). The neighbor-joining method: A new method for reconstructing phylogenetic trees.. Molecular Biology and Evolution, 4(4), 406–425.

  • Song, L., Parikh, A., & Xing, E. (2011). Kernel embeddings of latent tree graphical models. In Twenty-Fifth Conference in Neural Information Processing Systems (NIPS-11).

  • Tan, V. Y. F., Anandkumar, A., & Willsky, A. (2011). Learning high-dimensional Markov forest distributions: Analysis of error rates. Journal of Machine Learning Research,12, 1617–1653.

Key References:

  • T. Chen and N. L. Zhang (2006). Quartet-based learning of shallow latent variables. In Proceedings of the Third European Workshop on Probabilistic Graphical Model,59-66 , September 12-15, 2006.

  • Chen, T., Zhang, N. L., Liu, T., Poon, K. M., & Wang, Y. (2012). Model-based multidimensional clustering of categorical data. Artificial Intelligence, 176(1), 2246–2269.

  • Liu, T. F., Zhang, N. L., Liu, A. H., & Poon, L. K. M. (2013). Greedy learning of latent tree models for multidimensional clustering. Machine Learning, doi:10.1007/s10994-013-5393-0.

  • Liu, T. F., Zhang, N. L., and Chen, P. X. (2014). Hierarchical latent tree analysis for topic detection. ECML, 2014

  • Poon, L. K. M., Zhang, N. L., Chen, T., & Wang, Y. (2010). Variable selection in modelbased clustering: To do or to facilitate. In Proceedings of the 27th International Con-ference on Machine Learning (ICML-2010).

  • Wang, Y., Zhang, N. L., & Chen, T. (2008). Latent tree models and approximate inference in Bayesian networks. Journal of Articial Intelligence Research, 32, 879–900.

  • Wang, X. F., Guo, J. H., Hao, L. Z., Zhang, N.L., & P. X. Chen (2013). Recovering discrete latent tree models by spectral methods.

  • Wang, X. F., Zhang, N. L. (2014). A Study of Recently Discovered Equalities about Latent Tree Models using Inverse Edges. PGM 2014.

  • Zhang, N. L. (2004). Hierarchical latent class models for cluster analysis. The Journal of Machine Learning Research, 5, 697–723.

  • Zhang, N. L., & Kocka, T. (2004a). Effective dimensions of hierarchical latent class models. Journal of Articial Intelligence Research, 21, 1–17.

Key References:

  • Zhang, N. L., & Kocka, T. (2004b). Efficient learning of hierarchical latent class models. In Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI), pp. 585–593.

  • Zhang, N. L., Nielsen, T. D., & Jensen, F. V. (2004). Latent variable discovery in classification models. Artificial Intelligence in Medicine, 30(3), 283–299.

  • Zhang, N. L., Wang, Y., & Chen, T. (2008). Discovery of latent structures: Experience with the CoIL Challenge 2000 data set*. Journal of Systems Science and Complexity, 21(2), 172–183.

  • Zhang, N. L., Yuan, S., Chen, T., & Wang, Y. (2008). Latent tree models and diagnosis in traditional Chinese medicine. Artificial Intelligence in Medicine, 42(3), 229–245.

  • Zhang, N. L., Yuan, S., Chen, T., & Wang, Y. (2008). Statistical Validation of TCM Theories. Journal of Alternative and Complementary Medicine, 14(5):583-7. 

  • Zhang, N. L., Fu, C., Liu, T. F., Poon, K. M., Chen, P. X., Chen, B. X., Zhang, Y. L. (2014). The Latent Tree Analysis Approach to Patient Subclassificationin Traditional Chinese Medicine. Evidence-Based Complementary and Alternative Medicine.

  • Xu, Z. X., Zhang, N. L., Wang, Y. Q., Liu, G. P., Xu, J., Liu, T. F., and Liu A. H. (2013). Statistical Validation of Traditional Chinese Medicine Syndrome Postulates in the Context of Patients with Cardiovascular Disease. The Journal of Alternative and Complementary Medicine. 18, 1-6.

  • Zhao, Y. Zhang , N. L., Wang, T. F., Wang, Q. G. (2014). Discovering Symptom Co-Occurrence Patterns from 604 Cases of Depressive Patient Data using Latent Tree Models. The Journal of Alternative and Complementary Medicine. 20(4):265-71.

  • Login