- 147 Views
- Uploaded on
- Presentation posted in: General

Latent Tree Models

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

AAAI 2014 Tutorial

Latent Tree Models

Nevin L. Zhang

Dept. of Computer Science & Engineering

The Hong Kong Univ. of Sci. & Tech.

http://www.cse.ust.hk/~lzhang

HKUST

2014

HKUST

1988

- Part I: Non-Technical Overview (25 minutes)
- Part II: Definition and Properties (25 minutes)
- Part III: Learning Algorithms
(110 minutes, 30 minutes break half way)

- Part IV: Applications (50 minutes)

- Latent tree models
- What can LTMs be used for:
- Discovery of co-occurrence/correlation patterns
- Discovery of latent variable/structures
- Multidimensional clustering

- Examples
- Danish beer survey data
- Text data

- Tree-structured probabilistic graphical models
- Leaves observed (manifest variables)
- Discrete or continuous

- Internal nodes latent (latent variables)
- Discrete

- Each edge is associated with a conditional distribution
- One node with marginal distribution
- Defines a joint distributions over all the variables
(Zhang, JMLR 2004)

- Leaves observed (manifest variables)

From data on observed variables, obtain latent tree model

Learning latent tree models: Determine

- Number of latent variables
- Numbers of possible states for latent variables
- Connections among nodes
- Probability distributions

- 463 consumers, 11 beer brands
- Questionnaire: For each brand:
- Never seen the brand before (s0);
- Seen before, but never tasted (s1);
- Tasted, but do not drink regularly (s2)
- Drink regularly (s3).

(Mourad et al. JAIR 2013)

- Responses on brands in each group strongly correlated.
- GronTuborg and Carlsberg: Main mass-market beers
- TuborgClas and CarlSpec: Frequent beers, bit darker than the above
- CeresTop, CeresRoyal, Pokal, …: minor local beers

- In general, LTA partitions observed variables into groups such that
- Variables in each group are strongly correlated, and
- The correlations among each group can be properly be modeled using one single latent variable

- Each Latent variable gives a partition of consumers.
- H1:
- Class 1: Likely to have tasted TuborgClas, Carlspecand Heineken , but do not drink regularly
- Class 2: Likely to have seen or tasted the beers, but did not drink regularly
- Class 3: Likely to drink TuborgClas and Carlspec regularly

- H1:
- H0 and H2 give two other partitions.
- In general, LTA is a technique for multiple clustering.
- In contrast, K-Means, mixture models give only one partition.

- Grouping of objects intoclusters such that objects in the same cluster are similar while objects from different clusters are dissimilar.

- Result of clustering is often a partition of all the objects.

Style of picture

Type of object in picture

- Complex data usually have multiple facets and can be meaningfully partitioned in multiple ways. Multidimensional clustering / Multi-Clustering
- LTA is a model-based method for multidimensional clustering.
- Other methods: http://www.siam.org/meetings/sdm11/clustering.pdf

- LTA produces a partition of observed variables.
- For each cluster of variables, it produces a partition of objects.

- 1041web pages collected from 4 CS departments in 1997
- 336 words

89 latent variables

(Liu et al. MLJ 2013)

Latent Tree Modes for WebKB Data

- Words in each group tend to co-occur.
- On binary text data, LTA partitions word variables into groups such that
- Words in each group tend to co-occur and
- The correlations can be properly be explained using one single latent variable

LTA is a method for identifying co-occurrence relationships.

LTA is an alternative approach to topic detection

- Y66=4: Object Oriented Programming (oop)
- Y66=2: Non-oop programming
- Y66=1: programming language
- Y66=3: Not on programming

More on this in Part IV

- Latent tree models:
- Tree-structured probabilistic graphical models
- Leaf nodes: observed variables
- Internal nodes: latent variable

- What can LTA be used for:
- Discovery of co-occurrence patterns in binary data
- Discovery of correlation patterns in general discrete data
- Discovery of latent variable/structures
- Multidimensional clustering
- Topic detection in text data
- Probabilistic modelling

Key References:

- Anandkumar, A., Chaudhuri, K., Hsu, D., Kakade, S. M., Song, L., & Zhang, T. (2011). Spectral methods for learning multivariate latent tree structure. In Twenty-Fifth Conference in Neural Information Processing Systems (NIPS-11).
- Anandkumar, A., Ge, R., Hsu, D., Kakade, S.M., and Telgarsky, M. Tensor decompositions for learning latent variable models. In Preprint, 2012a.
- Anandkumar, A., Hsu, D., and Kakade, S. M. A method of moments for mixture models and hidden Markov models. In An abridged version appears in the Proc. Of COLT, 2012b.
- Choi, M. J., Tan, V. Y., Anandkumar, A., & Willsky, A. S. (2011). Learning latent tree graphical models. Journal of Machine Learning Research, 12, 1771–1812.
- Friedman, N., Ninio, M., Pe’er, I., & Pupko, T. (2002). A structural EM algorithm for phylogenetic inference.. Journal of Computational Biology, 9(2), 331–353.
- Harmeling, S., & Williams, C. K. I. (2011). Greedy learning of binary latent trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(6), 1087–1097.
- Hsu, D., Kakade, S., & Zhang, T. (2009). A spectral algorithm for learning hidden Markov models. In The 22nd Annual Conference on Learning Theory (COLT 2009).

Key References:

- E. Mossel, S. Roch, and A. Sly. Robust estimation of latent tree graphical models: Inferring hidden states with inexact parameters. Submitted. http://arxiv.org/abs/1109.4668, 2011.
- Mourad, R., Sinoquet, C., & Leray, P. (2011). A hierarchical Bayesian network approach for linkage disequilibrium modeling and data-dimensionality reduction prior to genomewideassociation studies. BMC Bioinformatics, 12, 16.
- Mourad R., Sinoquet C., Zhang N. L., Liu T. F. and Leray P. (2013). A survey on latent tree models and applications. Journal of Artificial Intelligence Research, 47, 157-203 , 13 May 2013. doi:10.1613/jair.3879.
- Parikh, A. P., Song, L., & Xing, E. P. (2011). A spectral algorithm for latent tree graphical models. In Proceedings of the 28th International Conference on Machine Learning (ICML-2011).
- Saitou, N., & Nei, M. (1987). The neighbor-joining method: A new method for reconstructing phylogenetic trees.. Molecular Biology and Evolution, 4(4), 406–425.
- Song, L., Parikh, A., & Xing, E. (2011). Kernel embeddings of latent tree graphical models. In Twenty-Fifth Conference in Neural Information Processing Systems (NIPS-11).
- Tan, V. Y. F., Anandkumar, A., & Willsky, A. (2011). Learning high-dimensional Markov forest distributions: Analysis of error rates. Journal of Machine Learning Research,12, 1617–1653.

Key References:

- T. Chen and N. L. Zhang (2006). Quartet-based learning of shallow latent variables. In Proceedings of the Third European Workshop on Probabilistic Graphical Model,59-66 , September 12-15, 2006.
- Chen, T., Zhang, N. L., Liu, T., Poon, K. M., & Wang, Y. (2012). Model-based multidimensional clustering of categorical data. Artificial Intelligence, 176(1), 2246–2269.
- Liu, T. F., Zhang, N. L., Liu, A. H., & Poon, L. K. M. (2013). Greedy learning of latent tree models for multidimensional clustering. Machine Learning, doi:10.1007/s10994-013-5393-0.
- Liu, T. F., Zhang, N. L., and Chen, P. X. (2014). Hierarchical latent tree analysis for topic detection. ECML, 2014
- Poon, L. K. M., Zhang, N. L., Chen, T., & Wang, Y. (2010). Variable selection in modelbased clustering: To do or to facilitate. In Proceedings of the 27th International Con-ference on Machine Learning (ICML-2010).
- Wang, Y., Zhang, N. L., & Chen, T. (2008). Latent tree models and approximate inference in Bayesian networks. Journal of Articial Intelligence Research, 32, 879–900.
- Wang, X. F., Guo, J. H., Hao, L. Z., Zhang, N.L., & P. X. Chen (2013). Recovering discrete latent tree models by spectral methods.
- Wang, X. F., Zhang, N. L. (2014). A Study of Recently Discovered Equalities about Latent Tree Models using Inverse Edges. PGM 2014.
- Zhang, N. L. (2004). Hierarchical latent class models for cluster analysis. The Journal of Machine Learning Research, 5, 697–723.
- Zhang, N. L., & Kocka, T. (2004a). Effective dimensions of hierarchical latent class models. Journal of Articial Intelligence Research, 21, 1–17.

Key References:

- Zhang, N. L., & Kocka, T. (2004b). Efficient learning of hierarchical latent class models. In Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI), pp. 585–593.
- Zhang, N. L., Nielsen, T. D., & Jensen, F. V. (2004). Latent variable discovery in classification models. Artificial Intelligence in Medicine, 30(3), 283–299.
- Zhang, N. L., Wang, Y., & Chen, T. (2008). Discovery of latent structures: Experience with the CoIL Challenge 2000 data set*. Journal of Systems Science and Complexity, 21(2), 172–183.
- Zhang, N. L., Yuan, S., Chen, T., & Wang, Y. (2008). Latent tree models and diagnosis in traditional Chinese medicine. Artificial Intelligence in Medicine, 42(3), 229–245.
- Zhang, N. L., Yuan, S., Chen, T., & Wang, Y. (2008). Statistical Validation of TCM Theories. Journal of Alternative and Complementary Medicine, 14(5):583-7.
- Zhang, N. L., Fu, C., Liu, T. F., Poon, K. M., Chen, P. X., Chen, B. X., Zhang, Y. L. (2014). The Latent Tree Analysis Approach to Patient Subclassificationin Traditional Chinese Medicine. Evidence-Based Complementary and Alternative Medicine.
- Xu, Z. X., Zhang, N. L., Wang, Y. Q., Liu, G. P., Xu, J., Liu, T. F., and Liu A. H. (2013). Statistical Validation of Traditional Chinese Medicine Syndrome Postulates in the Context of Patients with Cardiovascular Disease. The Journal of Alternative and Complementary Medicine. 18, 1-6.
- Zhao, Y. Zhang , N. L., Wang, T. F., Wang, Q. G. (2014). Discovering Symptom Co-Occurrence Patterns from 604 Cases of Depressive Patient Data using Latent Tree Models. The Journal of Alternative and Complementary Medicine. 20(4):265-71.