160 likes | 273 Views
This paper presents proximal methods for sparse hierarchical dictionary learning, focusing on structured sparsity and dictionary learning techniques. It discusses various sparsity-inducing norms including Lasso, Group Lasso, and Tree-Guided Group Lasso. The structured approach helps improve dictionary learning by incorporating hierarchical structures, which enhances the meaning derived from features. Experimental results demonstrate the effectiveness of learned dictionaries across different applications, including image patch imputation and text document classification. This work aims to bridge the gap between structured sparsity and dictionary learning.
E N D
Proximal Methods for Sparse Hierarchical Dictionary Learning Rodolphe Jenatton, Julien Mairal, Guillaume Obozinski, Francis Bach Presented by Bo Chen, 2010, 6.11
Outline • 1. Structured Sparsity • 2. Dictionary Learning • 3. Sparse Hierarchical Dictionary Learning • 4. Experimental Results
Structured Sparsity • Lasso (R. Tibshirani.,1996) • Group Lasso (M. Yuan & Y. Lin, 2006) • Tree-Guided Group Lasso (Kim & Xing, 2009)
Tree-Guided Structure Example Multi-task: Tree Regularization Definition: Kim & Xing, 2009
Tree-Guided Structure Penalty Introduce two parameters: Rewrite the penalty term, if the number of tasks is 2. (K=2): Generally: Kim & Xing, 2009
In Detail Kim & Xing, 2009
Dictionary Learning • If the structure information is introduced, the difference • between dictionary learning and group lasso: • Group Lasso is a regression problem. Each feature has its own physical • meaning. The structure information should be meaningful and correct. • Otherwise, the ‘structure’ will hurt the method. • In dictionary learning, the dictionary is unknown. So the structure information • will be a guide to help learn the structured dictionary.
= Optimization • Proximal Operator for Structure Norm Fix the dictionary D, the objective function: Transformed to a proximal problem: Proximal operator with the structure penalty:
Learning the Dictionary Updating D 5 times in each iteration, Updating A,
Experiments : Natural Image Patches • Use the learned dictionary from training set to impute the missing values in testing samples. Each sample is a 8x8 patch. • Training set: 50000; Testing set: 25000 • Test 21 balanced tree structures of depth 3 and 4. Also set the number of the nodes in each layer.
Experiments : Text Documents Key points:
Visualization of NIPS proceedings Documents: 1714 Words: 8274
Postings Classification Training set: 1000; Testing set: 425; Documents: 1425; Words:13312 Goal: classify the postings from the two newsgroups, alt.atheism and talk.religion.misc.