1 / 14

Multi-Label Prediction via Compressed Sensing

Multi-Label Prediction via Compressed Sensing. By Daniel Hsu, Sham M. Kakade, John Langford, Tong Zhang (NIPS 2009). Presented by: Lingbo Li ECE, Duke University 01-22-2010. * Some notes are directly copied from the original paper. Outline. Introduction Preliminaries Learning Reduction

ronli
Download Presentation

Multi-Label Prediction via Compressed Sensing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multi-Label Prediction via Compressed Sensing By Daniel Hsu, Sham M. Kakade, John Langford, Tong Zhang (NIPS 2009) Presented by: Lingbo Li ECE, Duke University 01-22-2010 * Some notes are directly copied from the original paper.

  2. Outline • Introduction • Preliminaries • Learning Reduction • Compression and Reconstruction • Empirical Results • Conclusion

  3. Introduction • Large database of images; • Goal: predict who or what is in a given image • Samples: images with corresponding labels is the total number of entities in the whole database. • One-against-all algorithm: Learn a binary predictor for each label (class). Computation is expensive when is large. e.g. , • Assume the output vector is sparse.

  4. Introduction 1 1 1 1 1 1 Compressed sensing: For any sparse vector , it is highly possible to compress to logarithmic in dimension with perfect reconstruction of . Main idea: “Learn to predict compressed label vectors, and then use sparse reconstruction algorithm to recover uncompressed labels from these predictions”

  5. Preliminaries • : input space; • : output (label) space, where • Training data: • Goal: to learn the predictor with low mean-squared error Assume • is very large; • Expected value is sparse, with only a few non-zero entries.

  6. Learning reduction • Linear compression function where • Goal: to learn a predictor Predict the label y with the Predictor F Predict the compressed label Ay with the Predictor H Samples Compressed Samples To minimize To minimize

  7. Reduction-training and prediction Reconstruction Algorithm R: If is close to , then should be close to

  8. Compression Functions Examples of valid compression functions:

  9. Reconstruction Algorithms Examples of valid reconstruction algorithms: iterative and greedy algorithms • Orthogonal Matching Pursuit (OMP) • Forward-Backward Greedy (FoBa) • Compressive Sampling Matching Pursuit (CoSaMP)

  10. General Robustness Guarantees Sparsity error is defined as where is the best k-sparse approximation of What if the reduction create a problem harder to solve than the original problem?

  11. Linear Prediction • If there is a perfect linear predictor of , then there will be a perfect linear predictor of :

  12. Experimental Results • Experiment 1: Image data (collected by the ESP Game) 65k images, 22k unique labels; Keep the 1k most frequent labels; the least frequent occurs 39 times while the most frequent occurs about 12k times, 4 labels on average per image; Half of the data as training and half as testing. • Experiment 2: Text data (collected from http://delicious.com/) 16k labeled web page, 983 unique labels; the least frequent occurs 21 times, the most frequent occurs about 6500 times, 19 labels on average per web page; Half of the data as training and half as testing. • Compression function A: select m random rows of the Hadamard matrix. • Test the greedy and iterative reconstruction algorithm: OMP, FoBa, CoSaMp and Lasso. • Use correlation decoding (CD) as a baseline method for comparisons.

  13. Experimental Results Measure Measure the precision Top two: image data; Bottom: text data

  14. Conclusion • Application of compressed sensing to multi-label prediction problem with output sparsity; • Efficient reduction algorithm with the number of predictions equal to logarithmic in original labels; • Robustness Guarantees from compressed case to the original case; and vice versa for the linear prediction setting.

More Related