1 / 18

Latent Semantic Indexing via a Semi-discrete Matrix Decomposition

Papers from the same authors with similar topics. Kolda, T.G.

nydia
Download Presentation

Latent Semantic Indexing via a Semi-discrete Matrix Decomposition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Latent Semantic Indexing via a Semi-discrete Matrix Decomposition

    2. Papers from the same authors with similar topics Kolda, T.G. & O'Leary, D.P. A semidiscrete matrix decomposition for latent semantic indexing information retrieval ACM Trans. Inf. Syst., 1998, 16, 322-346 Kolda, T.G. & O’Leary, D.P. George Cybenko, D.P.O. (ed.) Latentsemantic indexing via a semi-discrete matrix decomposition Springer-Verlag, 1999, 107, 73–80 Kolda, T.G. & O'leary, D.P. Algorithm 805: computation and uses of the semidiscrete matrix decomposition ACM Transactions on Mathematical Software, 2000, 26, 415–435

    3. Vector Space Framework Query:

    4. Weight of term in a document

    5. Weight of term in a document

    6. Motivation for using SDD Singular Value Decomposition (SVD) is used for Latent Semantic Indexing (LSI) to estimate the structure of word usage across documents. Use Semi-discrete Decomposition (SDD) instead of SVD for LSI to save storage space and retrieval time.

    7. Why? Claim: SVD has nice theoretical properties but SVD contains a lot of information, probably more than is necessary for this application.

    8. SVD vs SDD SVD: SDD:

    9. SDD is an approximate representation of the matrix. Repackaging, even without removing anything, might not result in the original matrix. Theorems exist that say that as the number of terms k tends to infinity, slowly you will converge to the original matrix. The speed of convergence depends on the original estimate, used to "initialize" the iterative decomposition algorithm.

    10. Result: Storage Space

    11. Medline test case

    12. Results on Medline test case

    13. Method for SDD

    14. Metrics in those papers Kolda, T.G. & O'Leary, D.P. A semidiscrete matrix decomposition for latent semantic indexing information retrieval ACM Trans. Inf. Syst., 1998, 16, 322-346 Kolda, T.G. & O’Leary, D.P. George Cybenko, D.P.O. (ed.) Latentsemantic indexing via a semi-discrete matrix decomposition Springer-Verlag, 1999, 107, 73–80 Kolda, T.G. & O'leary, D.P. Algorithm 805: computation and uses of the semidiscrete matrix decomposition ACM Transactions on Mathematical Software, 2000, 26, 415–435

    15. Greedy Algorithm

    16. Notes on the algorithm Starting vector y: every 100th element is 1 and all the other are 0. Ak ? A as k? 8 Find the minimum F-norm can be simplified to find an optimal x. Improvement threshold may be 0.01. improvement = |new - old| / old

    17. Finding x and d

More Related