Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts. Ramin Homayouni, Kevin Heinrich, Lai Wei, and Michael W. Berry University of Tennessee presented by J. Jiang. Outline. Brief Overview of Biomedical Literature Mining The Gene Clustering Problem Latent Semantic Indexing
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Ramin Homayouni, Kevin Heinrich, Lai Wei, and Michael W. Berry
University of Tennessee
presented by J. Jiang
(ISMB 05’ Tutorial Proposal, H. Shatkay)
This paper tries to improve the vector representation of documents using LSA.
X = T0S0D0,
where columns of T0 are the eigenvectors of XX, and columns of D0 are the eigenvectors of X X. S0 is diagonal. S02 is the matrix of eigenvalues of XX (or X X).
XXhat = TSD.
The first eigenvector
The second eigenvector
(taken from “A Tutorial on PCA” by Lindsay Smith)
Xhat Xhat = DS2D = DS(DS) .
Dq = Xq TS-1,
where Xq is the query vector in the original space. Dq is like a row of D.