Latent Semantic Kernels. Alejandro Figueroa. Outline. Introduction. From bag of words to semantic space. Representing text (Document – term matrix). Semantic Issues. Vector space kernels Desingning semantic kernels. Designing the proximity matrix. Generalised Vector Space Model.
Remark: [Stemming] differents forms of a words can be treated as equivalent terms in order to perform a reduction of the space.
Two terms co-ocurring in a document are considered related with the streght of the relationship given by the frequency and number of their co-occurences
LSI uses a reduction of the first k columns of U.
Query Q in any Language
Histogram H of substrings
Set S of Snippets
Set F of