1 / 25

Dependence Language Model for Information Retrieval

Dependence Language Model for Information Retrieval. Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval, SIGIR 2004. Reference. Structure and performance of a dependency language model. Ciprian, David Engle and et al . Eurospeech 1997.

Download Presentation

Dependence Language Model for Information Retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval, SIGIR 2004

  2. Reference • Structure and performance of a dependency language model. Ciprian, David Engle and et al. Eurospeech 1997. • Parsing English with a Link Grammar. Daniel D. K. Sleator and Davy Temperley. Technical Report CMU-CS-91-196 1991.

  3. Why we use independence assumption? • The independence assumption is one of the assumptions widely adopted in probabilistic retrieval theory. • Why? • Make retrieval models easier. • Make retrieval operation tractable. • The shortage of independence assumption • Independence assumption does not hold in textual data.

  4. Latest ideas of dependence assumption • Bigram • Some language modeling approach try to incorporate word frequency by using bigram. • Shortage: • Some of word dependencies not only exist between adjacent words but also exist at more distant. • Some of adjacent words are not exactly connected. • Bigam language model showed only marginally better effectiveness than the unigram model. • Bi-term • Bi-term language model is similar to the bigram model except the constraint of order in terms is relaxed. • “information retrieval” and “retrieval of information” will be assigned the same probability of generating the query.

  5. Structure and performance of a dependency language model

  6. Introduction • This paper present a maximal entropy language model that incorporates both syntax and semantics via a dependency grammar. • Dependency grammar: express the relations between words by a directed graph which can incorporate the predictive power of words that lie outside of bigram or trigram range.

  7. Introduction • Why we use Ngram • Assume if we want to record we need to store independent parameters • The drawback of Ngram • Ngram blindly discards relevant words that lie N or more positions in the past.

  8. Structure of the model

  9. Structure of the model • Develop an expression for the joint probability , K is the linkages in the sentence. • Then we get • Assume that the sum is dominated by a single term, then

  10. A dependency language model of IR • A query we want to rank • Previous work: • Assume independence between query terms : • New work: • Assume that term dependencies in a query form a linkage

  11. A dependency language model of IR • Assume that the sum over all the possible Ls is dominated by a single term • Assume that each term is dependent on exactly one related query term generated previous.

  12. A dependency language model of IR

  13. A dependency language model of IR • Assume • The generation of a single term is independent of L • By this assumption, we would have arrived at the same result by starting from any term. L can be represented as an undirected graph.

  14. A dependency language model of IR 取log

  15. Parameter Estimation • Estimating • Assume that the linkages are independent. • Then count the relative frequency of link l between and given that they appear in the same sentence. Have a link in a sentence in training data A score The link frequency of query i and query j

  16. Parameter Estimation assumption Assumption:

  17. Parameter Estimation • Estimating • The document language model is smoothed with a Dirichlet prior Constant discount Dirichilet distribution

  18. Parameter Estimation • Estimating

  19. Experimental Setting • Stemmed and stop words were removed. • Queries are TREC topics 202 to 250 on TREC disk 2 and 3.

  20. The flow of the experimental document Training data For weight computation query Find the linkage of query Get Count the frequency Find the max L by maxlP(l|Q) Get P(L|D) Count the frequency Ranking document Get combine Count the frequency Get

  21. Result-BM & UG • BM: binary independent retrieval • UG: unigram language model approach • UG achieves the performance similar to, or worse than, that of BM.

  22. Result- DM • DM: dependency model • The improve of DM over UG is statistically significant.

  23. Result- BG • BG: bigram language model • BG is slightly worse than DM in five out of six TREC collections but substantially outperforms UG in all collection.

  24. Result- BT1 & BT2 • BT: bi-term language model

  25. Conclusion • This paper introduce the linkage of a query as a hidden variable. • Generate each term in turn depending on other related terms according to the linkage. • This approach cover several language model approaches as special cases. • The experimental of this paper outperforms substantially over unigram, bigram and classical probabilistic retrieval model.

More Related