1 / 22

Exploiting Proximity Feature in Statistical Translation Models for Information Retrieval

Exploiting Proximity Feature in Statistical Translation Models for Information Retrieval. 報告者 :101598035 邱威霖. Reference. CIKM '13 Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Download Presentation

Exploiting Proximity Feature in Statistical Translation Models for Information Retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Exploiting Proximity Feature in Statistical Translation Models for Information Retrieval 報告者:101598035 邱威霖

  2. Reference • CIKM '13 Proceedings of the 22nd ACM international conference on Conference on information & knowledge management • XinhuiTu, Jing Luo, Maofu Liu, College of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, China • Bo Li, TingtingHe, Department of Computer Science, Central China Normal University, Wuhan, China

  3. Outline • Introduction • Statistical Translation Model For Retrieval • Proximity-Based Translation Langauge Model • Experiments • Conclusions

  4. Introduction • In this paper, theystudy how to explicitly incorporate proximity information into the existing translation language model, and propose a proximity based translation language model, called TM-P, with three variants. • In TM-P models, a new concept isintroduced to model the proximity of word co-occurrences, which is then used to estimate translation probabilities.

  5. Statistical Translation Model For Retrieval • In the basic language modelling approach, documents are ranked by the probability that the query text could be generated by the document language model. • Given a query q1, q2, q3, … qm and a document d, the query likelihood scoring function is as follows:

  6. Statistical Translation Model For Retrieval • In translation language modelling approach, the document model p(w|d)can be calculated by using the following “translation document model”:

  7. Statistical Translation Model For Retrieval • In this way, a word can be translated into its semantically related words with non-zero probability, which allows us to score a document by counting the matches between a query word and semantically related words in document.

  8. Proximity-Based Translation Langauge Model • Previous studies have shown that translation language modelworks better with Dirichlet Prior smoothing [4][5]. • Therefore, inthe rest of the paper, we further focus on the translation languagemodel with Dirichlet prior smoothing only as follows:

  9. Proximity-Based Translation LangaugeModel • Estimating Translation Probability • They introduce a new concept, namely proximity-based word co-occurrence frequency (pcf) to model the proximity feature of co-occurrencies.

  10. Proximity-Based Translation Langauge Model • In this paper, three commonly used distance measures are adopted to calculate dist(w,u,d). • Minimum pair distance: • Average pair distance: • Average minimum pair distance:

  11. Minimum pair distance • It is defined as the minimum distance between any occurrences of w and u in document d . • In the example, dist(w,u,d)is 1 and can be calculated from the position vectors.

  12. Average pair distance • It is defined as the average distance between w and u for all position combinations in d . • In the example, the distances from the first occurrence of w (in position 1) to all occurrences of u are: {1 and 5}. This is computed for the next occurrence of w (in position 5) and so on. dist(w,u,d)for the example is (((2-1) + (6-1)) + ((5-2) + (6-5)) + ((9-2) + (9-6)))/(2 · 3) = 20/6 = 3.33.

  13. Average minimum pair distance • It is defined as the average of the shortest distance between each occurrence of the least frequently occurring word and any occurrence of the other word. • In the example, u is the least frequently occurring word so dist(w,u,d)= ((2−1)+(6−5))/2 =1.

  14. Proximity-Based Translation Langauge Model • Then, the probability of translating word u into word w can be estimated as follows:

  15. Proximity-Based Translation Langauge Model • Optimizing Self-translation Probability • In order to satisfy the constraints defined in [5], we adjust translation language model as follows:

  16. Experiments • The experiments in this section use three main document collections: • ad hoc data in TREC7 with TREC topics 351-400 and 528,155 articles • WSJ news articles with TREC topics 51-100 • technical reports in DOE abstracts with TREC topics 51-100.

  17. Experiments-Parameter Sensitivity Study • An important issue that may affect the robustness of the TM-Pmodels is the sensitivity of their parameters s and σ . • The parameter s controls the amount ofself-translation probabilities. • The kernel parameter σ determinesthe distance in which words are considered to be related. • In thissection, we study how sensitive these two parameters are to MAPmeasure.

  18. Experiments-Parameter Sensitivity Study

  19. Experiments-Parameter Sensitivity Study

  20. Conclusion • In this paper, a new type of translation language model, called TM-P, is proposed by explicitly incorporating proximity information into the existing translation language model. • The corresponding models based on these measures, TM-P1, TM-P2 and TM-P3, are evaluated on three standard TREC collections. • Our experiment results indicate that our TM-P models are more effective than the state-of-art translation models. • Comparing the three variants of TM-P, TM-P3 is more effective than TM-P1 and TM-P2. In the future, we will try to study how to apply TM-P in other text processing tasks.

  21. End

More Related