1 / 12

Boosting the Ranking Function Learning Process using Clustering

Boosting the Ranking Function Learning Process using Clustering. WIDM 2008. Outline. Introduction Problem definition Approach Evaluation Conclusion. Introduction. Abstract Web continuously grows, the results returned by search engines are too many to review

mieko
Download Presentation

Boosting the Ranking Function Learning Process using Clustering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Boosting the Ranking Function Learning Process using Clustering WIDM 2008

  2. Outline • Introduction • Problem definition • Approach • Evaluation • Conclusion

  3. Introduction • Abstract • Web continuously grows, the results returned by search engines are too many to review • User feedback has gained a lot of attention • Require a big amount of user feedback on the results • Goal: • Produce user feedback “automatically” by using some methods

  4. Problem definition • User feedback • Explicit feedback (user relevnacejudgement) • Implicit feedback • Click information • Users usually inspect only the first few results returned by a search engine, and click even fewer • Collect relevance judgements from clickthrough data is time consuming process • Problem • How to use explicit feedback to generate implicit feedback?(relevance relations expansion)

  5. Approach procedure • Process • Assume that only the relevance judgements of the top-10 results are available for each query (by BM25 feature) • Group all the search results into clusters of documents having similar content • Expand the initial set(top-10 results) of relevance judgements using cluster information

  6. Clustering • Represent each document by a feature vector • total number of distinct terms in all documents • Cluster method • Bisetion clustering • Similarity • Cosine similarity

  7. Relation expansion Train query Train query expansion

  8. Relation expansion • Expansion Algorithm:

  9. Evaluation • Dataset • Letor OHSUMED collection • 348,566 records and 16,140 relevance judgements • 84 training queries and 22 testing queries • Relevance judgement • 0(irrelevant), 1(partially relevant), 2(strongly relevant) • Training method • RankSVM

  10. Evaluation • Clustering precision

  11. Evaluation Use 160 relevance judgements

  12. Conclusion • We presented a methodology for increasing the training input of ranking function learning systems • Future work • Decision on whether a cluster is valid • Different Cluster label ways

More Related