1 / 14

Result Diversification Based On Query Specific Cluster Ranking

Result Diversification Based On Query Specific Cluster Ranking. By Jiyin He, Edgar Meij, and Maarten de Rijke Presenter: Bilge Koroglu June 6, 2011. Introduction. Multi-faceted queries Jaguar: cocktail, car, animal, etc... Ambiguity in search result lists

lovey
Download Presentation

Result Diversification Based On Query Specific Cluster Ranking

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Result Diversification Based On Query Specific Cluster Ranking By Jiyin He, Edgar Meij, and Maarten de Rijke Presenter: Bilge Koroglu June 6, 2011

  2. Introduction • Multi-faceted queries • Jaguar: cocktail, car, animal, etc... • Ambiguity in search result lists • which of the web pages should be included • cocktail, car, animal, etc... • Possible solution: diversification • query specific clusters • important clusters (=high quality)

  3. Related Work • Probabilistic Ranking Principle • traditional retrieval strategy • maximize query-document similarity • Diversification requirements: • maximize query-document similarity • minimize document-document similarity • irrelevant document retrieval • Maximum Marginal Relevance (MMR) • Precision & diversity are inversely proportional (Figure 1) • Aim to increase early precision while giving diversified results

  4. Related Work (cont’d...) Figure 1. The trade-off between precision anddiversity for MMR

  5. Diversification Methods • MMR • Facet modeling with Latent Dirichlet Allocation (FM-LDA) • ItentAware Select (IA-Select) • Round Robin Facet Selection • Selection of T • diversification on top T clusters • remaining documents ranked by their retrieval score • The ways of ranking clusters: • query likelihood • oracle ranker

  6. Result Diversification with Cluster Ranking • Similar documents are in the same group • documents with the same subtopic • Relatively small number of clusters include actually relevant documents • For each query • search result list is constructed by MRF • top-ranked documents are clustered • clusters are ranked for the relevancy of query in decreasing order • from high quality clusters, new search result list is composed

  7. Figure 2. Diversification with Cluster Ranking

  8. Experiments • 4 research questions are investigated: • What is the impact of diversification with cluster ranking on the effectiveness of existing result diversification methods? • What are the impacts of cluster ranker and the value of T ? • How sensitive is the performance of the framework to the number of documents selected? • What conditions should clusters fulfill to be effective clusters for cluster ranking?

  9. Experiments: Question 1 • How much performance is gained by employing query specific clustering and applying result diversification to the retrieved documents? • query likelihood cluster ranker • automatically determined T (leave-one-out cross validation) • the higher T, less performance in FM-LDA • positive effect on performance with cluster ranking except IA-select algorithm

  10. Experiments: Question 2 • What are the impacts of cluster ranker and the value of T ? Figure 3. Query likelihood ranker Figure 4. Oracle Cluster Ranking

  11. Experiments: Question 3 • How sensitive is the performance of the framework to the number of documents selected? Table 1. The effect of search result lists’ length

  12. Experiments: Question 3 (cont’d...) Table 2. The effect of search result lists’ length

  13. Experiments: Question 4 • What conditions should clusters fulfill to be effective clusters for cluster ranking: Diversified result from small number of high quality clusters Figure 5. Accumulated precision scores for hierarchical (above) and LDA clusters (below)

  14. Conclusion • Taking the advantage of cluster-based retrieval for diversification • Aim to increase the diversity while preserving the early precision • Evaluated that the technique is effective and applicable • Worth to further investigate with rigorous learning algorithms for parameters

More Related