1 / 13

Web Information retrieval (Web IR)

Web Information retrieval (Web IR). Handout #13: Ranking based on User Behavior. Ali Mohammad Zareh Bidoki ECE Department, Yazd University alizareh@yaduni.ac.ir. Finding Ranking Function. R=f( Query, User behavior , web graph & content features) How can we use the user behavior?

alma-jordan
Download Presentation

Web Information retrieval (Web IR)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Web Information retrieval (Web IR) Handout #13: Ranking based on User Behavior Ali Mohammad Zareh Bidoki ECE Department, Yazd University alizareh@yaduni.ac.ir

  2. Finding Ranking Function • R=f( Query, User behavior, web graph & content features) • How can we use the user behavior? • Explicit • Implicit • 80% of user clicks are related to query • Click-through data • From search Engines log

  3. Click-through data Triple (q,r,c) q=query r=ranked list c=set of clicked docs c q r Click-through data (by Joachims )

  4. Benefits of Using Click through data • Democracy in Web • Filling gap between user needs and results • User clicks are more valuable that a page content (Search engine precision is evaluated by user no page creators) • Degree of relevancy between query and documents will increase (Adding click metadata to document)

  5. Docs Docs Words Queries Users 1 1 1 1 1 2 2 2 2 2 n q n m w Web graph Web Entities

  6. Document Expansion Using Click TD • First time Google used Anchortext as a document content • Anchor text is view of a document from another document

  7. Long term incremental learning • Di vector of a document in ith iteration • Q is vector of the query that this document is clicked • Alpha is learning rate

  8. Naïve Method (NM)A bipartite graph for docs and queries • Mij is number of clicks on document j for query i

  9. Naïve Method (Cont.) • The weight between query qj and document di: • The meta data for document i is:

  10. Co-Visited Method • If two pages are clicked by the same query they called co-visited. • The similarity between two docs i and j is (visited(di) shows number of clicks on di and visited(di,dj) shows number of queries in which both are clicked):

  11. Co-Visited Disadvantages • It only considers documents similarity (not query similarity) • As users clicks on top 10 pages, click data are sparse (1.5 queries for each page) • So similarity is not precise

  12. Iterative Method (IM) • O(q): set of clicked page for q • Oi(q): the ith clicked page for q • I(d): set of queries in which it is clicked on d • Ii(d): The ith query in which it is clicked on d

  13. Experimental Results • Experimental results on a real large query click-through log, i.e. MSN query log data, indicate that the proposed algorithm relatively outperforms • the baseline search system by 157%, • naïve query log mining by 17% and • co-visited algorithm by 17% • on top 20 precision respectively.

More Related