1 / 29

Accelerating the Pagerank Algorithm

Accelerating the Pagerank Algorithm. M. Campbell Missouri State University REU. The Information Retrieval Problem. Actually two…. Given a finite set of Documents D and a query q… I. Which elements of D are relevant to q? II. Of the relevant documents, which are most relevant?.

yanni
Download Presentation

Accelerating the Pagerank Algorithm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Accelerating the Pagerank Algorithm M. Campbell Missouri State University REU

  2. The Information Retrieval Problem

  3. Actually two… Given a finite set of Documents D and a query q… I. Which elements of D are relevant to q? II. Of the relevant documents, which are most relevant?

  4. Exploiting the Structure of the document set Scholarly papers: Papers cite other papers. Papers which are cited the most are likely to be very important in their field. Additionally, the papers cited by important papers gain in relative importance.

  5. What about the internet? The internet has a similar structure due to hyperlinking. Pages which are very important get linked to by many pages, and pages linked to by important pages will likely be deemed to be more important than others.

  6. Looking abstractly at the link structure of the web

  7. The Pagerank Equation

  8. The Iterative Pagerank Equation

  9. Determining the Pagerank Vector by the Power Method This is the power method, where we are computing the eigenvector of H associated to the eigenvalue of 1

  10. Fixing the Link matrix to ensure the pagerank vector exists I. Dangling nodes

  11. Fixing The link matrix II. Reducibility (dangling webs) U is a probabilistic (entries add to one) “personalization” vector

  12. The Google matrix

  13. An alternate Method

  14. The linear system Letting It has been shown that v = rx for some scalar r where x is the solution of the system

  15. Options for solving the system There are many options for solving this system. I focused on three. I. Jacobi II. Gauss-Seidel III. Successive Over Relaxation(SOR) But first we study reorderings of the matrix to make it “nice” for the solver

  16. Stanford.edu web

  17. stanford.edu/berkley.edu web

  18. Stanford Reordered by descending outdegree

  19. SB Reordered by descending outdegree

  20. Stanford Reordered by descending indegree

  21. SB Reordered by descending indegree

  22. Reverse Cuthill Mckee

  23. The Breadth first search

  24. BFS reorder on Stanford web

  25. BFS on Stanford/Berkley

  26. The dangling node/BFS reordering

  27. Solving the BFS/Dangling system {

  28. Comparative Results

  29. Further studies • Preconditioning • Optimal implementation of • Gauss-Siedel/SOR Algorithm • III. Markov Chain Updating Problem with • Linear Solving • IV. Using Kendall-tau measure for • convergence criterion.

More Related