1 / 29

The $25 Billion Eigenvector

The $25 Billion Eigenvector. How does Google do Pagerank ?. The Imaginary Web Surfer:. Starts at any page, Randomly goes to a page linked from the current page, Randomly goes to any web page from a dangling page, … except sometimes (e.g. 15% of the time), goes to a purely random page. J.

dasan
Download Presentation

The $25 Billion Eigenvector

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The $25 Billion Eigenvector How does Google do Pagerank?

  2. The Imaginary Web Surfer: • Starts at any page, • Randomly goes to a page linked from the current page, • Randomly goes to any web page from a dangling page, • … except sometimes (e.g. 15% of the time), goes to a purely random page.

  3. J A A tiny web: who should get the highest rank? B I C H D G F E

  4. The associated stochastic matrix: 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.4400 0.0150 0.0150 0.2983 0.4400 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.2983 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.2983 0.8650 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.4400 0.0150 0.0150 0.8650 0.0150 0.8650 0.0150 0.0150 0.0150 0.0150 0.0150 0.2983 0.0150 0.0150 0.8650 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.8650 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.8650 0.2983 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.2983 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.4400 0.0150 0.0150 0.0150

  5. Start with equal components

  6. One iteration

  7. Two iterations

  8. Three iterations

  9. Four iterations

  10. Five iterations

  11. Six iterations

  12. Seven iterations

  13. Eight iterations

  14. Nine iterations

  15. Ten iterations

  16. The Eigenvector

  17. The Imaginary Web Surfer: • Starts at any page, • Randomly goes to a page linked from the current page, • Randomly goes to any web page from a dangling page, • … except sometimes (e.g. 15% of the time), goes to a purely random page.

  18. [U,G] = surfer (‘http://google.com’, 100)

  19. Pagerank Power Iteration 1 step

  20. Pagerank Power Iteration 2 steps

  21. Pagerank Power Iteration 3 steps

  22. Pagerank Power Iteration 4 steps

  23. Pagerank Power Iteration 5 steps

  24. Pagerank Power Iteration the limit

  25. And the winners are… 'http://www.loc.gov/standards/iso639-2' 'http://www.sil.org/iso639-3' 'http://www.loc.gov/standards/iso639-5' 'http://purl.org/dc/elements/1.1' 'http://purl.org/dc/terms' 'http://purl.org/dc' 'http://creativecommons.org/licenses/by/3.0' 'http://i.creativecommons.org/l/by/3.0/88x31.png' 'http://www.nlb.gov.sg' 'http://purl.org/dcpapers' 'http://www.nl.go.kr' 'http://purl.org/dcregistry' 'http://www.kc.tsukuba.ac.jp/index_en.html'

  26. How much storage to hold this array? • Current estimate of indexed WWW: 4.7 · 1010 web pages • If placed into an array this would have 2.21 · 1021 elements • If each element is stored in 4 bytes, this would be 8.8 · 1022 bytes • Current estimate of world’s data storage capacity is 3.0 · 1018 bytes (.003% of necessary space) http://www.smartplanet.com/blog/thinking-tech/what-is-the-worlds-data-storage-capacity/6256

  27. How much time to do one power step? • Current estimate of indexed WWW: 4.7 · 1010 web pages • If placed into an array this would have 2.21 · 1021 elements • Fastest current machine does 33.86 · 1015 operations per second • One step of y = Ay takes 3.68 days

  28. J A How is xk+1=Axkperformed? B I C H D G F E connection = [2 5 3 4 64 5 6 5 1 10 78 1 8 9] end = [2 5 6 7 8 9 11 12 13 16]

  29. How is xk+1=Axkperformed? • xk+1 = .15/n e, (where e is all 1’s) • start = 1 • for j = 1,…, n • col_tot = endj-start • for i = start,…,endj • ii = connectioni • xk+1ii =xk+1ii+.85/col_tot*xki • c) start =endj+1

More Related