1 / 39

Link Building

Link Building. Martin Olsen Department of Computer Science Aarhus University. Outline. Motivation and Introduction Contribution Link Building Communities in Networks Hedonic Games Simple Games. What is Search Engine Optimization (SEO) ?.

chriswebber
Download Presentation

Link Building

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Link Building Martin Olsen Department of Computer Science Aarhus University

  2. Outline • Motivation and Introduction • Contribution • Link Building • Communities in Networks • Hedonic Games • Simple Games

  3. What is Search Engine Optimization (SEO) ? • ... in 2012, companies will spend almost $9 billion on search engine optimization … The New York Times, January 2009 Objective of SEO: A link to your page appears here on page 1

  4. www as a Graph = =

  5. Random Surfer Zaps with probability 0.15 PageRank. Random Surfer Perspective 100 1 3 100 2 100 4 5 6 7 8 9 10 100 100 100 100 100 100 100 1000 random surfers

  6. Random Surfer Zaps with probability 0.15 PageRank. Random Surfer Perspective 143 = 85 + 85/2 +15 1 3 355 = 4  85 + 15 2 270 4 5 6 7 8 9 10 15 58 15 15 15 15 100 1000 random surfers Distribution after one tick

  7. Random Surfer Zaps with probability 0.15 PageRank. Random Surfer Perspective 281 1 3 66 2 280 4 5 6 7 8 9 10 254 15 43 15 15 15 15 1000 random surfers Stationary distribution after 50 ticks

  8. Random Surfer Zaps with probability 0.15 PageRank. Random Surfer Perspective 0.281 1 3 0.066 2 0.280 4 5 6 7 8 9 10 0.254 0.015 0.043 0.015 0.015 0.015 0.015

  9. Random Surfer Zaps with probability 0.15 PageRank. Random Surfer Perspective 0.281 1 3 0.066 2 0.280 4 5 6 7 8 9 10 0.254 0.015 0.043 0.015 0.015 0.015 0.015 PageRank Ranking: 1, 2, 4, 3, 6 PageRank is an important ingredient of the ranking mechanism Relevance counts as well!

  10. Link Building is an Important Aspect of SEO

  11. Contribution/Link Building The Computational Complexity of Link Building (Cocoon ´08) Olsen Maximizing PageRank with new Backlinks (submitted) Olsen MILP for Link Building (In preparation) Olsen, Viglas

  12. The Link Building Problem. Formal Definition • LINK BUILDING • Instance : G(V, E), t V, k  Z+ • Solution : S V  {t} with  S   k • maximizing t after adding • S  {t} to E

  13. Link Building is not Trivial 0.096 2 0.091 0.060 7 3 0.272 1 8 0.250 6 4 0.085 0.069 0.054 2 0.039 2 5 0.078 0.042 0.042 0.049 0.035 7 3 0.375 7 3 1 0.367 1 8 0.337 6 4 8 0.331 6 4 0.054 0.054 0.070 0.049 5 0.042 5 0.060

  14. PageRank Topology Theorem*) : The expected number of visits to p for a random surfer starting at u prior to the first zapping event i 1  increase in PageRank 1 j

  15. k-REGULAR INDEPENDENT SET ≤FPT LINK BUILDING • Does the graph contain an independent set of size k? • Can we turn this question into a Link Building problem? j i

  16. k-REGULAR INDEPENDENT SET ≤FPT LINK BUILDING j y x 1 i OPT! Basic idea: Make zij relatively big

  17. k-REGULAR INDEPENDENT SET ≤FPT LINK BUILDING j LINK BUILDING is W[1]-hard *): LINK BUILDING solvable in time f(k)  nc  k-REGULAR INDEPENDENT SET solvable in time f(k)  nc  W[1] = FPT Another result: FPTAS for LINK BUILDING  NP = P y x 1 i OPT! Basic idea: Make zij relatively big

  18. Upper Bound: k = 1 fixed 0.070 0.096 2 2 0.060 0.091 0.048 0.060 7 3 7 3 0.338 0.272 1 1 8 0.306 8 0.250 6 4 6 4 0.048 0.085 0.060 0.069 5 5 0.070 0.078 The dashed link can be found in time corresponding to O(1) PageRank computations with a randomized scheme *).

  19. Upper Bound: Mixed Integer Linear Programming Approach *) Price for link from i Compute the cheapest set of new incoming links that would make node 5 rank highest 0.061 2 0.099 0.036 7 3 0.187 1 8 0.178 6 4 0.189 0.049 5 0.200

  20. A Quiz: Which of the two situations would be optimal for Martin?

  21. Contribution/Communities in Networks Communities in Large Networks: Identification and Ranking (WAW ´06) Olsen

  22. Communities in Networks Dolphins in Doubtful Sound [Newman, Girvan ´04]:

  23. What is a Community? Informally: A community C is a set of nodes with relatively many links between them Assumption/Observation: A CS site has relatively many CS links! Formal definition based on assumption *) : v C,u  C: wvC ≤ wuC C

  24. A Greedy Approach for Detecting Members of a Community *) Repeat until C is a Community: • Find v Cwith maximum attention to C • CC {v} • Update attentions Use two priority queues holding elements in C and V C 1) Old C 2) New C

  25. An Experiment. A Danish CS Community • Crawl of the dk-domain with 180.468 sites in total • Representatives = 4 CS sites • CS-Community with 556 sites • Minimum attention, : 15.8% • Maximum attention, : 15.4% Ranking: • www.daimi.au.dk (CS U Aarhus) • www.diku.dk (CS U Copenhagen) • www.itu.dk (ITU Copenhagen) • www.cs.auc.dk (CS U Aalborg) • www.brics.dk (CS PhD School) • www.imm.dtu.dk (Informatics/Mathematical modeling DTU Copenhagen) … • www.imada.sdu.dk (CS/Mathematics U Southern Denmark)

  26. Other Results Computing non trivial communities by the definition given is NP-hard A simple model for the evolution of communities is presented. These communities are probably obeying the definition for large n if the out degree of the nodes is (log n).

  27. Contribution/Hedonic Games Nash Stability in Additively Separable Hedonic Games Is NP-Hard (CiE ´07) Olsen Extended version: Nash Stability in Additively Separable Hedonic Games and Community Structures (Theory of Computing Systems ´09) Olsen

  28. An Additively Separable Hedonic Game Two buffaloes b1 and b2 that hate each other. They are only thirsty if they have a parasite on their back in which case they have to drink 9 l/h. Two gigantic parasites p1 and p2. They only want to sit on b1 and b2 respectively. Five waterholes w1, …,w5 with capacities 1, 2, 3, 4 and 8 l/h respectively.

  29. An Additively Separable Hedonic Game One Nash Equilibrium for the game: PARTITION ≤ NE in ASHG NPC *)

  30. Community Structures in Networks Put a 1 on each connection between two dolphins. The community structure is a NE! NE  community structure? NE’s are NP-hard to compute even with symmetric and positive payoffs*)

  31. Contribution/Simple Games On the Complexity of Problems on Simple Games (submitted) Freixas, Molinero, Olsen, Serna

  32. Open Problems/Future Work • In the thesis we show LINK BUILDING  APX. Is there a PTAS for LINK BUILDING? • Surgical Link Building: • Isolate the Community C • Model all pages in V  C as one page • Use MILP • Use information on distribution of PageRank • Does the stuff presented really work? • Thank You!

  33. Link Building. A Real World Example Dear X We are trying to get more links to our website to help improve its rating on the search engines. We were wondering if you could put a link to our site … on your webpage or blog. If you have a website or a Blog and put a link to our page on it then to say thank you for each month it is up, I will give you … Source: An e-mail to a colleague X

  34. Link Building is not Trivial. 2nd Example 1 Assumption: Obtaining a link from one green node is slightly better for node 1 compared to obtaining a link from one blue node. Now node 1 can pick three incoming links for free. What should node 1 choose?

  35. No FPTAS for LINK BUILDING if NP ≠ P *) j y x 1 i OPT!

  36. Power Law

  37. Fixed Parameter Tractability: FPT and W[1] W[1] k-INDEPENDENT SET k-REGULAR INDEPENDENT SET Solvable in time f(k)  nc FPT k-VERTEX COVER Complete for W[1] LINK BUILDING is W[1]-hard *)

  38. Other Results Computing non trivial communities by the definition given is NP-hard A simple model for the evolution of communities is presented. These communities are probably obeying the definition for large n if the out degree of the nodes is (log n). C

  39. Upper Bound: Mixed Integer Linear Programming Approach *) price for 0.061 0.096 2 2 0.099 0.036 0.091 0.060 7 3 7 3 0.187 0.272 1 1 8 0.178 8 0.250 6 4 6 4 0.189 0.085 0.049 0.069 5 5 0.200 0.078 The dashed links show the cheapest modification that will bring node 5 to the top of the ranking. Computed using a MILP approach. Alternatively we could go for the maximum improvement in the ranking for a given budget.

More Related