1 / 13

CSE 450 – Web Mining Seminar Professor Brian D. Davison Fall 2005

CSE 450 – Web Mining Seminar Professor Brian D. Davison Fall 2005. A Presentation on When Experts Agree: Using Non-Affiliated Experts to Rank Popular Topics K. Bharat & G. A. Mihaila WWW10 Conference, May 2001, Hong Kong by Osama Ahmed Khan 10/06/2005. Problem. Query on Popular Topic

Download Presentation

CSE 450 – Web Mining Seminar Professor Brian D. Davison Fall 2005

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSE 450 – Web Mining SeminarProfessor Brian D. DavisonFall 2005 A Presentation on When Experts Agree: Using Non-Affiliated Experts to Rank Popular Topics K. Bharat & G. A. Mihaila WWW10 Conference, May 2001, Hong Kong by Osama Ahmed Khan 10/06/2005

  2. Problem • Query on Popular Topic • Content Analysis Solution • Most Authoritative Pages

  3. Technical Terms • Expert • Recommendation • Non-affiliation

  4. Hilltop Algorithm • Expert Lookup • Detecting Host Affiliation • Expert Selection • Expert Indexing • Target Ranking • Computing Expert Score • Computing Target Score

  5. Detecting Host Affiliation • Conditions • Same first 3 octets of IP 127.0.0.1 127.0.0.15 • Same rightmost non-generic token of hostname www.ibm.com www.ibm.co.mx • Union-Find Algorithm

  6. Expert Selection • Retrieve all webpages with: Out-degree > Threshold (k) (e.g. k = 5) • Expert will have: URLs pointing to k distinct non-affiliated hosts

  7. Expert Indexing • Inverted Index • Mapping Keywords to Experts • Key Phrases • Match Positions

  8. Computing Expert Score • Condition • Atleast 1 URL with all query keywords • Expert Score: (S0, S1, S2) Si = SUM{key phrases p with k-i query terms} * LevelScore(p) * FullnessFactor(p,q) Expert_Score = 232 * S0 + 216 * S1 + S2

  9. Computing Target Score • Condition • Atleast 2 non-affiliated experts • Target Score: Edge_Score(E,T) = Expert_Score(E) * SUM{query keywords w} * occ(k,T) Target_Score = Sum{Edge_Score(E,T)}

  10. Evaluation • Locating Specific Popular Targets

  11. Evaluation (Contd.) • Gathering Relevant Pages

  12. Conclusion • Characteristics • Popular Queries • Expert Subset • Hilltop vs. • PageRank • Topic Distillation

  13. Thank You

More Related