using odp metadata to personalize search n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Using ODP Metadata to Personalize Search PowerPoint Presentation
Download Presentation
Using ODP Metadata to Personalize Search

Loading in 2 Seconds...

play fullscreen
1 / 12

Using ODP Metadata to Personalize Search - PowerPoint PPT Presentation


  • 106 Views
  • Uploaded on

Using ODP Metadata to Personalize Search. Presented by Lan Nie 0 9 / 2 1/2005, Lehigh University. Introduction. ODP metadata 4 million sites, 590,000 categories Tree Structure Categories: inner node Pages: leaf node, high quality, representative Using ODP Metadata to personalize Search

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Using ODP Metadata to Personalize Search' - hedia


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
using odp metadata to personalize search

Using ODP Metadata to Personalize Search

Presented by Lan Nie

09/21/2005, Lehigh University

introduction
Introduction
  • ODP metadata
    • 4 million sites, 590,000 categories
    • Tree Structure
      • Categories: inner node
      • Pages: leaf node, high quality, representative
  • Using ODP Metadata to personalize Search
    • 4 billion vs. 4 million
    • Using ODP Metadata for personalized search
    • Is biasing possible in the ODP context?

Extend ODP classifications from its current 4 million to a 4 billion Web automatically by biasing

using odp metadata for personalized search
Using ODP Metadata For Personalized Search
  • User Profile: several topics from ODP selected by user
  • Personalized Search
    • Send Q to a search Engine S(E.g., Google, ODP Search)
    • Res=URLs returned by S
    • For i= 1 to size(Res)

Dist[i]=Distance(Res[i], Prof)

    • Resort Res based on Dist
  • Representation
    • Both user profile and URL(50% in Google directory) can be represented as a set of nodes in the directory tree
  • Distance ( Profile, URL)
    • Minimum distance between the 2 set of nodes.
slide4
Naïve Distances

Minimum tree distance

      • Intra-topic links
      • Subsumer

Graph shortest path

      • Inter-topic links
  • Complex Distance

The bigger the subsumer’s depth is, the more related are the nodes

  • Combing with Google PageRank

Some Google Results are not annotated

extending odp annotations to the web
Extending ODP Annotations To The Web
  • Manual annotation for the whole web is impossible
  • Biasing is an implicit way for extending annotations to the Web
  • Is basing possible in the ODP context?

Are ODP entries good biasing sets to obtain relevant results: generate rankings which are different enough from the non-biased ranking

  • When does biasing make a difference?

Find the characteristics the biasing set has to exhibit in order to obtain relevant results

slide7

Experimental Setup

  • Compare the similarity between top 100 non-biased PageRank results and biased results
  • Similarity Measure
    • OSIM: degree of overlap between the top n elements of two rank lists
  • KSim: degree of agreement on ordering between the two rank lists
slide8
Choice of Biasing Sets
    • Top [0-10]% PageRank pages
    • Top[0-2]% PageRank pages
    • Randomly selected pages
    • Low PageRank pages
  • Varied the sum of score within the set between 0.000005% and 10% of the total sum over all pages (TOT).
  • Experiments are done on a crawl of 3 million pages, and then applied on Stanford WebBase crawl.
slide11
According to the random model of biasing, every set with TOT below 0.015% is good for biasing.
  • Results are not influence by the crawl size

(3 million crawl vs 120 million WebBase crawl)

  • Entries in ODP have TOT below than 0.015% thus biasing is possible in the ODP context
conclusions
Conclusions
  • A Personalized search algorithm to rank urls based on the distance between user profile and url in the ODP taxonomy.
  • Biasing on ODP entries will take effect, thus it is feasible to extend the manual ODP classification to the Web is feasible