tagging with queries how and why
Download
Skip this Video
Download Presentation
Tagging with Queries: How and Why?

Loading in 2 Seconds...

play fullscreen
1 / 35

Tagging with Queries: How and Why? - PowerPoint PPT Presentation


  • 219 Views
  • Uploaded on

Tagging with Queries: How and Why?. Ioannis Antonellis [email protected] Hector Garcia-Molina [email protected] Jawed Karim [email protected] Content on the Web. Back Link Text. Search queries. Page Text. Forward Link Text. Cnn Obama Critics news. How?.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Tagging with Queries: How and Why?' - Pat_Xavi


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide2
Content on the Web

Back Link Text

Search queries

Page Text

Forward Link Text

Cnn ObamaCriticsnews

Stanford Infolab

slide3
How?
  • Basic observation: http referrer field contains search query

Stanford Infolab

3

slide4
How?

Stanford Infolab

slide5
How?
  • Basic observation: http referrer field contains search query

1) Extract queries from web access log

Stanford Infolab

5

web access log
Web Access Log

a997c1950718d75c03f22ca8715e50b3 [28/Feb/2007:23:45:47 -0800] /group/svsa/cgi-bin/www/officers.php http://www.google.com/search?sourceid=navclient&ie=UTF-8&rls=HPIB,HPIB:2006-47,HPIB:en&q=sexy+random+facts

a64344ffd6638d0f6fb2a0284f98b28b [28/Feb/2007:23:45:49 -0800] /group/King/ "http://www.google.com.au/search?hl=en&q=Martin+Luther+King&meta="

413fa663474b2288c1661882e7e62aea [28/Feb/2007:23:46:02 -0800] /group/pandegroup/folding/results.html "http://www.google.com/search?sourceid=navclient-menuext&ie=UTF-8&q=RESULTS"

3d2edd4dfa7778da92875ee67a319433 [28/Feb/2007:23:46:03 -0800] /group/vpge/sgsi/entrepreneurship/ "http://www.google.com/search?hl=en&q=summer+institute+of+entrepreneurship"

ac49793239a6c490023e460fd4863a48 [28/Feb/2007:23:46:06 -0800] / "http://www.google.com/search?sourceid=navclient&hl=ko&ie=UTF-8&rlz=1T4SUNA_ko___KR209&q=stanford"

1c9893680

Stanford Infolab

slide7
How?
  • Basic observation: http referrer field contains search query

1) Extract queries from web access log

2) Embed Javascript code in web pages that capture search queries

Stanford Infolab

7

embeddable code
Embeddable code

Stanford Infolab

8

slide9
How?
  • Basic observation: http referrer field contains search query

1) Extract queries from web access log

2) Embed Javascript code in web pages and capture search queries

  • Convince server administrator/page onwer

Stanford Infolab

9

query tags
Query tags

Stanford Infolab

11

information value of query tags
Information value of Query Tags

WebBase

  • Datasets:
  • Stanford Query Logs: 360,000 URLs, 900,000 query tags
  • [email protected]: 3,000 URLs, 5,500 tags

Stanford Infolab

12

experiments summary
Experiments - Summary
  • URLs coverage
  • Query vs Delicious Tags
  • Query/Delicious Tags vs Pagetext

Stanford Infolab

urls coverage
URLs coverage
  • Query logs provide tags for ~110 times more URLs than delicious
  • 13% of delicious URLs (380 URLs) only tagged by delicious

Stanford Infolab

14

query tags15
Query Tags
  • Query logs provide 42 query tags per URL on average

Stanford Infolab

15

delicious tags
Delicious Tags
  • Delicious provides 3 tags per URL on average

Stanford Infolab

16

tags for common urls
Tags for common URLs
  • Query logs provide 250 query tags per URL on average for common URLs
  • Delicious provides 5 tags per URL on average for common URLs

Stanford Infolab

17

query tags vs page text
Query Tags vs Page Text
  • For every URL, 1 out of 3 query tags are not present in the pagetext

Stanford Infolab

18

delicious tags vs page text
Delicious Tags vs Page Text
  • For every URL, 1 out of 2 query tags are not present in the pagetext

Stanford Infolab

19

tags for common urls20
Tags for common URLs
  • For common URLs, 1 out of 2 query/delicious tags not present in the pagetext

Stanford Infolab

20

conclusions
Conclusions

Query tags:

Can be extracted in a distributed fashion

new promising source of information

can provide substantially many, new tags, for a large fraction of the Web

Stanford Infolab

21

slide22
Thank You!

(DEMO)

http://tags.stanford.edu

Stanford Infolab

22

slide33
How?

Stanford Infolab

33

ad