c raigslist n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
c raigslist++ PowerPoint Presentation
Download Presentation
c raigslist++

Loading in 2 Seconds...

play fullscreen
1 / 10

c raigslist++ - PowerPoint PPT Presentation


  • 192 Views
  • Uploaded on

c raigslist++. s ean a nastasi j oseph chen t atiana g ershanovich a ndreas sekine. our goal. to enhance craigslist’s interface show related items also being sold at craigslist show related items from other third-party sites. how we do it. main components crawler (heretrix)

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'c raigslist++' - barb


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
c raigslist

craigslist++

seananastasi

josephchen

tatianagershanovich

andreassekine

cse454 craigslist++

slide2

our goal

  • to enhance craigslist’s interface
    • show related items also being sold at craigslist
    • show related items from other third-party sites

cse454 craigslist++

slide3

how we do it

  • main components
    • crawler(heretrix)
    • clusterer (carrot2)
    • relevance sorting
    • user interface (greasemonkey)
    • other stuff

cse454 craigslist++

slide4

crawler

  • specific crawling needs
    • volatile data
    • questionable legalities
  • heritrix
    • only crawling one domain
    • problematic setup
  • our setup
    • 2 crawlers for new posts, 1 cleaner

cse454 craigslist++

slide5

clusterer

  • Carrot2
    • what to cluster (title, body or title + body)?
    • need of reclustering and combination
  • WordNet
    • combination of synonym clusters

cse454 craigslist++

slide6

relevance sorting

cse454 craigslist++

slide7

relevance sorting (cont.)

cse454 craigslist++

slide8

user interface

  • greasemonkey
    • show related posts (grouped by clusters)
    • show which items have data
  • jquery
    • folding item lists
    • mouseover details/images

cse454 craigslist++

slide9

other

  • amazon product advertising api
  • yahoo term extraction
  • botnet

cse454 craigslist++

slide10

demo

  • greasemonkey plugin
    • https://addons.mozilla.org/en-US/firefox/addon/748
  • craigslist++ script
    • http://cubist.cs.washington.edu/~lidor7/craigslistpp.user.js
  • craigslist
    • http://seattle.craigslist.org/

cse454 craigslist++