ist 441 example projects n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
IST 441 Example Projects PowerPoint Presentation
Download Presentation
IST 441 Example Projects

Loading in 2 Seconds...

play fullscreen
1 / 3

IST 441 Example Projects - PowerPoint PPT Presentation


  • 60 Views
  • Uploaded on

IST 441 Example Projects. Undergrad Project. Find a customer – interest in xbox game forum Build a search engine for Xbox game forums etc. Compare two approaches: Google CSE and LucidWorks . Steps: Crawl websites (at most 5).

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

IST 441 Example Projects


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
undergrad project
Undergrad Project
  • Find a customer – interest in xboxgame forum
  • Build a search engine for Xbox game forums etc.
  • Compare two approaches: Google CSE and LucidWorks.
  • Steps:
    • Crawl websites (at most 5).
      • Determine crawl depth, how to include/exclude certain pages, filetypes.
    • Extract information and build the index.
    • Experiment with different rankings (see “relevancy workbench” app in your LucidWorks installation).
      • http://ist441.ist.psu.edu:8988/relevancy/experiment
    • Perform search and compare the precision@K values.
graduate project
Graduate Project
  • Crawling academic institution webpages in Qatar (it’s a small domain).
    • Integrating a more powerful crawler such as Nutch/heritrix with LucidWorks system.
    • Focused crawling i.e. crawling for specific type of pages such as researchers’ home pages.
  • Modifying the parser to extract specific information such as email address, phone numbers in a web page.
  • Modifying Solr schema and/or ranking functions.
  • Comparing search results with Google CSE.
  • Discuss with instructor for more information.