1 / 10

Deepin Search

Deepin Search. Wenxu Li & Ziming Zhai. Motivation. Google gives you the best results for everyone, but maybe not the best for you. Besides keyword match, maybe you also be aware of site speed, site quality or category belonging.

darena
Download Presentation

Deepin Search

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Deepin Search Wenxu Li & Ziming Zhai

  2. Motivation • Google gives you the best results for everyone, but maybe not the best for you. • Besides keyword match, maybe you also be aware of site speed, site quality or category belonging. • It would be great if users can create their owns ranking methods.

  3. Our Approach • Retrieve the first 24 results from google • Send request to Amazon Alexa Services to get insight information of each result url • Use our formula to calculate the score of each criteria for each url • Allow user to change the weight of each criteria • Re-rank the results based on the final scores

  4. Main Functions • Customized rank: User can use scroll bar to give weights to five criteria • Speed • Quality • Popularity • Date Created • Keyword Match • View detailed information of each url User can view the general description of each url, 3 months traffic information and related sites • Display results in category We allow the results to show in categories

  5. Amazon Alexa Services • URL Info • Related Links, Categories, LinksInCount • Rank, RankByCountry, RankByCity • UsageStats, Speed, Keyword, SiteData • ContactInfo, AdultContent, Language, OwnedDomains • SitesLinkingIn • CategoryListings • Domz, return a list of sites within that category • Traffic History since 06-01-2007 • Rank, Reach, PageView

  6. Calculate Ranking Scores • Formula: • S = Us*Sspeed + Ut*Stime + Uq*Squality + Up*Spopularity + Ud*Sgoogle • Normalization: • Squality = (value-min)/(max-min) • Popularity Score: • Reach*PageView • Quality Score: • 0.5 * SLinksInCount + 0.5 * SPageView • Dummy Variable (Keep Google Ranking)

  7. Architecture

  8. Implementation • jQuery + PHP + MySQL • AJAX + JSON + XML • Hosted on Godaddy • Amazon Alexa Cost $1.6 so far ($0.15 per 1000 requests) • Use hash (inverted index) to index url • Use Trie Structure to organize url in categories

  9. Performance • Each Query (everything on the fly) • 5*3 connections to Google • 24*5 connections to Amazon Alexa • Godaddy has connection limitation • Actually more than 200 connection requests per query • Ajax to split a big task into 6 tasks, each one only deals with one kind of information • Store retrieved info to database, update regularly • It saves money

  10. Demo • Demo • http://www.zhaiziming.com/deepin

More Related