1 / 28

Social Search Engine

Social Search Engine. Using trusted metadata to improve the relevance of search results. The motivation factor.

cloris
Download Presentation

Social Search Engine

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Social Search Engine Using trusted metadata to improve the relevance of search results

  2. The motivation factor • The essential idea of SpaK social searchengine is that people make decisions based primarily on a few people whom they trust. The average person has a set of experts whom they consult in designated areas: the computer expert, the car expert, the fashion expert, the financial expert. If the opinions of these experts can be collected, they are incredibly useful: it is this metadata (data about other data) that gives the most intelligent filtering and sorting of the information on the internet. • "We're the search engine, but you're the fuel."

  3. What is a search engine? • A search engine is a program designed to help find information stored on a computer system, such as the World Wide Web, inside a corporate or a proprietary network or a personal computer. • The search engine allows one to ask for content meeting specific criteria and retrieves a list of references that match those criteria. • Search engines use regularly updated indexes to operate quickly and efficiently. • A program called a ‘crawler’ indexes all the web pages as it ‘crawls’ through the links on a page.

  4. Problems with existing search engines • Page rank system: Page rank assumes each incoming link is a valid vote for a website. Some links are not really valid at all. People use guestbook and blogs to spam and they are becoming less efficient.

  5. Problems with existing searchengines • Word frequency: In this method, relevance is decided based on the number of times the word repeats in web page. Word frequency can be increased by inserting unrelated keywords in a ‘meta’ tag of a web page.

  6. + Social community Existing search engines Examples: Engineer Doctor Student Actors Indians Americans Examples: Yahoo MSN Google AltaVista Social search engines An alternative: Social Search

  7. About social search engines • Social search engines are a class of search engines that use social networks to organize, prioritize or filter search results. • They use ‘metadata’ to judge the relevance of web pages to a user. • ‘Metadata’ is defined as ‘data about data’. • In this case, the metadata refers to the feedback given by the community about the web pages.

  8. Continued….. • It is really about people indexing the information we find on the web ,instead of the computational formulae that guide the traditional sites. • Since the relevance is based on trust, the users of such a social search engine are automatically secured from spamming and phishing sites.

  9. This is how it works for the example • User searches for "Thailand", and the page containing photos of a friend's Thailand vacation is chosen by the search engine.

  10. An illustration…

  11. Continued…

  12. Continued…

  13. Design And Implementation Details

  14. The registration process New user SIGN UP REGISTRATION PAGE • ASSIGNED A UNIQUE UID AND USERNAME. • NEW ENTRY IN THE MEMBERS TABLE. NO YES Re-register Login

  15. After successful log in… Profile page 1 View Community 2 View Buddies 3 Plain search 4 Search History 5 Categories

  16. Community View community List of other Registered members THE BUDDY TABLE IN THE DATABASE IS ACCESSED NO YES Rate your friend on a scale of 1 to 5. Display message: ADDED AS FRIEND Display message: ALREADY ADDED AS FRIEND

  17. Buddies… View buddy CLICK TO DELETE Database entry deleted. Display message: DATABASE ENTRY HAS BEEN DELETED

  18. Search history… Personal search history Search on specified topic Buddy search history

  19. Search Stem from user and his buddies Porter’s stemmer algorithm Default results Select the query Similarity function KN(p,q) / MAX [ kn(p), kn(q) ] Aggregate user rating, clicks, similarity. Result array extracted Using community feedback Default result array Using the search api Display re-ordered results

  20. The Similarity Function • For each stemmed word, we select a similar stem from the database, and the queries associated with that stem are extracted. • Using the similarity function KN (p, Q)/max [kn (p), kn (Q)] • Where KN (p, Q) is the number of common words in the extracted query (p) and the user query (Q). • kn(p) is the number of words in the extracted query. • kn(Q) is the number of words in the user query.

  21. Continued… • The output of the similarity function is a real number which lies between (0,1).The similarity value for a user query is stored in the database for corresponding extracted query. • We calculate an aggregate value that is a function of (similarity * clicks * rating) and order the links in the descending order of the output of this function. • The array of links got from community feedback is compared with the default search results of the API. The default search results of the API are then rearranged based on the metadata received from the community.

  22. Continued… • The output is arranged in decreasing order for a user based on previous searches made by his buddies for the same search query or a related query.

  23. Stemming...its importance • Stemming is the process of stripping the suffix off a word. • Stemming is important for our project because words with common stems will usually have similar meanings, • for example: predict, prediction, predicted etc. • Keywords in the search query are grouped according to their stems.

  24. The Porter Stemmer

  25. Description of Porter Stemmer • A consonant in a word is a letter other than A, E, I, O or U.If a letter is not a consonant then it is a vowel. A consonant will be denoted by c, a vowel by v. A list ccc... of length greater than 0 will be denoted by C, and a list vvv... of length greater than 0 will be denoted by V. Any word, or part of a word, therefore has one of the four forms: • CVCV ... C CVCV ... V VCVC ... C VCVC ... V • These may all be represented by the single form [C]VCVC ... [V] where the square brackets denote arbitrary presence of their contents. • Using (VC) {m} to denote VC repeated m times, this may again be written as [C](VC){m}[V]. m will be called the \measure\ of any word or word part when represented in his form. The case m = 0 covers the null word.

  26. User interface design

  27. Technologies used • Linux : The operating system • Apache : The web-server • MySQL : The RDBMS • PHP : Hypertext Pre Processor • CGI : Common Gateway Interface • CSS : Cascading Style Sheets

  28. The End. Thank you. Please try it out!

More Related