ScoreFinder A P2P asynchronous implementation Guan Gui 2009 MERIT Summer Research Project supervised by Aaron Harwood and Scott Douglas. Melbourne School of Engineering
Motivation • The Web contains a significant amount of User Generated Content(UGC) that is unconfirmed, i.e. its credibility is unknown. Manually confirming this content, using a constant number of human experts, is infeasible. • Many sites such as YouTube, Flikr, iTunes Store, make use of reviewer feedback and simple techniques to compute the credibility or rank/rate of UGC. • Existing techniques do not consider a wide variety of additional aspects, such as reviewer behavior like malicious scoring/feedback, differing distributions of scores, the confidence of reviewers in their scores, etc.
Existing techniques • Google’s PageRank • HITS • SALSA • Hilltop • TruthFinder • TruthRank • Firefox Stumble Plugin • Reputation systems in P2P
Overview of our Approach • In this work we consider UGC to be a set of Articles submitted by the users. • We consider a set of Annotators who provide (score,confidence) values for articles that they review. • We have implemented a P2P based platform that demonstrates an iterative algorithm for ranking articles. • The algorithm, called ScoreFinder, has been provided by Yang Liao who is working with Professor RaoKotagiri and Dr Aaron Harwood. • Our implementation uses a P2P networking suite provided by NICTA, with Scott Douglas.
How does calculation work? See Yang, Kotagiri and Harwood for details.
Centralized Implementation Articles Review (Score, Conf) Output Annotators ScoreFinder Ranking
Decentralized Implementation Annotator ScoreFinder Articles Annotator Component Tracker Component
Example relationships Peer Annotator Tracker
Peer Architecture ScoreFinder Web Interface Built-in Web Server ScoreFinder Daemon ScoreFinder Algorithm Tracker Annotator Badumna Network Suite
User data and Annotator data • Users submit the following data describing their articles: Title of article, a URL to the article, description/abstract, and a list of user defined topic contributions. • Users are also Annotators and can review any article submitted to the system. Annotators can search for articles, access them and provide a score and a confidence for them. • Annotators cannot review their own articles.
Tracker & Annotator Tracker Annotator Sync Remote Copy of Data Local Copy of Data Calc Data Calc Data
Badumna Peers with public IP addresses Subset of all peers All peers, not every peer has a public IP address
Migratable Resource Controller Tracker extends Migratable Controller Peer Annotator Tracker
Web Interface • Make it possible for user to share one ScoreFinder over their home/office network • Access everywhere, even on your iTouch • Easy to customize • More attractive • We can fully integrate with future browser such Google Chrome, Firefox, say as their plugins
Live Demo • We have installed the software on Aaron Harwood’s desktop machine. He has submitted an article. • It is also installed on our laptop with multiple test users and we have several test articles submitted. • We will demonstrate searching for articles, submitting articles and providing scores for articles. • We will also demonstrate fault tolerance, e.g. if a peer leaves the system then the articles submitted from that peer will remain available.
Expert Notification • The algorithm intrinsically computes an “expertness” value for each annotator. • We can use this information to aid annotators finding interesting articles.
Future Improvement • Object Xml serialization • Based on CPU idleness or speed migration • JSON between browser and web interface handler • BIB format article automatically import • File Distribution • Pluginable base algorithm • Stats for rating algorithm