1 / 17

Reputation Systems for Open Collaboration

Reputation Systems for Open Collaboration. CACM 2010 Bo Adler, Luca de Alfaro, Ashutosh Kulshreshtha , Ian Pye. Reviewed by : Minghao Yan. Introduction. Open Collaboration: Egalitarian, meritocratic, self-organizing Efficient, but with challenges

yannis
Download Presentation

Reputation Systems for Open Collaboration

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reputation Systems for Open Collaboration CACM 2010 Bo Adler, Luca de Alfaro, AshutoshKulshreshtha, Ian Pye Reviewed by : Minghao Yan

  2. Reputation Systems Introduction • Open Collaboration: • Egalitarian, meritocratic, self-organizing • Efficient, but with challenges • quality: spam, vandalism • trust: how much you can rely on that? • Reputation Systems: • computes reputation scores for objects within a domain, based on the content of themselves or the external ratings. • help stem abuse • offer indications of content quality • regulates people’s interaction in open collaboraion • Relevance to our course content • recommendation system • PageRank and HITS are “page” reputation systems

  3. Reputation Systems Content-driven vs. User-driven

  4. Reputation Systems WikiTrust • a reputation system for wiki authors and content • goals: • incentivize users to give lasting contributions • help increase quality of content and spot vandalism • offer guide to quality of content • consists of: • user reputation system • gain reputation: when user making edits preserved later • lose reputation: when their edits undone by other users in future • content reputation system • gain reputation: when revised by high-reputation user • lose reputation: when disturbed by edits

  5. Reputation Systems User Reputation System • assumptions: • sequence of revisions made by different author • possible to compare and measure the difference of two revisions • possible to track unchanged content across revisions • user reputation: • quality and quantity of contributions they make • contribution quality: • good quality: the change is preserved in subsequent revisions • bad quality: the change is rolled back in subsequent revisions • measure on how good the contribution is?

  6. Reputation Systems Contribution Quality • relies on an edit distance function d: • d(r,r’) = how many words have been deleted, inserted, replaced and displaced from r to r’ • language independent b: the current revision a: a past revision c: a future revision -1 <=q( b | a, c ) <= 1 q( b | a, c ) = 1 : revision b fully preserved q( b | a, c ) = -1 : revision b fully reverted unable to judge newly created revisions!

  7. Reputation Systems User Reputation • only consider non-negative reputation values • new user assigned reputation close to 0 • calculating revision: • 5 subsequent, 5 preceding, 2 previous by high-reputation author and 2 previous with high average text reputation • why? – to let it be difficult to subvert • calculating user reputation: • r(B) = k * d(a,b) * q(b | a,c) * log(r(C)) • r(B) is reputation increment of author B of revision b • r(C) is reputation of author C of revision c • why using logarithm? – balances the influence of reputation contribution between users

  8. Reputation Systems User Reputation • resistant to manipulation • only way to damage reputation is to revert revision • maintain fairness, resistant to sybil attack • increase reputation of B only if C has higher reputation • sybil attack – creating fake identities to gain reputation • evaluation • ability of using user reputation to predict quality of future contribution • recall is high: high-reputation user are unlikely to be reverted • precision is low: many novice authors make good contributions

  9. Reputation Systems Content Reputation • informative, robust, explainable • how ? – according to which the content has been revised, and the reputation of the author of the revision • edit part – assigned small faction of the author’s reputation • unchanged part – gains reputation • tweaks • deleting, re-arranging text – low reputation mark • raise reputation only up to author’s own reputation • associate word with last few editing authors who raised the text’s reputation • block moves • adopting edit distance weight

  10. Reputation Systems Crowdsensus • a reputation system to analyze user edits to Google Maps • goals • measure accuracy of users contributing information • reconstruct possible correct listing information • design space • relies on the existence of ground truth • user reputation is not visible • identity notion is stronger • global computation is possible

  11. Reputation Systems Crowdsensus • input • triple(u, a, v) – user u asserts attribute a has value v • structure– fixpoint graph algorithm • vertices are users and attributes • for each (u, a, v), insert an edge valued v from u to a and back • each user vertex is associated with a truthfulness value qu • iterations • all qu are initialized to an a-priori default • user vertex send (q, v) pairs to attribute vertex • attribute inference algorithm to derive the probability distribution over (v1, v2, ..., vn) • send back the user vertex the probability of vi is correct • truthfulness inference algorithm estimates the truthfulness of users • go for another iteration

  12. Reputation Systems Crowdsensus • heart of crowdsensus – attribute inference algorithm • standard algorithm – Bayesian inference • bad for real cases • information are not independent • business attributes have different characteristics • complete system • for multiple correct value attributes • dealing with spam • protecting system from abuse • integrated with other data pipeline components

  13. Reputation Systems Design Space • content-driven vs. user-driven • reputation system visible to user? • week identity vs. strong identity • existence of ground truth • affect which algorithm used • chronological vs. global reputation updates • global model can utilize information in graph topology (PageRank, HITS) • chronological model can leverage past and future to prevent attack(sybil attack)

  14. Reputation Systems Design Space

  15. Reputation Systems Conclusion • reputation systems are the on-line equivalent of the body of laws regulates real-world people interactions • reputation systems provide ways for users to evaluate content and improve trust level • design of reputation systems should leverage different aspects • reputation systems should be robust, and invulnerable to attacks (or their is no trust) • reputation systems with population-dynamic approach • reputation systems with multiple goals

  16. Reputation Systems Pros • well defined reputation systems characteristics and goals • discussion on design aspects and influence on reputation systems • detail level wikitrust implementation tweaks for preventing system from abuse and attacks • comparison of two content-driven systems well illustrated and supported the discussion of system design considerations • provided good evaluation measures of systems accuracy on wiki real data

  17. Reputation Systems Cons • lack of deeper explanation of algorithms in Crowdsensus • lack of evidence of Crowdsensus algorithm’s better performance than standard Bayesian inference on real data • lack of comparison between user-driven and content-driven model’s performance and how these two can work together

More Related