Encouraging Peer Participation in Decentralized Systems: Havelaar Approach

Havelaar A Robust and Efficient Reputation System for Active Peer-to-Peer Systems Dominik Grolimund, Luzius Meisser, Stefan Schmid, Roger Wattenhofer Computer Engineering and Networks Laboratory (TIK), ETH Zurich NetEcon’06 June 10, Ann Arbor, Michigan, USA DistributedComputing Group

Talk Outline • Environment • Existing Solutions • Principles of Havelaar • Evaluation • Conclusions

Do You Know YouTube? • Very popular online video platform • > 30 mio. users, growing rapidly • >> 1 mio. watched every day • >> 10,000 uploaded every day  very active!

Guess What: YouTube is Centralized • Hosted on servers • Simple, but: huge costs • 1 mio. $ / month for bandwidth and storage  Low quality  Limited (10-minute clips)

Imagine YouTube Being Decentralized • Files stored in a distributed storage system • Resources provided by the users • Uncontrollable environment: • unreliable, ordinary desktop computers • private users • turn computer on and off at any time • can leave the system forever at any time • open, attracts malicious agents, attacks • rational agents, free-riders

Three Key Problems 1. Availability How can the data be made immediately accessible when requested, although users can turn off their computer at any time? 2. Reliability How can the data be stored persistently, despite the inherent dynamics, node departures, and malicious nodes? 3. Incentives (focus of this talk) How can rational agents be encouraged to provide their resources without free-riding?

Kangoo – A Distributed Storage System • Research at ETH Zurich • Availability achieved with redundancy: • A file is divided into ~100 blocks, which are then encrypted and encoded into ~500 redundant fragments using erasure codes • Any 100 are sufficient to reconstruct the file • Lots of transactions necessary! • Usage of YouTube would result in tens of thousands of transactions per peer and week • Not ready yet, but you can subscribe for the beta: www.caleido.com/kangoo

This Talk: Havelaar • How to encourage peers to provide their upload bandwidth? (storage and online time are handled by Kangoo itself) • Havelaar is independent of Kangoo  can be used for other systems as well. • Robust to attacks • Efficient, scalable in the number of transactions

Existing Solutions • Direct reciprocity (e.g. BitTorrent) • Tit-for-tat, iterated prisoner dilemma • Works for content distribution, but not for a system where interactions are too infrequent • Monetary-based (e.g. Karma) • Economic theory • But: centralized or else inefficient, market regulations, ... • Reputation systems (e.g. eBay) • Service differentiation: The higher your reputation, the better your service • Good, but how...

Reputation Systems How to keep track of the contribution of each peer? • Client (e.g. Kazaa) • Simple to subvert, as it has been shown with Kazaa Lite • Centralized (e.g. eBay) • Many many more transactions if used for fairness in a p2p system  server cluster would be needed • Decentralized • Good, but how...

Decentralized Reputation Systems • Direct observations do not scale to large networks with infrequent interactions • We need to incorporate second-hand observations • Big new problem: false reports

Coping with False Reports How to defend against false reports? • Max-flow • Maximum likelihood estimation • Bayesian approach • Transitivity of trust • weigh the voting by the reputation of the sender • Most systems are designed for a decentralized „pure reputation system“ (e.g. eBay), but not meant for a fairness system where we need to track the contribution of each peer with lots of transactons

Storing Contribution Values Where to store the contribution value of each peer? • Flood in the system (e.g. EigenRep) • Request from peers before transaction • Store in a DHT: „DHT-based approach“ • store and update contribution value of peer u at h(u) in a DHT  Scales linearly in the number of transactions

Introducing Havelaar • Approximation is good enough! • If peer u provides three times more than v, u should get aboutthree times a better download bandwidth than v • Track contribution value C: bandwidth b, size s • If locally computed contribution value is close to the global / real one for all peers, that‘s fine

Local Vector • Every peer has a local observation vector o • After u downloads from v (bandwidth of 5, size 3), u will increase the entry of v by 5 * 3 (Cv += 15) • Only after complete transaction

Send Local Vector To Successors w observation vector o once a round (~ week) h1(w) o defend against attacks: can only send to its k successors  limited influence can only send once per round „self-observation“ of the sender is dropped cannot praise itself o o h2(w) o k successors: determined by hash functions on the sender id w same successors in every round h4(w) h3(w)

Aggregation: Need More Observations • Need more observations for an accurate approximation  Aggregate exponentially more: own observations defend against attacks: for each entry, outliers are detected and dropped [o1,o2,o3] O = [o0,o1+o1+o1,o2+o2+o2] praise or accusation „within bounds“ will be smoothed out (lots of observations aggregated) o0 o3 dropped O O [o1,o2,o3] distribution of a vector can be analyzed  if spiked, then it is most likely an attack  drop, maybe even decrease the trust value of that peer use all for contribution update c [o1,o2,o3]

Rewarding • Always allocate full bandwidth  No artificial limits • Contention: Two or more want to download from a third node at the same time  allocate according to the contribution values • Different resource allocation algorithms possible. We chose an algorithm similar to: An Incentive Mechanism for P2P Networks, R. B. Ma et al., ICDS 2004

Evaluation • We have analyzed and simulated Havelaar • 5 successors and a matrix with four vectors is already enough for huge networks with more than 100,000 nodes and 5,000 transactions per peer and round. bootstrapping

Communication Costs • Need to send a huge matrix, but: it does not depend on the number of transactions! • The more transactions, the higher the accuracy!

Conclusions • Havelaar for active, long-term peer-to-peer systems • Robust against attacks, false reports • Low communication costs: scalable in the number of transactions • Churn: not an issue because the local vector can be sent at any time in a round • Kangoo takes care about other attacks (sybil attacks, white washing) and has strong identifiers

Thank you for your attention!

Encouraging Peer Participation in Decentralized Systems: Havelaar Approach

Encouraging Peer Participation in Decentralized Systems: Havelaar Approach

Presentation Transcript