Toward Fault-tolerant P2P Systems: Constructing a Stable Virtual Peer from Multiple Unstable Peers

Toward Fault-tolerant P2P Systems: Constructinga Stable Virtual Peer from Multiple Unstable Peers Kota Abe, Tatsuya Ueda (Presenter), Masanori Shikano, Hayato Ishibashi andToshio MatsuuraOsaka City University 14/Oct/2009 AP2PS2009 AP2PS2009

Background • P2P systems pros and cons • pros: scalability, no single point of failure, etc. • cons: hard to implement! • detect remote peer failure • replicate data over multiple peers • manage multiple pointers to backup peers • Implementing these measures is delicate work and troublesome burden for developers GOAL! • Implement a reliable layer for fault tolerant P2P systems AP2PS2009

Our Approach • Virtual Peer (VP) • Group multiple unstable peers to form a stablevirtual peer(redundant system) Virtual Peer(VP) Virtual Peer(VP) AP2PS2009

Our Approach (Cont'd) • A virtual peer consists of multiple member peers • A P2P application runs on a virtual peer as a virtual process • Failed member peer is replaced with another (non-failed) one • A virtual process is fault-tolerant • It does not fail even if some part of the member peers fail • Application developers do not need to take care of peer failure Virtual Peer Virtual Process memberpeer1 memberpeer2 memberpeer3 AP2PS2009

Issues to solve • How to achieve fault-tolerance of a virtual process? • How to ensure identical message sequences? • How to handle peer failure? • How to communicate with a remote virtual peer? AP2PS2009

1. Achieving fault-tolerance of virtual process • The state of a virtual process must be replicated over multiple member peers • Each member peer simultaneously and redundantly executes the same application, as a process • To maintain the state of each process identical: • A process must be a state machine • its state must be changed only by external messages • Also, each process receives the identical message sequence (aka atomic broadcasting) • Merit: application programs can be quite simple • Just process the received messages in order AP2PS2009

2. Ensuring identical message sequences • To implement atomic broadcast, the Paxos consensus algorithm is used • Paxos • Distributed algorithm to form a consensus between multiple nodes (peers) on an unreliable network • Only a dedicated leader peer can propose values • The leader is elected by using a leader election algorithm • All peers eventually choose an identical value • Majority agreement is required • All the member peers in VP execute Paxos algorithm • External messages sent to a VP are processed by the Paxos algorithm to be identically ordered AP2PS2009

3. Handling peer failure • Failed member peer must be replaced to keep the number of the peers constant • Otherwise the VP eventually will not be functional because majority agreement is required by Paxos • All the member peers must have a consistent view of membership configuration • Paxos is also used to update a member configuration without losing consistency AP2PS2009

3. Handling peer failure (Cont'd) Virtual Peer • The leader peer chooses another peer p from the P2P network • If leader peer fails, new leader is elected • The leader peer proposes a peer configuration change • p executes the same process • The state must be same • Process migration technique is used • Note that the majority of member peers must be alive during this replacing sequence Process Process Process Process Leader Peer Member Peer Member Peer Keep Alive choose randomly Timeout fail Peer p Paxos Process migration (replication) Virtual Peer AP2PS2009

4. Communication with virtual peer • How to deliver messages to VPs • Member peers are not fixed! • Solution: Use ALM (Application Level Multicast) • Each VP has a dedicated ALM group • All member peers join in • Messages sent to a VP are multicast to the group • We have implemented ALM by using range queries on Skip Graph ALM group Multicast by group id Virtual Peer AP2PS2009

Our implementation: musasabi • A platform forimplementing P2P services • Implemented in Java • Each peer executes a musasabi instance • An application program written in Java can be executed on musasabi • Java sandbox mechanism is used to protect a local node • musasabi uses PIAX for P2P networking • PIAX provides Skip Graph, ALM (over Skip Graph) etc. • http://www.piax.org/en/ Process 1 Process 2 Process 3 musasabi PIAX Java VM Base Operating SystemWindows, MacOS X, Linux … Configuration of musasabi AP2PS2009

Process migration in musasabi • musasabi supports strong mobility • Transfer the program, data and execution context (thread stack and program counter) • Not easy in Java (not supported by the standard JVMs) • Some implementations use customized JVMs or native libraries (not portable) • Not suitable for P2P systems! • Implementation of strong mobility in musasabi • Use Apache Javaflow library • Javaflow allows to capture and resume the execution context • Captured contexts can be transferred to a remote node! • Javaflow uses byte code translation technique and thus works on the standard JVMs AP2PS2009

Reliability of virtual peers Virtual Peer A VP fails if a majority of member peers fails • Maximum time toreplace a failed peeris 60sec • Each peer fails independently • Intervals of peerfailure are exponentiallydistributed • Peer failure rate:50% of peers failin an hour Process Process Process Leader Peer Member Peer Member Peer Keep Alive 7 peers choose randomly fail Timeout Process Paxos Peer p 60sec Reliability of VP 5 peers 3peers Process migration (replication) Elapsed time (days) 7 peers are enough in this case After 1 year Virtual Peer

MTTF of virtual peer Relation between MTTF (Mean Time To Failure) of a VP and # of its member peer is analyzed • Each peer fails independently • Intervals of peerfailure are exponentiallydistributed • Maximum time toreplace a failed peer is 60sec • Peer failure rate isvaried (x-axis) 9 peers MTTF of VP (year) 7 peers 5 peers Even in excessive peer failure environment, VP is stable if it has enough member peers Duration of 50% of the peers fail (minutes) frequently peer fails less frequently AP2PS2009

Conclusion and Future work • We proposed a novel method to construct a stable virtual peer from multiple unstable peers • Integrate the Paxos consensus algorithm, process migration technique and ALM • An application running on a VP virtually does not fail • Application programs can be quite simple • The method can be used for reducing development costs, and for improving stability, of P2P systems • Future work • Improve the method for choosing good member peers • Investigate and improve security issues of VPs • Evaluate the method on the Internet AP2PS2009

Questions? This research was partially supported by National Institute of Information and Communication Technology (NICT), Japan AP2PS2009

Toward Fault-tolerant P2P Systems: Constructing a Stable Virtual Peer from Multiple Unstable Peers

Toward Fault-tolerant P2P Systems: Constructing a Stable Virtual Peer from Multiple Unstable Peers

Presentation Transcript

Automatic Synthesis of Fault-Tolerance

Multiple-Input-Multiple- Output (MIMO) Systems

Peer-to-Peer Protocols and Systems

Fault Tolerance in Distributed Systems

Fault Tolerance

Computing in the

Stable Atmosphere

CS 552 Peer 2 Peer Networking

Fault-Tolerance: Practice Chapter 7

HSAemu

Fault Tolerance

Fault Tolerance in Protein Interaction Networks: Stable Bipartite Subgraphs and Redundant Pathways

Virtual Reality Systems

GEU 0047: Meteorology Lecture 6 Stability and Cloud Development

Software Fault Tolerance (SWFT) SWIFI in OSs

Software Fault Tolerance (SWFT) SWIFI in OSs

Pastry Scalable, decentralized object location and routing for large-scale peer-to-peer systems

Peer-to-Peer Networking and Networked Virtual Environments

Distributed Systems Concepts and Design Chapter 10: Peer-to-Peer Systems

Midterm Review CS 230 – Distributed Systems (ics.uci/~cs230)

Virtual Memory CENG331: Introduction to Computer Systems 12 th Lecture