Erasure Code Replication - PowerPoint PPT Presentation

morse
erasure code replication n.
Skip this Video
Loading SlideShow in 5 Seconds..
Erasure Code Replication PowerPoint Presentation
Download Presentation
Erasure Code Replication

play fullscreen
1 / 25
Download Presentation
Erasure Code Replication
144 Views
Download Presentation

Erasure Code Replication

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Erasure Code Replication Presenter: W.K Lin (The Chinese University of Hong Kong)

  2. Why we need replication? • Storage devices can fail to function. • Use replication to increase data availability, e.g. RAID • The basic idea of replication: • Place more data in different places and increase the chance of finding a data. • P2P systems often provide replication.

  3. Server-less VoD Architecture • No centralized video server to provide the video streaming. • Each client in the system store a partial video blocks. • Store the video blocks by erasure code. • Not necessary to stream from all peers for complete video playback. • The clients can stream the video from other clients.

  4. Some Terminologies • Peers are the computers/ storage devices that store the data. • Peer availabilityμ is a measure to indicate the portion of time that the peer is up/ online. • File availabilityA is the probability to recover the file from the duplicated copies of data. • Storage overheadS is the ratio of storage required for replication to the storage required before replication

  5. Whole File Replication • Whole file replication replicates the complete file. • If the storage overhead is S, then there are S copies of data in the system. • File availability Aw:

  6. Whole File Replication • It is not storage effective: Adopted from : Replication Strategies for Highly Available Peer to Peer Networks, Ranjita Bhagwan et. al,

  7. Erasure Code Replication • Instead of replicating the whole file, replicate a portion of the file. • Principle: • A file is divided into b blocks. • Use erasure code to add redundancy to these b blocks. We then have n blocks in total. • Make the n file blocks dependent to each other – each file block has partial information of other blocks. • Any b out of the n blocks are enough to recover the original file.

  8. Erasure Code Replication • Storage overhead S = n/b; or n = S*b. • Since we need any b out of the S*b copies to recover the file, the file availability Aw is: • Notice that whole file replication is a special case of erasure code replication with b = 1.

  9. Erasure Code Replication • Erasure code replication is more storage effective Adopted from : Replication Strategies for Highly Available Peer to Peer Networks, Ranjita Bhagwan et. al,

  10. Effectiveness of Erasure Code Replication • The effectiveness of erasure code replication is determined by two factors: • combinatorial effect, i.e. SbCb >> SC1 • peer availability factorμb(1-μ)Sb-b • Erasure code replication depends on S, b, and μ.

  11. Effectiveness of Erasure Code Replication

  12. How Erasure Code Replication Performs? • File availability A (Aw or Ab) by varying μ and S:

  13. A Related Problem • Lee and Liew paper: “Parallel Communications for ATM Network Control and Management” points out a similar problem: • An information string is divided into b parts, then encoded into n parts. • Any b out of the n parts is enough to recover the original information. • Very similar to our problem! • They prove a necessary bound Sμ > 1 for reliable communication.

  14. Erasure Code Bound (Sμ > 1) • The area above the curve define the region that erasure code replication is preferred for large b.

  15. Erasure Code Replication Sensitivity Analysis • We need to use a large b in order to benefit from erasure code replication. • If the system is operating at a level Sμ ~ 1, a little fluctuation of system parameter will harm the system.

  16. Erasure Code Replication Sensitivity Analysis • The system is targeted to operate at S = 3, μ = 0.35. • Sμ > 1 • 10% measurement error of μ.

  17. Related Work I: • Markov chain model for a simple birth/ death model: Adopted from : Design and Analysis of a Fault-Tolerant Mechanism for a Server-Less Video-On-Demand System Lee and Yeung

  18. Related Work I: • Mean time to failure of the model: • Result:

  19. Related Work II: • Another Markov model: c: connected state, mean time to stay = λ u: disconnected state, mean time to stay = μ . d: dead state α : the probability of going to disconnected state d. Adopted from : Data Durability in Peer to Peer Storage Systems Gil Utard, Antoine Vernois

  20. Related Work II: Storage overhead S=3

  21. Conclusion • Traditionally, erasure code replication has been very successful, e.g. RAID • A strict bound Sμ > 1, has to be satisfied for replication to gain from erasure code replication. • Erasure code replication is sensitive to system measurement errors. • Partly explain why erasure code replication is not seen in P2P systems.

  22. Future Directions • Most analysis are based on the assumption that all peers have the same availability level. • In real system, a peer might have different failure and recovery rates. • The replica distribution, discovery are opened for research: • How to place/ locate the replicas if the peers are having different availabilities? • If the system fail, how to recover the lost replicas from the system?

  23. ~ End of presentation ~

  24. Appendix • Proof: Let X be a binomial random variable having mean μ’=Sbμ and variance σ2 =Sbμ(1-μ).

  25. Appendix • Similarly,