
Network Coding for Distributed Storage Systems IEEE TRANSACTIONS ON INFORMATION THEORY, SEPTEMBER 2010 Alexandros G. Dimakis Brighten Godfrey Yunnan Wu Martin J. Wainwright KannanRamchandran
Outline • Introduction • Background • Analysis • Evaluation • Conclusion
Introduction • Distributed storage systems provide reliable access to data through redundancy spread over individually unreliable nodes. • Storing data in distributed storage systems • the encoded data are spread across nodes. • require less redundancy than replication. • replace stored data periodically.
Introduction • Key issue in distributed storage systems. • repair bandwidth • storage space • How to generate encoded data in a distributed way as little data as possible ?
MDS Codes • A common practice to repair from a single node failure for an erasure coded system. • a new node to reconstruct the whole encoded data object. • then, generate just one encoded block. • Maximum Distance Separable (MDS) code. • (n, k)-MDS property • recover original file by any k set of encoded data.
MDS Codes M/k M/k MDS encode M/k store at n nodes File divide encode M/k
Introduction • Redundancy must be continually refreshed as nodes fail in distributed storage systems. • large data transfers across the network.
Introduction • The erasure codes can be repaired without communicating the whole data object. • (4, 2)-MSR example when node is fail. • generate smaller parity packets of their data. • forward them to the newcomer. • the newcomer mix packets to generate two new packets. 0.5 0.5 0.5 0.5 0.5 0.5 0.5
Introduction • This paper identifies that there is a optimal tradeoff curve between storage and repair bandwidth. • smaller storage space => less redundancy => more repair bandwidth • This paper calls codes that lie on this optimal tradeoff curve regenerating codes.
Introduction • Minimum-Storage Regenerating (MSR) codes. • can be efficiently repaired. • Minimum-Bandwidth Regenerating (MBR) codes. • storage node stores slightly more than M/k . • the repair bandwidth can be reduced.
Outline • Introduction • Background • Analysis • Evaluation • Conclusion
Erasure Codes • Classical coding theory focuses on the tradeoff between redundancy and error tolerance. • In terms of the redundancy-reliability tradeoff, the Maximum Distance Separable (MDS) codes are optimal. • the most well-known is Reed-Solomon codes.
Network Coding • Network coding allows • the intermediate nodes to generate output data by encoding previously received input data. • information to be “mixed” at intermediate nodes. • This paper investigates the application of network coding for the repair problem in distributed storage. • tradeoff between storage and repair network bandwidth
Distributed Storage Systems • Erasure codes could reduce bandwidth use by an order of magnitude compared with replication. • Hybrid strategy: • one special storage node maintains one full replica. • multiple erasure encoded data. • transfer only M / kbytes for a new encoded data by replica node. • there is the problem when replica data lost.
Outline • Introduction • Background • Analysis • Evaluation • Conclusion
Storage-Bandwidth Tradeoff • The normal redundancy we want to maintain requires active storage nodes • each storing αbits • β bits each from any d surviving nodes • total repair bandwidth is γ = dβ • For each set of parameters (n, k, d, α,γ), there is a family of information flow graphs, each of which corresponds to a particular evolution of node failures / repairs.
Storage-Bandwidth Tradeoff • Denote this family of directed acyclic graphs by • (4, 2, 3, 1 Mb, 1.5 Mb) is feasible.
Storage-Bandwidth Tradeoff • Theorem 1 : For any α≥ α*(n, k, d, γ), the points are feasible.
Theorem Proof (2/4) • . • . • . • .
Theorem Proof (3/4) • . • .
Theorem Proof (4/4) • . • .
Storage-Bandwidth Tradeoff • Code repair can be achieved if and only if the underlying information flow graph has sufficiently large min-cuts.
Storage-Bandwidth Tradeoff • Optimal tradeoff curve between storage α and repair bandwidth γ • (γ = 1, α= 0.2) (γ = 1, α= 0.1)
Special Cases (1/2) • Minimum-Storage Regenerating (MSR) Codes • . • .
Special Cases (2/2) • Minimum-Bandwidth Regenerating (MBR) Codes • . • .
Outline • Introduction • Background • Analysis • Evaluation • Node Dynamics and Objectives • Model • Quantitative Results • Conclusion
Node Dynamics and Objectives (1/2) • A permanent failure • the permanent departure of a node from the system • a disk failure resulting in loss of the data stored on the node • A transient failure • node reboot • temporary network disconnection
Node Dynamics and Objectives (2/2) • A file is available • it can be reconstructed from the data stored on currently available nodes. • A file is durability • after permanent node failures, it may be available at some point in the future.
Model (1/5) • The model has two key parameters, fand a. • a fraction f of the nodes storing file data fail permanently per unit time. • at any given time, the node storing data is available with some probability a. • The expected availability and maintenance bandwidth of various redundancy schemes can be computed to maintain a file of M bytes.
Model (2/5) • Replication • redundancyR replicas • store total R M bytes • replace f RM bytes per unit time • the file is unavailable if no replica is available • probability • Ideal Erasure Codes • n = kR, redundancy Rn / k • transfer just M / k bytes each packet • replace fRM bytes per unit time • unavailability probability
Model (3/5) • Hybrid • n = k(R− 1) • store total RM bytes • transfer fRM bytes per unit time • The file is unavailable if the replica is unavailable and fewer than k erasure-coded packets are available • probability
Model (4/5) • Minimum-Storage Regenerating Codes • store total RM bytes • redundancy Rn / k • replace fRMbytes per unit time • extra amount of information • unavailability
Model (5/5) • Minimum-Bandwidth Regenerating Codes • store total M n bytes • redundancy Rn / k • replace f M n bytes per unit time • extra amount of information • unavailability
Quantitative Comparison • Comparison With Hybrid • Disadvantage : asymmetric design • MBR codes • Disadvantage : • reconstruct the entire file, requires communication with n1 nodes • if the reading frequency of a file is sufficiently high and kis sufficiently small, this inefficiency could become unacceptable.
Outline • Introduction • Background • Analysis • Evaluation • Conclusion
Conclusion • This paper presented a general theoretic framework that can determine the information. • communicate to repair failures in encoded systems. • identify a tradeoff between storage and repair bandwidth. • One potential application area for the proposed regenerating codes is distributed archival storage or backup. • regenerating codes potentially can offer desirable tradeoffs in terms of redundancy, reliability, and repair bandwidth.