Network Coding for Distributed Storage Systems

Network Coding for Distributed Storage Systems IEEE TRANSACTIONS ON INFORMATION THEORY, SEPTEMBER 2010 Alexandros G. Dimakis Brighten Godfrey Yunnan Wu Martin J. Wainwright KannanRamchandran

Outline • Introduction • Background • Analysis • Evaluation • Conclusion

Introduction • Distributed storage systems provide reliable access to data through redundancy spread over individually unreliable nodes. • Storing data in distributed storage systems • the encoded data are spread across nodes. • require less redundancy than replication. • replace stored data periodically.

Introduction • Key issue in distributed storage systems. • repair bandwidth • storage space • How to generate encoded data in a distributed way as little data as possible ?

MDS Codes • A common practice to repair from a single node failure for an erasure coded system. • a new node to reconstruct the whole encoded data object. • then, generate just one encoded block. • Maximum Distance Separable (MDS) code. • (n, k)-MDS property • recover original file by any k set of encoded data.

MDS Codes M/k M/k MDS encode M/k store at n nodes File divide encode M/k

Introduction • Redundancy must be continually refreshed as nodes fail in distributed storage systems. • large data transfers across the network.

Introduction • The erasure codes can be repaired without communicating the whole data object. • (4, 2)-MSR example when node is fail. • generate smaller parity packets of their data. • forward them to the newcomer. • the newcomer mix packets to generate two new packets. 0.5 0.5 0.5 0.5 0.5 0.5 0.5

Introduction • This paper identifies that there is a optimal tradeoff curve between storage and repair bandwidth. • smaller storage space => less redundancy => more repair bandwidth • This paper calls codes that lie on this optimal tradeoff curve regenerating codes.

Introduction • Minimum-Storage Regenerating (MSR) codes. • can be efficiently repaired. • Minimum-Bandwidth Regenerating (MBR) codes. • storage node stores slightly more than M/k . • the repair bandwidth can be reduced.

Erasure Codes • Classical coding theory focuses on the tradeoff between redundancy and error tolerance. • In terms of the redundancy-reliability tradeoff, the Maximum Distance Separable (MDS) codes are optimal. • the most well-known is Reed-Solomon codes.

Network Coding • Network coding allows • the intermediate nodes to generate output data by encoding previously received input data. • information to be “mixed” at intermediate nodes. • This paper investigates the application of network coding for the repair problem in distributed storage. • tradeoff between storage and repair network bandwidth

Distributed Storage Systems • Erasure codes could reduce bandwidth use by an order of magnitude compared with replication. • Hybrid strategy: • one special storage node maintains one full replica. • multiple erasure encoded data. • transfer only M / kbytes for a new encoded data by replica node. • there is the problem when replica data lost.

Information Flow Graph

Storage-Bandwidth Tradeoff • The normal redundancy we want to maintain requires active storage nodes • each storing αbits • β bits each from any d surviving nodes • total repair bandwidth is γ = dβ • For each set of parameters (n, k, d, α,γ), there is a family of information flow graphs, each of which corresponds to a particular evolution of node failures / repairs.

Storage-Bandwidth Tradeoff • Denote this family of directed acyclic graphs by • (4, 2, 3, 1 Mb, 1.5 Mb) is feasible.

Storage-Bandwidth Tradeoff • Theorem 1 : For any α≥ α*(n, k, d, γ), the points are feasible.

Theorem Proof (1/4)

Theorem Proof (2/4) • . • . • . • .

Theorem Proof (3/4) • . • .

Theorem Proof (4/4) • . • .

Storage-Bandwidth Tradeoff • Code repair can be achieved if and only if the underlying information flow graph has sufficiently large min-cuts.

Storage-Bandwidth Tradeoff • Optimal tradeoff curve between storage α and repair bandwidth γ • (γ = 1, α= 0.2) (γ = 1, α= 0.1)

Special Cases (1/2) • Minimum-Storage Regenerating (MSR) Codes • . • .

Special Cases (2/2) • Minimum-Bandwidth Regenerating (MBR) Codes • . • .

Outline • Introduction • Background • Analysis • Evaluation • Node Dynamics and Objectives • Model • Quantitative Results • Conclusion

Node Dynamics and Objectives (1/2) • A permanent failure • the permanent departure of a node from the system • a disk failure resulting in loss of the data stored on the node • A transient failure • node reboot • temporary network disconnection

Node Dynamics and Objectives (2/2) • A file is available • it can be reconstructed from the data stored on currently available nodes. • A file is durability • after permanent node failures, it may be available at some point in the future.

Model (1/5) • The model has two key parameters, fand a. • a fraction f of the nodes storing file data fail permanently per unit time. • at any given time, the node storing data is available with some probability a. • The expected availability and maintenance bandwidth of various redundancy schemes can be computed to maintain a file of M bytes.

Model (2/5) • Replication • redundancyR replicas • store total R M bytes • replace f RM bytes per unit time • the file is unavailable if no replica is available • probability • Ideal Erasure Codes • n = kR, redundancy Rn / k • transfer just M / k bytes each packet • replace fRM bytes per unit time • unavailability probability

Model (3/5) • Hybrid • n = k(R− 1) • store total RM bytes • transfer fRM bytes per unit time • The file is unavailable if the replica is unavailable and fewer than k erasure-coded packets are available • probability

Model (4/5) • Minimum-Storage Regenerating Codes • store total RM bytes • redundancy Rn / k • replace fRMbytes per unit time • extra amount of information • unavailability

Model (5/5) • Minimum-Bandwidth Regenerating Codes • store total M n bytes • redundancy Rn / k • replace f M n bytes per unit time • extra amount of information • unavailability

Estimating f and a

Quantitative Results (1/2)

Quantitative Results (2/2)

Quantitative Comparison • Comparison With Hybrid • Disadvantage : asymmetric design • MBR codes • Disadvantage : • reconstruct the entire file, requires communication with n1 nodes • if the reading frequency of a file is sufficiently high and kis sufficiently small, this inefficiency could become unacceptable.

Conclusion • This paper presented a general theoretic framework that can determine the information. • communicate to repair failures in encoded systems. • identify a tradeoff between storage and repair bandwidth. • One potential application area for the proposed regenerating codes is distributed archival storage or backup. • regenerating codes potentially can offer desirable tradeoffs in terms of redundancy, reliability, and repair bandwidth.

Network Coding for Distributed Storage Systems

Network Coding for Distributed Storage Systems

Presentation Transcript

Signatures for Network Coding

Availability in Globally Distributed Storage Systems

Distributed Storage

NC-Audit: Auditing for Network Coding Storage

An Update Model for Network Coding in Cloud Storage Systems

Distributed Load Balancing for Key-Value Storage Systems

Coding for Distributed Storage Alex Dimakis (UT Austin)

Distributed Network Coding Based Opportunistic Routing for Multicast

Availability in Globally Distributed Storage Systems

Network Coding Distributed Storage

Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding

Distributed Storage

Simple Regenerating Codes: Network Coding for Cloud Storage

(Distributed) (Structured) Storage Systems

Network Coding and Distributed Storage

Cooperative regenerating codes for distributed storage systems

Network Coding in P2P-Systems

Effective Replica Maintenance for Distributed Storage Systems

High-Performance Reliable Distributed Storage Systems

CS 6464: Advanced Distributed Storage Systems

Coding for Distributed Storage Alex Dimakis (UT Austin)

Efficient Replica Maintenance for Distributed Storage Systems