Create Presentation
Download Presentation

Download Presentation

Erasure Correcting Codes for Highly Available Storage

Download Presentation
## Erasure Correcting Codes for Highly Available Storage

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Erasure Correcting CodesforHighly Available Storage**Thomas Schwarz, S.J.**Error Control Codes**• Use redundancy to correct errors • Designed for • Ease of Encoding • Decoding (Calculation of syndrome / location of error) • Error Correction Power (Burst Errors / Low Redundancy)**Error Control Codes**Block Codes: Information Symbols + Parity Symbols (i1i2 i3 i4 i5 i 6 i7 i8 p1 p2 p3)**Error Control Codes**Typical Applications: Communication: Deep Space “A match made in heaven” Telephone Computer Networks Streaming Audio, Video (CD, DVD) Storage (Main Memory, Magnetic & Optical Devices)**Error Correcting Codes**Most applications use hardware implemented encoding and decoding.**Erasure Correcting Codes**Protect against erasure of data. Simplest Erasure Correcting Code: Parity i1 i2 i3 i4 i5 i6 i7 i8 p where p = i1i2 i3 i4 i5 i6 i7 i8**Erasure Correcting Codes**Some applications implement encoding and decoding in hardware (e.g. RAIDs). Software implementation is much more feasible because of the simpler decoding problem.**Erasure Correcting Codes**Ideal Properties: • Systematic: Data is stored explicitly. Data updates do not change other data. • MDS: Only as much parity data is created as is necessary to reconstruct maximum level of failures • Simple encoding and decoding.**Parity Based Codes**Only use parity of data (XOR operation) for ease of coding and decoding.**Parity Based Codes**History: Protection for Multitrack Magnetic Recording. Prusinkiewicz & Budkowski 1976: X X X X X X X X X X Parity 1 X X X X X X X X X X Data 1 X X X X X X X X X X Data 2 X X X X X X X X X X Data 3 X X X X X X X X X X Parity 2 Horizontal and diagonal parity.**Parity Based Codes**Extend the scheme by using lines of different slopes. Patel 1985: horizontal + 2 diagonals (slopes 0,1,-1) However, the code is optimal only if the data band is infinite. If not, there is (slightly) more parity than data.**Parity Based Array Codes**Idea: Break up data into m symbols. Arrange the symbols in columns. Use horizontal and vertical lines to calculate parity. 1st column: horizontal parity, 2nd column: vertical parity**Parity Based Array Codes**But is it not so simple! Is a legitimate code word.**Parity Based Array Codes**But indistinguishable from the zero code word after failure of columns 1 and 3.**Parity Based Array Codes**Number of Data Columns needs to be prime.**EvenOdd**• Better version of array codes for two parity • Code words two-dimensional m-1 by m arrays with two additional parity columns**EvenOdd**The EvenOdd code has as code words the m-1 by m+2 array of symbols ai,jsuch that**EvenOdd Encoding**Set m=5. Start with an arbitrary 4 by 5 data array.**EvenOdd Encoding**Fill in the horizontal parity lines: and calculate S to be a3,1+a2,2+a1,3+a0,4 S=0+1+0+0 = 1.**EvenOdd Decoding**Assume that the last two data columns have failed.**EvenOdd Decoding**Use the parity columns to calculate S.**EvenOdd Decoding**Use S=1 and the magenta diagonal to find the data symbol in the last column.**EvenOdd Decoding**Then use the horizontal parity for one more symbol.**EvenOdd Decoding**The blue diagonal now can be exploited.**EvenOdd**EvenOdd requires m is a prime. Hence, for a given number n of data lines, choose m to be the smallest prime n. Set the superfluous data columns to zero:**EvenOdd**Encoding and Decoding only uses XOR operations. Given formulae suggests an iterative procedure, but the equations can be easily expanded to calculate the symbols in parallel.**Higher Array Codes**There exists array codes using only XOR operations that can correct up to m erasures. The decoding process involves solution of a linear equation.**Algebraic Block Codes**Interpret symbols (larger than bits) as elements of a Galois Field. Calculate parity symbols as linear combinations of the data symbols.**Galois Fields**Only GF(2f) for simplicity’s sake. Elements: Bit strings of length f. Addition: XOR Multiplication: Much more complicated.**Galois Field Multiplication**For GF(28). Elements are bytes. Method 1: Identify byte with a binary polynomial. E.g. (0100 1001) = x6+x3+1 Multiply to polynomials as polynomials modulo a generator polynomial. E.g. modulo 1 0001 1101 = x8+x4+x3+x2+1.**Galois Field Multiplication**Combination of XORs and shifts!**Galois Field Multiplication**This multiplication gives a field structure to GF(2f). Multiplicative group is cyclic: There are elements such that all nonzero elements can be written as i , i=0,1 … 2f-1.**Galois Field Multiplication**For each non-zero element x GF(2f) define log(x)=i iff i=x. Define antilog(i) = i Calculate xy = antilog(log(x)+log(y)); if x0y = 0; if x=0 or y=0.**Galois Field Multiplication**Can be implemented with two tables, two zero comparisons, four additions three memory accesses. 9 elementary operations in a processor with sufficient L1 cache to store 3*(2f –1) entries.**Linear Erasure Correcting Block Codes**m data symbols u = (u0,u1,u2…um-1) u0 u0’ u0’’ u0’’’ . . . u1 u1’ u1’’ u1’’’ . . . u2 u2’ u2’’ u2’’’ . . . u3 u3’ u3’’ u3’’’ . . . Code Word u’’ Bucket 0 Bucket 3**Linear Erasure Correcting Block Codes**Add k=n – m parity symbols for code word a u0 u0’ u0’’ u0’’’ . . . u1 u1’ u1’’ u1’’’ . . . u2 u2’ u2’’ u2’’’ . . . u3 u3’ u3’’ u3’’’ . . . p0 p0’ p0’’ p0’’’ . . . pk-1 pk -1’ pk -1’’ pk-1’’’ . . . Parity Bucket k-1 Bucket 0 Bucket 3**Linear Erasure Correcting Block Codes**Calculate the parity symbols as a linear combination of the data symbols: With “Generator Matrix” G.**Properties of a Good Generator Matrix**• Systematic: Left m by m matrix is identity matrix. • MDS: All matrices formed from m different columns of G are invertible. Thus: Any m coordinates of code word a suffice to calculate data word u.**Generation of Generator Matrices**• Find the largest rectangular matrix with MDS property. • Multiply from left with the inverse of the matrix formed by the first m columns. Result is still MDS and now systematic.**Large MDS Matrices**• There are known families of matrices with the MDS property: • Cauchy m+n = 2f • Vandermonde n=2f–1 • Twice extended Vandermonde n =2f+1**Vandermonde Generator Matrix**• Write column m as a linear combination of the first m columns. • Multiply column i (i=0,1,…m – 1) with this coefficient (non-zero according to Cramer’s Rule. (This preserves MDS.) • Multiply with A-1, where A is the matrix consisting of columns 0 to m – 1.**RS Erasure Correcting Codes**• The generator matrix is that of a twice extended, generalized Reed-Solomon code. • Large number of parity symbols: If symbols are bytes, then code length is 257.**RS Erasure Correcting Codes**Encoding: Generation of a parity symbol costs: m multiplications with known coefficients m-1 XOR operation 7m-1 elementary operations**RS Erasure Correcting Codes**Change of one data symbol in a data word: Calculate the difference d = uinew – uinew. Send d to the site maintaining the parity symbol. Multiply with coefficient gi,l of G. Add to existing parity. 7 elementary operations per parity site. 1 elementary operation at data site. 1 message.**RS Erasure Correcting Codes**Erasure Correction: Typical cases: • Parity site has failed. Regenerate parity from the data sites. • Data site has failed. Use column m to regenerate the data from the other data sites and the XOR stored at this first parity site.**RS Erasure Correcting Codes**Erasure Correction General Case: • Collect m survivors among data and parity sites • Invert the matrix consisting of the corresponding columns of G • Each replacement site uses this matrix and G in order to calculate a decoding matrix H