Fountain Codes D.J.C MacKay IEE Proceedings Communications, Vol. 152, No. 6, December 2005
Outline • Introduction • Fountain Codes • Intermission • LT Codes • Raptor Codes
Introduction • Erasure Channel: • Files are transmitted in multiple small packets. • Each packet is either received without erroror loss. • Such as the Internet. • How to deal with packet loss? • Some simple retransmission protocols: • ACKs: for missing packets. • ACKs: for received packets. • Erasure Correcting Codes.
Introduction • Why Erasure Correcting Codes? • Retransmission are wasteful when erasure is serious: • ACKs: for missing packets. • ACKs would be enormous. • ACKs: for received packets. • Would lead to multiple copies. • Broadcast Channel with erasure: server A E F D C
Introduction • Erasure Correcting Codes: • Block Code, such as (N, K) Reed-Solomon Code: • Any K of the N transmitted symbols are received, then the original K source symbols can be recovered. • High Complexity: O( K(N-K) log2N) • Estimate the erasure probability f first, then choose the code rate R = K/N before transmission. • Ex. If loss-rate = 50%, then set code rate R = 1/(1-50%) = 1/2 = K/N. ( N = 2K ) loss-rate = 50%
Introduction • Erasure Correcting Codes: • Block Code, such as (N, K) Reed-Solomon Code: • If f is larger than expected (decoder receives fewer than K symbols): • Ex. We thought loss-rate is50%, and set the code rate R = 1/2( N = 2K ); however, the actualloss-rate = 66.7%, the proper code rate Rshould be lower: R = 1/3 ( N = 3K ) • We would like a simple way to extend the code on the fly to create a lower-rate (N’, K) code. • No way! loss-rate = 66.7%
Fountain Codes • Fountain Codes are rateless: • The number of encoded packets generated can be determined on the fly.
Fountain Codes • Fountain Codes are rateless: • The number of encoded packets generated can be determined on the fly. • Fountain Codes can also have fantastically small encoding and decoding complexities. • Depends on the careful choice of Degree Distribution.
Intermission • Balls–and–Bins Problem: • Imagine that we throw N balls independently at random into K bins, what probability of one bin have no balls in it? K bins … N throws …
Intermission • Balls–and–Bins Problem: • After N balls have been thrown, what probability of one bin have no ball in it? • The probability that one particular bin is empty after N balls have been thrown: K bins …
Intermission • Balls–and–Bins Problem: • After Nballs have been thrown, what probability of one bin have no ball in it? • The probability that one particular bin is empty after N balls have been thrown: • The expected number of empty bins:δ = ,which roughly implies: the probability of all bins have a ball is (1- δ) only if: K bins …
LT Codes • Luby Transform (LT) Codes: • Encoding process: • For the ith encoded packet, select degree di by carefully chosen Degree Distribution(RobustSoliton Distribution). • Choose di source data. • Perform XOR on chosen data. • Decoding process: • Decode degree-one encoded packets. • Remove degree-one edges iteratively. x6 x1 x2 x3 x4 x5 … y1 y2 y3 y4 y5 x2 x4 x1x3 x2x5 x3x5x6
LT Codes • Designing the Degree Distribution: • A fewencoded packets must have high degree. • To ensure that every source data are connected to encoded packets. • Many encoded packets must have low degree. • So that decoding process can get started, and keep going. • Also the total number of XOR operations involved in the encoding and decoding is kept small. x1 x2 x3 x4 x5 y1 y2 y3 y4 y5
LT Codes • Some properties of Degree Distribution: • The complexity(both encoding and decoding) are scaled linearly with the number of edges in the graph. • Key factor: The average degree of the encoded packets. • How small (number of edges) can this be? • Recall: Balls–and–Bins Problem. • Balls: linked edges. • Bins: source data. x1 x2 x3 x4 x5 How small number of edgescan assure that every source data must have at least one edge on it? (all bins have a ball) y1 y2 y3 y4 y5
LT Codes • Some properties of encoder: • Encoder throws edges into source data at random. • The number of edges must be at least oforder : K lnK. • Balls–and–Bins Problem: • The expected number of empty bins:δ = ,which roughly implies: the probability of all bins have a ball is (1- δ) only if:
LT Codes • For decoder: • If decoder received optimal K encoded packets, the average degree of each encoded packet must be at least: lnK • The number of edges must be at least oforder: K lnK. • The complexity(both encoding and decoding) of an LT code will definitely be at least: K lnK • Luby showed that this bound on complexity can indeed be achieved by a careful choice of Degree Distribution.
LT Codes • Ideally, to avoid redundancy: • We would like just one check node has degree one at each iteration. • Ideal Soliton Distribution: • The expected average degree under this distribution is roughly: lnK
LT Codes • In practice, this distribution works poorly: • Fluctuations around the expected behavior: • Sometimes in the decoding process there will be no degree-one check node. • A few source data will receive no connections at all. • Some small modification fixes these problems. • Robust Soliton Distribution: • More degree-one check node. • A bit more high-degree check node.
LT Codes • Robust Soliton Distribution: • Two extra parameters: c and δ • The expected number of degree-one check node (through out decoding process) is about: • δ: a bound on the decoding failure probability. • Decoding fails to run to completion after K’ of encoded packets have been received. • c: a free parameters smaller than 1. • Luby’s key result.
LT Codes • Luby defines a positive function: , then adds the Ideal Soliton Distributionρ to τ and normalize to obtain the Robust Soliton Distribution μ: , where ※ Receiver once receives K' = KZ encoded packets ensures that the decoding can run to completion with probability at least 1 - δ.
LT Codes ※ High-degree ensures every source data is likely to be connected to a check. τ( k/S ) ※ Small-degree of τensures the decoding process gets started.
LT Codes ※Histograms of the number of encoded packets N required in order to recover source data K = 10,000
LT Codes ※Practical performance of LT codes - Three experimental decodings are shown. an overhead of 10% ※ All codes created with c = 0.03, δ = 0.5 (S= 30, K/S = 337, Z = 1.03), and K = 10,000
Raptor Codes • Complexity cost: • LT Codes: O(K lnK), where K: the number of original data. • Average degree of the encoded packets:lnK • Encoding and decoding complexity: lnK per encoded packet • Raptor Codes: Linear time encoding and decoding. • Concatenating a weakened LT Code with an outer code. • Average degree of weakened LT code ≒3
Raptor Codes • Weakened LT Code: • Average degree of encoded packets ≒3 • A fraction ofsource data will receive no connections at all. • What fraction? • Balls–and–Bins Problem: Also the fraction of empty bins • ≒5%
Raptor Codes • Shokrollahi’s trick: ※ Encoder: • Find a outer code can correct erasures if the erasure rate is: , then pre-code K source data into: • Transmit this slightly enlarged data using a weaken LT Code. ※ Decoder: • Once slightly more than K encoded packets been received, can recover of the pre-coded packets (roughly K packets). • Then use the outer code to recover all the original data.
Raptor Codes ※Schematic diagram of a Raptor Code K = 16 Pre-Coding K’ = 20 covered = 17 Weaken LT N = 18
Raptor Codes ※The idea of a weakened LT Code. ※ LT codes created with c = 0.03, δ = 0.5 and truncated at degree 8 (thus average degree = 3)
Soliton Distribution • Ideal Soliton Distribution: • The average degree is roughly: lnK
Soliton Distribution • Robust Soliton Distributionμ: • The average degree is roughly: lnK , where