Iterative Coding for Broadband Communications: New Trends in Theory and Practice

Iterative Coding for Broadband Communications: New Trends in Theory and Practice Amir H. Banihashemi Broadband Communications and Wireless Systems (BCWS) Centre Dept. of Systems & Computer Engineering Carleton University

outline • Iterative coding schemes and LDPC codes • Min-sum algorithm and its modifications (Zarkeshvari, Zhao) • New schedules for iterative decoding (Mao, Xiao) • Normalized and offset belief propagation (Yazdani, Hemati) • Majority-based algorithms (Zarrinkhat) • Hybrid algorithms (Zarrinkhat, Xiao) • Bootstrap decoding and reliability-based scheduling (Nouh) • Iterative decoding in analog electronics and optics (Hemati) • Dynamics of asynchronous continuous-time iterative decoding (Hemati) • RC-LDPC codes in hybrid ARQ schemes (Yazdani) • LDPC codes on channels with burst errors (Hong)

LDPC codes and iterative decoding • Iterative coding schemes, such as turbo codes and LDPC codes, provide excellent performance/complexity tradeoff. • Iterative decoding can be naturally described using graph representations (Tanner graph (TG)). • For linear block codes: Check Nodes I II III 1 2 3 7 6 5 4 Variable Nodes

Iterative decoding algorithms • There are a number of iterative message-passing decoding algorithms, each offering a particular tradeoff between error performance and decoding complexity. • Best performing: ``belief propagation (BP)” or ``sum-product (SP).” It converges to a posteriori probabilities (APP) for bits on a cycle-free graph. • Less complex: ``min-sum (MS),” also referred to as ``max-sum,” or ``max-product.” It converges to Maximum-likelihood (ML) solution for codewords on a cycle-free graph.

Min-sum algorithm • Min-sum can be considered as an approximation to BP in log-likelihood ratio domain. • Advantages over BP: - Simpler to implement - Doesn’t require an estimate of noise power - More robust against quantization error • Disadvantage: Inferior error performance

Min-sum algorithm on BI-AWGN Channel • Min-sum:Initialization: Check node step: Variable node step: Hard decision (at variable node s):

Min-sum algorithm • Effects of clipping and quantization on MS at short block lengths are studied: - clipping improves the performance, - 4 quantization bits provide performance close to or even better than that of unquantized MS (compared to 6 bits for BP in LLR domain). • Simple modifications that can considerably improve MS performance are proposed: - Modified MS can outperform BP!

Quantized MS: (1268,456) irregular code

Quantized MS: (273,191) and (8000,4000) regular codes

Modified MS Algorithms: (1268,456) code

Min-sum and its modifications: concluding remarks • With optimal clipping threshold, only 4 bits suffice to obtain near (or even better than) unquantized performance. • Modifications to min-sum algorithm, which considerably improve the performance with small increase in complexity are proposed. • In some cases, the modified min-sum algorithms, even in their quantized form, outperform belief propagation! • This indicates that algorithms which are optimal on cycle-free graphs do not necessarily deliver the best performance on graphs with cycles. • Min-sum with unconditional correction seems to be a very good choice for practical digital decoding of LDPC codes.

Motivation:Given an LDPC code with a particular TG, a given channel model, and an iterative decoding algorithm, is there any space for performance improvement? Yes! with similar or even lower complexity! Main idea:Schedule the message-passing on the TG according to the structure of the graph to minimize the sub-optimality of the decoder. Implementation:Girth- and closed-walk-dependent schedules: Node-based vs. Edge-based Unidirectional vs. Bidirectional Deterministic vs. Probabilistic Message-passing schedules

Message-passing schedules • Different schedules provide different performance/complexity tradeoffs. • In general, more complex schedules perform better. • Edge-based and probabilistic schedules are more complex to implement compared to node-based and deterministic schedules, respectively. Bidirectional schedules are roughly twice as complex as the corresponding unidirectional ones. • The performance/complexity tradeoff is not only a function of schedule and TG, but also depends on decoding algorithm and channel model.

Message-passing schedules • Codes: I. Regular (1200,600) II. Regular (8000,4000) III. Irregular (1268,456) IV. Irregular (3072,1024) • Channel models: BSC, AWGN, Rayleigh fading (with and without SI) • Decoding algorithms: Gallager’s algorithm A, BP, MS

Edge-based vs. node-based schedule (GA for code I over BSC)

Bidirectional vs. unidirectional schedule (BP for code II over AWGN channel)

Deterministic vs. probabilistic schedule (BP for code III over AWGN channel)

Scheduling for MS (code I over AWGN channel)

Scheduling on uncorrelated Rayleigh fading channels (BP, code IV)

Message-passing schedules: concluding remarks • Different schedules provide different tradeoffs between error performance and decoding complexity. • The tradeoff depends not only on the girth and closed-walk distributions of the TG, but also on the decoding algorithm and the channel model. • In general, the new schedules outperform the conventional flooding schedule.

Reliability-based schedule: (273,191) regular code

Reliability-based schedule: (273,191) PG code

Normalized and offset BP • Motivation: Reliability of BP messages are overestimated on graphs with cycles.

Majority-based decoding algorithms • Majority-based algorithms work based on a generalized majority-decision rule: For the ensemble of (dv, dc)-regular graphs (dc > dv ≥ 3), a majority based algorithm of order ω, 0 ≤ ω≤dv – 1 – ⌈dv / 2⌉, denoted by MBω, is defined by

Majority-based decoding algorithms • They are particularly attractive for their remarkably simple implementation (per iteration). • Both Gallager’s algorithm A and standard majority decoding belong to this family. • We investigate the dynamics of these algorithms using density evolution and compute the threshold values for regular LDPC codes decoded by these algorithms. • It appears that many of these algorithms enjoy very fast convergence, and/or have better threshold values compared to Gallager’s algorithm A.

Threshold values

Convergence speed Number of iterations required to achieve an average fraction of erroneous messages below 10-6. The channel parameter is 90% of the smallest threshold value amongst different orders.

Majority-based decoding: concluding remarks • Many of the majority-based algorithms have a larger noise threshold and enjoy a much faster convergence compared to Gallager’s algorithm A. • Can be used in conjunction with soft decoding algorithms in hybrid platforms to achieve very good performance/complexity tradeoffs.

Hybrid algorithms • Combining different iterative decoding algorithms with the aim of improving the performance/complexity tradeoff. • Suppose that are N message-passing algorithms which can be used to decode over a given channel. Hybrid algorithm is defined by where and are probability mass functions at iteration for partitioning variable and check nodes into N partitions, respectively. The nodes in the i th partition process the messages according to • Class I: • Class II:

Hybrid algorithms • Threshold values for some optimized hybrid algorithms:

Hybrid algorithms: concluding remarks • Hybrid algorithms can provide large improvements in threshold and speed of convergence compared to their constituent algorithms. • Class II (switch-type) algorithms have slightly better thresholds compared to class I (time-invariant) algorithms. The latter class however is far less sensitive to channel conditions and thus can be practically more attractive. • The convergence region of many majority-based algorithms extends to , which indicates that these simple algorithms can take care of decreasing the error probability to zero given that a more powerful algorithm has sufficiently reduced it, already. • Majority-based algorithms are good candidates for class II hybrid algorithms.

Iterative decoding in analog electronics • Need for real computations and iterative nature of BP algorithm has motivated some very recent research on analog implementations (1999 – 2002). • This is projected to improve the ratio of speed to power consumption by two orders of magnitude. • Proposed implementations are based on either BiCMOS or subthreshold CMOS technologies. • We show that min-sum algorithm can be implemented by full CMOS technology. • Max winner-take-all (WTA) circuits with high swing, low voltage and very good accuracy have been designed.

Full CMOS min-sum analog iterative decoder • current-mode circuits • lower fabrication cost and/or simpler design compared to previously reported analog iterative decoders that are based on BiCMOS or sub-threshold CMOS technology. • higher robustness in MS is favorable in mitigating the problems of mismatch and parameter variations due to the change of temperature in large analog integrated circuits.

Full CMOS min-sum analog iterative decoder • modules with large number of inputs can be fabricated easily and simulations show that increasing the number of inputs does not increase the delay as much. • Special circuits have also been designed for deep submicron technologies, where short channel effects degrade the performance of conventional circuits and low voltage power supplies are used. • functionality of circuits has been tested by simulating the decoder based on TSMC 0.18 μm CMOS technology for (7,4) Hamming code. • An MS decoder for a regular (32,24) code has been designed and submitted for fabrication.

Dynamics of asynchronous continuous-time iterative decoding • Iterative decoding with flooding schedule can be formulated as a fixed-point problem solved iteratively by successive substitution method. • Analog asynchronous decoding can be approximated as the application of the well-known successive over relaxation (SOR) method for solving the fixed-point problem. • Simulation results confirm that SOR, which is in general superior to the simpler successive substitution method, can considerably improve the performance of BP and MS for short codes.

Simulation results

Dynamics of analog iterative decoding: concluding remarks • Implementation of iterative decoding algorithms in analog circuits not only increases the ratio of speed to power consumption compared to digital synchronous circuits, but also can provide a better performance. • This work also suggests yet another framework for improving iterative decoding algorithms, in general, and belief propagation, in particular, on graphs with cycles.

RC-LDPC codes in hybrid ARQ schemes • Type-II hybrid ARQ protocol • Rate-compatible (RC) LDPC codes constructed by progressive edge growth (PEG) construction • Linear-time encoding • Design of puncturing and extending patterns

RC-LDPC codes in hybrid ARQ schemes

Wish I had more time!Thanks!

Iterative Coding for Broadband Communications: New Trends in Theory and Practice