Loading in 2 Seconds...

Codes for Deletion and Insertion Channels with Segmented Errors

Loading in 2 Seconds...

72 Views

Download Presentation
## Codes for Deletion and Insertion Channels with Segmented Errors

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Codes for Deletion and InsertionChannels with Segmented**Errors Zhenming Liu Michael Mitzenmacher Harvard University, School of Engineering and Applied Sciences**The Most Basic Channels**• Binary erasure channel. • Each bit is replaced by a ? with probability p. • Binary symmetric channel. • Each bit flipped with probability p. • Binary deletion channel. • Each bit deleted with probability p.**The Most Basic Channels**• Binary erasure channel. • Each bit is replaced by a ? with probability p. • Very well understood. • Binary symmetric channel. • Each bit flipped with probability p. • Very well understood. • Binary deletion channel. • Each bit deleted with probability p. • We don’t even know the capacity!!!**Motivation**• Capacity/coding results for deletion/insertion channels are very hard. • Very little theory for practical coding schemes. • Huge gap between codes and capacity bounds. • Perhaps this is an artifact of the model. • Are independent deletions/insertions the right model for insertions/deletions in practice? • Do different models yield much better results? • If so, would highlight challenges of original model.**Model Motivation**• Claim: Deletion/insertion errors occur because of timing mismatches. • Mechanisms running at slightly different speeds. • Clock drift. • After one deletion (or insertion), some time passes before the next.**Channel Model : Segmented Deletions**• Input is divided into consecutive blocks of b bits. • Channel guarantee: at most one deletion per block. • No block markers at output. • Example: b= 8. 00001110001111 0001011100101111 00010111001011 0001011100101111**Segmented Deletion Model**• More general than models requiring a gap between deletions. • Two consecutive deletions can occur on the boundary. • Can define similar segmented insertion model.**Codes for Segmented Deletions :Our Approach**• Create a codebook C with strings of b bits. • Codeword is concatenation of blocks from C. • Aim to decode blocks from left to right, without losing synchronization, regardless of errors. • Questions: • How can this be done? • What properties does C need? • How large can C be?**Notation**• Let D1(u) be all strings obtainable by deleting 1 bit from u. • And • Codebook C is 1-deletion correcting if • Fixed map from strings with 1 deletion to codeword. • Our C will have this property. • Let pref(u) be first k – 1 bits of k-bit string u, and suff(u) be last k – 1 bits. • Similarly define pref(S), suff(S).**Intuition**• At start of decoding, after reading first b – 1 bits, we know the first block. • Assuming C is 1-deletion correcting. • But don’t know if next block starts at bit b or bit b + 1 of received string. • Is marked received 0 from 1st block or 2nd? • Can’t resolve ambiguity. • Need to make sure ambiguity does not grow. • Key invariant: each successive block starts in one of two positions. Sent : Received : 00100100???????? 00100100…**Theorem Statement**• For a segmented deletion channel with blocklength b, consider a codebook C of strings of length b satisfying: • Such a codebook allows linear time left-to-right decoding.**Proof Sketch**• Maintain invariant: suppose block starts at position k or k + 1 of received string R. To decode block: • Done if • Otherwise • and this determines the sent block. • As long as sent block not of form • next block starts at position k + b – 1 or k + b.**Finding Valid Codebooks**• Restrictions lead to independent set problem. • Each possible b-bit codeword is a vertex. • Throw out vertices for restricted strings. • Edge between two vertices u, v if • Maximum independent set = largest codebook. • Can be found exhaustively for small b. • Use heuristics (greedy) for larger b.**Results**• Codes from exhaustive search: • 8 bit blocks, 12 codewords : rate > 44% • 9 bit blocks, 20 codewords : rate > 48% • Codes from heuristics: • 16 bit blocks, 740 codewords : rate > 59%. • Decoding simple – easily done in hardware.**Insertions**• Can analyze segmented insertion channels the same way. • Surprising result: the codebooks for insertions and codebooks for deletions have the same properties! • Non-obvious symmetry!**Improvements**• Extended scheme simulated in extended version of paper. • Ideas: • Increase C so that multiple decodings are locally possible (per block). • Use parity checks (local/global) to remove spurious decodings. • Use dynamic programming to enforce globally consistent decoding. • Results in higher rates, but slower, and currently no provable guarantees.**Conclusions and Open Questions**• Codes ready for implementation. • Any users? • Theoretical limits. • Capacity bounds for segmented channels? • Time/capacity tradeoffs? • Possible improvements. • Analysis of more general dynamic-programming based scheme?