1 / 18

A Scalable Architecture for LDPC Decoding

A Scalable Architecture for LDPC Decoding. Cocco, M.; Dielissen, J.; Heijligers, M.; Hekstra, A.; Huisken, J. Design, Automation and Test in Europe Conference and Exhibition, 2004. Proceedings , Volume: 3 , Feb. 16-20, 2004 Pages:88 - 93. Outline. Introduction Serial approach UMP algorithm

elgin
Download Presentation

A Scalable Architecture for LDPC Decoding

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Scalable Architecture for LDPC Decoding Cocco, M.; Dielissen, J.; Heijligers, M.; Hekstra, A.; Huisken, J. Design, Automation and Test in Europe Conference and Exhibition, 2004. Proceedings ,Volume: 3 ,Feb. 16-20, 2004 Pages:88 - 93

  2. Outline • Introduction • Serial approach • UMP algorithm • Dataset in check nodes • Check operation • Computation skill • Memory reduction • Computation for Iteration

  3. Introduction • High code rate (=0.9) LDPC code • K (avg.=30):Row-weight • High code rate, codeword length and High SNR • Memory reduction (1/10)

  4. Serial Approach • Storage media application (optical or magnetic) • Relaxed delay requirement • Process from first bit node to last bit node • Memory storage for message

  5. UMP Algorithm • "FOR 40 ITERATIONS DO" • "FOR ALL BIT NODES DO" • "FOR EACH INCOMING ARC X" • "SUM ALL INCOMING LLRs EXCEPT OVER X" • "SEND THE RESULT BACK OVER X" • "NEXT ARC" • "NEXT BIT NODE" • "FOR ALL CHECK NODES DO" • "FOR EACH INCOMING ARC X" • "TAKE THE ABS MINIMUM OF THE INCOMING • LLRs EXCEPT OVER X" • “TAKE THE XOR OF THE INCOMING LLRs EXCEPT OVER X” • "SEND THE RESULT BACK OVER X" • "NEXT ARC“ • "NEXT CHECK NODE" • "NEXT ITERATION"

  6. UMP algorithm • Not needed knowledge of SNR of channel Robust performance • Not needed complex mathematical function (tanh x) area saving

  7. Check Node 4 Dataset in check nodes • Minimum: Overall minimum value • One-but-minimum • Index

  8. Check operation • Compute exclusive or of all hard bits output by connected bit nodes, except jth. • Compute the minimum of all K absolute value of LLRs of bit nodes to which the check node is connected, except jth.

  9. Computation skill • Minimum: LLRj is not minimum, minimum=overall minimum. Otherwise, minimum=second-to-minimum

  10. Memory reduction • Original size • Reduced size

  11. Memory unit inside Check node

  12. Computation for Iteration • "FOR 40 ITERATIONS DO" • "FOR ALL BIT NODES DO" • “CALCULATE THE OUTPUT MESSAGES FROM THE 3 CONNECTED CHECK NODES“ • “DO RUNNING CHECK NODE UPDATES ON THE 3 CHECK NODES” • “NEXT BIT NODES” • "NEXT ITERATION"

  13. Computation for Iteration NEW | OLD NEW | OLD NEW | OLD NEW | OLD

  14. Control R/W & address Serial input Serial output Time folded architecture FSM & PC μROM Computational Kernel Prefetcher Memory

  15. Prefetch • Every dataset is statically used for 30 consecutive cycles. • Every clock cycle an average of 2R and 2W operations are required. • Delayed writeback • Datasets caching

  16. Tiled architecture FSM & PC μROM Computational Kernel Prefetcher Memory

  17. Result and area distribution • N=1020 R=0.5, 57 tiles 36mm2 with 0.13μm @1GHz, 300Mb/s

  18. Conclusion • Speedup & Simultaneously multiple access  Prefetch • Reduce memory access latency Memory hierarchy • Increase performance N-tiled architecture • Modified version can be pipelined

More Related