An Improved Split-Row Threshold Decoding Algorithm for LDPC Codes Tinoosh Mohsenin, Dean Truong and Bevan M. Baas VLSI Computation Lab, ECE Department University of California, Davis
Outline Introduction LDPC Decoding Goals and Key Features Split-Row Threshold Decoding Method Error Performance Results Split-Row Threshold Decoder Implementation and Results Conclusion
LDPC Decoding Message passing decoding LDPC decoding challenges High interconnect complexity for large number of processing nodes Large delay, area, and power dissipation caused by long and global wire
Outline Introduction to LDPC Decoding Goals and Key Features Split-Row Threshold Decoding Method Error Performance Results Split-Row Threshold Decoder Implementation and Results Conclusion
LDPC Decoder Design Goals and Features Key goals Very high throughput and high energy efficiency Area efficient (small circuit area) Well suited for long-length and large row weight LDPC codes Easy implementation with automatic CAD tools Good error performance Split-Row decoding key features Reduced interconnect complexity Reduced processor complexity T. Mohsenin and B. Baas, “Split-row: A reduced complexity, high throughput LDPC decoder architecture,” in ICCD, 2006 T. Mohsenin and B. Baas, “High-throughput LDPC decoders using a multiple Split- Row method,” in ICASSP, 2007
Standard vs. Split-Row Decoding standard decoding Split-Row decoding = H split - sp 1 C V 3 5 8 10 reduction of check processor area reduction of input wires to check processor
MinSum vs. MinSum Split-Row Sign Magnitude MinSum: MinSum Split-Row:
Outline Introduction to LDPC Decoding Goals and Key Features Split-Row Threshold Decoding Method Error Performance Results Split-Row Threshold Decoder Implementation and Results Conclusion
MinSum Split-Row Threshold Algorithm A signal (Threshold_en) is passed from each partition, which indicates whether a partition has a minimum less than a given threshold (T). Based on Threshold_en status, the check nodes take as their minimum of their own local Min or T. Optimum threshold value (T) is obtained by empirical simulations Threshold_en Sp1=1 Threshold_en Sp0=0 MinSum Split-Row Threshold:
Impact of Threshold Selection 15 decoding iterations SNR=4.2 dB Optimum T=0.2 Optimum T=0.2 (6,32) (2048,1723) LDPC Code Optimum threshold (T) is independent of SNR and decoding iteration
Outline Introduction to LDPC Decoding Goals and Key Features Split-Row Threshold Decoding Method Error Performance Results Split-Row Threshold Decoder Implementation and Results Conclusion
Error Performance for (16,4) (1536,1155) LDPC Code In the Plot: BPSK modulation AWGN channel Based on 80 error blocks Maximum 15 iterations SPA: Sum Product Algorithm MS: MinSum Normalized For MS Split-Row Threshold T=0.3 0.08dB 0.42 dB
Multi-Split-Row Threshold Decoding Divide parity check matrix to Spn (Spn>2) partitions Partitioning can be arbitrary so long as there are at least two variable nodes per partition Example: (6,32) (2048,1723) LDPC Code 32/Spn variable nodes
Error Performance for (2048,1723) 10GBASE-T Code MS Split-Row-2 Threshold is 0.07 dB away from MS MS Split-Row-16 Threshold is 0.22 dB away from MS and is 0.12 dB better than Split-Row-2 Original. Decoder Split-2 Split-4 Split-8 Split-16 Optimum T 0.2 0.23 0.24 0.22 dB 0.12 dB
Outline Introduction to LDPC Decoding Goals and Key Features Split-Row Threshold Decoding Method Error Performance Results Split-Row Threshold Decoder Implementation and Results Conclusion
Comparison of Decoders (6,32) (2048,1723) 10GBASE-T code with 15 decoding iterations. 10GBASE-T Code 65 nm, 7 M, 1.3 V MinSum standard Split-2 Threshold Split-4 Threshold Split-8 Threshold Split-16 Threshold Split-16 vs.MinSum Area Utilization 25% 40% 85% 95% 98% 3.9x Area (mm2) 18.2 8.9 5.0 4.5 3.8 4.8x Speed (MHz) 17 40 53 112 101 5.9x Throughput @ 15 iter (Gbps) 2.3 5.5 7.2 15.2 13.8 6x CAD Tool CPU Time (hour) >78 36 18 10 5 >15.6x
Conclusion Split-Row Threshold algorithm improves the error performance when compared with original Split-Row. Split-Row Threshold allows for high level of partitionings without losing significant error performance. Higher level of partitioning reduces the number of connections between check and variable processors. This results in a higher logic utilization, smaller and faster circuits. We can meet the demands of high speed applications while obtaining very low area when compared to standard decoding.
Support Special thanks Acknowledgements ST Microelectronics NSF Grant 430090 and CAREER award 546907 Intel SRC GRC Grant 1598 and CSR Grant 1659 Intellasys UC Micro SEM Special thanks Professor Shu Lin