Download presentation
Presentation is loading. Please wait.
Published byJoshua King Modified over 9 years ago
1
Tinoosh Mohsenin and Bevan M. Baas VLSI Computation Lab, ECE Department University of California, Davis Split-Row: A Reduced Complexity, High Throughput Low Density Parity Check (LDPC) Decoder Architecture
2
Outline Introduction to LDPC Codes Split-Row Decoder Algorithm Error Performance Comparison Decoder Implementation Results Conclusion
3
Error Correction in Communication Systems Error correction is widely used in most communication systems.
4
LDPC Codes Applications Standards: 10 Gigabit Ethernet (10GBASE-T): 2006 Digital Video Broadcasting (DVB-S2):2005 Next generation of WiFi and WiMAX Problems with current LDPC decoders Lack of enough memory bandwidth High interconnect complexity [www.ieee802.org/3/an/ ]
5
Transmitter: Receiver: Received Image Iteration 1 Iteration 14 Noisy Channel Decoded Image LDPC Coding Modified images from [Maccay 2001] Encoded Image
6
Performs row and column operations iteratively. 100001010 010100001 001010100 001100010 100010001 010001100 Row Processing Column Processing LDPC Decoding: Message Passing Algorithm Row processing Column processing α β Row processing Col processing Error correction Parity check Row processing Col processing Error correction Parity check Received information from channel β α
7
Serial Decoders One or a few row and column processing units. Features Simple Small area Small number of memories Disadvantages Low memory bandwidth Low throughput : 100 Kbps- 10Mbps
8
Full Parallel Decoders Row and column processors are directly mapped according to the parity check matrix High throughput Disadvantages Large circuit area High interconnect complexity Example: 2048-bit, 10GBASE-T Row weight=32, Col weight=6, quantization bit=5 139 mm 2 in 0.18 µm CMOS 122,000 long inter-processor wires 1.3 Gbps
9
Outline Introduction to LDPC Codes Split-Row Decoder Algorithm Error Rate Comparison Decoder Implementation Results Conclusion
10
Key Features of Split-Row Decoder Row processing (dominates decoder complexity) Increased parallelism Reduced number of memory accesses Reduced processor complexity Results: Smaller decoder area and higher utilization Lower interconnect complexity Higher throughput Simpler hardware implementation
11
Standard vs. Split-Row Decoder Split-Row Decoder Standard Decoder N columns row weight=Wr N/2 columns row weight= Wr/2 N/2 columns row weight= Wr/2
12
Split-Row Algorithm-Mathematical View By normalizing the α values with a scale factor S<1 the error performance of Split-Row decoder is improved The magnitude part of the row processor output α, is larger for the Split-Row decoder
13
Outline Introduction to LDPC Codes Split-Row Decoder Algorithm Error Performance Comparison Decoder Implementation Results Conclusion
14
Bit Error Rate Performance Comparison Code length: 1536 bits Message length: 1155 bits Row weight: 16 Column weight:4 No. of iterations:15 MS: MinSum MS Split-Row: MinSum- Split Row S: Scale factor 0.6dB
15
Bit Error Rate Performance Comparison Code length: 2048 bits Message length: 1723 bits Row weight: 32 Column weight:6 No. of iterations:15 MS: MinSum MS Split-Row: MinSum- Split Row S: Scale factor 0.3dB
16
Outline Introduction to LDPC Codes Split-Row Decoder Algorithm Error Rate Comparison Decoder Implementation Results Conclusion
17
A Full-Parallel Decoder Implementation LDPC code example: Code length=1536 bits Message length=770 bits Row weight=6 Col weight=3 In Split-Row decoder: Total no. of wires between each half is 3% of total wires. Row processors in each half are 2.7 times smaller Each row processor in each half is connected to only 3 column processors
18
Full Parallel Decoder Architecture 0.18 µm CMOS Technology, 6M layer Split-Row, each half includes: 768 row processors 768 column processors Standard MinSum
19
Split-Row vs. Standard Decoder 1536-bit (3,6) Quasi-cyclic LDPC code No. of quantization bits is set to 5 bits per message. For throughput computation no. of decoding iterations is set to 15. Reported numbers are based on chip implementation results in 0.18 µm Avg. Wire length Chip size Clk freq. Throughput CAD tool P&R Run time (mins) Req. Mem (GB) Standard MinSum 0.22422.1323.23203.9 Split-Row (This work) 0.14216.8535.41932.3 Improvement1.58×1.3×1.7× 1.65×1.7× (mm 2 )(MHz)(Gbps)(mm)
20
Conclusion Split-Row decoder method provides a significant reduction in circuit area Results in: Reduced wire interconnect complexity Increased circuit area utilization Increased speed Simpler implementation A good tradeoff between hardware complexity and error performance
21
Acknowledgments Intel Corporation UC Micro NSF Grant No. 0430090 UCD Faculty Research Grant
22
MinSum: Message Passing (Row processing )
23
Message Passing (Column processing ) λ j is the received information.
24
λ1λ1 100001010 010100001 001010100 001100010 100010001 010001100 H α α y1
25
= 0 (Stop decoding) ≠0 (Repeat decoding)
26
LDPC Codes An LDPC code is defined by a binary matrix called parity check matrix H. Rows define parity check equations (constrains) between encoded symbols in a code word and columns define the length of the code. V is a valid code word if H٠V t =0 Decoder in the receiver checks if the condition H٠V t =0 is valid. Example : Parity check matrix for (9, 5) LDPC code, row weight=4, column weight =2:
27
Row and Column Processor Architecture Col. Proc. Row Proc.
28
Row+Col Procs. left Row+Col Procs. Right
30
Throughput=Clk*Code length/Imax P=cfv 2
31
What is the critical path and how you make sure that sign is computed correctly? Answer: the critical path is the sign computation, which depends on the other side. The statistical timing analysis in place and route reports the slowest path delay, so it will make sure that the circuit works correctly. Why the decoder chip becomes smaller even when you make it into half? Answer: first the size and total no of col processors doesn’t change. The main benefit comes from the row processor which gets smaller than twice. The reason is that inside row processor there are different stages of comparators and they decrease more than twice when the number of inputs reduces to half. You mentioned the design is power efficient but you didn’t report any power numbers Answer: For this paper we didn’t get the power numbers, but it can be estimated from the fact the major energy comes from the wires (p=1/2cf^2) and we can say it’s scaled down linearly so it’s about 58% reduction. Are there other works close to your design?
32
Which applications can tolerate this error performance loss? This a very broad question. It really depends on the power budget and how much low you want to go on ber. What is the difference between viterbi and LDPC code? What is the difference between the turbo and LDPC? If don’t know the answer: I was not involved in That part of project but from what I know …. Review the previous works If asked why the chip figure is not square? If somebody asked: the way yu proposed didn’t decrease the no of wires how do you say that it decreases the interconncet complexity. You should notice that we are talking about long wires. Because when there is a large no of wires conincting one
33
Hard decision vs. soft: In hard decision decoding each received symbol is thresholded to yield a single received bit as input to the decoding algorithm and messages passed between variable and check nodes as single bit only In soft decision decoding, multiple bits are used to represent each received symbol and the messages passed between variable and check node How did you compute
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.