On-line arithmetic for detection in digital communication receivers Sridhar Rajagopal and Joseph R. Cavallaro Rice University {sridhar,cavallar}@rice.edu This work is supported by Nokia, TI, TATP and NSF
CDMA communication system noise + interference Base-station Direct Reflections User 1 User 2 Single user detection – Ignore interference of other users as noise Multiuser detection – Cancel interference and jointly detect data of all users
Benefits of multiuser detection 2 4 6 8 10 12 14 16 -4 -3 -2 -1 Error rate vs. SNR SNR (in dB) Bit error rate Single-user (channel estimation + detection) Multi-user estimation+ Single-user detection Multi-user (channel estimation + detection)
Contributions Accelerate multiuser detection for 3G communication systems. Reduced computations - identified minimal necessary. Increased throughput and latency reduction using on-line arithmetic. Increase in throughput - 2.63X to 3X Decrease in latency - 1.5X to 1.79X
Key concept Conventional detection - high precision operations (8-32 bits) followed by testing for sign. Actual detection dependent only on most significant digits (1-3 bits). Use MSDF computation to find the sign and avoid computation of the successive digits. Detection
Outline Single and multiuser detection On-line implementation for BPSK modulation single user detector multiuser detector Performance comparisons Extensions to higher modulation schemes Conclusions
Multiuser interference Interference due to past, current and future bits of other users. Delay I-1 I Interference from future bits of other users d1,i-i Desired User I I+1 Interference from previous bits of other users dk,i I I+1 dj,i+1 ri-1 ri ri+1 ri+2 Received signal
Multistage Parallel Interference Cancellation (PIC) Conventional Code Matched filter: r - received signal A - channel estimates y - soft decision d - detected bits Iterate for convergence (PIC) l- iteration index, i-time - inteference estimates
Multiuser detection in a CDMA communication system Start with a single user detector estimate. Subtract interference from other users in stages (Parallel Interference Cancellation). Code matched filter detector PIC (Stage 1) PIC (Stage 2) Received bits (r) Detected Bits(d) PIC (Stage 3)
Conventional detector N mult’s N/2 add’s 1 add 2*log2(d)*tconv log2(d)*tconv N* N/2+ 1 di-1 di di+1 Latency = Throughput = (log2(N)+2)*log2(d)*tconv
Pipelined architecture for multiuser detection
Conventional multiuser detector Latency = (2*S –1)*tCPIC +2*tCMF Throughput = tCMF
Outline Single and multiuser detection On-line implementation for BPSK modulation single user detector multiuser detector Performance comparisons Extensions to higher modulation schemes Conclusions and future work
On-line arithmetic Uses a redundant number representation. Pipelined bit-serial arithmetic with MSDF computations. Successive computations as soon as inputs available ( = 1..4, typically). Algorithms available for various operations (+,*,/,sqrt) and for fixed-point computations. Input x x1 x2 x3 x4 x5 … Input y y1 y2 y3 y4 y5 Output z z1 z2 z3 z4 z5
Adder Implementation tconv – conventional adder time per bit tOL – online delay time per digit d – bit-precision
Multiplier implementation
On-line detector N mult’s N/2 add’s 1 add Execution time depends on time for computing non-zero MSD. Throughput dependent on SNR -- limitation.
Dependency of execution time for on-line addition on SNR 5 4.5 4 3.5 3 2.5 Time taken for addition 2 Conventional addition 1.5 1 On-line addition 0.5 -1 -0.5 0.5 1 Signal Amplitude
On-line single user detector
On-line multiuser detector
Implementation considerations Number of users N=K=32 Bit-precision d = 8 Number of detection stages S = 3 radix r =4 On-line delay tOL = 2 Time for 1-bit addition tCONV = 1 Average time for first non-zero MSD tstop = 2
Performance comparisons Detector Conventional On-line Speedup Single user Latency (log2(N)+2)* 1.50 log2 (d)*tconv = 21 tOL +tstop = 14 Throughput m* tOL=8 2.63 Multi-user (2*S-1)*tCPIC tMF+S*tPIC 1.79 +2*tCMF =168 +m* S*tOL=94 tCMF =24 3.00
Outline Single and multiuser detection On-line implementation for BPSK modulation single user detector multiuser detector Performance comparisons Extensions to higher modulation schemes Conclusions and future work
Extensions for higher modulation schemes QPSK/QAM – 2 bits/symbol Sign of real and imaginary part Proposed for 3G communication systems Real and imaginary parts processed independently
M-QAM and M-PSK modulation M-QAM : Get additional digits after the sign or choose a higher radix representation More computations if closer to the decision region M-PSK : More complicated as angle-based decision region M-QAM M-PSK
Conclusions On-line arithmetic has significant potential for detection in communication systems. Extraneous computations avoided. Higher throughput ( 2.63X - 3X ) and lower latencies ( 1.5X - 1.79X ) achieved for detection. Suitable for wide range of detection algorithms and modulation schemes.
Future Work Other blocks in a communication receiver using on-line arithmetic. Extensions to higher modulation schemes On-line implementation for M-PSK. On-line arithmetic ASIC for a communication receiver.