Download presentation
Presentation is loading. Please wait.
Published byByron Berry Modified over 9 years ago
1
RICE UNIVERSITY Advanced Wireless Receivers: Algorithmic and Architectural Optimizations Suman Das Rice University Department of Electrical and Computer Engineering & Center for Multimedia Communication
2
RICE UNIVERSITY Introduction Wireless is one of the fastest growing industries 0 100 200 300 400 500 600 700 199319941995199619971998199920002001 millions of cell-phone users Year Source: Ericsson “By 2002, a lot more cellular phones are going to have internet access than PCs.” Larry Ellison, CEO, Oracle.
3
RICE UNIVERSITY Wireless Cellular Ubiquitous wireless connectivity Wireless LAN Bluetooth/ Home Networks Ad-hoc Network
4
RICE UNIVERSITY Why advanced receiver algorithms? The number of wireless subscribers growing Multimedia data replacing voice traffic Higher and varied data rate (144Kbps - 2Mbps) Stricter quality of service (QOS) Wireless bandwidth remains a critical resource Current generation receivers are suboptimal
5
RICE UNIVERSITY Performance of advanced receivers Current receiver Advanced receiver Theoretical limit 6810121416 4 10 -4 10 -2 10 0 bit error rate SNR (dB) Huge performance improvement
6
RICE UNIVERSITY Computational requirements of advanced receivers 15 user system transmitting at 0.5Mbps needs ~20 Billion additions per second ~15 Billion multiplications per second Requires 32 bit floating point precision 50 floating point DSP-s running at 200MHz to sustain the computation!
7
RICE UNIVERSITY My research Receiver design High performance Low complexity Approach Algorithmic simplification Efficient architectural mapping
8
RICE UNIVERSITY Wireless channel model Direct Path Reflected Paths Noise User 1 User 2 Base Station Channel Effects Background noise Fading Multiple paths Multiple Users Multiple Access Interference(MAI)
9
RICE UNIVERSITY Code Division Multiple Access (CDMA) time S(t) bit chip Spreading gain = 7 Wideband CDMA - technology of choice Users distinguished by spreading sequence Received signal K: # of users P: # of paths w : attenuation : delay b: data bits - 1 -1 1 -1 1 1 1
10
RICE UNIVERSITY CDMA system Proposed advanced/multiuser receiver modules Designed in isolation Suboptimal design DECODING DETECTION DEMODULATION CHANNEL ESTIMATION RECEIVER detected bits of all K users TRANSMITTER ENCODING SPREADING MODULATION OTHER USERS data
11
RICE UNIVERSITY Integrated receiver design Joint channel estimation and detection Joint detection and decoding DECODING DETECTION DEMODULATION CHANNEL ESTIMATION RECEIVER detected bits of all K users
12
RICE UNIVERSITY Why separate channel estimation and detection? Chip-matched filter Channel Estimation Detection Code-matched filter Received signal riri bibi b i+1 delay time Processing Window for Chan. Est. Operate on different statistics
13
RICE UNIVERSITY Towards an integrated solution Reuse computation from channel estimation step Use same discretized filter output Avoid alignment to bit interval of each user Reduce computation Save hardware
14
RICE UNIVERSITY Components of the observation vector -1 -1 1 -1 1 1 1 1 1 -1 1 -1 -1 -1 bit i = +1bit i+1 = -1 attenuation delay 1 1 1 0 0 0 0 0 0 0 -1 -1 1 -1 + w k,p -w k,p
15
RICE UNIVERSITY Matrix representation -1 -1 1 -1 1 1 1 bit i = +1bit i+1 = +1 UkUk ZkZk b k (i) + other users r = U Z b preamble
16
RICE UNIVERSITY Efficient statistics Parametric approach Build channel model (number of paths) Estimate delay, attenuation Produce the code matched filter output Our approach Estimate effective spreading code (UZ) Code matched filter y = (UZ) T r
17
RICE UNIVERSITY Simulation parameters System parameters 15 users 3 paths Spreading gain - 31 Hardware platform TI C62 and C67 EVM boards 64 KB each internal program & data memory 256 KB SBSRAM, 8 MB SDRAM (external) Code-composer 1.0 to profile code
18
RICE UNIVERSITY Effectiveness of integrated design -4 -20246810121416 10 -4 10 -3 10 -2 10 10 0 SNR (dB) bit error rate Multiuser Parametric approach UZ approach Actual Parameters Single User 2dB gain in performance
19
RICE UNIVERSITY Computational savings Avoid extraction of actual channel parameters Avoid realignment of data for code-matched filtering Reduce intermediate storage requirement Avoid divisions (28 cycles) and square-root (38 cycles) in DSP.
20
RICE UNIVERSITY Fixed point behavior Fixed point advantages Speed Power Cost Fixed point analysis 12 bit of precision required instead of 32 bits! Pack two16 bit operations in 32 bit registers More packing with Saturation arithmetic User power control!
21
RICE UNIVERSITY Time requirement 0 10 20 30 40 50 60 70 80 90 100 Original Unified Synch + Detect 16 bit fixed-point 68.5 41.8 Normalized time 2.39 X speedup
22
RICE UNIVERSITY Integrated receiver design Joint channel estimation and detection Effective spreading code approach Optimized detector design Joint detection and decoding DECODING DETECTION DEMODULATION CHANNEL ESTIMATION RECEIVER detected bits of all K users
23
RICE UNIVERSITY Linear multiuser detector N block-length K # of Users Received signal r = (UZ) b + n Channel estimation (UZ) Matched filter output y = (UZ) T r Linear detector R b + n= y solve R = ((UZ) T UZ) Size of the linear system (NK) Direct inverse takes O((NK) 3 ) operation
24
RICE UNIVERSITY Outline of the Kronecker algorithm Kronecker representation Isolates structure and the matrix blocks Fourier transform converts it to a block-diagonal system Computationally optimal Correlation matrix is block-Toeplitz Approximate it as a block-circulant system Solve N independent order K system iteratively
25
RICE UNIVERSITY Speedup in detector Complexity O(N 2 K 3 ) Vs O(NK 2 + KNlogN) 0 10 20 30 40 50 60 70 80 90 DecorrelatorKronecker Achievable data rate (Kbps) 10.4 Kbps 83.1 Kbps
26
RICE UNIVERSITY Pipelining and parallelization Mostly matrix based operations Detector - iterative algorithm Pipeline various iterations Parallelize operations Add more functional units Distribute data across functional units Distribute computations
27
RICE UNIVERSITY Projected computation time 0 100 200 300 400 500 600 Base Multiuser Algorithm Achievable data rate (Kbps) 20.75 Kbps 564.5 Kbps. DSP only DSP + Coprocessor support Hardware Pipelining Pipelining + Parallelization 154.3 Kbps 30 adders and multipliers
28
RICE UNIVERSITY Integrated receiver design Joint channel estimation and detection Effective spreading code approach Optimized detector design Joint detection and decoding DECODING DETECTION DEMODULATION CHANNEL ESTIMATION RECEIVER detected bits of all K users
29
RICE UNIVERSITY Maximum a-posteriori (MAP) decoding Received signal: r = UZd + n Optimum decoding rule Constrained optimization problem Decode all users simultaneously Exponential complexity in number of users TRANSMITTER ENCODING SPREADING MODULATION OTHER USERS b d
30
RICE UNIVERSITY Single-user detection and decoding Suboptimum alternatives Isolate detection and decoding y1y1 yKyK b1b1 ^ bKbK ^ r MF K MF 1 Decoder 1 Decoder K........
31
RICE UNIVERSITY Decoding matched filter outputs 12345678 10 -4 10 -3 10 -2 10 10 0 SNR(dB) BER MF+Viterbi Optimal Huge performance loss!
32
RICE UNIVERSITY Iterative detection and decoding ^yc^yc = (UZ) T c ( r- (UZ) I d I ) User of concern c, interfering users I r = (UZ) c d c + (UZ) I d I + z Estimate d I Eliminate interference: Estimate d c for the next step Complexity linear in number of users
33
RICE UNIVERSITY Reduction in decoding complexity Convolutional code Coded bits depend on past data bits Performance improves with memory length Viterbi algorithm for decoding Complexity exponential in memory length Our suboptimal approach Maximal weight basis decoding Complexity quadratic in memory length
34
RICE UNIVERSITY Joint detection and decoding performance 12345678 10 -5 10 -4 10 -3 10 -2 10 10 0 SNR (dB) BER MF + Viterbi Iter1 + Subopt Iter1 + Viterbi Iter3 + Subopt Optimal Rate = 1/2 = 7
35
RICE UNIVERSITY Joint detection and decoding Huge performance gain. Suboptimal approximation - Insignificant performance loss Significant computational gain Architecture for suboptimal decoding? Viterbi algorithm - butterfly architecture Have a sliding window implementation
36
RICE UNIVERSITY Summary of contributions Integrated channel estimation and detection model [wcnc] Optimized detection algorithm [PIMRC, Tr. Com] Fixed point implementation [ICASSP, SPIE] Parallel architecture [Asilomar] Joint detection and decoding [Globecom,Tr. Com] Suboptimal decoding algorithm [Asilomar, Tr. Inf. Th.]
37
RICE UNIVERSITY Future research Wireless Cellular Wireless LAN Bluetooth/ Home Networks Ad-hoc Network
38
RICE UNIVERSITY Future research Universal wireless receiver Reconfigurable solution Power efficient Automate design? Network level interaction Resource allocation Quality of service guarantee Application level interaction
39
RICE UNIVERSITY Further details http://www.ece.rice.edu/~suman http://www.ece.rice.edu/CMC
40
RICE UNIVERSITY Convolutional codes b d even d odd Rate : 1/2 memory ( ) 2 d 2 = d 1 d 4 = d 1 + d 3 d 6 = d 1 + d 3 + d 5 d 8 = d 3 + d 5 + d 7 d 10 = d 5 + d 7 + d 9 d odd systematic bits d even parity bits
41
RICE UNIVERSITY Suboptimal single user channel decoder y = (y 1, …y N ) d = (d 1, …d N ) Viterbi algorithm: Complexity grows exponentially with If no codeword constraint d = sgn( y ) Estimated d may not be a codeword !!
42
RICE UNIVERSITY Maximum weight basis decoding More variables than equations NR independent variables N: block-length R: Rate Choice depends on y i y= 7.5 d = 1 y= - 4.5 d = -1 y = 0.5 d = ? d 2 = d 1 d 4 = d 1 + d 3 d 6 = d 1 + d 3 + d 5 d 8 = d 3 + d 5 + d 7 d 10 = d 5 + d 7 + d 9 Want to choose maximally independent subset with largest total weight
43
RICE UNIVERSITY Selection of maximally independent subset Set I = Given y, sort the weights |y i | : i = {1..N} While | I | < NR Choose location from {1..N} with largest weight such that I U e is still an independent subset of {1..N} Set I = I U e ..
44
RICE UNIVERSITY Suboptimal decoding algorithm Chose M maximum independent subset For each independent subset Compute the codeword d I Compute the likelihood p ( y|d I ) Chose codeword with largest likelihood Decoding complexity reduced from O(2 ) to O( 2 ) If d e = sgn(y e )
45
RICE UNIVERSITY Performance improvement 12345678 10 -4 10 -3 10 -2 10 10 0 SNR(dB) BER MF+MAP 2stage + MAP Single User Performance approaches single-user bound
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.