Presentation is loading. Please wait.

Presentation is loading. Please wait.

RICE UNIVERSITY Advanced Wireless Receivers: Algorithmic and Architectural Optimizations Suman Das Rice University Department of Electrical and Computer.

Similar presentations


Presentation on theme: "RICE UNIVERSITY Advanced Wireless Receivers: Algorithmic and Architectural Optimizations Suman Das Rice University Department of Electrical and Computer."— Presentation transcript:

1 RICE UNIVERSITY Advanced Wireless Receivers: Algorithmic and Architectural Optimizations Suman Das Rice University Department of Electrical and Computer Engineering & Center for Multimedia Communication

2 RICE UNIVERSITY Introduction Wireless is one of the fastest growing industries 0 100 200 300 400 500 600 700 199319941995199619971998199920002001 millions of cell-phone users Year Source: Ericsson “By 2002, a lot more cellular phones are going to have internet access than PCs.” Larry Ellison, CEO, Oracle.

3 RICE UNIVERSITY Wireless Cellular Ubiquitous wireless connectivity Wireless LAN Bluetooth/ Home Networks Ad-hoc Network

4 RICE UNIVERSITY Why advanced receiver algorithms?  The number of wireless subscribers growing  Multimedia data replacing voice traffic  Higher and varied data rate (144Kbps - 2Mbps)  Stricter quality of service (QOS)  Wireless bandwidth remains a critical resource Current generation receivers are suboptimal

5 RICE UNIVERSITY Performance of advanced receivers Current receiver Advanced receiver Theoretical limit 6810121416 4 10 -4 10 -2 10 0 bit error rate SNR (dB) Huge performance improvement

6 RICE UNIVERSITY Computational requirements of advanced receivers  15 user system transmitting at 0.5Mbps needs  ~20 Billion additions per second  ~15 Billion multiplications per second  Requires 32 bit floating point precision 50 floating point DSP-s running at 200MHz to sustain the computation!

7 RICE UNIVERSITY My research  Receiver design  High performance  Low complexity  Approach  Algorithmic simplification  Efficient architectural mapping

8 RICE UNIVERSITY Wireless channel model Direct Path Reflected Paths Noise User 1 User 2 Base Station  Channel Effects  Background noise  Fading  Multiple paths  Multiple Users  Multiple Access Interference(MAI)

9 RICE UNIVERSITY Code Division Multiple Access (CDMA) time S(t) bit chip Spreading gain = 7  Wideband CDMA - technology of choice  Users distinguished by spreading sequence Received signal K: # of users P: # of paths w : attenuation  : delay b: data bits - 1 -1 1 -1 1 1 1

10 RICE UNIVERSITY CDMA system  Proposed advanced/multiuser receiver modules  Designed in isolation  Suboptimal design DECODING DETECTION DEMODULATION CHANNEL ESTIMATION RECEIVER detected bits of all K users TRANSMITTER ENCODING SPREADING MODULATION OTHER USERS data

11 RICE UNIVERSITY Integrated receiver design  Joint channel estimation and detection  Joint detection and decoding DECODING DETECTION DEMODULATION CHANNEL ESTIMATION RECEIVER detected bits of all K users

12 RICE UNIVERSITY Why separate channel estimation and detection? Chip-matched filter Channel Estimation Detection Code-matched filter Received signal riri bibi b i+1 delay time Processing Window for Chan. Est. Operate on different statistics

13 RICE UNIVERSITY Towards an integrated solution  Reuse computation from channel estimation step  Use same discretized filter output  Avoid alignment to bit interval of each user  Reduce computation  Save hardware

14 RICE UNIVERSITY Components of the observation vector -1 -1 1 -1 1 1 1 1 1 -1 1 -1 -1 -1 bit i = +1bit i+1 = -1 attenuation delay 1 1 1 0 0 0 0 0 0 0 -1 -1 1 -1 + w k,p -w k,p

15 RICE UNIVERSITY Matrix representation -1 -1 1 -1 1 1 1 bit i = +1bit i+1 = +1 UkUk ZkZk b k (i) + other users r = U Z b preamble

16 RICE UNIVERSITY Efficient statistics  Parametric approach  Build channel model (number of paths)  Estimate delay, attenuation  Produce the code matched filter output  Our approach  Estimate effective spreading code (UZ)  Code matched filter y = (UZ) T r

17 RICE UNIVERSITY Simulation parameters  System parameters  15 users  3 paths  Spreading gain - 31  Hardware platform  TI C62 and C67 EVM boards  64 KB each internal program & data memory  256 KB SBSRAM, 8 MB SDRAM (external)  Code-composer 1.0 to profile code

18 RICE UNIVERSITY Effectiveness of integrated design -4 -20246810121416 10 -4 10 -3 10 -2 10 10 0 SNR (dB) bit error rate Multiuser Parametric approach UZ approach Actual Parameters Single User 2dB gain in performance

19 RICE UNIVERSITY Computational savings  Avoid extraction of actual channel parameters  Avoid realignment of data for code-matched filtering  Reduce intermediate storage requirement  Avoid divisions (28 cycles) and square-root (38 cycles) in DSP.

20 RICE UNIVERSITY Fixed point behavior  Fixed point advantages  Speed Power Cost  Fixed point analysis  12 bit of precision required instead of 32 bits!  Pack two16 bit operations in 32 bit registers  More packing with Saturation arithmetic User power control!

21 RICE UNIVERSITY Time requirement 0 10 20 30 40 50 60 70 80 90 100 Original Unified Synch + Detect 16 bit fixed-point 68.5 41.8 Normalized time 2.39 X speedup

22 RICE UNIVERSITY Integrated receiver design  Joint channel estimation and detection  Effective spreading code approach  Optimized detector design  Joint detection and decoding DECODING DETECTION DEMODULATION CHANNEL ESTIMATION RECEIVER detected bits of all K users

23 RICE UNIVERSITY Linear multiuser detector N block-length K # of Users Received signal r = (UZ) b + n Channel estimation (UZ) Matched filter output y = (UZ) T r Linear detector R b + n= y solve R = ((UZ) T UZ) Size of the linear system (NK) Direct inverse takes O((NK) 3 ) operation

24 RICE UNIVERSITY Outline of the Kronecker algorithm  Kronecker representation  Isolates structure and the matrix blocks  Fourier transform converts it to a block-diagonal system  Computationally optimal Correlation matrix is block-Toeplitz Approximate it as a block-circulant system Solve N independent order K system iteratively

25 RICE UNIVERSITY Speedup in detector Complexity O(N 2 K 3 ) Vs O(NK 2 + KNlogN) 0 10 20 30 40 50 60 70 80 90 DecorrelatorKronecker Achievable data rate (Kbps) 10.4 Kbps 83.1 Kbps

26 RICE UNIVERSITY Pipelining and parallelization  Mostly matrix based operations  Detector - iterative algorithm  Pipeline various iterations  Parallelize operations  Add more functional units  Distribute data across functional units  Distribute computations

27 RICE UNIVERSITY Projected computation time 0 100 200 300 400 500 600 Base Multiuser Algorithm Achievable data rate (Kbps) 20.75 Kbps 564.5 Kbps. DSP only DSP + Coprocessor support Hardware Pipelining Pipelining + Parallelization 154.3 Kbps 30 adders and multipliers

28 RICE UNIVERSITY Integrated receiver design  Joint channel estimation and detection  Effective spreading code approach  Optimized detector design  Joint detection and decoding DECODING DETECTION DEMODULATION CHANNEL ESTIMATION RECEIVER detected bits of all K users

29 RICE UNIVERSITY Maximum a-posteriori (MAP) decoding  Received signal: r = UZd + n  Optimum decoding rule  Constrained optimization problem  Decode all users simultaneously Exponential complexity in number of users TRANSMITTER ENCODING SPREADING MODULATION OTHER USERS b d

30 RICE UNIVERSITY Single-user detection and decoding  Suboptimum alternatives  Isolate detection and decoding y1y1 yKyK b1b1 ^ bKbK ^ r MF K MF 1 Decoder 1 Decoder K........

31 RICE UNIVERSITY Decoding matched filter outputs 12345678 10 -4 10 -3 10 -2 10 10 0 SNR(dB) BER MF+Viterbi Optimal Huge performance loss!

32 RICE UNIVERSITY Iterative detection and decoding ^yc^yc = (UZ) T c ( r- (UZ) I d I )  User of concern c, interfering users I r = (UZ) c d c + (UZ) I d I + z  Estimate d I  Eliminate interference:  Estimate d c for the next step Complexity linear in number of users

33 RICE UNIVERSITY Reduction in decoding complexity  Convolutional code  Coded bits depend on past data bits  Performance improves with memory length  Viterbi algorithm for decoding  Complexity exponential in memory length  Our suboptimal approach  Maximal weight basis decoding  Complexity quadratic in memory length

34 RICE UNIVERSITY Joint detection and decoding performance 12345678 10 -5 10 -4 10 -3 10 -2 10 10 0 SNR (dB) BER MF + Viterbi Iter1 + Subopt Iter1 + Viterbi Iter3 + Subopt Optimal Rate = 1/2  = 7

35 RICE UNIVERSITY Joint detection and decoding  Huge performance gain.  Suboptimal approximation -  Insignificant performance loss  Significant computational gain  Architecture for suboptimal decoding?  Viterbi algorithm - butterfly architecture  Have a sliding window implementation

36 RICE UNIVERSITY Summary of contributions  Integrated channel estimation and detection model [wcnc]  Optimized detection algorithm [PIMRC, Tr. Com]  Fixed point implementation [ICASSP, SPIE]  Parallel architecture [Asilomar]  Joint detection and decoding [Globecom,Tr. Com]  Suboptimal decoding algorithm [Asilomar, Tr. Inf. Th.]

37 RICE UNIVERSITY Future research Wireless Cellular Wireless LAN Bluetooth/ Home Networks Ad-hoc Network

38 RICE UNIVERSITY Future research  Universal wireless receiver  Reconfigurable solution  Power efficient  Automate design?  Network level interaction  Resource allocation  Quality of service guarantee  Application level interaction

39 RICE UNIVERSITY Further details http://www.ece.rice.edu/~suman http://www.ece.rice.edu/CMC

40 RICE UNIVERSITY Convolutional codes b d even d odd Rate : 1/2 memory (  )  2 d 2 = d 1 d 4 = d 1 + d 3 d 6 = d 1 + d 3 + d 5 d 8 = d 3 + d 5 + d 7 d 10 = d 5 + d 7 + d 9 d odd systematic bits d even parity bits

41 RICE UNIVERSITY Suboptimal single user channel decoder  y = (y 1, …y N )  d = (d 1, …d N )  Viterbi algorithm:  Complexity grows exponentially with   If no codeword constraint d = sgn( y )  Estimated d may not be a codeword !!

42 RICE UNIVERSITY Maximum weight basis decoding  More variables than equations  NR independent variables N: block-length R: Rate  Choice depends on y i  y= 7.5 d = 1  y= - 4.5 d = -1  y = 0.5 d = ? d 2 = d 1 d 4 = d 1 + d 3 d 6 = d 1 + d 3 + d 5 d 8 = d 3 + d 5 + d 7 d 10 = d 5 + d 7 + d 9 Want to choose maximally independent subset with largest total weight

43 RICE UNIVERSITY Selection of maximally independent subset  Set I =  Given y, sort the weights |y i | : i = {1..N}  While | I | < NR  Choose location from {1..N} with largest weight such that I U e is still an independent subset of {1..N}  Set I = I U e ..

44 RICE UNIVERSITY Suboptimal decoding algorithm  Chose M maximum independent subset  For each independent subset  Compute the codeword d I  Compute the likelihood p ( y|d I )  Chose codeword with largest likelihood Decoding complexity reduced from O(2  ) to O(  2 )  If d e = sgn(y e )

45 RICE UNIVERSITY Performance improvement 12345678 10 -4 10 -3 10 -2 10 10 0 SNR(dB) BER MF+MAP 2stage + MAP Single User Performance approaches single-user bound


Download ppt "RICE UNIVERSITY Advanced Wireless Receivers: Algorithmic and Architectural Optimizations Suman Das Rice University Department of Electrical and Computer."

Similar presentations


Ads by Google