RICE UNIVERSITY Advanced Wireless Receivers: Algorithmic and Architectural Optimizations Suman Das Rice University Department of Electrical and Computer.

Slides:



Advertisements
Similar presentations
Multiuser Detection for CDMA Systems
Advertisements

The Impact of Channel Estimation Errors on Space-Time Block Codes Presentation for Virginia Tech Symposium on Wireless Personal Communications M. C. Valenti.
VLSI Communication SystemsRecap VLSI Communication Systems RECAP.
Comparison of different MIMO-OFDM signal detectors for LTE
Development of Parallel Simulator for Wireless WCDMA Network Hong Zhang Communication lab of HUT.
Real-Time DSP Multiprocessor Implementation for Future Wireless Base-Station Receivers Bryan Jones, Sridhar Rajagopal, and Dr. Joseph Cavallaro.
1 Wireless Communication Low Complexity Multiuser Detection Rami Abdallah University of Illinois at Urbana Champaign 12/06/2007.
Multiuser Detection in CDMA A. Chockalingam Assistant Professor Indian Institute of Science, Bangalore-12
Submission May, 2000 Doc: IEEE / 086 Steven Gray, Nokia Slide Brief Overview of Information Theory and Channel Coding Steven D. Gray 1.
King Fahd University of Petroleum &Minerals Electrical Engineering Department EE-400 presentation CDMA systems Done By: Ibrahim Al-Dosari Mohammad.
Code and Decoder Design of LDPC Codes for Gbps Systems Jeremy Thorpe Presented to: Microsoft Research
1 CMPT 371 Data Communications and Networking Spread Spectrum.
Digital Data Communications Techniques Updated: 2/9/2009.
A Low-Power Low-Memory Real-Time ASR System. Outline Overview of Automatic Speech Recognition (ASR) systems Sub-vector clustering and parameter quantization.
Muhammad Imadur Rahman1, Klaus Witrisal2,
Juanjo Noguera Xilinx Research Labs Dublin, Ireland Ahmed Al-Wattar Irwin O. Irwin O. Kennedy Alcatel-Lucent Dublin, Ireland.
Implementation Issues for Channel Estimation and Detection Algorithms for W-CDMA Sridhar Rajagopal and Joseph Cavallaro ECE Dept.
DSPs in Wireless Communication Systems Vishwas Sundaramurthy Electrical and Computer Engineering Department, Rice University, Houston,TX.
RICE UNIVERSITY Implementing the Viterbi algorithm on programmable processors Sridhar Rajagopal Elec 696
Signal Propagation Propagation: How the Signal are spreading from the receiver to sender. Transmitted to the Receiver in the spherical shape. sender When.
College of Engineering WiFi and WCDMA Network Design Robert Akl, D.Sc. Department of Computer Science and Engineering Robert Akl, D.Sc. Department of Computer.
A bit-streaming, pipelined multiuser detector for wireless communications Sridhar Rajagopal and Joseph R. Cavallaro Rice University
Multiuser Detection (MUD) Combined with array signal processing in current wireless communication environments Wed. 박사 3학기 구 정 회.
Efficient VLSI architectures for baseband signal processing in wireless base-station receivers Sridhar Rajagopal, Srikrishna Bhashyam, Joseph R. Cavallaro,
Tinoosh Mohsenin and Bevan M. Baas VLSI Computation Lab, ECE Department University of California, Davis Split-Row: A Reduced Complexity, High Throughput.
TI DSPS FEST 1999 Implementation of Channel Estimation and Multiuser Detection Algorithms for W-CDMA on Digital Signal Processors Sridhar Rajagopal Gang.
Efficient VLSI architectures for baseband signal processing in wireless base-station receivers Sridhar Rajagopal Srikrishna Bhashyam, Joseph R. Cavallaro,
Data and Computer Communications Chapter 6 – Digital Data Communications Techniques.
Space-Time and Space-Frequency Coded Orthogonal Frequency Division Multiplexing Transmitter Diversity Techniques King F. Lee.
RICE UNIVERSITY “Joint” architecture & algorithm designs for baseband signal processing Sridhar Rajagopal and Joseph R. Cavallaro Rice Center for Multimedia.
Outline Transmitters (Chapters 3 and 4, Source Coding and Modulation) (week 1 and 2) Receivers (Chapter 5) (week 3 and 4) Received Signal Synchronization.
Coded Modulation for Multiple Antennas over Fading Channels
Wireless Multiple Access Schemes in a Class of Frequency Selective Channels with Uncertain Channel State Information Christopher Steger February 2, 2004.
Name Iterative Source- and Channel Decoding Speaker: Inga Trusova Advisor: Joachim Hagenauer.
ITERATIVE CHANNEL ESTIMATION AND DECODING OF TURBO/CONVOLUTIONALLY CODED STBC-OFDM SYSTEMS Hakan Doğan 1, Hakan Ali Çırpan 1, Erdal Panayırcı 2 1 Istanbul.
Synchronization of Turbo Codes Based on Online Statistics
RICE UNIVERSITY DSPs for future wireless systems Sridhar Rajagopal.
VIRGINIA POLYTECHNIC INSTITUTE & STATE UNIVERSITY MOBILE & PORTABLE RADIO RESEARCH GROUP MPRG Combined Multiuser Reception and Channel Decoding for TDMA.
Implementing algorithms for advanced communication systems -- My bag of tricks Sridhar Rajagopal Electrical and Computer Engineering This work is supported.
Pipelining and number theory for multiuser detection Sridhar Rajagopal and Joseph R. Cavallaro Rice University This work is supported by Nokia, TI, TATP.
Real-Time Turbo Decoder Nasir Ahmed Mani Vaya Elec 434 Rice University.
RICE UNIVERSITY On the architecture design of a 3G W-CDMA/W-LAN receiver Sridhar Rajagopal and Joseph R. Cavallaro Rice University Center for Multimedia.
Implementing Multiuser Channel Estimation and Detection for W-CDMA Sridhar Rajagopal, Srikrishna Bhashyam, Joseph R. Cavallaro and Behnaam Aazhang Rice.
Overview of Implementation Issues for Multitier Networks on DSPs Joseph R. Cavallaro Electrical & Computer Engineering Dept. Rice University August 17,
Minufiya University Faculty of Electronic Engineering Dep. of Electronic and Communication Eng. 4’th Year Information Theory and Coding Lecture on: Performance.
Outline Transmitters (Chapters 3 and 4, Source Coding and Modulation) (week 1 and 2) Receivers (Chapter 5) (week 3 and 4) Received Signal Synchronization.
SR: 599 report Channel Estimation for W-CDMA on DSPs Sridhar Rajagopal ECE Dept., Rice University Elec 599.
Efficient VLSI architectures for baseband signal processing in wireless base-station receivers Sridhar Rajagopal, Srikrishna Bhashyam, Joseph R. Cavallaro,
Accurate WiFi Packet Delivery Rate Estimation and Applications Owais Khan and Lili Qiu. The University of Texas at Austin 1 Infocom 2016, San Francisco.
Optimal Sequence Allocation and Multi-rate CDMA Systems Krishna Kiran Mukkavilli, Sridhar Rajagopal, Tarik Muharemovic, Vikram Kanodia.
Sridhar Rajagopal Bryan A. Jones and Joseph R. Cavallaro
Space Time Codes.
Space-Time and Space-Frequency Coded Orthogonal Frequency Division Multiplexing Transmitter Diversity Techniques King F. Lee.
Suman Das Rice University
A programmable communications processor for future wireless systems
Sridhar Rajagopal April 26, 2000
Coding and Interleaving
Optimal Sequence Allocation and Multi-rate CDMA Systems
Sridhar Rajagopal and Joseph R. Cavallaro Rice University
Ian C. Wong, Zukang Shen, Jeffrey G. Andrews, and Brian L. Evans
Sridhar Rajagopal and Joseph R. Cavallaro Rice University
DSPs for Future Wireless Base-Stations
On-line arithmetic for detection in digital communication receivers
Sridhar Rajagopal, Srikrishna Bhashyam,
DSP Architectures for Future Wireless Base-Stations
On-line arithmetic for detection in digital communication receivers
Suman Das, Sridhar Rajagopal, Chaitali Sengupta and Joseph R.Cavallaro
DSPs for Future Wireless Base-Stations
Presentation transcript:

RICE UNIVERSITY Advanced Wireless Receivers: Algorithmic and Architectural Optimizations Suman Das Rice University Department of Electrical and Computer Engineering & Center for Multimedia Communication

RICE UNIVERSITY Introduction Wireless is one of the fastest growing industries millions of cell-phone users Year Source: Ericsson “By 2002, a lot more cellular phones are going to have internet access than PCs.” Larry Ellison, CEO, Oracle.

RICE UNIVERSITY Wireless Cellular Ubiquitous wireless connectivity Wireless LAN Bluetooth/ Home Networks Ad-hoc Network

RICE UNIVERSITY Why advanced receiver algorithms?  The number of wireless subscribers growing  Multimedia data replacing voice traffic  Higher and varied data rate (144Kbps - 2Mbps)  Stricter quality of service (QOS)  Wireless bandwidth remains a critical resource Current generation receivers are suboptimal

RICE UNIVERSITY Performance of advanced receivers Current receiver Advanced receiver Theoretical limit bit error rate SNR (dB) Huge performance improvement

RICE UNIVERSITY Computational requirements of advanced receivers  15 user system transmitting at 0.5Mbps needs  ~20 Billion additions per second  ~15 Billion multiplications per second  Requires 32 bit floating point precision 50 floating point DSP-s running at 200MHz to sustain the computation!

RICE UNIVERSITY My research  Receiver design  High performance  Low complexity  Approach  Algorithmic simplification  Efficient architectural mapping

RICE UNIVERSITY Wireless channel model Direct Path Reflected Paths Noise User 1 User 2 Base Station  Channel Effects  Background noise  Fading  Multiple paths  Multiple Users  Multiple Access Interference(MAI)

RICE UNIVERSITY Code Division Multiple Access (CDMA) time S(t) bit chip Spreading gain = 7  Wideband CDMA - technology of choice  Users distinguished by spreading sequence Received signal K: # of users P: # of paths w : attenuation  : delay b: data bits

RICE UNIVERSITY CDMA system  Proposed advanced/multiuser receiver modules  Designed in isolation  Suboptimal design DECODING DETECTION DEMODULATION CHANNEL ESTIMATION RECEIVER detected bits of all K users TRANSMITTER ENCODING SPREADING MODULATION OTHER USERS data

RICE UNIVERSITY Integrated receiver design  Joint channel estimation and detection  Joint detection and decoding DECODING DETECTION DEMODULATION CHANNEL ESTIMATION RECEIVER detected bits of all K users

RICE UNIVERSITY Why separate channel estimation and detection? Chip-matched filter Channel Estimation Detection Code-matched filter Received signal riri bibi b i+1 delay time Processing Window for Chan. Est. Operate on different statistics

RICE UNIVERSITY Towards an integrated solution  Reuse computation from channel estimation step  Use same discretized filter output  Avoid alignment to bit interval of each user  Reduce computation  Save hardware

RICE UNIVERSITY Components of the observation vector bit i = +1bit i+1 = -1 attenuation delay w k,p -w k,p

RICE UNIVERSITY Matrix representation bit i = +1bit i+1 = +1 UkUk ZkZk b k (i) + other users r = U Z b preamble

RICE UNIVERSITY Efficient statistics  Parametric approach  Build channel model (number of paths)  Estimate delay, attenuation  Produce the code matched filter output  Our approach  Estimate effective spreading code (UZ)  Code matched filter y = (UZ) T r

RICE UNIVERSITY Simulation parameters  System parameters  15 users  3 paths  Spreading gain - 31  Hardware platform  TI C62 and C67 EVM boards  64 KB each internal program & data memory  256 KB SBSRAM, 8 MB SDRAM (external)  Code-composer 1.0 to profile code

RICE UNIVERSITY Effectiveness of integrated design SNR (dB) bit error rate Multiuser Parametric approach UZ approach Actual Parameters Single User 2dB gain in performance

RICE UNIVERSITY Computational savings  Avoid extraction of actual channel parameters  Avoid realignment of data for code-matched filtering  Reduce intermediate storage requirement  Avoid divisions (28 cycles) and square-root (38 cycles) in DSP.

RICE UNIVERSITY Fixed point behavior  Fixed point advantages  Speed Power Cost  Fixed point analysis  12 bit of precision required instead of 32 bits!  Pack two16 bit operations in 32 bit registers  More packing with Saturation arithmetic User power control!

RICE UNIVERSITY Time requirement Original Unified Synch + Detect 16 bit fixed-point Normalized time 2.39 X speedup

RICE UNIVERSITY Integrated receiver design  Joint channel estimation and detection  Effective spreading code approach  Optimized detector design  Joint detection and decoding DECODING DETECTION DEMODULATION CHANNEL ESTIMATION RECEIVER detected bits of all K users

RICE UNIVERSITY Linear multiuser detector N block-length K # of Users Received signal r = (UZ) b + n Channel estimation (UZ) Matched filter output y = (UZ) T r Linear detector R b + n= y solve R = ((UZ) T UZ) Size of the linear system (NK) Direct inverse takes O((NK) 3 ) operation

RICE UNIVERSITY Outline of the Kronecker algorithm  Kronecker representation  Isolates structure and the matrix blocks  Fourier transform converts it to a block-diagonal system  Computationally optimal Correlation matrix is block-Toeplitz Approximate it as a block-circulant system Solve N independent order K system iteratively

RICE UNIVERSITY Speedup in detector Complexity O(N 2 K 3 ) Vs O(NK 2 + KNlogN) DecorrelatorKronecker Achievable data rate (Kbps) 10.4 Kbps 83.1 Kbps

RICE UNIVERSITY Pipelining and parallelization  Mostly matrix based operations  Detector - iterative algorithm  Pipeline various iterations  Parallelize operations  Add more functional units  Distribute data across functional units  Distribute computations

RICE UNIVERSITY Projected computation time Base Multiuser Algorithm Achievable data rate (Kbps) Kbps Kbps. DSP only DSP + Coprocessor support Hardware Pipelining Pipelining + Parallelization Kbps 30 adders and multipliers

RICE UNIVERSITY Integrated receiver design  Joint channel estimation and detection  Effective spreading code approach  Optimized detector design  Joint detection and decoding DECODING DETECTION DEMODULATION CHANNEL ESTIMATION RECEIVER detected bits of all K users

RICE UNIVERSITY Maximum a-posteriori (MAP) decoding  Received signal: r = UZd + n  Optimum decoding rule  Constrained optimization problem  Decode all users simultaneously Exponential complexity in number of users TRANSMITTER ENCODING SPREADING MODULATION OTHER USERS b d

RICE UNIVERSITY Single-user detection and decoding  Suboptimum alternatives  Isolate detection and decoding y1y1 yKyK b1b1 ^ bKbK ^ r MF K MF 1 Decoder 1 Decoder K

RICE UNIVERSITY Decoding matched filter outputs SNR(dB) BER MF+Viterbi Optimal Huge performance loss!

RICE UNIVERSITY Iterative detection and decoding ^yc^yc = (UZ) T c ( r- (UZ) I d I )  User of concern c, interfering users I r = (UZ) c d c + (UZ) I d I + z  Estimate d I  Eliminate interference:  Estimate d c for the next step Complexity linear in number of users

RICE UNIVERSITY Reduction in decoding complexity  Convolutional code  Coded bits depend on past data bits  Performance improves with memory length  Viterbi algorithm for decoding  Complexity exponential in memory length  Our suboptimal approach  Maximal weight basis decoding  Complexity quadratic in memory length

RICE UNIVERSITY Joint detection and decoding performance SNR (dB) BER MF + Viterbi Iter1 + Subopt Iter1 + Viterbi Iter3 + Subopt Optimal Rate = 1/2  = 7

RICE UNIVERSITY Joint detection and decoding  Huge performance gain.  Suboptimal approximation -  Insignificant performance loss  Significant computational gain  Architecture for suboptimal decoding?  Viterbi algorithm - butterfly architecture  Have a sliding window implementation

RICE UNIVERSITY Summary of contributions  Integrated channel estimation and detection model [wcnc]  Optimized detection algorithm [PIMRC, Tr. Com]  Fixed point implementation [ICASSP, SPIE]  Parallel architecture [Asilomar]  Joint detection and decoding [Globecom,Tr. Com]  Suboptimal decoding algorithm [Asilomar, Tr. Inf. Th.]

RICE UNIVERSITY Future research Wireless Cellular Wireless LAN Bluetooth/ Home Networks Ad-hoc Network

RICE UNIVERSITY Future research  Universal wireless receiver  Reconfigurable solution  Power efficient  Automate design?  Network level interaction  Resource allocation  Quality of service guarantee  Application level interaction

RICE UNIVERSITY Further details

RICE UNIVERSITY Convolutional codes b d even d odd Rate : 1/2 memory (  )  2 d 2 = d 1 d 4 = d 1 + d 3 d 6 = d 1 + d 3 + d 5 d 8 = d 3 + d 5 + d 7 d 10 = d 5 + d 7 + d 9 d odd systematic bits d even parity bits

RICE UNIVERSITY Suboptimal single user channel decoder  y = (y 1, …y N )  d = (d 1, …d N )  Viterbi algorithm:  Complexity grows exponentially with   If no codeword constraint d = sgn( y )  Estimated d may not be a codeword !!

RICE UNIVERSITY Maximum weight basis decoding  More variables than equations  NR independent variables N: block-length R: Rate  Choice depends on y i  y= 7.5 d = 1  y= d = -1  y = 0.5 d = ? d 2 = d 1 d 4 = d 1 + d 3 d 6 = d 1 + d 3 + d 5 d 8 = d 3 + d 5 + d 7 d 10 = d 5 + d 7 + d 9 Want to choose maximally independent subset with largest total weight

RICE UNIVERSITY Selection of maximally independent subset  Set I =  Given y, sort the weights |y i | : i = {1..N}  While | I | < NR  Choose location from {1..N} with largest weight such that I U e is still an independent subset of {1..N}  Set I = I U e ..

RICE UNIVERSITY Suboptimal decoding algorithm  Chose M maximum independent subset  For each independent subset  Compute the codeword d I  Compute the likelihood p ( y|d I )  Chose codeword with largest likelihood Decoding complexity reduced from O(2  ) to O(  2 )  If d e = sgn(y e )

RICE UNIVERSITY Performance improvement SNR(dB) BER MF+MAP 2stage + MAP Single User Performance approaches single-user bound