RICE UNIVERSITY “Joint” architecture & algorithm designs for baseband signal processing Sridhar Rajagopal and Joseph R. Cavallaro Rice Center for Multimedia.

RICE UNIVERSITY “Joint” architecture & algorithm designs for baseband signal processing Sridhar Rajagopal and Joseph R. Cavallaro Rice Center for Multimedia Communications This work has been supported by Nokia, TI, TATP and NSF

RICE UNIVERSITY Single-slide version of my talk Algorithms DSP VLSI FPGA IMAGINE Multiuser channel estimation Multiuser detection Task-partitioning Parallelism Pipelining Conventional arithmetic On-line arithmetic Instruction set extensions Co-processor support Functional unit design and usage Distant Past Recent Past Recent and Near Future

RICE UNIVERSITY Contents  Algorithms for channel estimation and detection  Conventional and on-line arithmetic designs  Programmable architecture design using the IMAGINE simulator

RICE UNIVERSITY Estimation - detection algorithms?  Sophisticated, computationally complex algorithms proposed for 3G - 4G standards  Typically need complex operations, huge matrix sizes, matrix inversions  Difficult for hardware implementation and for real- time performance

RICE UNIVERSITY Multiuser channel estimation algorithm  = {+1, -1} : training/tracking bits  = 8-bit integer (complex) : Received signal  N = spreading gain (typically fixed, e.g. 32)  K = number of users (variable, <=N)  = maximum likelihood channel estimate

RICE UNIVERSITY Iterative scheme for channel estimation  Bit-streaming : suitable for tracking (window length L)  Method of gradient descent  Stable convergence behavior  Simple fixed-point VLSI architecture [ASAP 2000]

RICE UNIVERSITY Comparisons  DSPs unable to exploit bit-level parallelism  Inefficient storage of bits  Replacing multiplications by additions/subtractions

RICE UNIVERSITY Multiuser detection innovations  Developed a simple architecture for asynchronous multiuser detection for CDMA [ +, x ]  Bit-streaming  reduced latency  eliminates window edge computations  lower memory requirements  Pipelined stages  higher throughput (with more hardware)

RICE UNIVERSITY Block Pipelined Detector Variable latency [Worst case (1st bit)  D*latency per bit] 2 extra edge bit computations per stage. 11 MF 22 Bits 12-21 TIME 1 MF 12 Bits 2-11 1 PIC 1211 PIC 22 1 PIC 1211 PIC 22 1 PIC 1211 PIC 22

RICE UNIVERSITY Bit-streaming multiuser detection Savings in memory by D 2

RICE UNIVERSITY Pipelining the multiuser detector Matched Filter (causal) PIC - Stage 1 PIC - Stage 2 PIC - Stage 3 TIME Latency = 2*latency per bit (D/2 speedup over block) eliminated edge bit computations. [ISCAS 2001]

RICE UNIVERSITY Matched filter with conventional arithmetic T ~ log(N) * log(d) N - dot product size d - precision

RICE UNIVERSITY Conventional MF using CSAs T ~ a + log(d+c) a,c - constants

RICE UNIVERSITY Key concept in on-line arithmetic  Conventional detection - high precision operations (8-32 bits) followed by testing for sign.  Actual detection dependent only on most significant digits (1-3 bits).  Use MSDF computation to find the sign and avoid computation of the successive digits. [Arith-15] Detection

RICE UNIVERSITY Comparisons of arithmetic schemes

RICE UNIVERSITY Using on-line arithmetic for detection Channel -1,+1 -0.500.51 0 1 1.5 2 2.5 3 3.5 4 4.5 5 Received Signal Amplitude (Normalized) Time taken for addition (Normalized)

RICE UNIVERSITY Equations Probability of error for optimal BPSK detection Probability of error for on-line BPSK detection r – radix of the number system p – number of digits

RICE UNIVERSITY Probability of error using on-line

RICE UNIVERSITY On-line MF implementation T ~ c c - constant

RICE UNIVERSITY Throughput comparisons

RICE UNIVERSITY Area comparisons

RICE UNIVERSITY Implementing higher modulation schemes

RICE UNIVERSITY Conclusions on arithmetic schemes  CSAs better than straightforward implementation  1.35 - 1.6X speedup for 8-32 bit precision  1.64 - 1.14X less area  If reduced precision computations, on-line still better  1.67 - 2.12X speedup over CSA  0.64 - 12.73X less area over CSA

RICE UNIVERSITY A programmable architecture?  Flexibility in the algorithm requirements  channel dependent computations  changing algorithms on-the-fly  seamless switching between wireless LAN and wideband CDMA -- RENE.  Simulator needed to test performance of algorithms  extensions/modifications for critical operations

RICE UNIVERSITY Algorithms needed for 3G base-band base-station implementation  Equalization  FFT  Viterbi decoding  Channel estimation  Multiuser detection  Viterbi/Turbo decoding  Multiple antennas  Long spreading codes  Space-Time codes Wireless LAN W-CDMA If you felt that life was too easy

RICE UNIVERSITY The IMAGINE architecture and simulator  IMAGINE is a media signal processor, built at Stanford.  Many common workload features  Good starting point to explore.  Local expertise - Dr. Scott Rixner ( rixner@rice.edu )

RICE UNIVERSITY IMAGINE architecture  Great for media processing algorithms  1024 pt FFT in 7.4  s on a 500 MHz processor with a 8-cluster (48 units)  3.8W of power  Great for parallel, vector and streaming computations  Performance/extensions to sequential computation kernels such as Viterbi traceback needs to be investigated.

RICE UNIVERSITY Conclusions  Algorithm steps for designing communication systems  Design hardware-efficient versions  Fixed-point implementation  DSP implementation - bottlenecks  Task partitioning, pipelining, parallelism  Computer arithmetic ideas -- VLSI  Integration into a programmable processor

RICE UNIVERSITY “Joint” architecture & algorithm designs for baseband signal processing Sridhar Rajagopal and Joseph R. Cavallaro Rice Center for Multimedia.

Similar presentations

Presentation on theme: "RICE UNIVERSITY “Joint” architecture & algorithm designs for baseband signal processing Sridhar Rajagopal and Joseph R. Cavallaro Rice Center for Multimedia."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

RICE UNIVERSITY “Joint” architecture & algorithm designs for baseband signal processing Sridhar Rajagopal and Joseph R. Cavallaro Rice Center for Multimedia.

Similar presentations

Presentation on theme: "RICE UNIVERSITY “Joint” architecture & algorithm designs for baseband signal processing Sridhar Rajagopal and Joseph R. Cavallaro Rice Center for Multimedia."— Presentation transcript:

Similar presentations

About project

Feedback