Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Slides:



Advertisements
Similar presentations
Chapter Thirteen: Multiplexing and Multiple- Access Techniques.
Advertisements

VSMC MIMO: A Spectral Efficient Scheme for Cooperative Relay in Cognitive Radio Networks 1.
Spread Spectrum Chapter 7.
Spread Spectrum Chapter 7. Spread Spectrum Input is fed into a channel encoder Produces analog signal with narrow bandwidth Signal is further modulated.
Error Control Code.
Multiple Access Techniques for wireless communication
April 25, 2005ECE 457 Cellular Communication ECE 457 Spring 2005.
1 U NIVERSITY OF M ICHIGAN 11 1 SODA: A Low-power Architecture For Software Radio Author: Yuan Lin, Hyunseok Lee, Mark Woh, Yoav Harel, Scott Mahlke, Trevor.
Submission May, 2000 Doc: IEEE / 086 Steven Gray, Nokia Slide Brief Overview of Information Theory and Channel Coding Steven D. Gray 1.
IERG 4100 Wireless Communications
Overview.  UMTS (Universal Mobile Telecommunication System) the third generation mobile communication systems.
A System Solution for High- Performance, Low Power SDR Yuan Lin 1, Hyunseok Lee 1, Yoav Harel 1, Mark Woh 1, Scott Mahlke 1, Trevor Mudge 1 and Krisztian.
1 SODA: A Low-power Architecture For Software Radio Yuan Lin 1, Hyunseok Lee 1, Mark Woh 1, Yoav Harel 1, Scott Mahlke 1, Trevor.
11 1 The Next Generation Challenge for Software Defined Radio Mark Woh 1, Sangwon Seo 1, Hyunseok Lee 1, Yuan Lin 1, Scott Mahlke 1, Trevor Mudge 1, Chaitali.
A Scalable Low-power Architecture For Software Radio
II. Medium Access & Cellular Standards. TDMA/FDMA/CDMA.
1 CMPT 371 Data Communications and Networking Spread Spectrum.
1 Design and Implementation of Turbo Decoders for Software Defined Radio Yuan Lin 1, Scott Mahlke 1, Trevor Mudge 1, Chaitali.
Sep 08, 2005CS477: Analog and Digital Communications1 Example Systems, Signals Analog and Digital Communications Autumn
1 Wireless and Mobile Networks EECS 489 Computer Networks Z. Morley Mao Monday March 12, 2007 Acknowledgement:
High survival HF radio network Michele Morelli, Marco Moretti, Luca Sanguinetti CNIT- PISA.
1. 2  What is MIMO?  Basic Concepts of MIMO  Forms of MIMO  Concept of Cooperative MIMO  What is a Relay?  Why Relay channels?  Types of Relays.
ORTHOGONAL FREQUENCY DIVISION MULTIPLEXING(OFDM)
Bilal Saqib. Courtesy: Northrop Grumman Corporation.
CE 4228 Data Communications and Networking
RICE UNIVERSITY Implementing the Viterbi algorithm on programmable processors Sridhar Rajagopal Elec 696
Chi-Cheng Lin, Winona State University CS 313 Introduction to Computer Networking & Telecommunication Modulation, Multiplexing, & Public Switched Telephone.
Physical Layer (2). Goal Physical layer design goal: send out bits as fast as possible with acceptable low error ratio Goal of this lecture – Review some.
Weekly Group Meeting Title: Programmable Baseband Processors (Chapters 5) By Assad Saleem.
Signal Propagation Propagation: How the Signal are spreading from the receiver to sender. Transmitted to the Receiver in the spherical shape. sender When.
Orthogonal Frequency Division Multiple Access (OFDMA)
Dr. Carl R. Nassar, Dr. Zhiqiang Wu, and David A. Wiegandt RAWCom Laboratory Department of ECE.
Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett.
6: Wireless and Mobile Networks6-1 Chapter 6 Wireless and Mobile Networks Computer Networking: A Top Down Approach Featuring the Internet, 3 rd edition.
CWNA Guide to Wireless LANs, Second Edition Chapter Four IEEE Physical Layer Standards.
MAC Protocols In Sensor Networks.  MAC allows multiple users to share a common channel.  Conflict-free protocols ensure successful transmission. Channel.
ECS 152A 4. Communications Techniques. Asynchronous and Synchronous Transmission Timing problems require a mechanism to synchronize the transmitter and.
Multiple Access Techniques for Wireless Communications (MAT)
Wireless specifics. 2 A Wireless Communication System Antenna.
Introduction of Low Density Parity Check Codes Mong-kai Ku.
Coding Theory. 2 Communication System Channel encoder Source encoder Modulator Demodulator Channel Voice Image Data CRC encoder Interleaver Deinterleaver.
CDMA TECHNOLOGY DEFINITION OF CDMA TECHNOLOGY A coding scheme, used as a modulation technique, in which multiple channels are independently coded for.
TI Cellular Mobile Communication Systems Lecture 4 Engr. Shahryar Saleem Assistant Professor Department of Telecom Engineering University of Engineering.
IMA Summer Program on Wireless Communications Quick Fundamentals of Wireless Networks Phil Fleming Network Advanced Technology Group Network Business Motorola,
Real-Time Turbo Decoder Nasir Ahmed Mani Vaya Elec 434 Rice University.
CDMA Reception Issues Unequal received power levels degrade SSMA performance Near-Far Ratio, terrain, RF obstacles, “Turn-the-Corner” effects, ... Multipath.
1 Orthogonal Frequency- Division Multiplexing (OFDM) Used in DSL, WLAN, DAB, WIMAX, 4G.
Minufiya University Faculty of Electronic Engineering Dep. of Electronic and Communication Eng. 4’th Year Information Theory and Coding Lecture on: Performance.
Code Division Multiple Access (CDMA) Transmission Technology
1/30/ :20 PM1 Chapter 6 ─ Digital Data Communication Techniques CSE 3213 Fall 2011.
EC 2401*** WIRELESS COMMUNICATION. Why Wireless Benefits – Mobility: Ability to communicate anywhere!! – Easier configuration, set up and lower installation.
Multiple Access Techniques for Wireless Communication
Stallings, Wireless Communications & Networks, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Spread Spectrum Chapter.
Data and Computer Communications Tenth Edition by William Stallings Data and Computer Communications, Tenth Edition by William Stallings, (c) Pearson Education.
S , Postgraduate Course in Radio Communications
Diana B. Llacza Sosaya Digital Communications Chosun University
DATA AND COMPUTER COMMUNICATIONS Eighth Edition by William Stallings Lecture slides by Lawrie Brown Chapter 9 – Spread Spectrum.
Introduction to OFDM and Cyclic prefix
 First generation systems utilized frequency axis to separate users into different channels  Second generation systems added time axis to increase number.
Multiple Access Techniques for Wireless Communication
244-6: Higher Generation Wireless Techniques and Networks
4G-WIRELESS NETWORKS PREPARED BY: PARTH LATHIGARA(07BEC037)
WiMAX 1EEE Protocol Stack
Shamir Stein Ackerman Elad Lifshitz Timor Israeli
January 2004 Turbo Codes for IEEE n
EE359 – Lecture 18 Outline Multiuser Systems Announcements
Towards IEEE HDR in the Enterprise
Physical Layer Approach for n
Digital Communication Chapter 1: Introduction
EE359 – Lecture 18 Outline Announcements Spread Spectrum
Presentation transcript:

Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian Flautner University of Michigan 1 ARM Ltd

Advanced Computer Architecture Laboratory University of Michigan2 Contents Software defined radio Categories of wireless networks Core technologies for future networks Case study : W-CDMA Network  Major algorithms  Workload characterization  Architectural implications

Software Defined Radio

Advanced Computer Architecture Laboratory University of Michigan4 Wireless Communication System Upper Protocol Layers Physical Layer (PHY) Application bits Baseband Processing Analog Front-end Packets “Air” MAC LINK Network Transport PPP IP TCP/UDP

Advanced Computer Architecture Laboratory University of Michigan5 Anatomy of Cellular Phone

Advanced Computer Architecture Laboratory University of Michigan6 Audio AMR/QCELP PHY MAC Protocol on Wireless Platform Upper layers Physical layer LINK Network Transport ASIC (Hardware) GPP (Software) Video MPEG GPP (Software) DSP/ Accelerator Source coding Application Processor Baseband Processor

Advanced Computer Architecture Laboratory University of Michigan7 Software Defined Radio (SDR) Use software routines instead of ASICs for the physical layer operations of wireless communication system ASICs (PHY) Programmable Hardware Software Routines Both Analog Frontend and Digital Baseband are the scope of SDR

Advanced Computer Architecture Laboratory University of Michigan8 Levels of SDR TierNameDescription Tier 0 Hardware Radio (HR) Implemented using hardware components. Cannot be modified Tier 1 Software Controlled Radio (SCR) Only control functions are implemented in software: inter-connects, power levels, etc. Tier 2 Software Defined Radio (SDR) Software control of a variety of modulation techniques, wide-band or narrow-band operation, security functions, etc. Tier 3 Ideal Software Radio (ISR) Programmability extends to the entire system with analog conversion only at the antenna. Tier 4 Ultimate Software Radio (USR) Defined for comparison purposes only

Advanced Computer Architecture Laboratory University of Michigan9 Why we need SDR ? Seamless wireless connection – End User  Widely different wireless protocols TDMA : GSM, AMPS CDMA : IS-95, cdma2000, W-CDMA, IEEE b OFDM : IEEE a/g/n, WiMAX  Needs a terminal that can support multiple wireless protocols Easy infrastructure upgrade – Service Provider  Wireless protocols evolve continuously Ex) W-CDMA  W-CDMA + HSDPA Time to market – Manufacturer  Reduce hardware development time and cost

Advanced Computer Architecture Laboratory University of Michigan10 Where can we use SDR ? Basestations  Weak constraints on power and area  Support several hundred subscribers  Will be commercialized first Wireless terminals  Tight constraints on power and area.  Will be commercialized next

Advanced Computer Architecture Laboratory University of Michigan11 Why SDR is challenging ? Analog Frontend  Must be tunable across a range of carrier frequencies and bandwidths. Digital Baseband  Super computer level computation power. > 50 Gops per subscriber  Tight power budget. 200 ~ 300 mW  High level of programmability. Combination of heterogeneous signal processing algorithms.

Advanced Computer Architecture Laboratory University of Michigan12 Our Strategy Performance  Exploit the parallelism in signal processing and forward error correction (FEC) algorithms Power  Limit the programmability to minimize power consumption.  Minimize both active and idle mode power consumption There exists trade off between power efficiency and programmability

Categories of Wireless Networks

Advanced Computer Architecture Laboratory University of Michigan14 Categories of Wireless Networks

Advanced Computer Architecture Laboratory University of Michigan15 WWAN (Wireless Wide Area Network)

Advanced Computer Architecture Laboratory University of Michigan16 WLAN / WMAN WMAN : Wireless Metro Area Network For last mile problem d : Fixed WiMax e : Mobile WiMax WLAN : Wireless Local Area Network High data rate Poor mobility support

Advanced Computer Architecture Laboratory University of Michigan17 WPAN (Wireless Personal Area Network) Interconnecting personal devices

Core technologies of future networks

Advanced Computer Architecture Laboratory University of Michigan19 OFDM (Orthogonal Frequency Division Multiplexing) Transmit signal over several sub-carriers. Frequency spectrum of sub-carriers are overlapped. (High spectral efficiency) Highly susceptible to frequency error in receiver.

Advanced Computer Architecture Laboratory University of Michigan20 Major Computation in OFDM system FFT / IFFT  N = 64 : IEEE a  N = 256~2048 : IEEE WiMax  Data precision : 12~16bits Amount of computations for OFDM operation  ~ 10 8 complex multiplications / sec

Advanced Computer Architecture Laboratory University of Michigan21 MIMO (Multiple Input Multiple Output) Use multiple antennas for signal transmission and reception In ideal case, linearly increase channel capacity Can effectively compensate multipath fading effect Significantly increase receiver complexity Channel Capacity C = W log 2 (1+SNR) Channel Capacity C = min(n, m) * W log 2 (1+SNR)

Advanced Computer Architecture Laboratory University of Michigan22 Computation in MIMO receiver Amount of computation in MIMO receiver  M : # of Tx/Rx antenna  L T : Length of preamble  L P : Length of payload 4 Tx/Rx antenna, 100 Mbps, 64 QAM, ½ coding rate  ~ 6 x 10 8 Computations / Sec

Advanced Computer Architecture Laboratory University of Michigan23 LDPC code Low Density Parity Check (LDPC) code  Turbo code like coding gain with lower implementation cost. Encoding  Matrix multiplication, c = xG  G (Generator matrix) is large matrix. (e.g. 4K X 4K matrix) Decoding  Equivalent to find most probable vector x such that Hx mod 2 = 0.  H (Parity check matrix) is large sparse matrix. Implementation  There exist trade-off between coding gain and implementation complexity

Advanced Computer Architecture Laboratory University of Michigan24 Hybrid ARQ Reuse error frames for the decoding of retransmitted frame Require huge buffer space

Case Study : W-CDMA system

Major Algorithms

Advanced Computer Architecture Laboratory University of Michigan27 Physical layer of W-CDMA Error Correction Overcome severe error in short time interval Assign signal waveform optimal for data transmission Suppress the signal term in outside of stop band

Advanced Computer Architecture Laboratory University of Michigan28 Channel Encoder/Decoder Encoder  Add systematic redundancy on source data Decoder  Fix errors on received data with the systematic redundancy information generated by encoder W-CDMA system uses  Convolutional code (for short voice and control message)  Turbo code (for video stream and high speed packet data)

Advanced Computer Architecture Laboratory University of Michigan29 Channel Encoder Consists of flip-flops and exclusive OR gates Has negligible impact on workload Output 0 G 0 = 561 (octal) Input DDDDDDDD Output 1 G 1 = 753 (octal)

Advanced Computer Architecture Laboratory University of Michigan30 Channel Decoder Determine maximally probable code sequence from the received sequence. Select C having minimum distance with received sequence r One of dominant workload C1C1 C2C2 CNCN r d1d1 d2d2 dNdN {c i } : code set - r : received signal

Advanced Computer Architecture Laboratory University of Michigan31 Channel Decoder – Viterbi Algorithm Most popular decoding algorithm of convolutional code Consists of three steps:  Branch metric calculation (BMC) abs(a-b), Parallelizable  Add compare select (ACS) min(a+b, c+d), Parallelizable  Trace back (TB) Recursive pointer tracing, Sequential Amount of operation in W-CDMA  16Kbps voice : ~2Gops

Advanced Computer Architecture Laboratory University of Michigan32 Channel Decoder –Turbo decoder Two algorithms are widely used  SOVA (Soft Output Viterbi Algorithm) Less computation intensive Lower error correction performance  Max-LogMap algorithm More computation required Higher error correction performance Amount of operation in W-CDMA  For 128 Kbps streaming data : ~18 Gops

Advanced Computer Architecture Laboratory University of Michigan33 Turbo Decoder Based on the multiple iteration of SOVA / Max-LogMap blocks. More iterations show better performance.

Advanced Computer Architecture Laboratory University of Michigan34 Block Interleaver/Deinterleaver Overcome severe signal attenuation within short time interval which frequently appears at wireless channel. Interleaver  Randomize the sequence of source data. Deinterleaver  Recover original sequence by reordering. Amount of operation : < 10 Mops InterleavingDeinterleaving   

Advanced Computer Architecture Laboratory University of Michigan35 Spreader/Despreader Allow the transmission of several signals at the same time. (x[n] and y[n] in the below diagram) It is based on the orthogonality between spreading codes

Advanced Computer Architecture Laboratory University of Michigan36 Spreader/Despreader Spreader / Despreader also suppress noise Amount of operation : ~4 Gops

Advanced Computer Architecture Laboratory University of Michigan37 Scrambler/Descrambler Randomize the output signal by multiplying pseudo random sequence so called scrambling code. Allow multiple terminals to communicate at the same time. Amount of operation : ~ 3 Gops Terminal 1, with scrambling code n Terminal 2, with scrambling code m

Advanced Computer Architecture Laboratory University of Michigan38 Low Pass Filter Suppress the signal terms at the outside of stop band frequency. Filtering Time domain Freq. domain Impulse signal sinc function Band limited signal Band unlimited signal

Advanced Computer Architecture Laboratory University of Michigan39 Low Pass Filter Use conventional FIR filter Number of filter tap (N) = 32 ~ 64 Amount of operation : ~ 12 Gops

Advanced Computer Architecture Laboratory University of Michigan40 Rake Receiver – Multipath fading Rake receiver mitigates multipath fading effect Multipath fading is a major cause of unreliable wireless channel characteristic x(t) y(t) = a 0 x(t)y(t) = a 0 x(t)+a 1 x(t-d 1 )y(t) = a 0 x(t)+a 1 x(t-d 1 )+a 2 x(t-d 2 )

Advanced Computer Architecture Laboratory University of Michigan41 Rake Receiver - Functions Ideally the function of rake receiver is to aggregate the signal terms with proper delay compensation y(t) = a 0 x(t)+a 1 x(t-d 1 )+a 2 x(t-d 2 ) r(t) = a 0 x(t-t dealy )+a 1 x(t-d 1 -d est1 )+a 2 x(t-d 2 -d est2 ) = (a 0 +a 1 +a 2 ) * x(t-t delay ) Rake receiver We need to know delay spread of received signal that randomly varies

Advanced Computer Architecture Laboratory University of Michigan42 Rake Receiver – Detect Delay Spread Scan the received signal in frame buffer while computing correlation with scrambling code sequence. Received signal Correlation window Correlation Result a0a0 a1a1 a2a2 0 d1d1 d2d2

Advanced Computer Architecture Laboratory University of Michigan43 Computation of Rake Receiver Correlation computation : L W L B F  L W : Correlation window = 320  L B : Frame buffer size = 5120  F : Operation Frequency = 50  ~ 80 Mega Multiplications / sec  Multiplications can be converted into subtraction Amount of operation in W-CDMA : ~25 Gops Most dominant workload

Advanced Computer Architecture Laboratory University of Michigan44 Rake Receiver – Overall Architecture Detects delay spread Compensates propagation delay recombine signal terms without delay

Advanced Computer Architecture Laboratory University of Michigan45 Power Control Receiver controls the transmission power of transmitter in order to minimize the interference to other users. Required computation is negligible TerminalBasestation Refrence level uduuddu Strength of pilot signal is below the reference level Terminal sends UP command Strength of pilot signal is above the reference level Terminal sends DOWN command : Pilot Signal u : Power Control Command

Advanced Computer Architecture Laboratory University of Michigan46 H/W operation states Radio resource control state defined in W-CDMA specification operation states defined according to H/W activity Idle Control Hold Active For long idle period between sessions Periodic wake up for control message reception Minimum workload but dominate terminal standby time For short idle period between packet burst Hold narrow control channel for fast transition to Active Intermediate workload For packet burst transmission period Use high speed packet channels up to 2Mbps Most heavily loaded state

Workload Characterization

Advanced Computer Architecture Laboratory University of Michigan48 Workload Profile One operation is equivalent to one RISC instruction Searcher, Turbo decoder, and LPF are dominant workloads Workload profile varies according to operation state

Advanced Computer Architecture Laboratory University of Michigan49 Processing Time Requirement Mixture of algorithms with various processing time requirements Classified into two categories  Heavy workload with long processing time (turbo decoder, searcher)  Light workload with short processing time (Scrambler, spreader, LPF, Power control)

Advanced Computer Architecture Laboratory University of Michigan50 Parallelism Most heavy workload algorithms have significant vector parallelism Data width of most operation is 8 bit

Advanced Computer Architecture Laboratory University of Michigan51 Memory Access Pattern Huge memory is not required Traffic between algorithm is not dominant Access rate of scratch pad memory is very high.

Advanced Computer Architecture Laboratory University of Michigan52 Instruction Breakdown ADD/SUB are dominant instruction Multiplication is not dominant in heavy workloads

Advanced Computer Architecture Laboratory University of Michigan53 Frequent Computations Most multiplications are simplified into cheaper operations Multiplication in LPF-Rx can not be simplified because both operands are 16bit integer number.

Architectural Implications

Advanced Computer Architecture Laboratory University of Michigan55 Architectural Implications SIMD because  We can exploit vector parallelism in W-CDMA algorithms  Highly power efficiency can be achieved by sharing control logic between datapath elements. Chip multiprocessor because  There exist substantial algorithm level parallelism  There exist many tiny sequential algorithms  Multiple SIMD + Scalar SIMD …. Scalar Interconnection Network

Advanced Computer Architecture Laboratory University of Michigan56 Architectural Implications Memory structure  Cache free Memory access pattern exhibits very dense spatial locality.  Small data memory (<64K)  Small instruction memory (<4K) Simple interconnection network  Low inter-processor communication is possible by algorithm level task mapping on each PE.

Advanced Computer Architecture Laboratory University of Michigan57 Architectural Implication Power management  Large workload variation according to operation state and radio channel condition change.  Various power management schemes can be applied DVS, DFS, Clock gating.  Idle mode power must be minimized because it dominates terminal standby time.

Advanced Computer Architecture Laboratory University of Michigan58 W-CDMA benchmark suite C based implementation of W-CDMA physical layer operation. Used for the workload characterization done in this paper. Available at 

Advanced Computer Architecture Laboratory University of Michigan59 Conclusion We discussed :  what is SDR and why it is challenging topic for embedded system.  the evolution history of wireless protocols and what are the core technologies of emerging protocols. We analyzed :  the workload characteristic of W-CDMA protocol and its architectural implication.

Backup Slides

Advanced Computer Architecture Laboratory University of Michigan61 Viterbi Algorithms –Trellis Diagram Viterbi algorithm is based on trellis diagram. Trellis diagram represents all possible state transition of encoder.

Advanced Computer Architecture Laboratory University of Michigan62 Viterbi Algorithm - BMC BMC (Branch metric calculation) operation is to compute difference between the received sequence r and outputs of trellis diagram. BMC i,j = distance(r ij, o ij )=abs(r ij, o ij ) o ij : output of state transition form i to j r ij : corresponding received sequence All BMC operation in a trellis diagram can be done in parallel. distance between r(01) and C n (10) = = 2 CnCn

Advanced Computer Architecture Laboratory University of Michigan63 Viterbi Algorithm - ACS ACS(Add Compare Select) operation is: This procedure is equivalent to finding a local optimal code sequence. If C 1 has smallest ACS value at node state i, then the ACS values of C 2 and C 3 are always greater than that of C 1 Add Compare, Select

Advanced Computer Architecture Laboratory University of Michigan64 Viterbi Algorithm - TB Trace back a code sequence which is most close to the received sequence Sequential algorithm

Advanced Computer Architecture Laboratory University of Michigan65 Block Interleaver/Deinterleaver Interleaver  Write row by row sequentially  read column by column according to the predefined permutation pattern Deinterlever  Write column by column according to the predefined permutation pattern  read row by row sequentially