Wireless Networking and Communications Group

Slides:



Advertisements
Similar presentations
Prof. Brian L. Evans Dept. of Electrical and Computer Engineering The University of Texas at Austin EE445S Real-Time Digital Signal Processing Lab Spring.
Advertisements

Filtering Filtering is one of the most widely used complex signal processing operations The system implementing this operation is called a filter A filter.
Digital Signal Processing – Chapter 11 Introduction to the Design of Discrete Filters Prof. Yasser Mostafa Kadah
Sampling, Reconstruction, and Elementary Digital Filters R.C. Maher ECEN4002/5002 DSP Laboratory Spring 2002.
CELLULAR COMMUNICATIONS DSP Intro. Signals: quantization and sampling.
AGC DSP AGC DSP Professor A G Constantinides 1 Digital Filter Specifications Only the magnitude approximation problem Four basic types of ideal filters.
Introduction to Spectral Estimation
Practical Signal Processing Concepts and Algorithms using MATLAB
DSP. What is DSP? DSP: Digital Signal Processing---Using a digital process (e.g., a program running on a microprocessor) to modify a digital representation.
Review for Midterm #2 Wireless Networking and Communications Group 14 September 2015 Prof. Brian L. Evans EE 445S Real-Time Digital Signal Processing Laboratory.
How to Make Printed and Displayed Images Have High Visual Quality
Prof. Brian L. Evans Dept. of Electrical and Computer Engineering The University of Texas at Austin EE445S Real-Time Digital Signal Processing Lab Fall.
EE345S Real-Time Digital Signal Processing Lab Fall 2006 Lecture 16 Quadrature Amplitude Modulation (QAM) Receiver Prof. Brian L. Evans Dept. of Electrical.
INTERPOLATED HALFTONING, REHALFTONING, AND HALFTONE COMPRESSION Prof. Brian L. Evans Collaboration.
Prof. Brian L. Evans Dept. of Electrical and Computer Engineering The University of Texas at Austin EE445S Real-Time Digital Signal Processing Lab Fall.
Signals and Systems Prof. Brian L. Evans Dept. of Electrical and Computer Engineering The University of Texas at Austin EE 382C-9 Embedded Software Systems.
DISP 2003 Lecture 6 – Part 2 Digital Filters 4 Coefficient quantization Zero input limit cycle How about using float? Philippe Baudrenghien, AB-RF.
EE445S Real-Time Digital Signal Processing Lab Spring 2014 Lecture 10 Data Conversion Slides by Prof. Brian L. Evans, Dept. of ECE, UT Austin, and Dr.
Z TRANSFORM AND DFT Z Transform
Chapter 7 Finite Impulse Response(FIR) Filter Design
Prof. Brian L. Evans Dept. of Electrical and Computer Engineering The University of Texas at Austin EE345S Real-Time Digital Signal Processing Lab Fall.
Generating Sinusoidal Signals Prof. Brian L. Evans Dept. of Electrical and Computer Engineering The University of Texas at Austin EE 445S Real-Time Digital.
EE445S Real-Time Digital Signal Processing Lab Spring 2014 Lecture 16 Quadrature Amplitude Modulation (QAM) Receiver Prof. Brian L. Evans Dept. of Electrical.
Digital Signal Processing
Software Defined Radio PhD Program on Electrical Engineering Sampling Theory and Quantization José Vieira.
Chapter 6 Discrete-Time System. 2/90  Operation of discrete time system 1. Discrete time system where and are multiplier D is delay element Fig. 6-1.
EE345S Real-Time Digital Signal Processing Lab Fall 2006 Lecture 17 Fast Fourier Transform Prof. Brian L. Evans Dept. of Electrical and Computer Engineering.
EE445S Real-Time Digital Signal Processing Lab Fall 2016 Lecture 16 Quadrature Amplitude Modulation.
Professor A G Constantinides 1 Digital Filter Specifications We discuss in this course only the magnitude approximation problem There are four basic types.
Digital Image Processing Lecture 8: Fourier Transform Prof. Charlene Tsai.
Prof. Brian L. Evans Dept. of Electrical and Computer Engineering The University of Texas at Austin EE445S Real-Time Digital Signal Processing Lab Spring.
Finite Impulse Response Filters
Tone Dependent Color Error Diffusion Halftoning
Lecture 11 FIR Filtering Intro
CEN352 Dr. Nassim Ammour King Saud University
Amplitude Modulation X1(w) Y1(w) y1(t) = x1(t) cos(wc t) cos(wc t)
EEE4176 Applications of Digital Signal Processing
EEE422 Signals and Systems Laboratory
Lossy Compression of Stochastic Halftones with JBIG2
Yousof Mortazavi, Aditya Chopra, and Prof. Brian L. Evans
Lab 4 Application of RTOS
Tone Dependent Color Error Diffusion
J McClellan School of Electrical and Computer Engineering
Sampling rate conversion by a rational factor
FM Halftoning Via Block Error Diffusion
Fast Fourier Transform
Generating Sinusoidal Signals
Lecture 14 Digital Filtering of Analog Signals
Infinite Impulse Response Filters
Color Error Diffusion with Generalized Optimum Noise Shaping
Data Conversion Slides by Prof. Brian L. Evans, Dept. of ECE, UT Austin, and Dr. Thomas D. Kite, Audio Precision, Beaverton, OR
Tone Dependent Color Error Diffusion
EXPLOITING SYMMETRY IN TIME-DOMAIN EQUALIZERS
Fourier Transform Analysis of Signals and Systems
Lect5 A framework for digital filter design
A Review in Quality Measures for Halftoned Images
Quantization in Implementing Systems
Z TRANSFORM AND DFT Z Transform
Chapter 6 Discrete-Time System
DEPARTMENT OF INFORMATION TECHNOLOGY DIGITAL SIGNAL PROCESSING UNIT 4
Chapter 7 Finite Impulse Response(FIR) Filter Design
Adaptive Filter A digital filter that automatically adjusts its coefficients to adapt input signal via an adaptive algorithm. Applications: Signal enhancement.
Tone Dependent Color Error Diffusion Halftoning
Tania Stathaki 811b LTI Discrete-Time Systems in Transform Domain Ideal Filters Zero Phase Transfer Functions Linear Phase Transfer.
DEPARTMENT OF INFORMATION TECHNOLOGY DIGITAL SIGNAL PROCESSING UNIT 4
Zhongguo Liu Biomedical Engineering
Finite Impulse Response Filters
Chapter 7 Finite Impulse Response(FIR) Filter Design
Fast Fourier Transform
EXPLOITING SYMMETRY IN TIME-DOMAIN EQUALIZERS
Presentation transcript:

Wireless Networking and Communications Group Reducing Complexity in Signal Processing Algorithms for Communication Receiver and Image Display Software Brian L. Evans Prof. Brian L. Evans 27 July 2010 Seminar at the American University of Beirut

Outline 2004 2005 2006 2007 2008 2009 2010 Embedded digital systems Generating sinusoidal waveforms Discrete-time filters Multicarrier equalizers Image halftoning algorithms Conclusion 2004 2005 2006 2007 2008 2009 2010

Embedded Digital Systems Often work on application-specific tasks In consumer products (2008 units) 1200M cell phones 70M DSL modems 300M PCs 55M cars/light trucks 100M digital cameras 30M gaming consoles (2007) 100M DVD players iPhone has six programmable processors (2008) Embedded programmable processors Inexpensive with small area and volume Predictable off-chip input/output (I/O) rates “Low” power (TI C5504 45mW @ 300MHz) Limited on- chip memory Fixed-point arithmetic

Embedded Digital Systems Memory access in processors External I/O: block data transfers to/from on-chip memory Internal I/O: on-chip memory to CPU registers using data buses (e.g. TI C6000 processor has two 32-bit data buses) Common word sizes for signal processing software 64-bit floating-point for desktop computing (e.g. Matlab) 32-bit floating-point for pro-audio and sonar beamforming 16-bit fixed-point for speech, consumer audio, image proc. IEEE floating-point operations Handles many special cases (e.g. +∞, -∞ and not a number) Add, multiply, divide have comparable hardware complexity

Embedded Digital Systems Fixed-point operations Multiplication based on addition operations Division takes 1-2 instructions per bit of accuracy Multiplication can consume much dynamic power Truncate constants for power savings 56% Multiplier used in TI C64 processors [Han, Evans & Swartzlander, 2005]

Generating Sinusoidal Waveforms Sample continuous-time cosine signal at rate fs Discrete-time fixed frequency 0 = 2  f0 / fs Example: f0 = 1200 Hz and fs = 8000 Hz, 0 = 3/10  Discrete-time realization drops fs term in front of cosine Math library call to cos function in C Uses double-precision floating-point arithmetic No standard in C for internal implementation Generally meant for high-accuracy desktop calculations Call to gsl_sf_cos_e in GNU scientific library 1.8 20 multiply, 30 add, 2 divide, 2 power calculations/output

Generating Sinusoidal Waveforms Difference equation with input x[n] and output y[n] y[n] = (2 cos 0) y[n-1] - y[n-2] + x[n] - (cos 0) x[n-1] From inverse z-transform of z-transform of cos(0 n) u[n] Impulse response gives cos(0 n) u[n] 2 multiplications and 3 adds per output value Buildup in error as n increases due to feedback Lookup table – pre-compute samples offline Discrete-time frequency 0 = 2  f0 / fs = 2  N / L All common factors between integers N and L removed  = 2  k = 2  (N / L) n → n = L → store L samples Entries in either floating-point or fixed-point format Table would contain N periods of the cosine Initial conditions are all zero

Generating Sinusoidal Waveforms Signal quality vs. implementation complexity in generating cos(0 n) u[n] with 0 = 2  N / L Method MACs/ sample ROM (words) RAM (words) Quality in floating pt. Quality in fixed point C math library call 30 22 1 Second Best N/A Difference equation 2 3 Worst Lookup table L Best MAC Multiplication-accumulation RAM Random Access Memory (writeable) ROM Read-Only Memory

Discrete-Time Filters Finite impulse response (FIR) filter Impulse response h[k] has finite extent k = 0,…, M-1 x[k-1] x[k] z-1 z-1 … z-1 … h[0] h[1] h[2] h[M-1] S y[k] Discrete-time convolution

Discrete-Time Filters Infinite impulse response (IIR) filter Biquad building block: 2 poles and 0-2 zeros Generally, coefficients a1, a2, b0, b1, b2 are real-valued  x[k] Unit Delay v[k-1] v[k-2] v[k] b2 a1 a2 b1 y[k] b0 Biquad is short for biquadratic− transfer function is ratio of two quadratic polynomials

Discrete-Time Filters FIR Filters IIR Filters Implementation complexity (1) Higher Lower (sometimes by factor of four) Minimum order design Parks-McClellan (Remez exchange) algorithm (2) Elliptic design algorithm Stable? Always May become unstable when implemented (3) Linear phase If impulse response is symmetric or anti-symmetric about midpoint No, but phase may made approximately linear over passband (or other band) (1) For same piecewise constant magnitude specification (2) Algorithm to estimate minimum order for Parks-McClellan algorithm by Kaiser may be off by 10%. Search for minimum order is often needed. (3) Algorithms can tune design to implementation target to minimize risk

Discrete-Time Filters Keep roots computed by filter design algorithms Polynomial deflation (rooting) reliable in floating-point Polynomial inflation (expansion) may degrade roots Choice of IIR filter structure matters Direct form IIR structures expand zeros and poles, and may become unstable for large order filters (order > 12) Cascade of biquads expands zeros and poles in each biquad Minimum order design not always most efficient Efficiency depends on target implementation Consider power-of-two coefficient design Efficient designs may require search of ∞ design space

Halftime: AUB Summer 2005 EECE 503 Real-Time DSP Lab Embedded digital systems Generating sinusoidal waveforms Discrete-time filters Multicarrier equalizers Image halftoning algorithms Conclusion

Channel Equalization Channel degrades transmitted signal Nonlinear distortion, e.g. amplitude nonlinearities Linear distortion, e.g. convolution by channel impulse response Additive noise, e.g. thermal (Gaussian) and impulsive Equalization compensates linear distortion Spreading/attenuation in time Magnitude/phase distortion in frequency Transmitter Channel Receiver Equalizer Message bit stream Received bit stream

Multicarrier Modulation Divide channel into narrowband subchannels Discrete multitone modulation Baseband transmission based on fast Fourier transform (FFT) Each subchannel carries single-carrier transmission Standardized for digital subscriber line (DSL) communication channel carrier magnitude subchannel Subchannels are 4.3 kHz wide in DSL systems frequency

Equalization in DSL receivers increases bit rate by 10x Channel Equalization nk Equalizer Shortens channel impulse response (time domain eq.) Compensates phase/ magnitude distortion (freq. domain eq.) Single carrier system – g is scalar constant FIR filter w performs time and frequency domain equalization Multicarrier system – g is FIR filter of length n+1 Time domain equalizer (w) then FFT & freq. domain equalizer Channel Equalizer xk yk rk ek h w + + + Training signal - Ideal Channel Receiver generates xk z- g Discretized Baseband System Equalization in DSL receivers increases bit rate by 10x

Multicarrier Equalization Maximum shortening SNR time domain equalizer Minimize energy leakage outside shortened channel length For each position of window  [Melsa, Younce & Rohrs, 1996] Cholesky decomposition of B leads to optimal eigensolution Computationally-intensive: O(Lw3) Floating-point multiplications/divisions Restricts TEQ length to be less than n+1 channel impulse response effective channel impulse response n+1 samples

Time Domain Equalizer Design Bit Rate (Mbps) TEQ length of 17 Data rates averaged over eight standard DSL test lines [Martin et al., 2006] Training complexity in log10(multiply-add operations) Most efficient floating-point versions of algorithms used

Time Domain Equalizer Design Unified framework [Martin et al., 2006] A and B are square (Lw  Lw) and depend on choice of  Constraint prevents trivial non-practical solution w = 0 Find eigenvector for largest generalized eigenvalue Formulation Power method Alternating Lagrangian Iterative Methods division-free 20 iterations to converge for 17-tap MSSNR TEQ design

Digital Image Halftoning For display on devices with fewer bits of gray/color resolution than original image Grayscale: 8-bit image to 1-bit image Color: 24-bit RGB image to 12-bit RGB display Produces artifacts Original Image b(m) x(m) Threshold at Mid-Gray Each pixel in original image is 8-bit unsigned intensity in [0, 255] For display, 0 is black and 255 is white

Quantization with Feedback Consider 4-bit data on 2-bit display (unsigned) Feedback quantization error For constant input 1001 = 9 Average output value ¼ (10+10+10+11) = 1001 4-bit resolution at DC ! Noise shaping Truncating from 4 to 2 bits increases noise by ~12dB Feedback removes noise at DC & increases HF noise 1 sample delay Input signal words To display device 4 2 Adder Inputs Output Time Upper Lower Sum to display 1 1001 00 1001 10 2 1001 01 1010 10 3 1001 10 1011 10 4 1001 11 1100 11 Added noise f 12 dB (2 bits) Periodic

Error Diffusion Halftoning Quantize each pixel Diffuse filtered quantization error to “future” pixels Original Halftone difference threshold + _ e(m) b(m) x(m) compute error shape error u(m) current pixel Halftone Spectrum 3/16 7/16 5/16 1/16 [Floyd & Steinberg, 1976] error filter weights

Error Diffusion Halftoning Artifact Model Compensation Added Complexity Sharpening Linear Sharpness control 1 multiplication and 1 addition per pixel False textures Nonlinear Deterministic bit flipping quantization 1 comparison per pixel Deterministic bit flipping quantizer (DBF) [Damera-Venkata & Evans, 2001] Thresholds input to black (0) or white (255) Flip quantized value about mid-gray (128) Reduces false textures in mid-grays Implemented with two comparisons DBF(x) 255 x1 128 x2 x

Plots for ideal lowpass H() Sharpness Control Model quantizer as gain plus noise [Kite, Evans & Bovik, 1997] Signal transfer function models sharpening Ks ≈ 2 for Floyd-Steinberg Noise transfer function models noise-shaping Kn = 1 w w1 -w1 2 Pass low and enhance high frequencies w w1 -w1 1 Pass high frequency noise Ks = 2 Plots for ideal lowpass H()

Sharpness Control Adjust by threshold modulation [Eschbach & Knox, 1991] Scale image by gain L and add it to quantizer input Flatten signal transfer function [Kite, Evans & Bovik, 2000] + _ e(m) b(m) x(m) u(m) L

Results Floyd-Steinberg Original DBF quantizer Unsharpened

Conclusion Processor architecture Decrease data sizes to reduce on-chip memory usage and increase data bus efficiency Truncate multiplicand constants to reduce power Compute signal values by recursion or lookup table Algorithm design Keep offline design results in full precision until end Order of calculations matters in implementation Exploit problem structure in developing fixed-point algorithms Linearize nonlinear systems to leverage linear system methods Many other ways to reduce complexity exist

Invitations Panel discussion on graduate studies Tomorrow (Wednesday) 1:30 – 2:30 pm in this room (RCR) Panelists: Prof. Zaher Dawy (AUB), Prof. Imad El-Hajj (AUB) and Prof. Brian Evans (UT Austin) IEEE Workshop on Signal Processing Systems Early October 2011 Short walk from the AUB campus Organizers include Prof. Magdy Bayoumi (Univ. of Louisiana at Lafayette), Prof. Brian Evans (UT Austin), Dean Ibrahim Hajj (AUB) and Prof. Mohammad Mansour (AUB)

Thank You!

Digital Signal Processors DSP Processor Market Market ~1/3 of $25B embedded digital signal processing market 2007 cholesterol lowering Pzifer Lipitor sales: $13B Applications (2007) Annual Revenue Share Source: Forward Concepts Source: Forward Concepts

Screening (Masking) Methods Introduction Screening (Masking) Methods Periodic thresholds to binarize image Periodic application leads to aliasing (gridding effect) Clustered dot screening is more resistant to ink spread Dispersed dot screening has higher spatial resolution Blue larger masks (e.g. 1” by 1”) Clustered dot mask Dispersed dot mask Threshold Lookup Table index

Linear Gain Model for Quantizer Extend sigma-delta modulation analysis to 2-D Linear gain model for quantizer in 1-D [Ardalan and Paulos, 1988] Linear gain model for grayscale image [Kite, Evans, Bovik, 1997] Error diffusion is modeled as linear, shift-invariant Signal transfer function (STF): quantizer acts as scalar gain Noise transfer function (NTF): quantizer acts as additive noise { us(m) Ks Ks us(m) Signal Path u(m) b(m) n(m) un(m) Simple linear equalizers are easy to implement but enhance noise Complex equalizers such as the DFE is required (how many taps has an DFE ?) The computational complexity of DFE increases very fast with bit rate un(m) + n(m) Noise Path

Spatial Domain Original Image Threshold at Mid-Gray Dispersed Dot Screening Clustered Dot Screening Floyd Steinberg Error Diffusion Stucki Error Diffusion

Magnitude Spectra Original Image Threshold at Mid-Gray Dispersed Dot Screening Clustered Dot Screening Floyd Steinberg Error Diffusion Stucki Error Diffusion

Human Visual System Modeling Contrast at particular spatial frequency for visibility Bandpass: non-dim backgrounds [Manos & Sakrison, 1974; 1978] Lowpass: high-luminance office settings with low-contrast images [Georgeson & G. Sullivan, 1975] Exponential decay [Näsäsen, 1984] Modified lowpass version [e.g. J. Sullivan, Ray & Miller, 1990] Angular dependence: cosine function [Sullivan, Miller & Pios, 1993]

Linear Gain Model for Quantizer Analysis and Modeling Linear Gain Model for Quantizer Best linear fit for Ks between quantizer input u(m) and halftone b(m) Stable for Floyd-Steinberg Can use average value to estimate Ks from only error filter Sharpening: proportional to Ks [Kite, Evans & Bovik, 2000] Value of Ks: Floyd Steinberg < Stucki < Jarvis Weighted SNR using unsharpened halftone Floyd-Steinberg > Stucki > Jarvis at all viewing distances Image Floyd Stucki Jarvis barbara 2.01 3.62 3.76 boats 1.98 4.28 4.93 lena 2.09 4.49 5.32 mandrill 2.03 3.38 3.45 Average 3.94 4.37