10/12/20151 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items.

Slides:



Advertisements
Similar presentations
DFT & FFT Computation.
Advertisements

DCSP-12 Jianfeng Feng
David Hansen and James Michelussi
Programmable FIR Filter Design
Digital Kommunikationselektronik TNE027 Lecture 5 1 Fourier Transforms Discrete Fourier Transform (DFT) Algorithms Fast Fourier Transform (FFT) Algorithms.
LECTURE Copyright  1998, Texas Instruments Incorporated All Rights Reserved Use of Frequency Domain Telecommunication Channel |A| f fcfc Frequency.
UNLOCKING THE SECRETS HIDDEN IN YOUR DATA Part 3 Data Analysis.
Processor Architecture Needed to handle FFT algoarithm M. Smith.
This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during.
Lecture 15 Orthogonal Functions Fourier Series. LGA mean daily temperature time series is there a global warming signal?
DFT/FFT and Wavelets ● Additive Synthesis demonstration (wave addition) ● Standard Definitions ● Computing the DFT and FFT ● Sine and cosine wave multiplication.
Let’s go back to this problem: We take N samples of a sinusoid (or a complex exponential) and we want to estimate its amplitude and frequency by the FFT.
Filtering Filtering is one of the most widely used complex signal processing operations The system implementing this operation is called a filter A filter.
Digital Signal Processing
This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during.
Software and Hardware Circular Buffer Operations First presented in ENCM There are 3 earlier lectures that are useful for midterm review. M. R.
6/3/20151 ENCM515 Comparison of Integer and Floating Point DSP Processors M. Smith, Electrical and Computer Engineering, University of Calgary, Canada.
6/3/20151 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items.
Computer Graphics Recitation 6. 2 Motivation – Image compression What linear combination of 8x8 basis signals produces an 8x8 block in the image?
Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create.
Sampling, Reconstruction, and Elementary Digital Filters R.C. Maher ECEN4002/5002 DSP Laboratory Spring 2002.
TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada
AGC DSP AGC DSP Professor A G Constantinides 1 Digital Filter Specifications Only the magnitude approximation problem Four basic types of ideal filters.
Cisc Complex Instruction Set Computing By Christopher Wong 1.
DTFT And Fourier Transform
1 Chapter 8 The Discrete Fourier Transform 2 Introduction  In Chapters 2 and 3 we discussed the representation of sequences and LTI systems in terms.
Lecture 1 Signals in the Time and Frequency Domains
Real time DSP Professors: Eng. Julian Bruno Eng. Mariano Llamedo Soria.
Processor Architecture Needed to handle FFT algoarithm M. Smith.
Fixed-Point Arithmetics: Part II
The Story of Wavelets.
Transforms. 5*sin (2  4t) Amplitude = 5 Frequency = 4 Hz seconds A sine wave.
Digital Signal Processing – Chapter 10
README Lecture notes will be animated by clicks. Each click will indicate pause for audience to observe slide. On further click, the lecturer will explain.
Real time DSP Professors: Eng. Julian S. Bruno Eng. Jerónimo F. Atencio Sr. Lucio Martinez Garbino.
Lecture#10 Spectrum Estimation
Fourier and Wavelet Transformations Michael J. Watts
Systematic development of programs with parallel instructions SHARC ADSP21XXX processor M. Smith, Electrical and Computer Engineering, University of Calgary,
Fast Fourier Transforms. 2 Discrete Fourier Transform The DFT pair was given as Baseline for computational complexity: –Each DFT coefficient requires.
بسم الله الرحمن الرحيم Digital Signal Processing Lecture 14 FFT-Radix-2 Decimation in Frequency And Radix -4 Algorithm University of Khartoum Department.
Lecture 19 Spectrogram: Spectral Analysis via DFT & DTFT
The content of lecture This lecture will cover: Fourier Transform
DIGITAL SIGNAL PROCESSING ELECTRONICS
Fourier and Wavelet Transformations
Fast Fourier Transforms Dr. Vinu Thomas
واشوقاه إلى رمضان مرحباً رمضان
Software and Hardware Circular Buffer Operations
CT-321 Digital Signal Processing
Microcoded CCU (Central Control Unit)
Lecture #17 INTRODUCTION TO THE FAST FOURIER TRANSFORM ALGORITHM
Real-time 1-input 1-output DSP systems
Overview of SHARC processor ADSP Program Flow and other stuff
This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during.
Chapter 6 Discrete-Time System
Convolution, GPS and the TigerSHARC XCORRS instr.
Lecture 17 DFT: Discrete Fourier Transform
Finite Wordlength Effects
CT-321 Digital Signal Processing
* From AMD 1996 Publication #18522 Revision E
* M. R. Smith 07/16/96 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint.
DEPARTMENT OF INFORMATION TECHNOLOGY DIGITAL SIGNAL PROCESSING UNIT 4
This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during.
Getting serious about “going fast” on the TigerSHARC
Explaining issues with DCremoval( )
Adaptive Filter A digital filter that automatically adjusts its coefficients to adapt input signal via an adaptive algorithm. Applications: Signal enhancement.
Overview of SHARC processor ADSP-2106X Memory Operations
This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during.
Working with the Compute Block
Lecture #17 INTRODUCTION TO THE FAST FOURIER TRANSFORM ALGORITHM
* M. R. Smith 07/16/96 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint.
Presentation transcript:

10/12/20151 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during your presentation In Slide Show, click on the right mouse button Select “Meeting Minder” Select the “Action Items” tab Type in action items as they come up Click OK to dismiss this box This will automatically create an Action Item slide at the end of your presentation with your points entered. Customizing DSP algorithms does not always mean speed A look at DFT / FFT issues Frequency domain version of Lab. 1 FIR operations. M. R. Smith, Electrical and Computer Engineering, University of Calgary, Alberta, Canada

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 2 / 37 Overview Introduction Industrial Example of DFT/FFT DFT -- FFT Theory Straight application Proper application “The KNOW-WHEN” application Future Talks The implications on DSP processor architecture How are actual DSP processors optimized for FFT operations?

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 3 / 37 References Work originally done for “Beta Monitors”, Calgary Talk first given to AMD FAE Meeting, Santa Clara Published in Microprocessors and Microsystems FFT - fRISCy Fourier Transforms Copy made available on the ENCM515 web-site

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 4 / 37 Testing and using DSP Algorithms Typical testing pattern -- use something simple Simple test of algorithm correctness Time Signal = sum of sinusoids In test, expect, and get, sharp peaks in spectrum Algorithms used in my research DFT -- Discrete Fourier Transform FFT -- Fast Fourier Transform ARMA -- Autoregressive Moving Average Wavelet

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 5 / 37 Testing and using DSP Algorithms Typical testing pattern Simple test of algorithm correctness Time Signal = sum of sinusoids In test: expect, and get, sharp peaks in spectrum IN REAL LIFE -- this is not a valid test as following example shows and many people working in the field don’t get the best out of their algorithms because they don’t realize that. DFT -- Discrete Fourier Transform Implemented directly (Order(N x N) ) operations Implemented by FFT (Order(N x log 2 N))

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 6 / 37 Industrial Example -- Equipment

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 7 / 37 Industrial Problem -- Result

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 8 / 37 Planned Solution -- Theory Unwanted “noise” on a data set can be removed if the “noise” has particular frequency characteristics Improvement is obtained By transforming to the frequency domain, Cutting out (filtering) the unwanted “noise” and then, Inverse transforming to recover the original data form Actually faster to operate in Frequency domain than Time domain (You can show algorithms to be equivalent) Frequency domain -- more memory needed

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 9 / 37 Planned Solution Visual Model

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 10 / 37 What algorithm could be used Time domain filtering tap FIR – same as in Lab. 1, 2 and 3 N = size of the data ( infinite) Complexity Order(N x Tap Length) 1024 * 300 = 300,000 operations Frequency domain filtering N-sized DFT Complexity Direct Order(2 * N * N) = 2,000,000 operations FFTOrder(2 * (N log N)) = 20,000 operations

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 11 / 37 Direct DFT and FFT Time savings -- Number of complex multiplications compared for DFT and FFT NDIRECT (DFT)Radix 2 (FFT) %Change % % % % Key issue -- How can you handle the memory accesses and operations associated with the complex multiplications of data and Fourier Coefficients? -- Data/Instruction Fetch Conflicts

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 12 / 37 Fast DFT algorithm implementation DFT -- Require Order(N ^ 2) operations FFT -- Divide and Conquer Principle N pt DFT can be decimated into 2 of N/2 pt DFT plus “some twiddling on N terms” Then each N/2 pt DFT becomes 2 * N / 4 DFT “plus twiddling” Then each N/4 pt DFT becomes 2 * N / 8 DFT etc Order(N x log N) PROVIDED you can handle bit reverse addressing efficiently. This is a crazy FFT addressing issue that must be handled when you store the data after doing FFT algorithm.

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 13 / 37 FFT -- divide and conquer Ability to do “complex” BUTTERFLY quickly is needed!

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 14 / 37 Bit reverse addressing ability -- KEY INPUT OUTPUT NEED ADDRESS ADDRESS Placing the array into the correct memory locations takes “time”

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 15 / 37 Algorithm -- Different forms x, y == real/imaginary parts of the input one fetched on J-Bus the other on K? wr, wi = precalculated cosine/sine values -- J-Bus and K-bus? m = log2(N) where N is the number of points (power of 2) n2 = N for (k = 0; k < m; k++) {/* Outer loop */ n1 = n2; n2 = n2 / 2; ie = n / n1; ia = 1; for (j = 0; j < n2; j++) {/* Middle loop */ c = wr[ia]; s = wi[ia]; ia += ie; for (i = j; i < N; i += n1) {/* inner loop */ l = i + n2/* BUTTERFLY offset */ xt = x[i] - x[l];/* Common */ yt = y[i] - y[l]; x[i] += x[l];/* Upper */ y[i] += y[l]; x[l] = c * xt + s * yt;/* Lower */ y[l] = c * yt - s * xt; }

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 16 / 37 What processors can be used? CISC Complex instruction set processor Basic and complex functions Control logic requires much real estate Many cycle instructions DSP Digital signal processing chip Specifically designed for DSP Specialized resources provided Dual cycle instructions (many now one) RISC Reduced instruction set processor Simple instructions done well Instructions complete in single cycle Intelligent compiler needed

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 17 / 37 Real life application of Theory Take 360 data points Pad to 512 with zeros to size of algorithm Everybody “knows” FFT is faster when you use “power of two” points Use standard FFT algorithm Zero unwanted “noise” components Use standard inverse FFT Transform “Angle” measurement to “Volume” Area between hystersis loop is associated with compressor efficiency

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 18 / 37 Frequency domain -- filtering Distortions associated with “edge effects” mean that frequency domain signal is not clean. Last point and first point of data -- connected in discrete domain “Cut” will remove more than just “resonance” components

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 19 / 37 Time Domain Result Channel resonance -- old problem greatly reduced New distortions evident at edges of data

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 20 / 37 Real Life versus Theory Perfect data infinitely long perfectly sampled Actual data Nyquist must be met (sample fast enough to cover signal and noise characteristics) finite length of the data manipulated Can be analysed using Fourier Theory by treating as infinitely long signal multiplied by a square window

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 21 / 37 Signal Characteristics -- Time/Frequency MAGNITUDE

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 22 / 37 Windowing -- implied and deliberate Windowing the data in the “TIME” domain spreads the “SPECTRUM” MAIN LOBE -- width of main lobe determines resolutions, or how close two similar sized peaks can be placed but yet be separated SIDE LOBES -- height of side lobes determine how close a small peak can be placed to a large peak and be believed as being a “true peak” and not being a “false” peak (side lobe) Choose a window with the narrowest main lobe and smallest side lobe MRI, seismic, telecommunications all have similar problems This form of data distortion often missed by naive users KEY REFERENCE -- HARRIS -- Proc. IEEE 666, p51, 1978

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 23 / 37 Windowing occurs -- when? ALL DATA ANYBODY GATHERS is always windowed NO EXCEPTIONS -- finite length in either time or frequency domain DFT (and many other algorithms) treat data AS CYCLIC No problems if CYCLIC model results in continuous data across the cycles (Nth order continuity is needed – amplitude continuous, slope continuous, 2 nd derivative continuous ) Discontinuities in data cause BIG problems in frequency domain -- in particular padding with zeros in order to use any DFT algorithm Some diseases in magnetic resonance imaging (MRI) are mimicked by discontinuity artifacts

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 24 / 37 How to fix? Chose a better window Naturally window Take data in a way that the data goes more smoothly to zero at ends so that meet Nth order continuity requirements Synchronously sample Very special case -- and possible for this data set Use a different DSP algorithm approach Not always stable -- MA, AR, ARMA, Burg, wavelet etc.

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 25 / 37 Windows W(m) = a0 + a1 cos (2 PI m / N) + a2 cos (4 PI m / N ) (0 <= m < N) BEWARE -N/2 <= m < N/2 -- flips sign of a1 1) Normal (Rect. window) a0 = 1, a1 =0, a2 = 0 Rectangular window in time becomes sinc function (with side lobes) in frequency 2) Simple (Rect + 2 sinusoids) a0 = 0.54, a1 = -0.46, a2 = 0; becomes rectangular sinc function + two shifted sinc functions. Adjust position and amplitude to compensate for errors 3) Blackman-Harris 3 term -- optimized a0 = , a1 = , a2 =

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 26 / 37 Windowing -- 2 cycles Remember to “window” NOT cut out the channel resonance in Frequency Domain too!

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 27 / 37 Natural Window in time domain 1. Rearrange the way you sample so that data “naturally goes to same DC level” near ends 2. Remove DC offset then pad with zeros Resolution between peaks in the frequency domain is function of data length. This example uses 2.5 cycles of the original data sequence

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 28 / 37 Naturally window -- Match ends at “DC” Not always possible with “real data” Advantage -- no data distortion occurring when window gets applied. Actually does occur, but is hidden -- see later

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 29 / 37 Naturally windowed – frequency domain

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 30 / 37 Naturally windowed – time domain

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 31 / 37 Synchronously Sample the Data As an engineer, you have to be able to reach back into your “ENCM and ENEL theory” and recognize when this sort of thing is possible and correct! Not a solution for most data sets There must be a “TRUE”, exact, cyclic property present in the original data set. Algorithm must be applied “exactly correctly” Windowing is still there! All the windowing distortions are still present -- BUT!!!!!!

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 32 / 37 Synchronously Sample -- Time/Frequency SAMPLED AT “ZEROS” IN WINDOW’S SPECTRUM Have an “exact” number of cycles in the window

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 33 / 37 Synchronously Sample – Frequency domain

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 34 / 37 Synchronously Sample – Time domain

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 35 / 37 Synchronously Sample Not possible for most situations There is a “TRUE” cyclic property present in data Don’t Pad with zeros -- use 720 pt DFT This industrial example 360 points round the cycle Would a specialized FFT algorithm improve things? (2 x 2 x 3 x 3 x 2 * 5) – speed much improved Implemented directly using a specialized 720 point DFT Customer satisfied with integer implementation on Z80 There are custom versions of FFT available for TigerSHARC Very highly parallel – C:\ProgramFiles\Analog4.5\TS\Examples

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 36 / 37 This sort of customization -- NOT NORMALLY POSSIBLE What are the characteristics of general DSP algorithms? What needs to be present on a processor to meet those requirements? Covered in earlier lecture See IEEE Micro Magazine, Dec “How RISCy is DSP”

10/12/2015 ENCM Custom DSP -- not necessarily speed Copyright 37 / 37 Overview Introduction Industrial Example of DFT/FFT DFT -- FFT Theory Straight application Proper application “The KNOW-WHEN” application Future talks The implications on DSP processor architecture How are actual DSP processors optimized for FFT operations?