Real-time 1-input 1-output DSP systems

Slides:



Advertisements
Similar presentations
Chapter 19 Fast Fourier Transform
Advertisements

DFT & FFT Computation.
Fast Fourier Transform for speeding up the multiplication of polynomials an Algorithm Visualization Alexandru Cioaca.
Garfield AP Computer Science
Y(J)S DSP Slide 1 Outline 1. Signals 2. Sampling 3. Time and frequency domains 4. Systems 5. Filters 6. Convolution 7. MA, AR, ARMA filters 8. System identification.
Digital Kommunikationselektronik TNE027 Lecture 5 1 Fourier Transforms Discrete Fourier Transform (DFT) Algorithms Fast Fourier Transform (FFT) Algorithms.
DFT and FFT FFT is an algorithm to convert a time domain signal to DFT efficiently. FFT is not unique. Many algorithms are available. Each algorithm has.
CS Divide and Conquer/Recurrence Relations1 Divide and Conquer.
A simple example finding the maximum of a set S of n numbers.
Lecture #17 INTRODUCTION TO THE FAST FOURIER TRANSFORM ALGORITHM Department of Electrical and Computer Engineering Carnegie Mellon University Pittsburgh,
Introduction to Fast Fourier Transform (FFT) Algorithms R.C. Maher ECEN4002/5002 DSP Laboratory Spring 2003.
This material in not in your text (except as exercises) Sequence Comparisons –Problems in molecular biology involve finding the minimum number of edit.
CSC 2300 Data Structures & Algorithms January 30, 2007 Chapter 2. Algorithm Analysis.
Unit 7 Fourier, DFT, and FFT 1. Time and Frequency Representation The most common representation of signals and waveforms is in the time domain Most signal.
Fast Fourier Transforms
MA/CSSE 473 Day 03 Asymptotics A Closer Look at Arithmetic With another student, try to write a precise, formal definition of “t(n) is in O(g(n))”
Analysis of Algorithms
CS 179: GPU Programming Lecture 9 / Homework 3. Recap Some algorithms are “less obviously parallelizable”: – Reduction – Sorts – FFT (and certain recursive.
 Recall grade school trick ◦ When multiplying by 9:  Multiply by 10 (easy, just shift digits left)  Subtract once ◦ E.g.  x 9 = x (10.
Reconfigurable Computing - Multipliers: Options in Circuit Design John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on.
Hossein Sameti Department of Computer Engineering Sharif University of Technology.
CSC 172 DATA STRUCTURES. THEORETICAL BOUND  Many good sorting algorithms run in O(nlogn) time.  Can we do better?  Can we reason about algorithms not.
Complexity 20-1 Complexity Andrei Bulatov Parallel Arithmetic.
DEPARTMENTT OF ECE TECHNICAL QUIZ-1 AY Sub Code/Name: EC6502/Principles of digital Signal Processing Topic: Unit 1 & Unit 3 Sem/year: V/III.
Professor A G Constantinides 1 Discrete Fourier Transforms Consider finite duration signal Its z-tranform is Evaluate at points on z-plane as We can evaluate.
Fast Fourier Transforms. 2 Discrete Fourier Transform The DFT pair was given as Baseline for computational complexity: –Each DFT coefficient requires.
CS 179: GPU Programming Lecture 9 / Homework 3. Recap Some algorithms are “less obviously parallelizable”: – Reduction – Sorts – FFT (and certain recursive.
EE345S Real-Time Digital Signal Processing Lab Fall 2006 Lecture 17 Fast Fourier Transform Prof. Brian L. Evans Dept. of Electrical and Computer Engineering.
بسم الله الرحمن الرحيم Digital Signal Processing Lecture 14 FFT-Radix-2 Decimation in Frequency And Radix -4 Algorithm University of Khartoum Department.
Discrete Fourier Transform
Algorithm Design Techniques, Greedy Method – Knapsack Problem, Job Sequencing, Divide and Conquer Method – Quick Sort, Finding Maximum and Minimum, Dynamic.
CSC317 Selection problem q p r Randomized‐Select(A,p,r,i)
CSC 172 DATA STRUCTURES.
DIGITAL SIGNAL PROCESSING ELECTRONICS
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
FFT-based filtering and the
Polynomial + Fast Fourier Transform
Randomized Algorithms
Chapter 4 Divide-and-Conquer
Great Theoretical Ideas in Computer Science
Fast Fourier Transforms Dr. Vinu Thomas
Fast Fourier Transform
Algorithm design and Analysis
Fast Fourier Transform (FFT) (Section 4.11)
FAST FOURIER TRANSFORM ALGORITHMS
Real-time double buffer For hard real-time
A New Approach to Pipeline FFT Processor
DFT and FFT By using the complex roots of unity, we can evaluate and interpolate a polynomial in O(n lg n) An example, here are the solutions to 8 =
Lecture #17 INTRODUCTION TO THE FAST FOURIER TRANSFORM ALGORITHM
Unit-2 Divide and Conquer
Randomized Algorithms
4.1 DFT In practice the Fourier components of data are obtained by digital computation rather than by analog processing. The analog values have to be.
Fast Fourier Transformation (FFT)
EE207: Digital Systems I, Semester I 2003/2004
Data Structures Review Session
Topic: Divide and Conquer
Chapter 9 Computation of the Discrete Fourier Transform
Sub-Quadratic Sorting Algorithms
Divide and Conquer Algorithms Part I
ECE 498AL Lecture 15: Reductions and Their Implementation
ECE 352 Digital System Fundamentals
Analysis of Algorithms
Chapter 19 Fast Fourier Transform
The Cooley-Tukey decimation-in-time algorithm
Lecture #18 FAST FOURIER TRANSFORM ALTERNATE IMPLEMENTATIONS
Fast Fourier Transform (FFT) Algorithms
Speaker: Chris Chen Advisor: Prof. An-Yeu Wu Date: 2014/10/28
Lecture #17 INTRODUCTION TO THE FAST FOURIER TRANSFORM ALGORITHM
Fast Fourier Transform
Divide and Conquer Merge sort and quick sort Binary search
Presentation transcript:

Real-time 1-input 1-output DSP systems Hard real-time: ALWAYS finish computing output before next input Soft real-time: enough to finish on average N-input N-output DSP systems No need to process all N inputs between the Nth and the N+1th by using double/cyclic buffer enough to process N inputs before the next N arrive Real-time demands algorithms that are O(N) since otherwise, no matter how fast the CPU, for large enough N we won’t finish processing in time double buffer cyclic buffer

DFT complexity DFT: Xk = Sn=0N-1 xn WNnk We need to compute N values (k = 0 … N-1) each of which contains with N products (n = 0 … N-1) Thus, the straightforward DFT takes N2 products DFT is O(N2) but the Fast Fourier Transform reduces it to O(N log N) This is not low enough to guarantee real-time for all N but is sufficiently low to enable even extremely large Ns (processors are rated by how large an FFT they can perform in real-time)

Warm-up problem 1 Find minimum and maximum of N numbers x0 x1 x2 x3 ... xN-2 xN-1 minimum alone takes N comparisons maximum alone takes N comparisons minimum and maximum takes 1½ N comparisons x0 x1 x2 x3 ... xN-2 xN-1 run over at pairs, separating into larger and smaller – this takes ½ N comparisons the maximum must be in the smaller list – find it in ½ N comparisons the minimum must be in the larger list Altogether 3/2 N comparisons – 25% savings use decimation can be performed in-place

Warm-up problem 2 Multiply two N digit numbers (w.o.l.g. N binary digits) Long multiplication takes N2 1-digit multiplications Partitioning factors reduces to 3/4 N2 Can recursively continue to reduce to O( N log2 3)  O( N1.585) 3 multiplications, each N/2 bits 32 multiplications, each N/4 bits 3 log2(N) multiplications, each 1 bit

Decimation and Partition x0 x1 x2 x3 x4 x5 x6 x7 Partition (MSB sort) x0 x1 x2 x3 LEFT x4 x5 x6 x7 RIGHT Decimation (LSB sort) x0 x2 x4 x6 EVEN x1 x3 x5 x7 ODD Decimation in Time  Partition in Frequency Partition in Time  Decimation in Frequency

DIT FFT If DFT is O(N2) then DFT of half-length signal takes only 1/4 the time thus two half sequences take half the time Can we combine 2 half-DFTs into one big DFT ? separate sum in DFT by decimation of x values we recognize the DFT of the even and odd sub-sequences we have thus made one big DFT into 2 little ones

DIT is PIF We get further savings by exploiting DIT = PIF comparing frequency values in 2 partitions Note that same products just different signs + - + - + - + - All the odd terms all have - sign ! combining we get the "butterfly"

Xk WN What does this mean ? DFT N DFT N/2 EVEN k ODD k = 0 ... N/2 -1 LEFT RIGHT WN k ODD We have divided the DFT of length N into 2 DFTs of length a butterfly for each pair of outputs This can be used for a recursive FFT implementation

DIT all the way We have already saved but we needn't stop after splitting the original sequence in two ! Each half-length sub-sequence can be decimated too Assuming that N is a power of 2, we continue decimating until we get to the basic N=2 butterfly multiplications

DIT N=8 - step 0

DIT N=8 - step 1

DIT N=8 - step 2

Full DIT for N=8 don’t worry about the order – yet! W20 = 1 W41 = W82

Complexity We assume that all the Ws are precomputed An FFT of length N has log2(N) layers of butterflies ½N butterflies per layer, each with 1 complex multiply 2 complex adds (1 add and 1 subtract) So there are : ½ N log2(N) complex multiplies N log2(N) complex adds Actually, a lot of these are trivial! the last layer has 1 trivial multiplication the next to last layer has 2 trivial multiplications ... the first layer has no non-trivial multiplications

Real complexity Each complex add is 2 real adds Each complex multiply is either: 4 real multiplies and 2 real adds (a + i b) (c + i d) = (a*c – b*d) + i (a*d + b*c) or 3 real multiplies and 5 real adds M1 = a*c M2 = b*d M3 = (a+b)*(c+d) (a + i b) (c + i d) = (M1 – M2) + i (M3 – M2 – M1) So N log2(N) complex adds = 2N log2(N) real adds ½ N log2(N) complex multiplies = 2N log2(N) real multiplies and another N log2(N) real adds (altogether 3N log2(N) ) or 3/2 N log2(N) real multiplies and another 5/2 N log2(N) real adds (altogether 9/2 N log2(N) )

Bit reversal So abcd  bcda  cdba  dcba the input seems to be in a strange order ! So abcd  bcda  cdba  dcba The bits of the index have been reversed ! (DSP processors have a special addressing mode for this)

DIT N=8 with bit reversal

DIF N=8 we can derive this from the DIT graph using the transposition theorem! DIF butterfly