Reconfigurable Computing S. Reda, Brown University Reconfigurable Computing (EN2911X, Fall07) Lecture 16: Application-Driven Hardware Acceleration (1/4)

Slides:



Advertisements
Similar presentations
DFT & FFT Computation.
Advertisements

You will learn about: Complex Numbers Operations with complex numbers Complex conjugates and division Complex solutions of quadratic equations Why: The.
Fast Fourier Transform for speeding up the multiplication of polynomials an Algorithm Visualization Alexandru Cioaca.
Reconfigurable Computing S. Reda, Brown University Reconfigurable Computing (EN2911X, Fall07) Lecture 06: Verilog (2/3) Prof. Sherief Reda Division of.
Parallel Fast Fourier Transform Ryan Liu. Introduction The Discrete Fourier Transform could be applied in science and engineering. Examples: ◦ Voice recognition.
Digital Kommunikationselektronik TNE027 Lecture 5 1 Fourier Transforms Discrete Fourier Transform (DFT) Algorithms Fast Fourier Transform (FFT) Algorithms.
Lecture 7: Basis Functions & Fourier Series
Why Systolic Architecture ?. Motivation & Introduction We need a high-performance, special-purpose computer system to meet specific application. I/O and.
Polynomial and FFT. Topics 1. Problem 2. Representation of polynomials 3. The DFT and FFT 4. Efficient FFT implementations 5. Conclusion.
Fast Fourier Transform Lecture 6 Spoken Language Processing Prof. Andrew Rosenberg.
FFT1 The Fast Fourier Transform. FFT2 Outline and Reading Polynomial Multiplication Problem Primitive Roots of Unity (§10.4.1) The Discrete Fourier Transform.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Lecture #17 INTRODUCTION TO THE FAST FOURIER TRANSFORM ALGORITHM Department of Electrical and Computer Engineering Carnegie Mellon University Pittsburgh,
Administrative Oct. 2 Oct. 4 – QUIZ #2 (pages of DPV)
Reconfigurable Computing S. Reda, Brown University Reconfigurable Computing (EN2911X, Fall07) Lecture 17: Application-Driven Hardware Acceleration (3/4)
Introduction to Fast Fourier Transform (FFT) Algorithms R.C. Maher ECEN4002/5002 DSP Laboratory Spring 2003.
FFT(Fast Fourier Transform). p2. FFT Coefficient representation: How to evaluate A(x 0 )?
FFT1 The Fast Fourier Transform by Jorge M. Trabal.
Princeton University COS 423 Theory of Algorithms Spring 2002 Kevin Wayne Fast Fourier Transform Jean Baptiste Joseph Fourier ( ) These lecture.
Lecture 14: Laplace Transform Properties
Fast Fourier Transform (FFT) (Section 4.11) CS474/674 – Prof. Bebis.
Introduction to Algorithms
The Fourier series A large class of phenomena can be described as periodic in nature: waves, sounds, light, radio, water waves etc. It is natural to attempt.
1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 14 Instructor: Paul Beame.
CELLULAR COMMUNICATIONS DSP Intro. Signals: quantization and sampling.
Fast Fourier Transform Irina Bobkova. Overview I. Polynomials II. The DFT and FFT III. Efficient implementations IV. Some problems.
Fast Fourier Transforms
1 Chapter 5 Divide and Conquer Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
CS 6068 Parallel Computing Fall 2013 Lecture 10 – Nov 18 The Parallel FFT Prof. Fred Office Hours: MWF.
FFT USING OPEN-MP Done by: HUSSEIN SALIM QASIM & Tiba Zaki Abdulhameed
FFT1 The Fast Fourier Transform. FFT2 Outline and Reading Polynomial Multiplication Problem Primitive Roots of Unity (§10.4.1) The Discrete Fourier Transform.
5.6 Convolution and FFT. 2 Fast Fourier Transform: Applications Applications. n Optics, acoustics, quantum physics, telecommunications, control systems,
The Fast Fourier Transform
Signal and Systems Prof. H. Sameti Chapter 5: The Discrete Time Fourier Transform Examples of the DT Fourier Transform Properties of the DT Fourier Transform.
Fourier Series. Introduction Decompose a periodic input signal into primitive periodic components. A periodic sequence T2T3T t f(t)f(t)
The Fast Fourier Transform and Applications to Multiplication
Z TRANSFORM AND DFT Z Transform
Convolution in Matlab The convolution in matlab is accomplished by using “conv” command. If “u” is a vector with length ‘n’ and “v” is a vector with length.
Inverse DFT. Frequency to time domain Sometimes calculations are easier in the frequency domain then later convert the results back to the time domain.
7- 1 Chapter 7: Fourier Analysis Fourier analysis = Series + Transform ◎ Fourier Series -- A periodic (T) function f(x) can be written as the sum of sines.
Digital Signal Processing
1 Fast Polynomial and Integer Multiplication Jeremy R. Johnson.
Fast Fourier Transforms. 2 Discrete Fourier Transform The DFT pair was given as Baseline for computational complexity: –Each DFT coefficient requires.
Applied Symbolic Computation1 Applied Symbolic Computation (CS 567) The Fast Fourier Transform (FFT) and Convolution Jeremy R. Johnson TexPoint fonts used.
ENEE 322: Continuous-Time Fourier Transform (Chapter 4)
EE345S Real-Time Digital Signal Processing Lab Fall 2006 Lecture 17 Fast Fourier Transform Prof. Brian L. Evans Dept. of Electrical and Computer Engineering.
DIGITAL SIGNAL PROCESSING ELECTRONICS
CE Digital Signal Processing Fall Discrete-time Fourier Transform
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
Polynomial + Fast Fourier Transform
September 4, 1997 Applied Symbolic Computation (CS 300) Fast Polynomial and Integer Multiplication Jeremy R. Johnson.
Applied Symbolic Computation
UNIT II Analysis of Continuous Time signal
Polynomials and the FFT(UNIT-3)
Fast Fourier Transform (FFT) (Section 4.11)
Applied Symbolic Computation
DFT and FFT By using the complex roots of unity, we can evaluate and interpolate a polynomial in O(n lg n) An example, here are the solutions to 8 =
Lecture #17 INTRODUCTION TO THE FAST FOURIER TRANSFORM ALGORITHM
Lect5 A framework for digital filter design
September 4, 1997 Applied Symbolic Computation (CS 300) Fast Polynomial and Integer Multiplication Jeremy R. Johnson.
The Fast Fourier Transform
Advanced Algorithms Analysis and Design
Z TRANSFORM AND DFT Z Transform
Applied Symbolic Computation
September 4, 1997 Applied Symbolic Computation (CS 567) Fast Polynomial and Integer Multiplication Jeremy R. Johnson.
Applied Symbolic Computation
The Fast Fourier Transform
Lecture #17 INTRODUCTION TO THE FAST FOURIER TRANSFORM ALGORITHM
Fast Polynomial and Integer Multiplication
Reconfigurable Computing (EN2911X, Fall07)
Presentation transcript:

Reconfigurable Computing S. Reda, Brown University Reconfigurable Computing (EN2911X, Fall07) Lecture 16: Application-Driven Hardware Acceleration (1/4) Prof. Sherief Reda Division of Engineering, Brown University

Reconfigurable Computing S. Reda, Brown University Fast Fourier transform One of the most important subroutines in scientific computing Used in many applications including: signal and image processing, solution of differential equations, multiplication of polynomial functions, data compression, …, etc One of the most widely implemented hardware accelerators

Reconfigurable Computing S. Reda, Brown University Discrete Fourier transform DFT Maps a set of input points to another set of output points. The operation is reversible.

Reconfigurable Computing S. Reda, Brown University Roots of the unity real imaginary (1, 0) (0, j) (-1, 0) (0, -j) What are the Nth roots of unity? If N = 8 then we have Define

Reconfigurable Computing S. Reda, Brown University Calculating the DFT How many arithmetic (+ and *) operations do we need to calculate the DFT?

Reconfigurable Computing S. Reda, Brown University Computing the DFT using the FFT How can we do better? Fast Fourier Transform (FFT) DFT of even indices DFT of odd indices The sum of N point DFT has been broken into two N/2 point DFTs

Reconfigurable Computing S. Reda, Brown University Example when N=8 Objective: Compute X 0, X 1, … X 7 given x 0, x 1, …, x 7 magic box magic box x0x0 x2x2 x4x4 x6x6 x1x1 x3x3 x5x5 x7x7 X0X0 X1X1 X2X2 X3X3 X4X4 X5X5 X6X6 X7X7 Note that

Reconfigurable Computing S. Reda, Brown University Now let’s apply the idea recursively x0x0 x4x4 x2x2 x6x6 x1x1 x5x5 x3x3 x7x7 X0X0 X1X1 X2X2 X3X3 X4X4 X5X5 X6X6 X7X7

Reconfigurable Computing S. Reda, Brown University One more time x0x0 x4x4 x2x2 x6x6 x1x1 x5x5 x3x3 x7x7 X0X0 X1X1 X2X2 X3X3 X4X4 X5X5 X6X6 X7X7 How many operations do we need now? What is the execution time on a general purpose CPU? What is the execution time on a FPGA? How many resources u need?

Reconfigurable Computing S. Reda, Brown University Another way to visualize FFT computations How can we determine the order of the first inputs? x0x0 x4x4 x2x2 x6x6 x1x1 x5x5 x3x3 x7x7 Butter fly Butter fly Butter fly Butter fly Butter fly Butter fly Butter fly Butter fly Butter fly Butter fly Butter fly Butter fly X0X0 X4X4 X1X1 X5X5 X2X2 X6X6 X3X3 X7X7

Reconfigurable Computing S. Reda, Brown University Application of FFT: faster multiplication of two polynomials Suppose we want to evaluate A(x) at x 0, how many operations do we need? Use Horner’s rule Suppose you have two polynomials represented by the coefficient vectors How many operations it takes to add these two polynomials? How many operations it takes to multiply these two polynomials?

Reconfigurable Computing S. Reda, Brown University Point value representation A point-value representation of a polynomial A(x) of degree-bound N is a set of N point-value pairs such that all of the x k are distinct and y k =A(x k ) for k=0, 1, …, N-1 How many operations do we need to compute the point representation of a polynomial? How can we do better?

Reconfigurable Computing S. Reda, Brown University Interpolation of polynomials from point-value representations Given the point representation of a polynomial, how can we inverse the evaluation, i.e., determine the coefficient form of a polynomial from a point representation? How can we find the a’s?

Reconfigurable Computing S. Reda, Brown University Adding and multiplying polynomials in point representation If polynomial C(x)=A(x)+B(x) then we can get point representation of C easily Polynomial A Polynomial B How many operations do we need? How about C(x)=A(x)*B(x)?

Reconfigurable Computing S. Reda, Brown University How can we convert a polynomial quickly from coefficient form to point-value and back? Evaluate O(N 2 ) Point-wise multiplication Interpolate O(N 2 ) Ordinary multiplication O(N 2 ) O(N) It does not make sense now. How can we evaluate and interpolate faster than O(N 2 )? Can we choose the evaluation points smartly?

Reconfigurable Computing S. Reda, Brown University Choosing the evaluation points smartly......

Reconfigurable Computing S. Reda, Brown University Finally multiplying polynomials in O(NlogN) FFT O(N log N) Point-wise multiplication Inverse FFT Ordinary multiplication O(N 2 ) O(N)

Reconfigurable Computing S. Reda, Brown University Back to signal processing Linear system with Impulse response (b 0, b 1, …, b N-1 ) (a 0, a 1, …, a N-1 ) T=0: a 0 b 0 T=1: a 0 b1+a 1 b 0 T=2: a 0 b 2 +a 1 b 1 +a 2 b 0 …. The response of the system to the input signal at different times is equal to the coefficients of the polynomial produced from multiplying the input signal polynomial with the impulse response polynomial? Commonly known as the convolution of the input and the system’s impulse response. How to do to find the output response faster than O(N 2 )?

Reconfigurable Computing S. Reda, Brown University Summary The lecture covered one of the most important hardware accelerators: FFT We have seen how it can be parallelized and speed up Examined some of the applications