Linear Filters in StreamIt

Slides:



Advertisements
Similar presentations
David Hansen and James Michelussi
Advertisements

DSP-CIS Chapter-5: Filter Realization
Digital Kommunikationselektronik TNE027 Lecture 5 1 Fourier Transforms Discrete Fourier Transform (DFT) Algorithms Fast Fourier Transform (FFT) Algorithms.
ANTLR in SSP Xingzhong Xu Hong Man Aug Outline ANTLR Abstract Syntax Tree Code Equivalence (Code Re-hosting) Future Work.
Linear Equations in Linear Algebra
Acknowledgments: Thanks to Professor Nicholas Brummell from UC Santa Cruz for his help on FFTs after class, and also thanks to Professor James Demmel from.
Outline for Today More math… Finish linear algebra: Matrix composition
VLSI DSP 2008Y.T. Hwang3-1 Chapter 3 Algorithm Representation & Iteration Bound.
Arithmetic Operations on Matrices. 1. Definition of Matrix 2. Column, Row and Square Matrix 3. Addition and Subtraction of Matrices 4. Multiplying Row.
1 1.1 © 2012 Pearson Education, Inc. Linear Equations in Linear Algebra SYSTEMS OF LINEAR EQUATIONS.
MATRICES AND DETERMINANTS
ME 1202: Linear Algebra & Ordinary Differential Equations (ODEs)
CHAPTER 8 DSP Algorithm Implementation Wang Weilian School of Information Science and Technology Yunnan University.
Static Translation of Stream Programs S. M. Farhad School of Information Technology The University of Sydney.
High Performance Scalable Base-4 Fast Fourier Transform Mapping Greg Nash Centar 2003 High Performance Embedded Computing Workshop
Elements of Computing Systems, Nisan & Schocken, MIT Press, 2005, Chapter 11: Compiler II: Code Generation slide 1www.idc.ac.il/tecs.
Linear Systems and Augmented Matrices. What is an augmented matrix?  An augmented matrix is essentially two matrices put together.  In the case of a.
SINGULAR VALUE DECOMPOSITION (SVD)
4.6: Rank. Definition: Let A be an mxn matrix. Then each row of A has n entries and can therefore be associated with a vector in The set of all linear.
Matthew Christian. About Me Introduction to Linear Algebra Vectors Matrices Quaternions Links.
Feedback Control Systems (FCS) Dr. Imtiaz Hussain URL :
Inverse DFT. Frequency to time domain Sometimes calculations are easier in the frequency domain then later convert the results back to the time domain.
1 Optimizing Stream Programs Using Linear State Space Analysis Sitij Agrawal 1,2, William Thies 1, and Saman Amarasinghe 1 1 Massachusetts Institute of.
Digital Filter Realization
Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA
RECOGNIZING INCONSISTENT LINEAR SYSTEMS. What is an Inconsistent Linear System?  An inconsistent linear system is a system of equations that has no solutions.
1.7 Linear Independence. in R n is said to be linearly independent if has only the trivial solution. in R n is said to be linearly dependent if there.
StreamIt on Raw StreamIt Group: Michael Gordon, William Thies, Michal Karczmarek, David Maze, Jasper Lin, Jeremy Wong, Andrew Lamb, Ali S. Meli, Chris.
Linear Analysis and Optimization of Stream Programs Masterworks Presentation Andrew A. Lamb 4/30/2003 Professor Saman Amarasinghe MIT Laboratory for Computer.
Static Translation of Stream Program to a Parallel System S. M. Farhad The University of Sydney.
Discrete Fourier Transform
StreamIt: A Language for Streaming Applications
REVIEW Linear Combinations Given vectors and given scalars
Linear Equations in Linear Algebra
Jean Baptiste Joseph Fourier
Fang Fang James C. Hoe Markus Püschel Smarahara Misra
CSE 167 [Win 17], Lecture 2: Review of Basic Math Ravi Ramamoorthi
JPEG Compression What is JPEG? Motivation
Linear Algebra Review.
Eigenfaces (for Face Recognition)
Chapter 7. Classification and Prediction
Matrices Rules & Operations.
Linear Algebra review (optional)
Linear Equations in Linear Algebra
CE Digital Signal Processing Fall Discrete-time Fourier Transform
Automatic Control Theory CSE 322
Lecture 12 Linearity & Time-Invariance Convolution
What is this “Viterbi Decoding”
Wavelets : Introduction and Examples
Matrix Operations Monday, August 06, 2018.
Compiler Construction
Description and Analysis of Systems
MATRICES AND DETERMINANTS
Centar ( Global Signal Processing Expo
Lecture #17 INTRODUCTION TO THE FAST FOURIER TRANSFORM ALGORITHM
StreamIt: High-Level Stream Programming on Raw
Digital Control Systems (DCS)
Linear Equations in Linear Algebra
Multiplier-less Multiplication by Constants
Chapter 8 The Discrete Fourier Transform
4.6: Rank.
Perceptron as one Type of Linear Discriminants
Algebra
Kenneth Moreland Edward Angel Sandia National Labs U. of New Mexico
Linear Equations in Linear Algebra
Dimensions matching Rows times Columns
3.6 Multiply Matrices.
Math review - scalars, vectors, and matrices
Lecture #17 INTRODUCTION TO THE FAST FOURIER TRANSFORM ALGORITHM
Presentation transcript:

Linear Filters in StreamIt Andrew A. Lamb MIT Laboratory for Computer Science Computer Architecture Group 8/29/2002

Outline Introduction Dataflow Analysis Hierarchal Matrix Combinations Performance Optimizations

Basic Idea Filter A Matrix Mult. x y y = xA + b a=pop(); b=pop(); c=(a+b)/2 + 1; push(c); Matrix Mult. x y y = xA + b

What is a Linear Filter? Generic filters calculate some outputs (possibly) based on their inputs. Linear filters: outputs (yj) are weighted sums of the inputs (xi) plus a constant. for b constant wi constant for all i N is the number of inputs

Linearity and Matricies Matrix multiply is exactly weighted sum We treat inputs (xi) and outputs (yj) as vectors of values (x, and y respectively) Filter is represented as a matrix of weights A and a vector of constants b Therefore, filter represents the equation y = xA + b

Equation Example, y1

Equation Example, y2

Equation Example, y3

Usefulness of Linearity Not all filters compute linear functions push(pop()*pop()); Many fundamental DSP filters do DFT/FFT DCT Convolution/FIR Matrix Multiply

Example: DFT Matrix DFT: row n column m

Example: IDFT Matrix IDFT: row m column n

Usefullness, cont. Matrix representations Are “embarrassingly parallel” 1 Expose redundant computation Let us take advantage of existing work in DSP field Well understood mathematics 1: Thank you, Bill Thies

Outline Introduction Dataflow Analysis Hierarchal Matrix Combinations Performance Optimizations

Dataflow Analysis Basic idea: convert the general code of a filter’s work function into an affine representation (eg y=xA+b) The A matrix represents the linear combination of inputs used to calculate each output. The vector b represents a constant offset that is added to the combination.

“Linear” Dataflow Analysis Much like standard constant prop. Goal: Have a vector of weights and a constant that represents the argument to each push statement which become a column in A and an entry in b. Keep mappings from variables to their linear forms (eg vector + constant).

“Linear” Dataflow Analysis Of course, we need the appropriate generating cases, eg constants pop/peek(x)

“Linear” Dataflow Analysis Like const prop, confluence operator is set union. Need combination rules to handle things like multiplication and addition (vector add and constant scale)

Ridiculous Example a=peek(2); b=pop(); c=pop(); pop(); d=a+2b; e=d+5;

Ridiculous Example a a=peek(2); b=pop(); c=pop(); pop(); d=a+2b; e=d+5;

Ridiculous Example a a=peek(2); b=pop(); c=pop(); pop(); b d=a+2b; e=d+5; b

Ridiculous Example a a=peek(2); b=pop(); c=pop(); pop(); b d=a+2b; e=d+5; b c

Ridiculous Example a a=peek(2); b=pop(); c=pop(); pop(); b d=a+2b; e=d+5; b c d

Ridiculous Example a a=peek(2); b=pop(); c=pop(); pop(); b d=a+2b; e=d+5; b c d e

Constructing matrix A Filter A: a filter code b push(b); push(a);

Constructing matrix A Filter A: a filter code b push(b); push(a);

Constructing matrix A Filter A: a filter code b push(b); push(a);

Big Picture FIR Filter push = 2 pop = 1 peek = 3 weights1 = {1,2,3}; float sum1 = 0; float sum2 = 0; float mean1 = 5; mean2 = 17; for (int i=0; i<3; i++) { sum1 += weights1[i]*peek(3-i-1); } push(sum2 – mean2); push(sum1 – mean1); pop(); push = 2 pop = 1 peek = 3

Big Picture, cont. Matrix Mult. x y y = xA + b size(x) = 3 size(y) = 2

Outline Introduction Dataflow Analysis Hierarchal Matrix Combinations Performance Optimizations

Combining Filters Basic idea: combine pipelines, splitjoins (and possibly feedback loops) of linear filters together End up with a single large matrix representation The details are tricky in the general case (eg I am still working on them)

Combining Pipelines [A] [B] [C] The matrix C is calculated as A’ B’ where A’ and B’ have been appropriately scaled and duplicated to make the dimensions work out. In the case where peek(B)  pop(B), we might have to use two stage filters or duplicate some work to get the dimensions to work out. [A] [B] ye olde pipeline [C] single filter

Combining Split Joins [C] [A1] [A2] [A3] [AN] single filter splitter joiner [C] [A3] [AN] single filter ye olde SplitJoin A split join reorders data, so the columns of C are interleaved copies of the columns of A1 through AN. Matching the rates of A1 through AN is a challenge that I am still working out.

Combining Feedback Loops joiner splitter [A] ? [B] ye olde FeedbackLoop It is unclear if we can do anything of use with a FeedbackLoop construct. Eigen values might give information about stability, but it is not clear if that is useful… more thought is needed.

Outline Introduction Dataflow Analysis Hierarchal Matrix Combinations Performance Optimizations

Performance Optimizations Take advantage of our compile time knowledge of the matrix coefficients. eg don’t waste computation on zeros Try and leverage existing DSP work on factoring matricies. Try to recognize parallel structures in our matrices. Use frequency analysis.

Factoring for Performance 16 multiplies 12 adds 14 multiplies 6 adds

SPL/SPIRAL Software package that will attempt to find a fast implementation signal processing algorithms described as matrices. It attempts to find a sparse factorization of an arbitrary matrix. It can automatically derive FFT (eg the Cooley-Turkey algorithm) from DFT definition. Claim that their performance is  FFTW1. 1. See http://www.fftw.org

Recognize Parallel Structure We can go from SplitJoin to matrix. Perhaps we can recognize the reverse transformation. Also, implement blocked matrix multiply to keep parallel resources busy.

Frequency Analysis [A] [A’] DFT IDFT Instead of computing the matrix product straight up, possibly go to frequency domain. Rids us of offset vector (added to response at f=0). Might allow additional optimizations (because of possible symmetries exposed in frequency domain). [A] [A’] DFT IDFT

Work left to do Implementation of single filter analysis. Combining hierarchical constructs. Understand the math of automatic matrix factorizations (group theory). Analyze frequency analysis. Implement optimizations. Get results.

Questions for the Future Are there any other optimizations? Can we produce inverted matrices programmer codes up transmitter and StreamIt automatically creates the receiver.1 How many cycles of a “real” DSP application are spent computing linear functions? Can we combine the linear description of what happens inside a filter with the SARE representation of what is happening between them? (POPL paper) 1. Thank you, BT