Factorization of DSP Transforms using Taylor Expansion Diagram Jeremie Guillot, E. Boutillon M.Ciesielski *, D. Gomez-Prado*, Q.Ren*, S. Askar* LESTER Lab, Université de Bretagne SUD *VLSI CAD Lab, University of Massachusetts, Amherst
Outline Taylor Expansion Diagram TED-based Factorization Results DSP example Results Conclusions
Taylor Expansion Diagram Graph based representation of arithmetical expression. Based on Taylor Series Expansion: f x 1 x2 x f(0) f ’(0) f ’’(0)/2
Your First TED Example: f(x,y)=5x+3y+5xy-3 Taylor decomposition: f(x,y)= (3y-3) + x*(5y+5) g(y) = -3+y*(3) h(y) = 5+y*(5) Representation used by the tool: (^0 -3) means an (additive) edge with power 0 and weight -3 f(x,y) x g(y) h(y) f(0)=g(y) fx’(0)=h(y) f(x,y) x y one ^1 5 ^0 5 ^0 -3 ^1 3 ^0 1 ^1 1
Your First TED, cont’d Properties: After normalization: f(x,y) x And more… Properties: Acyclic and oriented graph. Compact representation of linear expression. When the graph is reduced, ordered and normalized, it is canonical. For a given functionality, there exists only one representation useful for verification, equivalence checking…) Handles word-level & bit-level. ^0 -3 f(x,y) x y ONE ^1 5 ^0 5 ^1 3 ^0 1 ^1 1
TED-based Factorization, Example Discrete Cosine Transform, one of the main block in JPEG/MPEG compression DCT can be expressed as follows: A direct implementation: (N=4) for j in 0 to N-1 loop temp:=0; for n in 0 to N-1 loop temp:=temp+x(n)*cosine(n,j); end loop; y(j)<=temp;
TED-based Factorization, Example DCT - Direct implementation: Y=M*X 12 Additions 16 Multiplications
TED-based Factorization TED for the DCTII size 4 These nodes and associated sub-graphs are shared by Y1, Y3. x0-x3 x1-x2
TED-based Factorization Changing variable order helps identify candidates for CSE. Reuse sub-expressions by creating new variables: S0=x0-x3 S1=x0+x3
TED-based Factorization Continue with next substitutions: S2=x1-x2 S3=x1+x2
TED-based Factorization No more candidates can be found for common sub-expression elimination Each sub expression Sn in this graph is represented by an adder The expressions can be rewritten as: S0=x0-x3; S1=x0+x3; S2=x1-x2; S3=x1+x2; Y0=S3+S1; Y1=A*S0+B*S2 Y2=C*(S1-S3); Y3=-A*S2+B*S0 8 Additions 5 Multiplications
TED-based Factorization Algorithm
Results
Conclusions TED makes the CSE process straightforward. It extracts the functionality from the specification and reduces computation. Other factorization schemes are currently under development (Radix Decomposition, etc.). Applications: High Level Synthesis. Compilation Mathematical software…
Software: TEDify TEDify: a tool to optimize mathematical expressions using TEDs Available at: http://tango.ecs.umass.edu/TED/Doc/html/index.html
Thanks Any questions ?
Results Transform: Original # ADD Original # MPY # ADD after TED # MPY after TED Time WHT 4x4 12 16 8 0,08 WHT 8x8 56 64 24 0,09 WHT 16x16 240 256 0,211 WHT 32x32 992 1024 160 1,768 WHT 64x64 4032 4096 384 27,158 DCT 4x4 5 0,084 DCT 8x8 34 21 0,097 DCT 16x16 126 85 0,182 DCT 32x32 454 341 1,210 DCT 64x64 1654 1365 16,035 DCT128x128 16256 16384 6166 5461 468 DHT 4x4 0,092 DHT 8x8 32 4 0,094 DHT 16x16 112 28 0,195 DHT32x32 360 140 1,386 DHT 64x64 1200 620 17,98 DHT 128x128 4016 2604 340 DHT 256x256 65280 65536 14000 10668 10756