Presentation is loading. Please wait.

Presentation is loading. Please wait.

What is the WHT anyway, and why are there so many ways to compute it? Jeremy Johnson 1, 2, 6, 24, 112, 568, 3032, 16768,…

Similar presentations


Presentation on theme: "What is the WHT anyway, and why are there so many ways to compute it? Jeremy Johnson 1, 2, 6, 24, 112, 568, 3032, 16768,…"— Presentation transcript:

1 What is the WHT anyway, and why are there so many ways to compute it? Jeremy Johnson 1, 2, 6, 24, 112, 568, 3032, 16768,…

2 Walsh-Hadamard Transform y = WHT N x, N = 2 n

3 WHT Algorithms Factor WHT N into a product of sparse structured matrices Compute: y = (M 1 M 2 … M t )x y t = M t x … y 2 = M 2 y 3 y = M 1 y 2

4 Factoring the WHT Matrix AC  D  C  D  A  A   C) = ( A   C I m  n  mn WHT 2  WHT 2  WHT 2      WHT 2 

5 Recursive and Iterative Factorization WHT 8  WHT 2      WHT 4   WHT 2      WHT 2     I 2  WHT    WHT 2      WHT 2     I 2  I 2  WHT    WHT 2      WHT 2     I 2  I 2  WHT    WHT 2      WHT 2     I 2  I 2  WHT    WHT 2      WHT 2     I 4  WHT  

6 WHT 8 =  WHT 2      WHT 2     I 4  WHT  

7 WHT Algorithms Recursive Iterative General

8 WHT Implementation Definition/formula –N=N 1 * N 2  N t N i =2 n i –x=WHT N *x x =(x(b),x(b+s),…x(b+(M-1)s)) Implementation(nested loop) R=N; S=1; for i=t,…,1 R=R/N i for j=0,…,R-1 for k=0,…,S-1 S=S* N i; M b,s t i1 )     nn WHT 2 2 2 II( n 1 + ··· +n i- 1 2 n i+1 + ··· +n t i

9 Partition Trees 4 13 12 11 4 1111 Right Recursive Iterative 9 342 121 11 4 13 12 11 Left Recursive 4 22 1111 Balanced

10 Ordered Partitions There is a 1-1 mapping from ordered partitions of n onto (n-1)-bit binary numbers.  There are 2 n-1 ordered partitions of n. 1|1 1|1 1 1 1|1 1  1+2+4+2 = 9 162 = 1 0 1 0 0 0 1 0

11 Enumerating Partition Trees 33 12 0001 3 12 11 3 21 10 3 12 11 3 11 11 1

12 Counting Partition Trees

13 WHT Package Püschel & Johnson (ICASSP ’00) Allows easy implementation of any of the possible WHT algorithms Partition tree representation W(n)=small[n] | split[W(n 1 ),…W(n t )] Tools –Measure runtime of any algorithm –Measure hardware events (coupled with PCL) –Search for good implementation Dynamic programming Evolutionary algorithm

14 Histogram (n = 16, 10,000 samples) Wide range in performance despite equal number of arithmetic operations (n2 n flops) Pentium III consumes more run time (more pipeline stages) Ultra SPARC II spans a larger range

15 Operation Count Theorem. Let W N be a WHT algorithm of size N. Then the number of floating point operations (flops) used by W N is Nlg(N). Proof. By induction.

16 Instruction Count Model A(n) = number of calls to WHT procedure  = number of instructions outside loops A l (n) = Number of calls to base case of size l  l = number of instructions in base case of size l L i = number of iterations of outer (i=1), middle (i=2), and outer (i=3) loop  i = number of instructions in outer (i=1), middle (i=2), and outer (i=3) loop body

17 Small[1].file"s_1.c".version"01.01" gcc2_compiled.:.text.align 4.globl apply_small1.type apply_small1,@function apply_small1: movl 8(%esp),%edx //load stride S to EDX movl 12(%esp),%eax //load x array's base address to EAX fldl (%eax) // st(0)=R7=x[0] fldl (%eax,%edx,8) //st(0)=R6=x[S] fld %st(1) //st(0)=R5=x[0] fadd %st(1),%st // R5=x[0]+x[S] fxch %st(2) //st(0)=R5=x[0],s(2)=R7=x[0]+x[S] fsubp %st,%st(1) //st(0)=R6=x[S]-x[0] ????? fxch %st(1) //st(0)=R6=x[0]+x[S],st(1)=R7=x[S]-x[0] fstpl (%eax) //store x[0]=x[0]+x[S] fstpl (%eax,%edx,8) //store x[0]=x[0]-x[S] ret

18 Recurrences

19

20 Histogram using Instruction Model (P3)  l = 12,  l = 34, and  l = 106  = 27  1 = 18,  2 = 18, and  1 = 20

21 Algorithm Comparison

22 Dynamic Programming min n 1 +… + n t = n Cost ( TT … n1n1 ntnt n ), where T n is the optimial tree of size n. This depends on the assumption that Cost only depends on the size of a tree and not where it is located. (true for IC, but false for runtime). For IC, the optimal tree is iterative with appropriate leaves. For runtime, DP is a good heuristic (used with binary trees).


Download ppt "What is the WHT anyway, and why are there so many ways to compute it? Jeremy Johnson 1, 2, 6, 24, 112, 568, 3032, 16768,…"

Similar presentations


Ads by Google