Download presentation
Presentation is loading. Please wait.
Published byElwin Singleton Modified over 9 years ago
1
What is the WHT anyway, and why are there so many ways to compute it? Jeremy Johnson 1, 2, 6, 24, 112, 568, 3032, 16768,…
2
Walsh-Hadamard Transform y = WHT N x, N = 2 n
3
WHT Algorithms Factor WHT N into a product of sparse structured matrices Compute: y = (M 1 M 2 … M t )x y t = M t x … y 2 = M 2 y 3 y = M 1 y 2
4
Factoring the WHT Matrix AC D C D A A C) = ( A C I m n mn WHT 2 WHT 2 WHT 2 WHT 2
5
Recursive and Iterative Factorization WHT 8 WHT 2 WHT 4 WHT 2 WHT 2 I 2 WHT WHT 2 WHT 2 I 2 I 2 WHT WHT 2 WHT 2 I 2 I 2 WHT WHT 2 WHT 2 I 2 I 2 WHT WHT 2 WHT 2 I 4 WHT
6
WHT 8 = WHT 2 WHT 2 I 4 WHT
7
WHT Algorithms Recursive Iterative General
8
WHT Implementation Definition/formula –N=N 1 * N 2 N t N i =2 n i –x=WHT N *x x =(x(b),x(b+s),…x(b+(M-1)s)) Implementation(nested loop) R=N; S=1; for i=t,…,1 R=R/N i for j=0,…,R-1 for k=0,…,S-1 S=S* N i; M b,s t i1 ) nn WHT 2 2 2 II( n 1 + ··· +n i- 1 2 n i+1 + ··· +n t i
9
Partition Trees 4 13 12 11 4 1111 Right Recursive Iterative 9 342 121 11 4 13 12 11 Left Recursive 4 22 1111 Balanced
10
Ordered Partitions There is a 1-1 mapping from ordered partitions of n onto (n-1)-bit binary numbers. There are 2 n-1 ordered partitions of n. 1|1 1|1 1 1 1|1 1 1+2+4+2 = 9 162 = 1 0 1 0 0 0 1 0
11
Enumerating Partition Trees 33 12 0001 3 12 11 3 21 10 3 12 11 3 11 11 1
12
Counting Partition Trees
13
WHT Package Püschel & Johnson (ICASSP ’00) Allows easy implementation of any of the possible WHT algorithms Partition tree representation W(n)=small[n] | split[W(n 1 ),…W(n t )] Tools –Measure runtime of any algorithm –Measure hardware events (coupled with PCL) –Search for good implementation Dynamic programming Evolutionary algorithm
14
Histogram (n = 16, 10,000 samples) Wide range in performance despite equal number of arithmetic operations (n2 n flops) Pentium III consumes more run time (more pipeline stages) Ultra SPARC II spans a larger range
15
Operation Count Theorem. Let W N be a WHT algorithm of size N. Then the number of floating point operations (flops) used by W N is Nlg(N). Proof. By induction.
16
Instruction Count Model A(n) = number of calls to WHT procedure = number of instructions outside loops A l (n) = Number of calls to base case of size l l = number of instructions in base case of size l L i = number of iterations of outer (i=1), middle (i=2), and outer (i=3) loop i = number of instructions in outer (i=1), middle (i=2), and outer (i=3) loop body
17
Small[1].file"s_1.c".version"01.01" gcc2_compiled.:.text.align 4.globl apply_small1.type apply_small1,@function apply_small1: movl 8(%esp),%edx //load stride S to EDX movl 12(%esp),%eax //load x array's base address to EAX fldl (%eax) // st(0)=R7=x[0] fldl (%eax,%edx,8) //st(0)=R6=x[S] fld %st(1) //st(0)=R5=x[0] fadd %st(1),%st // R5=x[0]+x[S] fxch %st(2) //st(0)=R5=x[0],s(2)=R7=x[0]+x[S] fsubp %st,%st(1) //st(0)=R6=x[S]-x[0] ????? fxch %st(1) //st(0)=R6=x[0]+x[S],st(1)=R7=x[S]-x[0] fstpl (%eax) //store x[0]=x[0]+x[S] fstpl (%eax,%edx,8) //store x[0]=x[0]-x[S] ret
18
Recurrences
20
Histogram using Instruction Model (P3) l = 12, l = 34, and l = 106 = 27 1 = 18, 2 = 18, and 1 = 20
21
Algorithm Comparison
22
Dynamic Programming min n 1 +… + n t = n Cost ( TT … n1n1 ntnt n ), where T n is the optimial tree of size n. This depends on the assumption that Cost only depends on the size of a tree and not where it is located. (true for IC, but false for runtime). For IC, the optimal tree is iterative with appropriate leaves. For runtime, DP is a good heuristic (used with binary trees).
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.