What is the WHT anyway, and why are there so many ways to compute it? Jeremy Johnson 1, 2, 6, 24, 112, 568, 3032, 16768,…

Slides:

Advertisements

Similar presentations

Introduction to Algorithms Quicksort

Advertisements

Fast Algorithms For Hierarchical Range Histogram Constructions

Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 6.

Greedy Algorithms Be greedy! always make the choice that looks best at the moment. Local optimization. Not always yielding a globally optimal solution.

DIVIDE AND CONQUER. 2 Algorithmic Paradigms Greedy. Build up a solution incrementally, myopically optimizing some local criterion. Divide-and-conquer.

Introduction to Algorithms Jiafen Liu Sept

1 Divide-and-Conquer CSC401 – Analysis of Algorithms Lecture Notes 11 Divide-and-Conquer Objectives: Introduce the Divide-and-conquer paradigm Review the.

Chapter 19: Searching and Sorting Algorithms

Search and Recursion CS221 – 2/23/09. List Search Algorithms Linear Search: Simple search through unsorted data. Time complexity = O(n) Binary Search:

Divide-and-Conquer1 7 2  9 4   2  2 79  4   72  29  94  4.

Divide-and-Conquer1 7 2  9 4   2  2 79  4   72  29  94  4.

Chapter 4: Divide and Conquer The Design and Analysis of Algorithms.

Recursive Algorithms Nelson Padua-Perez Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.

Tirgul 6 B-Trees – Another kind of balanced trees Problem set 1 - some solutions.

Analysis of Algorithms CS 477/677

Recurrences The expression: is a recurrence. –Recurrence: an equation that describes a function in terms of its value on smaller functions Analysis of.

Analysis of Algorithms CS 477/677 Recurrences Instructor: George Bebis (Appendix A, Chapter 4)

Unit 1. Sorting and Divide and Conquer. Lecture 1 Introduction to Algorithm and Sorting.

Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?

A Prototypical Self-Optimizing Package for Parallel Implementation of Fast Signal Transforms Kang Chen and Jeremy Johnson Department of Mathematics and.

SPL: A Language and Compiler for DSP Algorithms Jianxin Xiong 1, Jeremy Johnson 2 Robert Johnson 3, David Padua 1 1 Computer Science, University of Illinois.

Divide-and-Conquer 7 2  9 4   2   4   7

Programming Concepts Jacques Tiberghien office : Mobile :

Short Vector SIMD Code Generation for DSP Algorithms

SEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY Lecture 12 CS2110 – Fall 2009.

Introduction to Algorithms 6.046J/18.401J/SMA5503 Lecture 3 Prof. Erik Demaine.

C++ Programming: Program Design Including Data Structures, Fourth Edition Chapter 19: Searching and Sorting Algorithms.

Divide-and-Conquer1 7 2  9 4   2  2 79  4   72  29  94  4.

Project 2 due … Project 2 due … Project 2 Project 2.

14/13/15 CMPS 3130/6130 Computational Geometry Spring 2015 Windowing Carola Wenk CMPS 3130/6130 Computational Geometry.

File Organization and Processing Week 13 Divide and Conquer.

SEARCHING (Linear/Binary). Searching Algorithms  method of locating a specific item of information in a larger collection of data.  two popular search.

Targil 6 Notes This week: –Linear time Sort – continue: Radix Sort Some Cormen Questions –Sparse Matrix representation & usage. Bucket sort Counting sort.

Distributed WHT Algorithms Kang Chen Jeremy Johnson Computer Science Drexel University Franz Franchetti Electrical and Computer Engineering.

Mathematical Background and Linked Lists. 2 Iterative Algorithm for Sum Find the sum of the first n integers stored in an array v : sum (v[], n) temp_sum.

Lecture 5 Jianjun Hu Department of Computer Science and Engineering University of South Carolina CSCE350 Algorithms and Data Structure.

Data Structure Introduction.

Lecture 7 All-Pairs Shortest Paths. All-Pairs Shortest Paths.

Chapter 18: Searching and Sorting Algorithms. Objectives In this chapter, you will: Learn the various search algorithms Implement sequential and binary.

Java Methods Big-O Analysis of Algorithms Object-Oriented Programming

Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 3.

Recurrences David Kauchak cs161 Summer Administrative Algorithms graded on efficiency! Be specific about the run times (e.g. log bases) Reminder:

DATA STRUCTURES AND ALGORITHMS Lecture Notes 2 Prepared by İnanç TAHRALI.

Recurrence Relations Analyzing the performance of recursive algorithms.

Optimization Problems In which a set of choices must be made in order to arrive at an optimal (min/max) solution, subject to some constraints. (There may.

COSC 3101NJ. Elder Announcements Midterms are marked Assignment 2: –Still analyzing.

Chapter 2: Algorithm Analysis Application of Big-Oh to program analysis Logarithms in Running Time Lydia Sinapova, Simpson College Mark Allen Weiss: Data.

CS6045: Advanced Algorithms Sorting Algorithms. Heap Data Structure A heap (nearly complete binary tree) can be stored as an array A –Root of tree is.

Performance Analysis of Divide and Conquer Algorithms for the WHT Jeremy Johnson Mihai Furis, Pawel Hitczenko, Hung-Jen Huang Dept. of Computer Science.

Divide & Conquer Themes –Reasoning about code (correctness and cost) –iterative code, loop invariants, and sums –recursion, induction, and recurrence relations.

Recursion Unrolling for Divide and Conquer Programs Radu Rugina and Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.

Sorting Lower Bounds n Beating Them. Recap Divide and Conquer –Know how to break a problem into smaller problems, such that –Given a solution to the smaller.

CH.3 Floating Point Hardware and Algorithms 3/10/

Chapter 15 Running Time Analysis. Topics Orders of Magnitude and Big-Oh Notation Running Time Analysis of Algorithms –Counting Statements –Evaluating.

In Search of the Optimal WHT Algorithm J. R. Johnson Drexel University Markus Püschel CMU

Analysis of Algorithms CS 477/677

MA/CSSE 473 Day 02 Some Numeric Algorithms and their Analysis

What is the WHT anyway, and why are there so many ways to compute it?

CS 154, Lecture 4: Limitations on DFAs (I),

Advance Analysis of Lecture No 9 Institute of Southern Punjab Multan

WHT Package Jeremy Johnson.

CSE 373 Data Structures and Algorithms

Lecture 8. Paradigm #6 Dynamic Programming

Algorithms: the big picture

Divide-and-Conquer 7 2  9 4   2   4   7

Discrete Mathematics CS 2610

Mathematical Induction II

Compiler Construction CS 606 Sohail Aslam Lecture 1.

Presentation transcript:

What is the WHT anyway, and why are there so many ways to compute it? Jeremy Johnson 1, 2, 6, 24, 112, 568, 3032, 16768,…

Walsh-Hadamard Transform y = WHT N x, N = 2 n

WHT Algorithms Factor WHT N into a product of sparse structured matrices Compute: y = (M 1 M 2 … M t )x y t = M t x … y 2 = M 2 y 3 y = M 1 y 2

Factoring the WHT Matrix AC  D  C  D  A  A   C) = ( A   C I m  n  mn WHT 2  WHT 2  WHT 2      WHT 2 

Recursive and Iterative Factorization WHT 8  WHT 2      WHT 4   WHT 2      WHT 2     I 2  WHT    WHT 2      WHT 2     I 2  I 2  WHT    WHT 2      WHT 2     I 2  I 2  WHT    WHT 2      WHT 2     I 2  I 2  WHT    WHT 2      WHT 2     I 4  WHT  

WHT 8 =  WHT 2      WHT 2     I 4  WHT  

WHT Algorithms Recursive Iterative General

WHT Implementation Definition/formula –N=N 1 * N 2  N t N i =2 n i –x=WHT N *x x =(x(b),x(b+s),…x(b+(M-1)s)) Implementation(nested loop) R=N; S=1; for i=t,…,1 R=R/N i for j=0,…,R-1 for k=0,…,S-1 S=S* N i; M b,s t i1 )     nn WHT II( n 1 + ··· +n i- 1 2 n i+1 + ··· +n t i

Partition Trees Right Recursive Iterative Left Recursive Balanced

Ordered Partitions There is a 1-1 mapping from ordered partitions of n onto (n-1)-bit binary numbers.  There are 2 n-1 ordered partitions of n. 1|1 1| |1 1  = =

Enumerating Partition Trees

Counting Partition Trees

WHT Package Püschel & Johnson (ICASSP ’00) Allows easy implementation of any of the possible WHT algorithms Partition tree representation W(n)=small[n] | split[W(n 1 ),…W(n t )] Tools –Measure runtime of any algorithm –Measure hardware events (coupled with PCL) –Search for good implementation Dynamic programming Evolutionary algorithm

Histogram (n = 16, 10,000 samples) Wide range in performance despite equal number of arithmetic operations (n2 n flops) Pentium III consumes more run time (more pipeline stages) Ultra SPARC II spans a larger range

Operation Count Theorem. Let W N be a WHT algorithm of size N. Then the number of floating point operations (flops) used by W N is Nlg(N). Proof. By induction.

Instruction Count Model A(n) = number of calls to WHT procedure  = number of instructions outside loops A l (n) = Number of calls to base case of size l  l = number of instructions in base case of size l L i = number of iterations of outer (i=1), middle (i=2), and outer (i=3) loop  i = number of instructions in outer (i=1), middle (i=2), and outer (i=3) loop body

Small[1].file"s_1.c".version"01.01" gcc2_compiled.:.text.align 4.globl apply_small1.type apply_small1: movl 8(%esp),%edx //load stride S to EDX movl 12(%esp),%eax //load x array's base address to EAX fldl (%eax) // st(0)=R7=x[0] fldl (%eax,%edx,8) //st(0)=R6=x[S] fld %st(1) //st(0)=R5=x[0] fadd %st(1),%st // R5=x[0]+x[S] fxch %st(2) //st(0)=R5=x[0],s(2)=R7=x[0]+x[S] fsubp %st,%st(1) //st(0)=R6=x[S]-x[0] ????? fxch %st(1) //st(0)=R6=x[0]+x[S],st(1)=R7=x[S]-x[0] fstpl (%eax) //store x[0]=x[0]+x[S] fstpl (%eax,%edx,8) //store x[0]=x[0]-x[S] ret

Recurrences

Histogram using Instruction Model (P3)  l = 12,  l = 34, and  l = 106  = 27  1 = 18,  2 = 18, and  1 = 20

Algorithm Comparison

Dynamic Programming min n 1 +… + n t = n Cost ( TT … n1n1 ntnt n ), where T n is the optimial tree of size n. This depends on the assumption that Cost only depends on the size of a tree and not where it is located. (true for IC, but false for runtime). For IC, the optimal tree is iterative with appropriate leaves. For runtime, DP is a good heuristic (used with binary trees).