FFTW and Matlab*p Richard Hu 6.338 Project.

Slides:

Advertisements

Similar presentations

Load Balancing Parallel Applications on Heterogeneous Platforms.

Advertisements

Parallel Jacobi Algorithm Steven Dong Applied Mathematics.

The Study of Cache Oblivious Algorithms Prepared by Jia Guo.

Parallel Sorting Sathish Vadhiyar. Sorting  Sorting n keys over p processors  Sort and move the keys to the appropriate processor so that every key.

CSCI-455/552 Introduction to High Performance Computing Lecture 11.

Partitioning and Divide-and-Conquer Strategies ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 23, 2013.

Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.

CMPT 300: Final Review Chapters 8 – Memory Management: Ch. 8, 9 Address spaces Logical (virtual): generated by the CPU Physical: seen by the memory.

May 29, Final Presentation Sajib Barua1 Development of a Parallel Fast Fourier Transform Algorithm for Derivative Pricing Using MPI Sajib Barua.

A Row-Permutated Data Reorganization Algorithm for Growing Server-less VoD Systems Presented by Ho Tsz Kin.

High Performance Computing 1 Parallelization Strategies and Load Balancing Some material borrowed from lectures of J. Demmel, UC Berkeley.

CS 584. Sorting n One of the most common operations n Definition: –Arrange an unordered collection of elements into a monotonically increasing or decreasing.

Parallel C3M1 Aylin Tokuç Erkan Okuyan Özlem Gür Aylin Tokuç Erkan Okuyan Özlem Gür.

Input image Output image Transform equation All pixels Transform equation.

How to find the inverse of a matrix

Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.

Low-Power Wireless Sensor Networks

FFT: Accelerator Project Rohit Prakash Anand Silodia.

MRI registration Using the phase correlation method Author: Robin Kramer.

Matrices NamingCalculatorApplication. Making & Naming a Matrix Matrix A.

1/20 A Novel Technique for Input Vector Compression in System-on-Chip Testing Student: Chien Nan Lin Satyendra Biswas, Sunil Das, and Altaf Hossain,” Information.

Balancing a trip matrix. sumAiOi 21361, , ,

Figure ground segregation in video via averaging and color distribution Introduction to Computational and Biological Vision 2013 Dror Zenati.

CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods December 10, 2003 Jose L. Rodriguez

Chapter 8A Productivity Software. 8A-2 Acquiring Software Commercial software –Software that must be purchased –Stand alone products Solve one type of.

Query Processing – Query Trees. Evaluation of SQL Conceptual order of evaluation – Cartesian product of all tables in from clause – Rows not satisfying.

ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 8 October 23, 2002 Nayda G. Santiago.

A Flexible Interleaved Memory Design for Generalized Low Conflict Memory Access Laurence S.Kaplan BBN Advanced Computers Inc. Cambridge,MA Distributed.

Matrix Multiplication The Introduction. Look at the matrix sizes.

Unit-8 Sorting Algorithms Prepared By:-H.M.PATEL.

1. Gauss-Jordan Method for Inverses 1.  Step 1: Write down the matrix A, and on its right write an identity matrix of the same size.  Step 2: Perform.

= the matrix for T relative to the standard basis is a basis for R 2. B is the matrix for T relative to To find B, complete:

1 Aggregated Circulant Matrix Based LDPC Codes Yuming Zhu and Chaitali Chakrabarti Department of Electrical Engineering Arizona State.

13.4 Product of Two Matrices

Courtesy: Dr. David Walker, Cardiff University

Topo Sort on Spark GraphX Lecturer: 苟毓川

Bin Packing First fit decreasing algorithm

Imageodesy for co-seismic shift study

Merging Merge. Keep track of smallest element in each sorted half.

Introduction to parallel algorithms

Parallel Programming in C with MPI and OpenMP

Fast Orbit Feedback System for HEPS (Cooperation work among all related systems) Dapeng Jin Control System Dec. 12, 2017.

Multipliers Multipliers play an important role in today’s digital signal processing and various other applications. The common multiplication method is.

Parallel Graph Algorithms

Parallel Programming with MPI and OpenMP

WarmUp 2-3 on your calculator or on paper..

CSCE569 Parallel Computing

Parallel Analysis of the Rijndael Block Cipher

Unit-2 Divide and Conquer

Outline Midterm results summary Distributed file systems – continued

High Performance Computing in Teaching

Parallel Matrix Operations

Bin Packing First fit decreasing algorithm

Parallel Programming in C with MPI and OpenMP

Introduction to parallel algorithms

CSCE569 Parallel Computing

Bin Packing First fit decreasing algorithm

Outline Announcement Distributed scheduling – continued

A G L O R H I M S T A Merging Merge.

Parallel sorting.

Matrix Addition and Multiplication

Matrices An appeaser is one who feeds a crocodile—hoping it will eat him last. Winston Churchhill.

A G L O R H I M S T A Merging Merge.

Parallel Graph Algorithms

A G L O R H I M S T A Merging Merge.

Insertion Sort Array index Value Insertion sort.

RKPACK A numerical package for solving large eigenproblems

Introduction to parallel algorithms

Presentation transcript:

FFTW and Matlab*p Richard Hu 6.338 Project

Outline Background Work Done and Algorithm Results Q&A

Background Integrating FFTW into Matlab*p requires translating from block-cyclic distribution to row distribution and back Want to minimize communication and delays

Work Done Program translates between virtually any block-cyclic distribution and any row distribution Requires knowledge of positions of blocks and rows for other processors

Algorithm Clump blocks in the same row together to achieve “temporary” row distribution Shift rows between processors to achieve “desired” row distribution Calculate Fourier Transform Shift rows between processors back to “temporary” row distribution Divide rows into blocks

Algorithm Gather positions of blocks and rows for other processors and group into sorted arrays Iterates through every position in the matrix and determines the sending and receiving processor

Results

Variants One processor acts as head node, directing sends and receives Each node keeps track of the rows and columns it should send/receive and all communication is done at the end.

Questions