CS 584 Lecture 3 How is the assignment going?.

Slides:



Advertisements
Similar presentations
Practical techniques & Examples
Advertisements

Partitioning and Divide-and-Conquer Strategies Data partitioning (or Domain decomposition) Functional decomposition.
2 Less fish … More fish! Parallelism means doing multiple things at the same time: you can get more work done in the same time.
CISC 879 : Software Support for Multicore Architectures John Cavazos Dept of Computer & Information Sciences University of Delaware
CISC October Goals for today: Foster’s parallel algorithm design –Partitioning –Task dependency graph Granularity Concurrency Collective communication.
Reference: Message Passing Fundamentals.
ECE669 L4: Parallel Applications February 10, 2004 ECE 669 Parallel Computer Architecture Lecture 4 Parallel Applications.
1 Friday, September 29, 2006 If all you have is a hammer, then everything looks like a nail. -Anonymous.
Message Passing Fundamentals Self Test. 1.A shared memory computer has access to: a)the memory of other nodes via a proprietary high- speed communications.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
CISC 879 : Software Support for Multicore Architectures John Cavazos Dept of Computer & Information Sciences University of Delaware
1 Tuesday, September 26, 2006 Wisdom consists of knowing when to avoid perfection. -Horowitz.
7/14/2015EECS 584, Fall MapReduce: Simplied Data Processing on Large Clusters Yunxing Dai, Huan Feng.
Chapter 13 Finite Difference Methods: Outline Solving ordinary and partial differential equations Finite difference methods (FDM) vs Finite Element Methods.
Parallel Programming: Case Studies Todd C. Mowry CS 495 September 12, 2002.
CS213 Parallel Processing Architecture Lecture 5: MIMD Program Design
Virtues of Good (Parallel) Software
MAT 204 F Composite Functions
Reference: / Parallel Programming Paradigm Yeni Herdiyeni Dept of Computer Science, IPB.
A Bridge to Your First Computer Science Course Prof. H.E. Dunsmore Concurrent Programming Threads Synchronization.
Development in hardware – Why? Option: array of custom processing nodes Step 1: analyze the application and extract the component tasks Step 2: design.
Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs.
Chapter 3 Parallel Algorithm Design. Outline Task/channel model Task/channel model Algorithm design methodology Algorithm design methodology Case studies.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Parallel Simulation of Continuous Systems: A Brief Introduction
Advanced Computer Architecture and Parallel Processing Rabie A. Ramadan http:
INTRODUCTION TO PARALLEL ALGORITHMS. Objective  Introduction to Parallel Algorithms Tasks and Decomposition Processes and Mapping Processes Versus Processors.
Multilevel Parallelism using Processor Groups Bruce Palmer Jarek Nieplocha, Manoj Kumar Krishnan, Vinod Tipparaju Pacific Northwest National Laboratory.
Lecture 4 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.
CS 484 Designing Parallel Algorithms Designing a parallel algorithm is not easy. There is no recipe or magical ingredient Except creativity We can benefit.
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
Parallel tree search: An algorithmic approach for multi- field packet classification Authors: Derek Pao and Cutson Liu. Publisher: Computer communications.
A Pattern Language for Parallel Programming Beverly Sanders University of Florida.
3/12/2013Computer Engg, IIT(BHU)1 CONCEPTS-1. Pipelining Pipelining is used to increase the speed of processing It uses temporal parallelism In pipelining,
Lecture 3: Designing Parallel Programs. Methodological Design Designing and Building Parallel Programs by Ian Foster www-unix.mcs.anl.gov/dbpp.
09/02/2010CS4961 CS4961 Parallel Programming Lecture 4: CTA, cont. Data and Task Parallelism Mary Hall September 2,
Parallel Computing Presented by Justin Reschke
Concurrency and Performance Based on slides by Henri Casanova.
Multi-cellular paradigm The molecular level can support self- replication (and self- repair). But we also need cells that can be designed to fit the specific.
Department of Computer Science, Johns Hopkins University Lecture 7 Finding Concurrency EN /420 Instructor: Randal Burns 26 February 2014.
Computer Science and Engineering Parallelizing Feature Mining Using FREERIDE Leonid Glimcher P. 1 ipdps’04 Scaling and Parallelizing a Scientific Feature.
Auburn University
Porting the MIT Global Circulation Model on the CellBE Processor
Xing Cai University of Oslo
High Altitude Low Opening?
Parallel Tasks Decomposition
Parallel Programming By J. H. Wang May 2, 2017.
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Mapping Techniques Dr. Xiao Qin Auburn University.
Interquery Parallelism
Parallel Algorithm Design
Parallel Programming in C with MPI and OpenMP
EE 193: Parallel Computing
Distributed Systems CS
CIS 488/588 Bruce R. Maxim UM-Dearborn
Parallel Programming in C with MPI and OpenMP
CS 584.
Parallel algorithm design
COMP60621 Designing for Parallelism
COMP60621 Fundamentals of Parallel and Distributed Systems
Distributed Systems CS
CS 584 Lecture7 Assignment -- Due Now! Paper Review is due next week.
Mattan Erez The University of Texas at Austin
Computational Thinking
Summary Inf-2202 Concurrent and Data-Intensive Programming Fall 2016
Parallel Programming in C with MPI and OpenMP
COMP60611 Fundamentals of Parallel and Distributed Systems
CS 584 Lecture 5 Assignment. Due NOW!!.
Mattan Erez The University of Texas at Austin
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

CS 584 Lecture 3 How is the assignment going?

Designing Parallel Algorithms Designing a parallel algorithm is not easy. There is no recipe or magical ingredient Except creativity We can benefit from a methodical approach. Framework for algorithm design Most problems have several parallel solutions which may be totally different from the best sequential algorithm.

PCAM Algorithm Design 4 Stages to designing a parallel algorithm Partitioning Communication Agglomeration Mapping P & C focus on concurrency and scalability. A & M focus on locality and performance.

PCAM Algorithm Design Partitioning Communication Agglomeration Mapping Computation and data are decomposed. Communication Coordinate task execution Agglomeration Combining of tasks for performance Mapping Assignment of tasks to processors

Partitioning Ignore the number of processors and the target architecture. Expose opportunities for parallelism. Divide up both the computation and data Can take two approaches domain decomposition functional decomposition

Domain Decomposition Start algorithm design by analyzing the data Divide the data into small pieces Approximately equal in size Then partition the computation by associating it with the data. Communication issues may arise as one task needs the data from another task.

Domain Decomposition Evaluate the definite integral. ò + 1 2 4 x 1

Split up the domain 1

Split up the domain 1

Split up the domain Now each task simply evaluates the integral in their range. All that is left is to sum up each task's answer for the total. 0.25 0.5 0.75 1

Domain Decomposition Consider dividing up a 3-D grid Other issues? What issues arise? Other issues? What if your problem has more than one data structure? Different problem phases? Replication?

Functional Decomposition Focus on the computation Divide the computation into disjoint tasks Avoid data dependency among tasks After dividing the computation, examine the data requirements of each task.

Functional Decomposition Not as natural as domain decomposition Consider search problems Often functional decomposition is very useful at a higher level. Climate modeling Ocean simulation Hydrology Atmosphere, etc.

Partitioning Checklist Define a LOT of tasks? Avoid redundant computation and storage? Are tasks approximately equal? Does the number of tasks scale with the problem size? Have you identified several alternative partitioning schemes?