CS 584 Lecture 5 Assignment. Due NOW!!.

Slides:



Advertisements
Similar presentations
CSE 160 – Lecture 9 Speed-up, Amdahl’s Law, Gustafson’s Law, efficiency, basic performance metrics.
Advertisements

Parallel and Distributed Simulation Global Virtual Time - Part 2.
Parallel Sorting Sathish Vadhiyar. Sorting  Sorting n keys over p processors  Sort and move the keys to the appropriate processor so that every key.
Lecture 3: Parallel Algorithm Design
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Partitioning and Divide-and-Conquer Strategies ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 23, 2013.
“The study of algorithms is the cornerstone of computer science.” Algorithms Winter 2012.
Lecture 7-2 : Distributed Algorithms for Sorting Courtesy : Michael J. Quinn, Parallel Programming in C with MPI and OpenMP (chapter 14)
2 Less fish … More fish! Parallelism means doing multiple things at the same time: you can get more work done in the same time.
CISC October Goals for today: Foster’s parallel algorithm design –Partitioning –Task dependency graph Granularity Concurrency Collective communication.
CS 584. A Parallel Programming Model We need abstractions to make it simple. The programming model needs to fit our parallel machine model. Abstractions.
1 Friday, September 29, 2006 If all you have is a hammer, then everything looks like a nail. -Anonymous.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
CSCI 4440 / 8446 Parallel Computing Three Sorting Algorithms.
CS 584. Algorithm Analysis Assumptions n Consider ring, mesh, and hypercube. n Each process can either send or receive a single message at a time. n No.
Design of parallel algorithms Matrix operations J. Porras.
Virtues of Good (Parallel) Software
Parallel Processing (CS526) Spring 2012(Week 5).  There are no rules, only intuition, experience and imagination!  We consider design techniques, particularly.
1 Parallel Computing 5 Parallel Application Design Ondřej Jakl Institute of Geonics, Academy of Sci. of the CR.
Parallel Programming in C with MPI and OpenMP
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs.
Chapter 3 Parallel Algorithm Design. Outline Task/channel model Task/channel model Algorithm design methodology Algorithm design methodology Case studies.
Parallel Simulation of Continuous Systems: A Brief Introduction
Bulk Synchronous Processing (BSP) Model Course: CSC 8350 Instructor: Dr. Sushil Prasad Presented by: Chris Moultrie.
Communication and Computation on Arrays with Reconfigurable Optical Buses Yi Pan, Ph.D. IEEE Computer Society Distinguished Visitors Program Speaker Department.
Lecture 15- Parallel Databases (continued) Advanced Databases Masood Niazi Torshiz Islamic Azad University- Mashhad Branch
Lecture 4 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.
CS 484 Designing Parallel Algorithms Designing a parallel algorithm is not easy. There is no recipe or magical ingredient Except creativity We can benefit.
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
Parallel and Distributed Simulation Data Distribution II.
Data Structures and Algorithms in Parallel Computing Lecture 8.
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE 498AL, University of Illinois, Urbana-Champaign 1 ECE 498AL Spring 2010 Programming Massively Parallel.
A Pattern Language for Parallel Programming Beverly Sanders University of Florida.
3/12/2013Computer Engg, IIT(BHU)1 CONCEPTS-1. Pipelining Pipelining is used to increase the speed of processing It uses temporal parallelism In pipelining,
Lecture 3: Designing Parallel Programs. Methodological Design Designing and Building Parallel Programs by Ian Foster www-unix.mcs.anl.gov/dbpp.
COMP7330/7336 Advanced Parallel and Distributed Computing Task Partitioning Dr. Xiao Qin Auburn University
Department of Computer Science, Johns Hopkins University Lecture 7 Finding Concurrency EN /420 Instructor: Randal Burns 26 February 2014.
Computer Science and Engineering Parallelizing Feature Mining Using FREERIDE Leonid Glimcher P. 1 ipdps’04 Scaling and Parallelizing a Scientific Feature.
Auburn University
Lecture 3: Parallel Algorithm Design
COMPUTATIONAL MODELS.
Parallel Programming By J. H. Wang May 2, 2017.
Pattern Parallel Programming
Parallel and Distributed Simulation Techniques
CS 584 Lecture 3 How is the assignment going?.
Introduction to parallel algorithms
Parallel Algorithm Design
Parallel Programming in C with MPI and OpenMP
EE 193: Parallel Computing
Introduction to locality sensitive approach to distributed systems
Parallel Programming in C with MPI and OpenMP
Introduction to parallel algorithms
شیوه های موازی سازی parallelization methods
CS 584.
CSCE569 Parallel Computing
Parallel algorithm design
COMP60621 Fundamentals of Parallel and Distributed Systems
Professor Ioana Banicescu CSE 8843
CS 584 Lecture7 Assignment -- Due Now! Paper Review is due next week.
Parallel Algorithm Models
Mattan Erez The University of Texas at Austin
Summary Inf-2202 Concurrent and Data-Intensive Programming Fall 2016
Database System Architectures
Parallel Programming in C with MPI and OpenMP
COMP60611 Fundamentals of Parallel and Distributed Systems
Introduction to parallel algorithms
Mattan Erez The University of Texas at Austin
Improving the Performance of Large-Scale Unstructured PDE Applications
Presentation transcript:

CS 584 Lecture 5 Assignment. Due NOW!!

Review Partition Communication Avoid centralized patterns Avoid sequential patterns Divide and Conquer

Agglomeration Partition and Communication steps were abstract Agglomeration moves to concrete. Combine tasks to execute efficiently on some parallel computer. Consider replication.

Agglomeration Goals Reduce communication costs by increasing computation decreasing/increasing granularity Retain flexibility for mapping and scaling. Reduce software engineering costs.

Changing Granularity A large number of tasks does not necessarily produce an efficient algorithm. We must consider the communication costs. Reduce communication by having fewer tasks sending less messages (batching)

Surface to Volume Effects Communication is proportional to the surface of the subdomain. Computation is proportional to the volume of the subdomain. Increasing computation will often decrease communication.

How many messages total? How much data is sent?

How many messages total? How much data is sent?

Replicating Computation Trade-off replicated computation for reduce communication. Replication will often reduce execution time as well.

Summation of N Integers s = sum b = broadcast How many steps?

Using Replication (Butterfly)

Using Replication Butterfly to Hypercube

Avoid Communication Look for tasks that cannot execute concurrently because of communication requirements. Replication can help accomplish two tasks at the same time, like: Summation Broadcast

Preserve Flexibility Create more tasks than processors. Overlap communication and computation. Don't incorporate unnecessary limits on the number of tasks.

Agglomeration Checklist Reduce communication costs by increasing locality. Do benefits of replication outweigh costs? Does replication compromise scalability? Does the number of tasks still scale with problem size? Is there still sufficient concurrency?

Assignment Exercises 2.1, 2.2, 2.4, and 2.12 When asked to design an algorithm, go through the PCAM design stages.