Thinking in Parallel – Implementing In Code New Mexico Supercomputing Challenge in partnership with Intel Corp. and NM EPSCoR.

Slides:



Advertisements
Similar presentations
Shared-Memory Model and Threads Intel Software College Introduction to Parallel Programming – Part 2.
Advertisements

CH14 Instruction Level Parallelism and Superscalar Processors
Computer Science 320 Clumping in Parallel Java. Sequential vs Parallel Program Initial setup Execute the computation Clean up Initial setup Create a parallel.
Operating Systems Part III: Process Management (Process Synchronization)
Computer Organization and Architecture
Operating System Concepts and Techniques Lecture 12 Interprocess communication-1 M. Naghibzadeh Reference M. Naghibzadeh, Operating System Concepts and.
1 Architectural Complexity: Opening the Black Box Methods for Exposing Internal Functionality of Complex Single and Multiple Processor Systems EECC-756.
INTEL CONFIDENTIAL Deadlock Introduction to Parallel Programming – Part 7.
PARALLEL PROCESSING COMPARATIVE STUDY 1. CONTEXT How to finish a work in short time???? Solution To use quicker worker. Inconvenient: The speed of worker.
Thinking in Parallel – Task Decomposition New Mexico Supercomputing Challenge in partnership with Intel Corp. and NM EPSCoR.
Reference: Message Passing Fundamentals.
1 Distributed Computing Algorithms CSCI Distributed Computing: everything not centralized many processors.
Chapter 2: Processes Topics –Processes –Threads –Process Scheduling –Inter Process Communication (IPC) Reference: Operating Systems Design and Implementation.
12a.1 Introduction to Parallel Computing UNC-Wilmington, C. Ferner, 2008 Nov 4, 2008.
On the Task Assignment Problem : Two New Efficient Heuristic Algorithms.
INTEL CONFIDENTIAL Confronting Race Conditions Introduction to Parallel Programming – Part 6.
A. Frank - P. Weisberg Operating Systems Introduction to Cooperating Processes.
INTEL CONFIDENTIAL OpenMP for Task Decomposition Introduction to Parallel Programming – Part 8.
INTEL CONFIDENTIAL Why Parallel? Why Now? Introduction to Parallel Programming – Part 1.
Thinking in Parallel – Pipelining New Mexico Supercomputing Challenge in partnership with Intel Corp. and NM EPSCoR.
INTEL CONFIDENTIAL Parallel Decomposition Methods Introduction to Parallel Programming – Part 2.
Maria-Cristina Marinescu Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology A Synthesis Algorithm for Modular Design of.
INTEL CONFIDENTIAL Finding Parallelism Introduction to Parallel Programming – Part 3.
1 Advanced Computer Programming Concurrency Multithreaded Programs Copyright © Texas Education Agency, 2013.
A Bridge to Your First Computer Science Course Prof. H.E. Dunsmore Concurrent Programming Threads Synchronization.
Recognizing Potential Parallelism Intel Software College Introduction to Parallel Programming – Part 1.
Multi-core Programming: Basic Concepts. Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered.
Parallelism Processing more than one instruction at a time. Pipelining
Operating System 4 THREADS, SMP AND MICROKERNELS
Object Oriented Analysis & Design SDL Threads. Contents 2  Processes  Thread Concepts  Creating threads  Critical sections  Synchronizing threads.
INTEL CONFIDENTIAL Predicting Parallel Performance Introduction to Parallel Programming – Part 10.
Chapter 3 Parallel Algorithm Design. Outline Task/channel model Task/channel model Algorithm design methodology Algorithm design methodology Case studies.
Chapter 3 Parallel Programming Models. Abstraction Machine Level – Looks at hardware, OS, buffers Architectural models – Looks at interconnection network,
LATA: A Latency and Throughput- Aware Packet Processing System Author: Jilong Kuang and Laxmi Bhuyan Publisher: DAC 2010 Presenter: Chun-Sheng Hsueh Date:
Recognizing Potential Parallelism Introduction to Parallel Programming Part 1.
1 Web Based Programming Section 8 James King 12 August 2003.
Copyright ©: University of Illinois CS 241 Staff1 Threads Systems Concepts.
INTEL CONFIDENTIAL Shared Memory Considerations Introduction to Parallel Programming – Part 4.
Operating Systems CSE 411 CPU Management Sept Lecture 10 Instructor: Bhuvan Urgaonkar.
1 "Workshop 31: Developing a Hands-on Undergraduate Parallel Programming Course with Pattern Programming SIGCSE The 44 th ACM Technical Symposium.
CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer/
Copyright © Curt Hill Concurrent Execution An Overview for Database.
CDP Tutorial 3 Basics of Parallel Algorithm Design uses some of the slides for chapters 3 and 5 accompanying “Introduction to Parallel Computing”, Addison.
C H A P T E R E L E V E N Concurrent Programming Programming Languages – Principles and Paradigms by Allen Tucker, Robert Noonan.
Threaded Programming Lecture 1: Concepts. 2 Overview Shared memory systems Basic Concepts in Threaded Programming.
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
Thinking in Parallel – Domain Decomposition New Mexico Supercomputing Challenge in partnership with Intel Corp. and NM EPSCoR.
3/12/2013Computer Engg, IIT(BHU)1 CONCEPTS-1. Pipelining Pipelining is used to increase the speed of processing It uses temporal parallelism In pipelining,
Uses some of the slides for chapters 3 and 5 accompanying “Introduction to Parallel Computing”, Addison Wesley, 2003.
Parallel Computing Presented by Justin Reschke
SMP Basics KeyStone Training Multicore Applications Literature Number: SPRPxxx 1.
1 5-High-Performance Embedded Systems using Concurrent Process (cont.)
Agenda  Quick Review  Finish Introduction  Java Threads.
Thinking in Parallel - Introduction New Mexico Supercomputing Challenge in partnership with Intel Corp. and NM EPSCoR.
Concurrency in Java MD. ANISUR RAHMAN. slide 2 Concurrency  Multiprogramming  Single processor runs several programs at the same time  Each program.
Tuning Threaded Code with Intel® Parallel Amplifier.
Department of Computer Science, Johns Hopkins University Lecture 7 Finding Concurrency EN /420 Instructor: Randal Burns 26 February 2014.
Parallel Programming Models EECC 756 David D. McGann 18 May, 1999.
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
Parallel Programming By J. H. Wang May 2, 2017.
Pattern Parallel Programming
Parallel and Distributed Simulation Techniques
Discussion and Conclusion
Parallelized Analysis Using Subdural Interictal EEG
Title of session For Event Plus Presenters 12/5/2018.
ე ვ ი ო Ш Е Т И О А С Д Ф К Ж З В Н М W Y U I O S D Z X C V B N M
Pipelining: Basic Concepts
This material is based upon work supported by the National Science Foundation under Grant #XXXXXX. Any opinions, findings, and conclusions or recommendations.
Presentation transcript:

Thinking in Parallel – Implementing In Code New Mexico Supercomputing Challenge in partnership with Intel Corp. and NM EPSCoR

Copyrights and Acknowledgments Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. New Mexico EPSCoR Program is funded in part by the National Science Foundation award # and the State of New Mexico. Any opinions, findings, conclusions, or recommendations expressed in the material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. For questions about the Supercomputing Challenge, a 501(c)3 organization, contact us at: challenge.nm.org

Agenda Introduction of threads Review of key concepts and issues in parallelism Implementation in computer programs (with examples in Java)

What is a Thread? A thread is a set of instructions for one processor. Multiple threads: Can run simultaneously on multiple processors. Can take turns running on a single processor.

Methodology Domain decomposition – Used when the same operation is performed on a large number of similar data items. Task decomposition – Used when some different operations can be performed at the same time. Pipelining – Used when there are sequential operations on a large amount of data.

Implementation Challenges Communication Inputs and outputs must be communicated between multiple threads. Synchronization Access to shared resources must be synchronized to respect dependencies and avoid collisions.

Most Commonly Used Communication Options Shared memory Multiple threads read and write to a common memory area. Message passing Multiple threads (possibly distributed over a network) send data to and from a master thread, or to peers.

Shared Memory Shared memory is most often used for multiple processors in a single computer. Shared memory is very efficient; however, additional care must be taken to avoid synchronization issues.

Synchronization Issues Dependencies – Work on one task is required before another task can begin. Contention – Multiple threads require simultaneous or overlapping access to shared resources. Ex: Race conditions.

Race Condition – … occurs when the order in which multiple threads complete their tasks isn’t predictable, and this can result in different outcomes. Recall the problems in adding numbers in Domain Decomposition Activity 1. Main solution approaches: Critical section Reduction

Race Condition – Example long sum = 0; void update(byte value) { sum += value; } If multiple threads execute the update method at the same time, the value in sum will be overwritten.

Critical Section – … can be used to resolve a race condition by preventing simultaneous access to a shared resource. In Domain Decomposition Activity 2, a critical section was created by using a single pen.

Critical Section – Example long sum = 0; synchronized void update(byte value) { sum += value; } The use of the synchronized keyword prevents multiple threads from executing the update method simultaneously.

Reduction – … can be used to resolve a race condition, by subdividing the problem domain into independent sections. In Domain Decomposition Activity 3, reduction was implemented by computing subtotals.

Reduction – Example long sum = 0; long totalCount = 0; synchronized void update (int count, long value) { sum += value; totalCount += count; } The update method can now be called from separate threads, each of which computes a subtotal and passes it to update.

Dependence Graph and Task Dependencies By creating a flowchart showing the order of tasks in a program, we can identify the dependencies between tasks. In Task Decomposition Activity 2, you identified task dependencies. A food chain is an example of dependencies from another field of science.

Dependence Graph and Task Dependencies f s r q h g Edges (arrows) depict data dependencies between tasks. Is there a logical way to assign the tasks to separate cpu’s?

Dependence Graph and Task Groups If we group tasks to minimize dependencies between groups, we can assign these groups to separate threads. In Task Decomposition Activity 4, you used grouping to identify sets of tasks that could be performed in parallel. A food web is an example of a more complex set of dependencies in another field of science.

Dependence Graph and Task Groups