Thinking in Parallel – Implementing In Code New Mexico Supercomputing Challenge in partnership with Intel Corp. and NM EPSCoR
Copyrights and Acknowledgments Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. New Mexico EPSCoR Program is funded in part by the National Science Foundation award # and the State of New Mexico. Any opinions, findings, conclusions, or recommendations expressed in the material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. For questions about the Supercomputing Challenge, a 501(c)3 organization, contact us at: challenge.nm.org
Agenda Introduction of threads Review of key concepts and issues in parallelism Implementation in computer programs (with examples in Java)
What is a Thread? A thread is a set of instructions for one processor. Multiple threads: Can run simultaneously on multiple processors. Can take turns running on a single processor.
Methodology Domain decomposition – Used when the same operation is performed on a large number of similar data items. Task decomposition – Used when some different operations can be performed at the same time. Pipelining – Used when there are sequential operations on a large amount of data.
Implementation Challenges Communication Inputs and outputs must be communicated between multiple threads. Synchronization Access to shared resources must be synchronized to respect dependencies and avoid collisions.
Most Commonly Used Communication Options Shared memory Multiple threads read and write to a common memory area. Message passing Multiple threads (possibly distributed over a network) send data to and from a master thread, or to peers.
Shared Memory Shared memory is most often used for multiple processors in a single computer. Shared memory is very efficient; however, additional care must be taken to avoid synchronization issues.
Synchronization Issues Dependencies – Work on one task is required before another task can begin. Contention – Multiple threads require simultaneous or overlapping access to shared resources. Ex: Race conditions.
Race Condition – … occurs when the order in which multiple threads complete their tasks isn’t predictable, and this can result in different outcomes. Recall the problems in adding numbers in Domain Decomposition Activity 1. Main solution approaches: Critical section Reduction
Race Condition – Example long sum = 0; void update(byte value) { sum += value; } If multiple threads execute the update method at the same time, the value in sum will be overwritten.
Critical Section – … can be used to resolve a race condition by preventing simultaneous access to a shared resource. In Domain Decomposition Activity 2, a critical section was created by using a single pen.
Critical Section – Example long sum = 0; synchronized void update(byte value) { sum += value; } The use of the synchronized keyword prevents multiple threads from executing the update method simultaneously.
Reduction – … can be used to resolve a race condition, by subdividing the problem domain into independent sections. In Domain Decomposition Activity 3, reduction was implemented by computing subtotals.
Reduction – Example long sum = 0; long totalCount = 0; synchronized void update (int count, long value) { sum += value; totalCount += count; } The update method can now be called from separate threads, each of which computes a subtotal and passes it to update.
Dependence Graph and Task Dependencies By creating a flowchart showing the order of tasks in a program, we can identify the dependencies between tasks. In Task Decomposition Activity 2, you identified task dependencies. A food chain is an example of dependencies from another field of science.
Dependence Graph and Task Dependencies f s r q h g Edges (arrows) depict data dependencies between tasks. Is there a logical way to assign the tasks to separate cpu’s?
Dependence Graph and Task Groups If we group tasks to minimize dependencies between groups, we can assign these groups to separate threads. In Task Decomposition Activity 4, you used grouping to identify sets of tasks that could be performed in parallel. A food web is an example of a more complex set of dependencies in another field of science.
Dependence Graph and Task Groups