Lecture 20: Parallelism & Concurrency CS 62 Spring 2013 Kim Bruce & Kevin Coogan CS 62 Spring 2013 Kim Bruce & Kevin Coogan Some slides based on those.

Slides:



Advertisements
Similar presentations
Concurrency Important and difficult (Ada slides copied from Ed Schonberg)
Advertisements

Concurrency The need for speed. Why concurrency? Moore’s law: 1. The number of components on a chip doubles about every 18 months 2. The speed of computation.
1 Chapter 1 Why Parallel Computing? An Introduction to Parallel Programming Peter Pacheco.
Java How to Program, 9/e CET 3640 Professor: Dr. José M. Reyes Álamo © Copyright by Pearson Education, Inc. All Rights Reserved.
Intro to Threading CS221 – 4/20/09. What we’ll cover today Finish the DOTS program Introduction to threads and multi-threading.
A Sophomoric Introduction to Shared-Memory Parallelism and Concurrency Lecture 1 Introduction to Multithreading & Fork-Join Parallelism Dan Grossman Last.
CSE332: Data Abstractions Lecture 18: Introduction to Multithreading and Fork-Join Parallelism Dan Grossman Spring 2010.
Threads Load new page Page is loading Browser still responds to user (can read pages in other tabs)
CS220 Software Development Lecture: Multi-threading A. O’Riordan, 2009.
Threads vs. Processes April 7, 2000 Instructor: Gary Kimura Slides courtesy of Hank Levy.
Threads 1 CS502 Spring 2006 Threads CS-502 Spring 2006.
29-Jun-15 Java Concurrency. Definitions Parallel processes—two or more Threads are running simultaneously, on different cores (processors), in the same.
1 Chapter 4 Threads Threads: Resource ownership and execution.
Multithreading in Java Nelson Padua-Perez Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
A. Frank - P. Weisberg Operating Systems Introduction to Cooperating Processes.
PARALLEL PROGRAMMING ABSTRACTIONS 6/16/2010 Parallel Programming Abstractions 1.
Java How to Program, 9/e CET 3640 Professor: Dr. Reyes Álamo © Copyright by Pearson Education, Inc. All Rights Reserved.
50.003: Elements of Software Construction Week 5 Basics of Threads.
1 Advanced Computer Programming Concurrency Multithreaded Programs Copyright © Texas Education Agency, 2013.
A Bridge to Your First Computer Science Course Prof. H.E. Dunsmore Concurrent Programming Threads Synchronization.
A Sophomoric Introduction to Shared-Memory Parallelism and Concurrency Lecture 1 Introduction to Multithreading & Fork-Join Parallelism Original Work by:
This module was created with support form NSF under grant # DUE Module developed by Martin Burtscher Module B1 and B2: Parallelization.
Concurrency: Threads, Address Spaces, and Processes Andy Wang Operating Systems COP 4610 / CGS 5765.
Multi-core Programming Introduction Topics. Topics General Ideas Moore’s Law Amdahl's Law Processes and Threads Concurrency vs. Parallelism.
Threads in Java. History  Process is a program in execution  Has stack/heap memory  Has a program counter  Multiuser operating systems since the sixties.
1 Concurrent Languages – Part 1 COMP 640 Programming Languages.
IT253: Computer Organization Lecture 3: Memory and Bit Operations Tonga Institute of Higher Education.
Chapter 2 Parallel Architecture. Moore’s Law The number of transistors on a chip doubles every years. – Has been valid for over 40 years – Can’t.
A Sophomoric Introduction to Shared-Memory Parallelism and Concurrency Lecture 1 Introduction to Multithreading & Fork-Join Parallelism Steve Wolfman,
Games Development 2 Concurrent Programming CO3301 Week 9.
CS 346 – Chapter 4 Threads –How they differ from processes –Definition, purpose Threads of the same process share: code, data, open files –Types –Support.
Parallel Processing Sharing the load. Inside a Processor Chip in Package Circuits Primarily Crystalline Silicon 1 mm – 25 mm on a side 100 million to.
Multithreading in Java Sameer Singh Chauhan Lecturer, I. T. Dept., SVIT, Vasad.
University of Washington What is parallel processing? Spring 2014 Wrap-up When can we execute things in parallel? Parallelism: Use extra resources to solve.
ICS 313: Programming Language Theory Chapter 13: Concurrency.
Exercise 1 : ex1.java class C extends Thread { int i; C(int i) { this.i = i; } public void run() { System.out.println("Thread " + i + " says hi"); System.out.println("Thread.
CS162 Week 5 Kyle Dewey. Overview Announcements Reactive Imperative Programming Parallelism Software transactional memory.
1. 2 Pipelining vs. Parallel processing  In both cases, multiple “things” processed by multiple “functional units” Pipelining: each thing is broken into.
Concurrency Control 1 Fall 2014 CS7020: Game Design and Development.
Lecture 5: Threads process as a unit of scheduling and a unit of resource allocation processes vs. threads what to program with threads why use threads.
Department of Computer Science and Software Engineering
CSE332: Data Abstractions Lecture 19: Introduction to Multithreading and Fork-Join Parallelism Tyler Robison Summer
3/12/2013Computer Engg, IIT(BHU)1 OpenMP-1. OpenMP is a portable, multiprocessing API for shared memory computers OpenMP is not a “language” Instead,
Concurrency and Performance Based on slides by Henri Casanova.
Agenda  Quick Review  Finish Introduction  Java Threads.
Concurrency (Threads) Threads allow you to do tasks in parallel. In an unthreaded program, you code is executed procedurally from start to finish. In a.
1/50 University of Turkish Aeronautical Association Computer Engineering Department Ceng 541 Introduction to Parallel Computing Dr. Tansel Dökeroğlu
CS 151: Object-Oriented Design November 19 Class Meeting Department of Computer Science San Jose State University Fall 2013 Instructor: Ron Mak
Concurrency in Java MD. ANISUR RAHMAN. slide 2 Concurrency  Multiprogramming  Single processor runs several programs at the same time  Each program.
Parallelism idea Example: Sum elements of a large array Idea: Have 4 threads simultaneously sum 1/4 of the array –Warning: This is an inferior first approach.
Concurrent Programming in Java Based on Notes by J. Johns (based on Java in a Nutshell, Learning Java) Also Java Tutorial, Concurrent Programming in Java.
Mergesort example: Merge as we return from recursive calls Merge Divide 1 element 829.
December 1, 2006©2006 Craig Zilles1 Threads & Atomic Operations in Hardware  Previously, we introduced multi-core parallelism & cache coherence —Today.
Multi Threading.
Advanced Topics in Concurrency and Reactive Programming: Asynchronous Programming Majeed Kassis.
CSE 373: Data Structures & Algorithms Introduction to Parallelism and Concurrency Riley Porter Winter 2017.
Instructor: Lilian de Greef Quarter: Summer 2017
CSE 332: Analysis of Fork-Join Parallel Programs
CSE 332: Intro to Parallelism: Multithreading and Fork-Join
A Sophomoric Introduction to Shared-Memory Parallelism and Concurrency Lecture 1 Introduction to Multithreading & Fork-Join Parallelism Dan Grossman Last.
CSE373: Data Structures & Algorithms Lecture 26: Introduction to Multithreading & Fork-Join Parallelism Catie Baker Spring 2015.
CSE332: Data Abstractions Lecture 18: Introduction to Multithreading & Fork-Join Parallelism Dan Grossman Spring 2012.
CSE373: Data Structures & Algorithms Lecture 21: Introduction to Multithreading & Fork-Join Parallelism Dan Grossman Fall 2013.
Background and Motivation
Parallelism for summation
Java Concurrency 17-Jan-19.
Java Concurrency.
Java Concurrency.
Java Concurrency 29-May-19.
Concurrency: Threads, Address Spaces, and Processes
Presentation transcript:

Lecture 20: Parallelism & Concurrency CS 62 Spring 2013 Kim Bruce & Kevin Coogan CS 62 Spring 2013 Kim Bruce & Kevin Coogan Some slides based on those from Dan Grossman, U. of Washington

Parallelism & Concurrency Single-processor computers going away. Want to use separate processors to speed up computing by using them in parallel. Also have programs on single processor running in multiple threads. Want to control them so that program is responsive to user: Concurrency Often need concurrent access to data structures (e.g., event queue). Need to ensure don’t interfere w/each other. Single-processor computers going away. Want to use separate processors to speed up computing by using them in parallel. Also have programs on single processor running in multiple threads. Want to control them so that program is responsive to user: Concurrency Often need concurrent access to data structures (e.g., event queue). Need to ensure don’t interfere w/each other.

History Writing correct and efficient multithread code is more difficult than for single-threaded (sequential). From roughly , desktop computers got exponentially faster at running sequential programs About twice as fast every 18 months to 2 years Writing correct and efficient multithread code is more difficult than for single-threaded (sequential). From roughly , desktop computers got exponentially faster at running sequential programs About twice as fast every 18 months to 2 years

More History Nobody knows how to continue this Increasing clock rate generates too much heat Relative cost of memory access is too high Can keep making “wires exponentially smaller” (Moore’s “Law”), so put multiple processors on the same chip (“multicore”) Now double number of cores every 2 years! Nobody knows how to continue this Increasing clock rate generates too much heat Relative cost of memory access is too high Can keep making “wires exponentially smaller” (Moore’s “Law”), so put multiple processors on the same chip (“multicore”) Now double number of cores every 2 years!

What can you do with multiple cores? Run multiple totally different programs at the same time Already do that? Yes, but with time-slicing Do multiple things at once in one program Our focus – more difficult Requires rethinking everything from asymptotic complexity to how to implement data-structure operations Run multiple totally different programs at the same time Already do that? Yes, but with time-slicing Do multiple things at once in one program Our focus – more difficult Requires rethinking everything from asymptotic complexity to how to implement data-structure operations

Parallelism vs. Concurrency Parallelism: Use more resources for a faster answer Concurrency Correctly and efficiently allow simultaneous access Connection: Many programmers use threads for both If parallel computations need access to shared resources, then something needs to manage the concurrency Parallelism: Use more resources for a faster answer Concurrency Correctly and efficiently allow simultaneous access Connection: Many programmers use threads for both If parallel computations need access to shared resources, then something needs to manage the concurrency

Analogy CS1 idea: Writing a program is like writing a recipe for one cook who does one thing at a time! Parallelism: Hire helpers, hand out potatoes and knives But not too many chefs or you spend all your time coordinating (or you’ll get hurt!) Concurrency: Lots of cooks making different things, but only 4 stove burners Want to allow simultaneous access to all 4 burners, but not cause spills or incorrect burner settings CS1 idea: Writing a program is like writing a recipe for one cook who does one thing at a time! Parallelism: Hire helpers, hand out potatoes and knives But not too many chefs or you spend all your time coordinating (or you’ll get hurt!) Concurrency: Lots of cooks making different things, but only 4 stove burners Want to allow simultaneous access to all 4 burners, but not cause spills or incorrect burner settings

Models Change Model: Shared memory w/explicit threads Program on single processor: One call stack (w/ each stack frame holding local variables) One program counter (current statement executing) Static fields Objects (created by new) in the heap (nothing to do with heap data structure) Model: Shared memory w/explicit threads Program on single processor: One call stack (w/ each stack frame holding local variables) One program counter (current statement executing) Static fields Objects (created by new) in the heap (nothing to do with heap data structure)

Multiple Theads/Processors New story: A set of threads, each with its own call stack & program counter No access to another thread’s local variables Threads can (implicitly) share static fields / objects To communicate, write somewhere another thread reads New story: A set of threads, each with its own call stack & program counter No access to another thread’s local variables Threads can (implicitly) share static fields / objects To communicate, write somewhere another thread reads

Shared Memory … pc=0x… … … … Threads, each with ownunshared call stack and currentstatement (pc for “program counter”) local variables are numbers/null or heap references Heap for all objects and static fields

Other Models Message-passing: Each thread has its own collection of objects. Communication is via explicit messages; language has primitives for sending and receiving them. Cooks working in separate kitchens, with telephones Dataflow: Programmers write programs in terms of a DAG and a node executes after all of its predecessors in the graph Cooks wait to be handed results of previous steps Data parallelism: Have primitives for things like “apply function to every element of an array in parallel” Message-passing: Each thread has its own collection of objects. Communication is via explicit messages; language has primitives for sending and receiving them. Cooks working in separate kitchens, with telephones Dataflow: Programmers write programs in terms of a DAG and a node executes after all of its predecessors in the graph Cooks wait to be handed results of previous steps Data parallelism: Have primitives for things like “apply function to every element of an array in parallel”

Parallel Programming in Java Creating a thread: 1. Define a class C extending Thread Override public void run() method 2. Create object of class C 3. Call that thread’s start method Creates new thread and starts executing run method. Direct call of run won’t work, as just be a normal method call Alternatively, define class implementing Runnable, create thread w/it as parameter, and send start message Creating a thread: 1. Define a class C extending Thread Override public void run() method 2. Create object of class C 3. Call that thread’s start method Creates new thread and starts executing run method. Direct call of run won’t work, as just be a normal method call Alternatively, define class implementing Runnable, create thread w/it as parameter, and send start message

Parallelism Idea Example: Sum elements of an array Use 4 threads, which each sum 1/4 of the array Steps: Create 4 thread objects, assigning each their portion of the work Call start() on each thread object to actually run it Wait for threads to finish Add together their 4 answers for the final result Example: Sum elements of an array Use 4 threads, which each sum 1/4 of the array Steps: Create 4 thread objects, assigning each their portion of the work Call start() on each thread object to actually run it Wait for threads to finish Add together their 4 answers for the final result ans0 ans1 ans2 ans3 + ans

First Attempt class SumThread extends Thread { int lo, int hi, int[] arr;//fields to know what to do int ans = 0; // for communicating result SumThread(int[] a, int l, int h) { … } public void run(){ … } } int sum(int[] arr){ int len = arr.length; int ans = 0; SumThread[] ts = new SumThread[4]; for(int i=0; i < 4; i++){// do parallel computations ts[i] = new SumThread(arr,i*len/4,(i+1)*len/4); ts[i].start(); // use start not run } for(int i=0; i < 4; i++) // combine results ans += ts[i].ans; return ans; } What’s wrong?

Correct Version class SumThread extends Thread { int lo, int hi, int[] arr;//fields to know what to do int ans = 0; // for communicating result SumThread(int[] a, int l, int h) { … } public void run(){ … } } int sum(int[] arr){ int len = arr.length; int ans = 0; SumThread[] ts = new SumThread[4]; for(int i=0; i < 4; i++){// do parallel computations ts[i] = new SumThread(arr,i*len/4,(i+1)*len/4); ts[i].start(); // start not run } for(int i=0; i < 4; i++) // combine results ts[i].join(); // wait for helper to finish! ans += ts[i].ans; return ans; } See program ParallelSum

Thread Class Methods void start( ), which calls void run( ) void join( ) -- blocks until receiver thread done Style called fork/join parallelism Need try-catch around join as it can throw exception InterruptedException Some memory sharing: lo, hi, arr, ans fields Later learn how to protect using synchronized. void start( ), which calls void run( ) void join( ) -- blocks until receiver thread done Style called fork/join parallelism Need try-catch around join as it can throw exception InterruptedException Some memory sharing: lo, hi, arr, ans fields Later learn how to protect using synchronized.

Actually not so great. If do timing, it’s slower than sequential!! Want code to be reusable and efficient as core count grows. At minimum, make #threads a parameter. Want to effectively use processors available now Not being used by other programs Can change while your threads running If do timing, it’s slower than sequential!! Want code to be reusable and efficient as core count grows. At minimum, make #threads a parameter. Want to effectively use processors available now Not being used by other programs Can change while your threads running

Problem Suppose 4 processors on computer Suppose have problem of size n can solve w/3 processors each taking time t on n/3 elts. Suppose linear in size of problem. Try to use 4 threads, but one processor busy playing music. First 3 threads run, but 4th waits. First 3 threads scheduled & take time (n/4)/(n/3)*t = 3/4 t After 1st 3 finish, run 4th & takes another 3/4 t Total time 1.5 * t, runs 50% slower than with 3 threads!!! Suppose 4 processors on computer Suppose have problem of size n can solve w/3 processors each taking time t on n/3 elts. Suppose linear in size of problem. Try to use 4 threads, but one processor busy playing music. First 3 threads run, but 4th waits. First 3 threads scheduled & take time (n/4)/(n/3)*t = 3/4 t After 1st 3 finish, run 4th & takes another 3/4 t Total time 1.5 * t, runs 50% slower than with 3 threads!!!

Other Possible Problems On some problems, different threads may take significantly different times to complete Imagine applying f to all members of an array, where f applied to some elts takes a long time If unlikely, all the slow elts may get assigned to same thread. Certainly won’t see n time speedup w/ n threads. May be much worse! Load imbalance problem! On some problems, different threads may take significantly different times to complete Imagine applying f to all members of an array, where f applied to some elts takes a long time If unlikely, all the slow elts may get assigned to same thread. Certainly won’t see n time speedup w/ n threads. May be much worse! Load imbalance problem!