Comp 422: Parallel Programming Shared Memory Multithreading: Pthreads Synchronization.

Slides:



Advertisements
Similar presentations
Operating Systems Semaphores II
Advertisements

CS Lecture 4 Programming with Posix Threads and Java Threads George Mason University Fall 2009.
Programming with Posix Threads CS5204 Operating Systems.
Pthreads & Concurrency. Acknowledgements  The material in this tutorial is based in part on: POSIX Threads Programming, by Blaise Barney.
Multi-core Programming Programming with Posix Threads.
PTHREADS These notes are from LLNL Pthreads Tutorial
Pthreads Topics Introduction to Pthreads Data Parallelism Task Parallelism: Pipeline Task Parallelism: Task queue Examples.
Threads By Dr. Yingwu Zhu. Review Multithreading Models Many-to-one One-to-one Many-to-many.
Chap. 23 Threads (1) Threads v.s. Fork (2) Basic Thread Functions #include int pthread_create ( pthread_t *tid, const pthread_attr_t *attr, void *(*func)(void.
Computer Architecture II 1 Computer architecture II Programming: POSIX Threads OpenMP.
8-1 JMH Associates © 2004, All rights reserved Windows Application Development Chapter 10 - Supplement Introduction to Pthreads for Application Portability.
Lecture 18 Threaded Programming CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger.
SMP threads an Introduction to Posix Threads. Technical Definition 1.Independent stream of instructions that can be scheduled to run by an operating system.
Threads? Threads allow us to have multiple tasks active at the same time in one executable –think of a server handling multiple connections Each thread.
Comp 422: Parallel Programming
Threads© Dr. Ayman Abdel-Hamid, CS4254 Spring CS4254 Computer Network Architecture and Programming Dr. Ayman A. Abdel-Hamid Computer Science Department.
Pthread II. Outline Join Mutex Variables Condition Variables.
ECE1747 Parallel Programming Shared Memory Multithreading Pthreads.
THREAD IMPLEMENTATION For parallel processing. Steps involved Creation Creates a thread with a thread id. Detach and Join All threads must be detached.
ECE 1747 Parallel Programming Shared Memory: OpenMP Environment and Synchronization.
Pthread (continue) General pthread program structure –Encapsulate parallel parts (can be almost the whole program) in functions. –Use function arguments.
Introduction to Pthreads. Pthreads Pthreads is a POSIX standard for describing a thread model, it specifies the API and the semantics of the calls. Model.
Object Oriented Analysis & Design SDL Threads. Contents 2  Processes  Thread Concepts  Creating threads  Critical sections  Synchronizing threads.
ECE 1747 Parallel Programming Shared Memory: OpenMP Environment and Synchronization.
Atomic Operations David Monismith cs550 Operating Systems.
Portable Operating System Interface Thread Yukai Hung Department of Mathematics National Taiwan University Yukai Hung
04/10/25Parallel and Distributed Programming1 Shared-memory Parallel Programming Taura Lab M1 Yuuki Horita.
Multi-threaded Programming with POSIX Threads CSE331 Operating Systems Design.
Threads and Thread Control Thread Concepts Pthread Creation and Termination Pthread synchronization Threads and Signals.
Copyright ©: University of Illinois CS 241 Staff1 Synchronization and Semaphores.
ICS 145B -- L. Bic1 Project: Process/Thread Synchronization Textbook: pages ICS 145B L. Bic.
CS345 Operating Systems Threads Assignment 3. Process vs. Thread process: an address space with 1 or more threads executing within that address space,
Programming with POSIX* Threads Intel Software College.
Condition Variables u A condition variable is used to wait until a particular condition is true  like in a monitor u Always use condition variables together.
Pthreads: A shared memory programming model
1 Pthread Programming CIS450 Winter 2003 Professor Jinhua Guo.
POSIX Synchronization Introduction to Operating Systems: Discussion Module 5.
(p)Threads Libraries Math 442 es Jim Fix. Life cycle of a thread.
POSIX Threads HUJI Spring 2011.
Lecture 7: POSIX Threads - Pthreads. Parallel Programming Models Parallel Programming Models: Data parallelism / Task parallelism Explicit parallelism.
IT 325 Operating systems Chapter6.  Threads can greatly simplify writing elegant and efficient programs.  However, there are problems when multiple.
Pthreads.
ECE 1747H: Parallel Programming Lecture 2: Data Parallelism.
Pthreads #include pthread_t tid ; //thread id. pthread_attr_t attr ; void *sleeping(void *); /* thread routine */ main() { int time = 2 ; pthread_create(&tid,
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Software Systems Advanced Synchronization Emery Berger and Mark Corner University.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Unix System Calls and Posix Threads.
1 Condition Variables CS 241 Prof. Brighten Godfrey March 16, 2012 University of Illinois.
Chapter 6 P-Threads. Names The naming convention for a method/function/operation is: – pthread_thing_operation(..) – Where thing is the object used (such.
CS252: Systems Programming Ninghui Li Based on Slides by Prof. Gustavo Rodriguez-Rivera Topic 13: Condition Variable, Read/Write Lock, and Deadlock.
PThread Synchronization. Thread Mechanisms Birrell identifies four mechanisms commonly used in threading systems –Thread creation –Mutual exclusion (mutex)
Thread Basic Thread operations include thread creation, termination, synchronization, data management Threads in the same process share:  Process address.
Working with Pthreads. Operations on Threads int pthread_create (pthread_t *thread, const pthread_attr_t *attr, void * (*routine)(void*), void* arg) Creates.
pThread synchronization
COMP7330/7336 Advanced Parallel and Distributed Computing POSIX Thread API Dr. Xiao Qin Auburn University
Case Study: Pthread Synchronization Dr. Yingwu Zhu.
ECE1747 Parallel Programming Shared Memory Multithreading Pthreads.
CS 537 – Introduction to Operating Systems
PThreads.
Principles of Operating Systems Lecture 11
Shared-Memory Programming with Threads
Threads Threads.
PTHREADS These notes are from LLNL Pthreads Tutorial
Lecture 14: Pthreads Mutex and Condition Variables
ECE1747 Parallel Programming
PTHREADS AND SEMAPHORES
ECE1747 Parallel Programming
Jonathan Walpole Computer Science Portland State University
Pthread Prof. Ikjun Yeom TA – Mugyo
Lecture 14: Pthreads Mutex and Condition Variables
POSIX Threads(pthreads)
Presentation transcript:

Comp 422: Parallel Programming Shared Memory Multithreading: Pthreads Synchronization

Reminder: Process Management pthread_create(): creates a parallel thread executing a given function (and arguments), returns thread identifier. pthread_exit(): terminates thread. pthread_join(): waits for thread with particular thread identifier to terminate.

General Program Structure Encapsulate parallel parts in functions. Use function arguments to parametrize what a particular thread does. Call pthread_create() with the function and arguments, save thread identifier returned. Call pthread_join() with that thread identifier.

Pthreads Synchronization Create/exit/join –provide some form of synchronization, –at a very coarse level, –requires thread creation/destruction. Need for finer-grain synchronization –mutex locks, –condition variables.

Mutex Locks (1 of 4) pthread_mutex_init( pthread_mutex_t * mutex, const pthread_mutex_attr *attr); Creates a new mutex lock. Attribute: ignore.

Mutex Locks (2 of 4) pthread_mutex_destroy( pthread_mutex_t *mutex); Destroys the mutex specified by mutex.

Mutex Locks (3 of 4) pthread_mutex_lock( pthread_mutex_t *mutex) Tries to acquire the lock specified by mutex. If mutex is already locked, then calling thread blocks until mutex is unlocked.

Mutex Locks (4 of 4) pthread_mutex_unlock( pthread_mutex_t *mutex); If calling thread has mutex currently locked, this will unlock the mutex. If other threads are blocked waiting on this mutex, one will unblock and acquire mutex. Which one is determined by the scheduler.

Use of Mutex Locks To implement critical sections (as needed, e.g., in en_queue and de_queue in TSP). Pthreads provides only exclusive locks. Some other systems allow shared-read, exclusive-write locks.

Condition variables (1 of 5) pthread_cond_init( pthread_cond_t *cond, pthread_cond_attr *attr) Creates a new condition variable cond. Attribute: ignore for now.

Condition Variables (2 of 5) pthread_cond_destroy( pthread_cond_t *cond) Destroys the condition variable cond.

Condition Variables (3 of 5) pthread_cond_wait( pthread_cond_t *cond, pthread_mutex_t *mutex) Blocks the calling thread, waiting on cond. Unlocks the mutex.

Condition Variables (4 of 5) pthread_cond_signal( pthread_cond_t *cond) Unblocks one thread waiting on cond. Which one is determined by scheduler. If no thread waiting, then signal is a no-op.

Condition Variables (5 of 5) pthread_cond_broadcast( pthread_cond_t *cond) Unblocks all threads waiting on cond. If no thread waiting, then broadcast is a no- op.

Use of Condition Variables To implement signal-wait synchronization discussed in earlier examples. Important note: a signal is “forgotten” if there is no corresponding wait that has already happened.

PIPE P1:for( i=0; i<num_pics, read(in_pic); i++ ) { int_pic_1[i] = trans1( in_pic ); signal( event_1_2[i] ); } P2: for( i=0; i<num_pics; i++ ) { wait( event_1_2[i] ); int_pic_2[i] = trans2( int_pic_1[i] ); signal( event_2_3[i] ); }

PIPE Using Pthreads Replacing the original wait/signal by a Pthreads condition variable wait/signal will not work. –signals before a wait are forgotten. –we need to remember a signal.

How to remember a signal (1 of 2) semaphore_signal(i) { pthread_mutex_lock(&mutex_rem[i]); arrived [i]= 1; pthread_cond_signal(&cond[i]); pthread_mutex_unlock(&mutex_rem[i]); }

How to Remember a Signal (2 of 2) sempahore_wait(i) { pthreads_mutex_lock(&mutex_rem[i]); if( arrived[i] = 0 ) { pthreads_cond_wait(&cond[i], mutex_rem[i]); } arrived[i] = 0; pthreads_mutex_unlock(&mutex_rem[i]); }

PIPE with Pthreads P1:for( i=0; i<num_pics, read(in_pic); i++ ) { int_pic_1[i] = trans1( in_pic ); semaphore_signal( event_1_2[i] ); } P2: for( i=0; i<num_pics; i++ ) { semaphore_wait( event_1_2[i] ); int_pic_2[i] = trans2( int_pic_1[i] ); semaphore_signal( event_2_3[i] ); }

Note Many shared memory programming systems (other than Pthreads) have semaphores as basic primitive. If they do, you should use it, not construct it yourself. Implementation may be more efficient than what you can do yourself.

Parallel TSP process i: while( (p=de_queue()) != NULL ) { for each expansion by one city { q = add_city(p); if complete(q) { update_best(q) }; else en_queue(q); }

Parallel TSP Need critical section – in update_best, –in en_queue/de_queue. In de_queue –wait if q is empty, –terminate if all processes are waiting. In en_queue: –signal q is no longer empty.

Parallel TSP: Mutual Exclusion en_queue() / de_queue() { pthreads_mutex_lock(&queue); …; pthreads_mutex_unlock(&queue); } update_best() { pthreads_mutex_lock(&best); …; pthreads_mutex_unlock(&best); }

Parallel TSP: Condition Synchronization de_queue() { while( (q is empty) and (not done) ) { waiting++; if( waiting == p ) { done = true; pthreads_cond_broadcast(&empty, &queue); } else { pthreads_cond_wait(&empty, &queue); waiting--; } if( done ) return null; else remove and return head of the queue; }

Pthreads SOR: main for some number of timesteps { for( i=0; i<p; i++ ) pthread_create(&thrd[i], NULL, sor_1, (void *)i); for( i=0; i<p; i++ ) pthread_join(thrd[i], NULL); for( i=0; i<p; i++ ) pthread_create(&thrd[i], NULL, sor_2, (void *)i); for( i=0; i<p; i++ ) pthread_join(thrd[i], NULL); }

Pthreads SOR: Parallel parts (1) void* sor_1(void *s) { int slice = (int) s; int from = (slice*n)/p; int to = ((slice+1)*n)/p; for(i=from;i<to;i++) for( j=0; j<n; j++ ) temp[i][j] = 0.25*(grid[i-1][j] + grid[i+1][j] +grid[i][j-1] + grid[i][j+1]); }

Pthreads SOR: Parallel parts (2) void* sor_2(void *s) { int slice = (int) s; int from = (slice*n)/p; int to = ((slice+1)*n)/p; for(i=from;i<to;i++) for( j=0; j<n; j++ ) grid[i][j] = temp[i][j]; }

Reality bites... Create/exit/join is not so cheap. It would be more efficient if we could come up with a parallel program, in which –create/exit/join would happen rarely (once!), –cheaper synchronization were used. We need something that makes all threads wait, until all have arrived -- a barrier.

Barrier Synchronization A wait at a barrier causes a thread to wait until all threads have performed a wait at the barrier. At that point, they all proceed.

Implementing Barriers in Pthreads Count the number of arrivals at the barrier. Wait if this is not the last arrival. Make everyone unblock if this is the last arrival. Since the arrival count is a shared variable, enclose the whole operation in a mutex lock-unlock.

Implementing Barriers in Pthreads void barrier() { pthread_mutex_lock(&mutex_arr); arrived++; if (arrived<N) { pthread_cond_wait(&cond, &mutex_arr); } else { pthread_cond_broadcast(&cond); arrived=0; /* be prepared for next barrier */ } pthread_mutex_unlock(&mutex_arr); }

Parallel SOR with Barriers (1 of 2) void* sor (void* arg) { int slice = (int)arg; int from = (slice * (n-1))/p + 1; int to = ((slice+1) * (n-1))/p + 1; for some number of iterations { … } }

Parallel SOR with Barriers (2 of 2) for (i=from; i<to; i++) for (j=1; j<n; j++) temp[i][j] = 0.25 * (grid[i-1][j] + grid[i+1][j] + grid[i][j-1] + grid[i][j+1]); barrier(); for (i=from; i<to; i++) for (j=1; j<n; j++) grid[i][j]=temp[i][j]; barrier();

Parallel SOR with Barriers: main int main(int argc, char *argv[]) { pthread_t *thrd[p]; /* Initialize mutex and condition variables */ for (i=0; i<p; i++) pthread_create (&thrd[i], &attr, sor, (void*)i); for (i=0; i<p; i++) pthread_join (thrd[i], NULL); /* Destroy mutex and condition variables */ }

Note again Many shared memory programming systems (other than Pthreads) have barriers as basic primitive. If they do, you should use it, not construct it yourself. Implementation may be more efficient than what you can do yourself.

Busy Waiting Not an explicit part of the API. Available in a general shared memory programming environment.

Busy Waiting initially: flag = 0; P1:produce data; flag = 1; P2:while( !flag ) ; consume data;

Use of Busy Waiting On the surface, simple and efficient. In general, not a recommended practice. Often leads to messy and unreadable code (blurs data/synchronization distinction). On some architectures, may be inefficient or may not even work as intended (depending on consistency model).

Private Data in Pthreads To make a variable private in Pthreads, you need to make an array out of it. Index the array by thread identifier, which you can get by the pthreads_self() call. Not very elegant or efficient.

Other Primitives in Pthreads Set the attributes of a thread. Set the attributes of a mutex lock. Set scheduling parameters.