Auburn University http://www.eng.auburn.edu/~xqin COMP7330/7336 Advanced Parallel and Distributed Computing Pthread Mutex Dr. Xiao Qin Auburn University.

Slides:



Advertisements
Similar presentations
Operating Systems Semaphores II
Advertisements

Ch 7 B.
CS Lecture 4 Programming with Posix Threads and Java Threads George Mason University Fall 2009.
Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.
8-1 JMH Associates © 2004, All rights reserved Windows Application Development Chapter 10 - Supplement Introduction to Pthreads for Application Portability.
Pthread Synchronization Operating Systems Hebrew University Spring 2004.
Programming Shared Address Space Platforms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text ``Introduction to Parallel.
Comp 422: Parallel Programming Shared Memory Multithreading: Pthreads Synchronization.
CSCI-6964: High Performance Parallel & Distributed Computing (HPDC) AE 216, Mon/Thurs 2-3:20 p.m. Pthreads (reading Chp 7 thru 7.9) Prof. Chris Carothers.
Jaehyuk Huh Computer Science, KAIST
Pthread (continue) General pthread program structure –Encapsulate parallel parts (can be almost the whole program) in functions. –Use function arguments.
Threads and Thread Control Thread Concepts Pthread Creation and Termination Pthread synchronization Threads and Signals.
Parallel Programming with PThreads. Threads  Sometimes called a lightweight process  smaller execution unit than a process  Consists of:  program.
ICS 145B -- L. Bic1 Project: Process/Thread Synchronization Textbook: pages ICS 145B L. Bic.
Programming Shared Address Space Platforms Carl Tropper Department of Computer Science.
Super computers By: Lecturer \ Aisha Dawood. Overview of Programming Models Programming models provide support for expressing concurrency and synchronization.
1 Pthread Programming CIS450 Winter 2003 Professor Jinhua Guo.
POSIX Synchronization Introduction to Operating Systems: Discussion Module 5.
Synchronizing Threads with Semaphores
IT 325 Operating systems Chapter6.  Threads can greatly simplify writing elegant and efficient programs.  However, there are problems when multiple.
1 CMSC421: Principles of Operating Systems Nilanjan Banerjee Principles of Operating Systems Acknowledgments: Some of the slides are adapted from Prof.
CS252: Systems Programming Ninghui Li Based on Slides by Prof. Gustavo Rodriguez-Rivera Topic 11: Thread-safe Data Structures, Semaphores.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Software Systems Advanced Synchronization Emery Berger and Mark Corner University.
1 Condition Variables CS 241 Prof. Brighten Godfrey March 16, 2012 University of Illinois.
Chapter 6 P-Threads. Names The naming convention for a method/function/operation is: – pthread_thing_operation(..) – Where thing is the object used (such.
Programming Shared Address Space Platforms Adapted from Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar ``Introduction to Parallel Computing'',
Super computers By: Lecturer \ Aisha Dawood. Overview of Programming Models Programming models provide support for expressing concurrency and synchronization.
Announcements. Cooperating Processes  Operating systems allow for the creation and concurrent execution of multiple processes & threads  eases program.
COMP7330/7336 Advanced Parallel and Distributed Computing Task Partitioning Dr. Xiao Qin Auburn University
COMP7330/7336 Advanced Parallel and Distributed Computing Task Partitioning Dynamic Mapping Dr. Xiao Qin Auburn University
COMP7330/7336 Advanced Parallel and Distributed Computing Pthread RW Locks - Implementation Dr. Xiao Qin Auburn University
Working with Pthreads. Operations on Threads int pthread_create (pthread_t *thread, const pthread_attr_t *attr, void * (*routine)(void*), void* arg) Creates.
Programming with Threads. Threads  Sometimes called a lightweight process  smaller execution unit than a process  Consists of:  program counter 
COMP7330/7336 Advanced Parallel and Distributed Computing OpenMP: Programming Model Dr. Xiao Qin Auburn University
pThread synchronization
COMP8330/7330/7336 Advanced Parallel and Distributed Computing Decomposition and Parallel Tasks (cont.) Dr. Xiao Qin Auburn University
COMP7330/7336 Advanced Parallel and Distributed Computing POSIX Thread API Dr. Xiao Qin Auburn University
COMP7330/7336 Advanced Parallel and Distributed Computing POSIX Thread API Dr. Xiao Qin Auburn University
Case Study: Pthread Synchronization Dr. Yingwu Zhu.
COMP7330/7336 Advanced Parallel and Distributed Computing Pthread RW Locks – An Application Dr. Xiao Qin Auburn University
COMP7330/7336 Advanced Parallel and Distributed Computing Thread Basics Dr. Xiao Qin Auburn University
Auburn University
Auburn University
Auburn University
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Exploratory Decomposition Dr. Xiao Qin Auburn.
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Odd-Even Sort Implementation Dr. Xiao Qin.
Auburn University COMP 3500 Introduction to Operating Systems Synchronization: Part 4 Classical Synchronization Problems.
Programming Shared-memory Machines
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Thread Basics Dr. Xiao Qin Auburn University.
PThreads.
Principles of Operating Systems Lecture 11
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Principles of Message-Passing Programming.
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Mapping Techniques Dr. Xiao Qin Auburn University.
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Improving Barrier Performance Dr. Xiao Qin.
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing A bug in the rwlock program Dr. Xiao Qin.
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Parallel Odd-Even Sort Algorithm Dr. Xiao.
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Data Partition Dr. Xiao Qin Auburn University.
Auburn University COMP 3500 Introduction to Operating Systems Project 3 – Synchronization Condition Variables Dr. Xiao.
Operating Systems CMPSC 473
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Principles of Message-Passing Programming.
Dr. Xiao Qin Auburn University
Thread synchronization
Programming Shared Address Space Platforms
Lecture 14: Pthreads Mutex and Condition Variables
Programming Shared Address Space Platforms
PTHREADS AND SEMAPHORES
Lecture 14: Pthreads Mutex and Condition Variables
Programming with Shared Memory - 2 Issues with sharing data
“The Little Book on Semaphores” Allen B. Downey
POSIX Threads(pthreads)
Presentation transcript:

Auburn University http://www.eng.auburn.edu/~xqin COMP7330/7336 Advanced Parallel and Distributed Computing Pthread Mutex Dr. Xiao Qin Auburn University http://www.eng.auburn.edu/~xqin xqin@auburn.edu Slides are adopted from Drs. Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar

Producer-Consumer Using Locks The producer-consumer scenario imposes the following constraints: The producer thread must not overwrite the shared buffer when the previous task has not been picked up by a consumer thread. The consumer threads must not pick up tasks until there is something present in the shared data structure. Individual consumer threads should pick up tasks one at a time.

Ex1: How does the producer behave? pthread_mutex_t task_queue_lock; int task_available; ... main() { .... task_available = 0; pthread_mutex_init(&task_queue_lock, NULL); } void *producer(void *producer_thread_data) { while (!done()) { inserted = 0; create_task(&my_task); while (inserted == 0) { if (task_available == 0) { insert_into_queue(my_task); task_available = 1; inserted = 1;

How does the producer behave? pthread_mutex_t task_queue_lock; int task_available; ... main() { .... task_available = 0; pthread_mutex_init(&task_queue_lock, NULL); } void *producer(void *producer_thread_data) { while (!done()) { inserted = 0; create_task(&my_task); while (inserted == 0) { pthread_mutex_lock(&task_queue_lock); if (task_available == 0) { insert_into_queue(my_task); task_available = 1; inserted = 1; pthread_mutex_unlock(&task_queue_lock);

Ex1: How does the consumer behave? Why we need a mutex? void *consumer(void *consumer_thread_data) { int extracted; struct task my_task; /* local data structure declarations */ while (!done()) { extracted = 0; while (extracted == 0) { if (task_available == 1) { extract_from_queue(&my_task); task_available = 0; extracted = 1; } process_task(my_task);

How does the consumer behave? void *consumer(void *consumer_thread_data) { int extracted; struct task my_task; /* local data structure declarations */ while (!done()) { extracted = 0; while (extracted == 0) { pthread_mutex_lock(&task_queue_lock); if (task_available == 1) { extract_from_queue(&my_task); task_available = 0; extracted = 1; } pthread_mutex_unlock(&task_queue_lock); process_task(my_task);

Normal Mutex vs. Recursive Mutex Normal mutex deadlocks if a thread that already has a lock tries a second lock on it. A recursive mutex allows a single thread to lock a mutex as many times as it wants. It simply increments a count on the number of locks. A lock is relinquished by a thread when the count becomes zero. Three types: normal, recursive, and error-check.

Error-Check Mutexes An error check mutex reports an error when a thread with a lock tries to lock it again (as opposed to deadlocking in the first case, or granting the lock, as in the second case). The type of the mutex can be set in the attributes object before it is passed at time of initialization.

Overheads of Locking Locks represent serialization points. What happens if we encapsulate large segments of the program within locks? Reduce idling overhead: int pthread_mutex_trylock ( pthread_mutex_t *mutex_lock); Q1: Why pthread_mutex_trylock is typically much faster than pthread_mutex_lock on typical systems? It does not have to deal with queues associated with locks for multiple threads waiting on the lock. Locks represent serialization points. since critical sections must be executed by threads one after the other. Encapsulating large segments of the program within locks can lead to significant performance degradation. It is often possible to reduce the idling overhead associated with locks using an alternate function, pthread_mutex_trylock. int pthread_mutex_trylock ( pthread_mutex_t *mutex_lock); pthread_mutex_trylock is typically much faster than pthread_mutex_lock on typical systems since it does not have to deal with queues associated with locks for multiple threads waiting on the lock.

Ex2: Alleviating Locking Overhead: An Example Which variable is a shared data, which is local data? /* Finding k matches in a list */ void *find_entries(void *start_pointer) { /* This is the thread function */ struct database_record *next_record; int count; current_pointer = start_pointer; do { next_record = find_next_entry(current_pointer); count = output_record(next_record); } while (count < requested_number_of_records); } int output_record(struct database_record *record_ptr) { pthread_mutex_lock(&output_count_lock); output_count ++; count = output_count; pthread_mutex_unlock(&output_count_lock); if (count <= requested_number_of_records) print_record(record_ptr); return (count);

Ex3: Alleviating Locking Overhead: An Example /* rewritten output_record function */ int output_record(struct database_record *record_ptr) { int count; int lock_status; lock_status=pthread_mutex_trylock(&output_count_lock); if ((1)___________________ EBUSY) { insert_into_local_list(record_ptr); return(0); } else { count = output_count; output_count += number_on_local_list + 1; (2)_________________________________________; print_records(record_ptr, local_list, requested_number_of_records - count); return(count + number_on_local_list + 1); Compare output_record () with its earlier version!

Ex3: Alleviating Locking Overhead: An Example /* rewritten output_record function */ int output_record(struct database_record *record_ptr) { int count; int lock_status; lock_status=pthread_mutex_trylock(&output_count_lock); if (lock_status == EBUSY) { insert_into_local_list(record_ptr); return(0); } else { count = output_count; output_count += number_on_local_list + 1; pthread_mutex_unlock(&output_count_lock); print_records(record_ptr, local_list, requested_number_of_records - count); return(count + number_on_local_list + 1); Compare output_record () with its earlier version!

Condition Variables for Synchronization Allows a thread to block itself until specified data reaches a predefined state. A condition variable is associated with this predicate. When the predicate becomes true, the condition variable is used to signal one or more threads waiting on the condition. A single condition variable may be associated with more than one predicate. Has a mutex associated with it. (Why?) A thread locks this mutex and tests the predicate defined on the shared variable. If the predicate is not true, the thread waits on the condition variable associated with the predicate using the function pthread_cond_wait. A condition variable allows a thread to block itself until specified data reaches a predefined state. A condition variable is associated with this predicate. When the predicate becomes true, the condition variable is used to signal one or more threads waiting on the condition. A single condition variable may be associated with more than one predicate. A condition variable always has a mutex associated with it. A thread locks this mutex and tests the predicate defined on the shared variable. If the predicate is not true, the thread waits on the condition variable associated with the predicate using the function pthread_cond_wait.

Summary Producer-Consumer Using pthread locks What are the three mutex types: normal, recursive, and error-check. How to reduce locking overhead? pthread_mutex_trylock() A condition variable allows a thread to block itself until specified data reaches a predefined state. A condition variable is associated with this predicate. When the predicate becomes true, the condition variable is used to signal one or more threads waiting on the condition. A single condition variable may be associated with more than one predicate. A condition variable always has a mutex associated with it. A thread locks this mutex and tests the predicate defined on the shared variable. If the predicate is not true, the thread waits on the condition variable associated with the predicate using the function pthread_cond_wait.