By: Gal Nave and Dan Slov Supervisor: Dmitri Perelman Technion, Electrical engineering department NSSL Laboratory.

Slides:



Advertisements
Similar presentations
CS 11 C track: lecture 7 Last week: structs, typedef, linked lists This week: hash tables more on the C preprocessor extern const.
Advertisements

HW/Study Guide. Synchronization Make sure you understand the HW problems!
CS492B Analysis of Concurrent Programs Lock Basics Jaehyuk Huh Computer Science, KAIST.
Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.
Chapter 6 Process Synchronization Bernard Chen Spring 2007.
Chapter 6: Process Synchronization
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.
Process Synchronization. Module 6: Process Synchronization Background The Critical-Section Problem Peterson’s Solution Synchronization Hardware Semaphores.
Parallel Processing (CS526) Spring 2012(Week 6).  A parallel algorithm is a group of partitioned tasks that work with each other to solve a large problem.
1 Friday, June 16, 2006 "In order to maintain secrecy, this posting will self-destruct in five seconds. Memorize it, then eat your computer." - Anonymous.
Read-Copy Update P. E. McKenney, J. Appavoo, A. Kleen, O. Krieger, R. Russell, D. Saram, M. Soni Ottawa Linux Symposium 2001 Presented by Bogdan Simion.
Multi-Object Synchronization. Main Points Problems with synchronizing multiple objects Definition of deadlock – Circular waiting for resources Conditions.
Concurrency, Race Conditions, Mutual Exclusion, Semaphores, Monitors, Deadlocks Chapters 2 and 6 Tanenbaum’s Modern OS.
Avishai Wool lecture Introduction to Systems Programming Lecture 4 Inter-Process / Inter-Thread Communication.
Introduction to Lock-free Data-structures and algorithms Micah J Best May 14/09.
Tutorial 6 & 7 Symbol Table
Semaphores. Announcements No CS 415 Section this Friday Tom Roeder will hold office hours Homework 2 is due today.
Synchronization Principles. Race Conditions Race Conditions: An Example spooler directory out in 4 7 somefile.txt list.c scores.txt Process.
Concurrency: Mutual Exclusion, Synchronization, Deadlock, and Starvation in Representative Operating Systems.
Threads© Dr. Ayman Abdel-Hamid, CS4254 Spring CS4254 Computer Network Architecture and Programming Dr. Ayman A. Abdel-Hamid Computer Science Department.
1.1 CAS CS 460/660 Introduction to Database Systems File Organization Slides from UC Berkeley.
A. Frank - P. Weisberg Operating Systems Introduction to Cooperating Processes.
Instructor: Umar KalimNUST Institute of Information Technology Operating Systems Process Synchronization.
Synchronization CSCI 444/544 Operating Systems Fall 2008.
Operating Systems CSE 411 CPU Management Oct Lecture 13 Instructor: Bhuvan Urgaonkar.
Discussion Week 3 TA: Kyle Dewey. Overview Concurrency overview Synchronization primitives Semaphores Locks Conditions Project #1.
Chapter 6 Concurrency: Deadlock and Starvation Operating Systems: Internals and Design Principles, 6/E William Stallings Dave Bremer Otago Polytechnic,
CS510 Concurrent Systems Introduction to Concurrency.
Design patterns. What is a design pattern? Christopher Alexander: «The pattern describes a problem which again and again occurs in the work, as well as.
Computer Science Department Data Structure & Algorithms Lecture 8 Recursion.
Threads and Thread Control Thread Concepts Pthread Creation and Termination Pthread synchronization Threads and Signals.
COMP 111 Threads and concurrency Sept 28, Tufts University Computer Science2 Who is this guy? I am not Prof. Couch Obvious? Sam Guyer New assistant.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.
4061 Session 21 (4/3). Today Thread Synchronization –Condition Variables –Monitors –Read-Write Locks.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 7: Process Synchronization Background The Critical-Section Problem Synchronization.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts Essentials – 9 th Edition Chapter 5: Process Synchronization.
Chapter 6 – Process Synchronisation (Pgs 225 – 267)
12/22/ Thread Model for Realizing Concurrency B. Ramamurthy.
1 Critical Section Problem CIS 450 Winter 2003 Professor Jinhua Guo.
CS510 Concurrent Systems Jonathan Walpole. Introduction to Concurrency.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition Chapter 6: Process Synchronization.
CS3771 Today: Distributed Coordination  Previous class: Distributed File Systems Issues: Naming Strategies: Absolute Names, Mount Points (logical connection.
Mutual Exclusion -- Addendum. Mutual Exclusion in Critical Sections.
Chapter 3 Lists, Stacks, Queues. Abstract Data Types A set of items – Just items, not data types, nothing related to programming code A set of operations.
6.1 Silberschatz, Galvin and Gagne ©2005 Operating System Principles 6.5 Semaphore Less complicated than the hardware-based solutions Semaphore S – integer.
December 1, 2006©2006 Craig Zilles1 Threads & Atomic Operations in Hardware  Previously, we introduced multi-core parallelism & cache coherence —Today.
Interprocess Communication Race Conditions
CS703 - Advanced Operating Systems
Module 11: File Structure
Process Synchronization: Semaphores
Chapter 5: Process Synchronization – Part 3
PARALLEL PROGRAM CHALLENGES
PThreads.
Project 3 Threads and Synchronization
Threads Threads.
Chapter 5: Process Synchronization
COT 5611 Operating Systems Design Principles Spring 2014
Chapter 7: Synchronization Examples
Concurrency: Mutual Exclusion and Process Synchronization
Synchronization Primitives – Semaphore and Mutex
Kernel Synchronization II
CSE 451: Operating Systems Autumn 2003 Lecture 7 Synchronization
CSE 451: Operating Systems Autumn 2005 Lecture 7 Synchronization
CSE 451: Operating Systems Winter 2003 Lecture 7 Synchronization
CSE 153 Design of Operating Systems Winter 2019
Chapter 6: Synchronization Tools
Foundations and Definitions
CSE 451 Section 1/27/2000.
Process/Thread Synchronization (Part 2)
CSE 542: Operating Systems
Presentation transcript:

By: Gal Nave and Dan Slov Supervisor: Dmitri Perelman Technion, Electrical engineering department NSSL Laboratory

Overview Introduction Deadlock detection algorithm Algorithm implementation in Linux Summary

Introduction Parallel vs Serial computation An evolution of serial computing Parallel comp solves larger problems Provides concurrency Faster Bigger Better More principle

Introduction (cont.) Parallel computing pitfalls Synchronization Ordering DEADLOCKS We will concentrate on deadlocks from now on

Problem definition Deadlock is a specific condition when two or more processes (or threads) are each waiting for each other to release a resource, thus forming a circular chain

Project Definition : Design and implementation of deadlock detection mechanism in Linux Supply support for successful debugging - backtrace - dependency description Evaluate the performance overhead implied by the solution

The Algorithm Deadlock detection algorithm by M.Herlihy and E.Koskinen which they called: DREADLOCKS

Deadlock Detection Algorithm The algorithm exploits the fact that during busy wait no useful work is done The thread that tries to lock a mutex checks if it is not already lock using atomic operation test-and-set (TSL) If the mutex is already locked, the thread waits for it to get unlocked (or in other words thread spins about the lock). From time to time a thread tests the mutex using TSL : Test-and-set If the lock has already been set by another thread - wait Repeat until TSL returns zero (the lock is set)

Deadlock Detection Algorithm (cont.) The Idea : Why don’t we use this time between 2 consecutive test- and-set’s to look for deadlock!!!!! THE BASIC ASSUMPTION OF THE ALGORITHM IS THAT THE TIME WHILE A THREAD IS SPINNING COULD BE USED FOR SOME USEFUL WORK – DEADLOCK DETECTION!!!

Deadlock Detection Algorithm (cont.) Now lock algorithm would look like the following: 1. Try to lock the mutex using test-and-set 2. If the lock has already been set by another thread – try to detect a deadlock 3. If there is no deadlock try to lock the mutex again using test-and-set Else if there is a deadlock alert the user 4. Repeat until TSL returns zero (the lock is set) or deadlock is detected

Deadlock Detection – how it gets done Each thread has a list of processes/threads, it is waiting for, to acquire some resource (mutex ). Let’s call this list digest. Thread trying to acquire mutex that is already locked checks the owner’s digest for its TID. Digest of A: {} Digest of B: {}

Deadlock Detection – how it gets done If TID is found in mutex owner’s digest, it would imply that thread is waiting for mutex owner to release the lock while owner is waiting for the thread itself to release another lock – classic deadlock!!!! Digest of A: {B} Digest of B: {A}

Deadlock Detection – how it gets done If TID is not found - set union of the thread’s digest with the one of the mutex owner. Keep spinning until the lock is acquired or deadlock is detected Digest of A: {B} Digest of B: {}

Algorithm Implementation for Linux : The Implementation is based upon: GLIBC GNU C library version 2.6 NPTL Native POSIX Thread Library Any distribution of Linux supporting glibc 2.6

Implementation Details Three most important structure in implementations are: thread struct mutex struct digest struct The details of each of these structs are in the next foils

Thread struct modification Each thread is described by a structure defined in descr.h the structure defining thread includes all sorts of fields like thread ID, attributes etc We have added additional field to the structure to hold the digest entry (the list of TID’s the thread is waiting for) Initially digest is empty If the thread that tries to acquire mutex and sees that its already locked, it scans the mutex owning thread’s digest for its own ID.

Digest struct Dependency list is implemented as a linked list. digest of the thread that is waiting for a mutex has a field pointing at the thread owner’s digest: Digest of a thread that is waiting for a mutex Digest of a mutex owning thread Mutex owning thread’s digest may point to other digests

Digest structure detailed Digest structure is defined in pthread_digest.h : typedef struct _digest_t { unsigned __tid; int __time_stamp; int __cnt; int __ref_cnt; int __is_alive; pthread_digest_p digest; } pthread_digest_t; Explanation all the fields purpose will follow

Digest structure detailed __tid specifies thread ID of the thread owning the digest __time_stamp specifies the time of last update of the digest __cnt counts the number of mutexes the thread holds __ref_cnt specifies the number of threads pointing at the digest digest points at the next digest (NULL if the thread is not waiting for a mutex)

Implementation Details (cont.) Since digest is ADT, some methods should be added to allow stronger decoupling. Most important actions on digest are: append another thread’s digest upon acquiring a mutex release dependency list update mutex owner compare time stamps of two digests print dependency list and stack n case of a deadlock

The usage of digest ADT methods in the algorithm 1. Init_thread_digest () 2. Try to lock the mutex using test-and-set 3. If the lock has already been set by another thread – append_owner_digest() compare_time_stamp() if time_stamp is outdated scan_digest() else do nothing 4. If there is no deadlock try to lock the mutex again using test-and-set Else if there is a deadlock print_dependency_list() 5. Repeat step 2-4 until the lock is set or deadlock is detected 6. If mutex is acquired release_dependency_list() update_mutex_owner()

Mutex structure How to get a pointer to owning mutex thread’s digest? Mutex struct needs to be modified!!! Each mutex is described by a structure defined in pthreadtypes.h Additional field is added to mutex_t structure to hold the owner digest entry If a thread acquires mutex it updates digest owner field in a mutex structure to contain a pointer to the thread’s digest

What if deadlock is detected? The following is done: Print dependency list to stderr Print backtrace to stderr Return deadlock_found error code

What if deadlock is detected? stderr example (real example from the test): Thread ID is waiting for thread ID Thread ID is waiting for thread ID Thread ID has detected a deadlock... Backtrace /home/dmitri/deadlock_detection/glibc261_build/nptl/libpthread.so.0 [0x2afa80b53fbc] /home/dmitri/deadlock_detection/glibc261_build/nptl/libpthread.so.0 [0x2afa80b54088] /home/dmitri/deadlock_detection/glibc261_build/nptl/libpthread.so.0(pthrea d_mutex_lock+0x1db) [0x2afa80b4c6ab]./simp_test(lock_mutex+0x15) [0x400d41]./simp_test(thr_b_func+0x5b) [0x400e4c]

Problems and solutions Memory leakage Description: Thread finished its task and exits and all the memory it used gets freed. But other threads may point at its digest!!! Solution: Add 2 additional fields to digest structure: _is_alive to delete digest logically _is_alive to delete digest logically _ref_countto count how many threads reference the digest. Free the memory if is_alive is false and ref_cnt is zero.

Problems and solutions Memory leakage take 2 Description: So thread does not necessarily free the digest memory upon exiting, but operating system does! Operating system has a maintaining daemon process that frees all the leaked memory. Solution: Add a global hash-table that references all digests. Delete entries from the hash table using the same principle as in freeing the digest memory Standard hash table of glibc 2.6 is not suitable due to very limited number of operations on it, so another library was added gnu hash table: Standard hash table of glibc 2.6 is not suitable due to very limited number of operations on it, so another library was added gnu hash table: ghtlib

Problems and solutions Mutex struct is used by a lot (all ?) processes. Description: Mutex struct is used by a lot if not all processes and change in its size requires changes in kernel. Solution: Remove changes from mutex struct and add another hash table to contain all mutexes, using mutex address as a key, and a pointer to the digest of the owner as a data. Performance degradation? Not really!!!

Problems and solutions Thread struct is used by a lot (all ?) processes. Description: Thread struct is used by a lot if not all processes and change in its size requires changes in kernel. Solution: Remove changes from thread struct and use one of the field provided by glibc author as a size buffer.

Verification The following test suit was used : Controlled deadlock Description: Deliberately create deadlock using small amounts of thread, so its easy to monitor the detection. Compare to standard glibc Test result: Passed. Dependency list was printed while standard glibc got frozen in a deadlock. Statistical deadlock Description: Create a number of threads (from 10 to 150 depending on test) that randomly lock mutexes controlling the probability of deadlock by a number of locked mutexes and a number of threads. Test result: Number of detected deadlocks are proportional to a number of randomly locked mutexes.

Verification (cont.) The typical graph (50 threads locking number of random mutexes)

Performance Evaluation The following benchmarks were used to evaluate performance: 1.Locking performance Create a fixed number of threads Let each thread lock and unlock the same number of mutexes, without doing any other task Repeat the step above for different number of mutexes for both standard and modified glibc.

LOCKING PERFORMANCE (50 threads locking number of mutexes)

Performance Evaluation (cont.) 2. CPU bound performance Create a fixed number of threads (50 for instance) Let each thread lock and unlock the same number of mutexes, while doing a heavy calculation in between Repeat the step above for different number of mutexes for both standard and modified glibc. The graph below represents typical picture where 50 mutexes were used

Performance Evaluation (cont.) CPU BOUND PERFORMANCE

Usage Proposal Compile the library using provided makefile Either install it or leave it as an alternative to mainstream glibc using LD_LIBRARY_PATH system variable Could be configured to do the following: a)Quit the task and return the error code in case of deadlock b)Print to stderr the deadlock information and remain in deadlock letting user to decide what to do c)Detect hotspots (printing __ref_cnt field from digest structure) may be easily added

Summary The integration of deadlock detection mechanism into Linux is definitely possible Simple deadlocks may and should be detected Programs with not very critical performance and relatively low number of locking operation could use deadlock detection mechanism all the time without major impact on performance NPTL is not completely decoupled from the kernel and therefore some changes in kernel are needed to make deadlock detection more effective

Final thoughts: Most of personal computers have more than one core Parallel programming is getting more and more essential Deadlocks problem becomes THE PROBLEM Deadlock prevention is not in our hands Deadlock avoidance demands operational system overhead Deadlock detection may be effective and low cost in particular cases GNU C library allows code modification