Lecture 12 CV
Last lecture Controlling interrupts Test and set (atomic exchange) Compare and swap Load linked and store conditional Fetch and add and ticket locks
typedef struct __lock_t { int flag; int guard; queue_t *q; } lock_t; void lock_init(lock_t *m) { m->flag = 0; m->guard = 0; queue_init(m->q); }
void lock(lock_t *m) { while (TestAndSet(&m->guard, 1) == 1) ; //acquire guard lock by spinning if (m->flag == 0) { m->flag = 1; // lock is acquired m->guard = 0; } else { queue_add(m->q, gettid()); setpark(); m->guard = 0; park(); } }
void unlock(lock_t *m) { while (TestAndSet(&m->guard, 1) == 1) ; //acquire guard lock by spinning if (queue_empty(m->q)) m->flag = 0; // let go of lock; no one wants it else // hold lock (for next thread!) unpark(queue_remove(m->q)); m->guard = 0; }
Different Support on Linux On Linux, OS provides two calls: futex_wait(address, expected) puts the calling thread to sleep, assuming the value at address is equal to expected. If it is not equal, the call returns immediately. futex_wake(address) wakes one thread that is waiting on the queue.
void lock(lock_t *m) { int v; /* Bit 31 was clear, we got the mutex (fastpath) */ if (atomic_bit_test_set (m, 31) == 0) return; atomic_increment (m); while (1) { if (atomic_bit_test_set (m, 31) == 0) { atomic_decrement (m); return; } /* We have to wait now. First make sure the futex value we are monitoring is truly negative (i.e. locked). */ v = *m; if (v >= 0) continue; futex_wait (m, v); }
void unlock(lock_t *m) { /* Adding 0x to the counter results in 0 if & only if there are not other interested threads */ if (atomic_add_zero (mutex, 0x )) return; /* There are other threads waiting for this mutex, wake one of them up. */ futex_wake (mutex); }
Lock Usage Examples Concurrent Counters Concurrent Linked Lists Concurrent Queues Concurrent Hash Table
Concurrency Objectives Mutual exclusion (e.g., A and B don’t run at same time) solved with locks Ordering (e.g., B runs after A) solved with condition variables
Condition Variables CV’s are more like channels than variables. B waits for a signal on channel before running. A sends signal when it is time for B to run. A CV also has a queue of waiting threads. A CV is usually PAIRED with some kind state variable.
wait and signal wait(cond_t *cv, mutex_t *lock) assumes the lock is held when wait() is called puts caller to sleep + releases the lock (atomically) when awoken, reacquires lock before returning signal(cond_t *cv) wake a single waiting thread (if >= 1 thread is waiting) if there is no waiting thread, just return w/o doing anything
Ordering Example: Join pthread_t p1, p2; printf("main: begin [balance = %d]\n", balance); Pthread_create(&p1, NULL, mythread, "A"); Pthread_create(&p2, NULL, mythread, "B"); // join waits for the threads to finish Pthread_join(p1, NULL); Pthread_join(p2, NULL); printf("main: done\n [balance: %d]\n [should: %d]\n", balance, max*2); return 0;
Implementing Join with CV’s (Correct) void thread_exit() { Mutex_lock(&m); done = 1; // a Cond_signal(&c); // b Mutex_unlock(&m); } void thread_join() { Mutex_lock(&m); // w while (done == 0) // x Cond_wait(&c, &m); // y Mutex_unlock(&m); // z }
Implementing Join with CV’s (wrong 1) void thread_exit() { done = 1; // a Cond_signal(&c); // b } void thread_join() { Mutex_lock(&m); // w if (done == 0) // x Cond_wait(&c, &m); // y Mutex_unlock(&m); // z }
Implementing Join with CV’s (wrong 2) void thread_exit() { Mutex_lock(&m); // a Cond_signal(&c); // b Mutex_unlock(&m); // c } void thread_join() { Mutex_lock(&m); // x Cond_wait(&c, &m); // y Mutex_unlock(&m); // z }
Implementing Join with CV’s (wrong 3) void thread_exit() { done = 1; // a Cond_signal(&c); // b } void thread_join() { if (done == 0) // x Cond_wait(&c, &m); // y }
Good Rule of Thumb Keep state in addition to CV’s! CV’s are used to nudge threads when state changes. If state is already as needed, don’t wait for a nudge! Always do wait and signal while holding the lock!
Implementing Join with CV’s (Correct?) void thread_exit() { Mutex_lock(&m); done = 1; // a Cond_signal(&c); // b Mutex_unlock(&m); } void thread_join() { Mutex_lock(&m); // w if (done == 0) // x Cond_wait(&c, &m); // y Mutex_unlock(&m); // z }
Implementing Join with CV’s (Correct?) void thread_exit() { Mutex_lock(&m); done = 1; // a Mutex_unlock(&m); Cond_signal(&c); // b } void thread_join() { Mutex_lock(&m); // w if (done == 0) // x Cond_wait(&c, &m); // y Mutex_unlock(&m); // z }
Implementing Join with CV’s (Correct?) void thread_exit() { done = 1; // a Mutex_lock(&m); Cond_signal(&c); // b Mutex_unlock(&m); } void thread_join() { Mutex_lock(&m); // w if (done == 0) // x Cond_wait(&c, &m); // y Mutex_unlock(&m); // z }
Producer/Consumer Problem Example: UNIX Pipes A pipe may have many writers and readers. Internally, there is a finite-sized buffer. Writers add data to the buffer. Readers remove data from the buffer. Implementation: reads/writes to buffer require locking when buffers are full, writers must wait when buffers are empty, readers must wait
Producer/Consumer Problem Producers generate data (like pipe writers). Consumers grab data and process it (like pipe readers). Producer/consumer problems are frequent in systems. Pipes Web servers Memory allocators Device I/O …
Queue get/put void put(int value) { assert(count == 0); count = 1; buffer = value; } int get() { assert(count == 1); count = 0; return buffer; }
Solution v0 void *consumer(int loops) { for (int i=0; i < loops; i++){ int tmp = get(i); printf("%d\n", tmp); } void *producer(int loops) { for (int i=0; i < loops; i++){ put(i); }
Solution v1 void *consumer(int loops) { for (int i=0; i < loops; i++){ Mutex_lock(&m); //c1 if (count == 0) //c2 Cond_wait(&C, &m); //c3 int tmp = get(); //c4 Cond_signal(&C); //c5 Mutex_unlock(&m); //c6 printf("%d\n", tmp); } void *producer(int loops) { for (int i=0; i < loops; i++){ Mutex_lock(&m); //p1 if (count == 1) //p2 Cond_wait(&C, &m); //p3 put(i); //p4 Cond_signal(&C); //p5 Mutex_unlock(&m); //p6 }
Solution v2 void *consumer(int loops) { for (int i=0; i < loops; i++){ Mutex_lock(&m); //c1 while (count == 0) //c2 Cond_wait(&C, &m); //c3 int tmp = get(); //c4 Cond_signal(&C); //c5 Mutex_unlock(&m); //c6 printf("%d\n", tmp); } void *producer(int loops) { for (int i=0; i < loops; i++){ Mutex_lock(&m); //p1 while (count == 1) //p2 Cond_wait(&C, &m); //p3 put(i); //p4 Cond_signal(&C); //p5 Mutex_unlock(&m); //p6 }
Better solution (usually): use two CVs Solution v3 void *consumer(int loops) { while (1) { Mutex_lock(&m); //c1 while (count == 0) //c2 Cond_wait(&F, &m); //c3 int tmp = get(); //c4 Cond_signal(&E); //c5 Mutex_unlock(&m); //c6 printf("%d\n", tmp); } void *producer(int loops) { for (int i=0; i < loops; i++){ Mutex_lock(&m); //p1 while (count == 1) //p2 Cond_wait(&E, &m); //p3 put(i); //p4 Cond_signal(&F); //p5 Mutex_unlock(&m); //p6 }
Summary: rules of thumb Keep state in addition to CV’s Always do wait/signal with lock held Whenever you acquire a lock, recheck state
Queue get/put void put(int value) { buffer[fill] = value; fill = (fill + 1) % max; count ++; } int get() { int tmp = buffer[use]; use = (use + 1) % max; count -; return tmp; }
Solution v4 (final) void *consumer(void *arg) { while (1) { Mutex_lock(&m); //c1 while (count == 0) //c2 Cond_wait(&F, &m); //c3 int tmp = get(); //c4 Cond_signal(&E); //c5 Mutex_unlock(&m); //c6 printf("%d\n", tmp); } void *producer(void *arg) { for (int i=0; i < loops; i++){ Mutex_lock(&m); //p1 while (count == max ) //p2 Cond_wait(&E, &m); //p3 put(i); //p4 Cond_signal(&F); //p5 Mutex_unlock(&m); //p6 }
Solution v5 void *consumer(void *arg) { while (1) { Mutex_lock(&m); //c1 while (numfull == 0) //c2 Cond_wait(&F, &m); //c3 Mutex_unlock(&m); //c3a int tmp = get(); //c4 Mutex_lock(&m); //c5a Cond_signal(&E); //c5 Mutex_unlock(&m); //c6 printf("%d\n", tmp); } void *producer(void *arg) { for (int i=0; i < loops; i++){ Mutex_lock(&m); //p1 while (numfull == max) //p2 Cond_wait(&E, &m); //p3 Mutex_unlock(&m); //p3a put(i); //p4 Mutex_lock(&m); //p5a Cond_signal(&F); //p5 Mutex_unlock(&m); //p6 }
How to wake the right thread? wait(cond_t *cv, mutex_t *lock) assumes the lock is held when wait() is called puts caller to sleep + releases the lock (atomically) when awoken, reacquires lock before returning signal(cond_t *cv) wake a single waiting thread (if >= 1 thread is waiting) if there is no waiting thread, just return, doing nothing broadcast(cond_t *cv) wake all waiting threads (if >= 1 thread is waiting) if there are no waiting thread, just return, doing nothing
Next: Concurrency bugs