COT 5611 Operating Systems Design Principles Spring 2012 Dan C. Marinescu Office: HEC 304 Office hours: M-Wd 5:00-6:00 PM
Lecture 19 – Wednesday March 21, 2012 Reading assignment: Chapter 9 from the on-line text Last time – All-or-nothing and before-or after atomicity Atomicity and processor management Processes, threads, and address spaces Thread coordination with a bounded buffer – the naïve approach Thread management Address spaces and multi-level memories Kernel structures for the management of multiple cores/processors and threads/processes 3/21/2012 Lecture 19
Today Locks and before-or-after actions; hardware support for locks YIELD Conditions for thread coordination – Safety, Liveness, Bounded-Wait, Fairness Critical sections – a solution to critical section problem Deadlocks Signals Semaphores Monitors Thread coordination with a bounded buffer. WAIT NOTIFY AWAIT ADVANCE SEQUENCE TICKET 3/21/2012 Lecture 19
Locks; Before-or-After actions Locks shared variables which acts as a flag to coordinate access to a shared data. Manipulated with two primitives ACQUIRE RELEASE Support implementation of before-or-after actions; only one thread can acquire the lock, the others have to wait. All threads must obey the convention regarding the locks. The two operations ACQUIRE and RELEASE must be atomic. Hardware support for implementation of locks RSM – Read and Set Memory CMP –Compare and Swap RSM (mem) If mem=LOCKED then RSM returns r=LOCKED and sets mem=LOCKED If mem=UNLOCKED the RSM returns r=LOCKED and sets mem=LOCKED 3/21/2012 Lecture 19
3/21/2012 Lecture 19
3/21/2012 Lecture 19
Important facts to remember Each thread has a unique ThreadId Threads save their state on the stack. The stack pointer of a thread is stored in the thread table. To activate a thread the registers of the processor are loaded with information from the thread state. What if no thread is able to run create a dummy thread for each processor called a processor_thread which is scheduled to run when no other thread is available the processor_thread runs in the thread layer the SCHEDULER runs in the processor layer We have a processor thread for each processor/core. We can use spin locks only if the two processes (the producer and the consumer) run on different CPUs; we need an active process to release a spin lock…. 3/21/2012 Lecture 19
Switching threads with dynamic thread creation Switching from one user thread to another requires two steps Switch from the thread releasing the processor to the processor thread Switch from the processor thread to the new thread which is going to have the control of the processor The last step requires the SCHEDULER to circle through the thread_table until a thread ready to run is found The boundary between user layer threads and processor layer thread is crossed twice Example: switch from thread 0 to thread 6 using YIELD ENTER_PROCESSOR_LAYER EXIT_PROCESSOR_LAYER 3/21/2012 Lecture 19
3/21/2012 Lecture 19
The control flow when switching from one thread to another The control flow is not obvious as some of the procedures reload the stack pointer (SP) When a procedure reloads the stack pointer then the place where it transfers control when it executes a return is the procedure whose SP was saved on the stack and was reloaded before the execution of the return. ENTER_PROCESSOR_LAYER Changes the state of the thread calling YIELD from RUNNING to RUNNABLE Save the state of the procedure calling it , YIELD, on the stack Loads the processors registers with the state of the processor thread, thus starting the SCHEDULER EXIT_PROCESSOR_LAYER Saves the state of processor thread into the corresponding PROCESSOR_TABLE and loads the state of the thread selected by the SCHEDULER to run (in our example of thread 6) in the processor’s registers Loads the SP with the values saved by the ENTER_PROCESSOR_LAYER 3/21/2012 Lecture 19
3/21/2012 Lecture 19
3/21/2012 Lecture 19
In ENTER PROCESSOR_LAYER instead of SCHEDULER() should be SP processor_table[processor].topstack 3/21/2012 Lecture 19
3/21/2012 Lecture 19
Implicit assumptions for the correctness of the implementation One sending and one receiving thread. Only one thread updates each shared variable. Sender and receiver threads run on different processors to allow spin locks in and out are implemented as integers large enough so that they do not overflow (e.g., 64 bit integers) The shared memory used for the buffer provides read/write coherence The memory provides before-or-after atomicity for the shared variables in and out The result of executing a statement becomes visible to all threads in program order. No compiler optimization supported 3/21/2012 Lecture 19
In practice….. Threads run concurrently Race conditions may occur data in the buffer may be overwritten a lock for the bounded buffer the producer acquires the lock before writing the consumer acquires the lock before reading 3/21/2012 Lecture 19
3/21/2012 Lecture 19
We have to avoid deadlocks If a producer thread cannot write because the buffer is full it has to release the lock to allow the consumer thread to acquire the lock to read, otherwise we have a deadlock. If a consumer thread cannot read because the there is no new item in the buffer it has to release the lock to allow the consumer thread to acquire the lock to write, otherwise we have a deadlock. 3/21/2012 Lecture 19
3/21/2012 Lecture 19
In practice… We have to ensure atomicity of some operations, e.g., updating the pointers 3/21/2012 Lecture 19
One more pitfall of the previous implementation of bounded buffer If in and out are long integers (64 or 128 bit) then a load requires two registers, e.,g, R1 and R2. int “00000000FFFFFFFF” L R1,int /* R1 00000000 L R2,int+1 /* R2 FFFFFFFF Race conditions could affect a load or a store of the long integer. 3/21/2012 Lecture 19
3/21/2012 Lecture 19
In practice the threads may run on the same system…. We cannot use spinlocks for a thread to wait until an event occurs. That’s why we have spent time on YIELD… 3/21/2012 Lecture 19
3/21/2012 Lecture 19
3/21/2012 Lecture 19