Multi-processor Scheduling

Slides:



Advertisements
Similar presentations
Processes CSCI 444/544 Operating Systems Fall 2008.
Advertisements

Threads 1 CS502 Spring 2006 Threads CS-502 Spring 2006.
Multithreading in Java Nelson Padua-Perez Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
CPS110: Implementing threads/locks on a uni-processor Landon Cox.
1 Threads Chapter 4 Reading: 4.1,4.4, Process Characteristics l Unit of resource ownership - process is allocated: n a virtual address space to.
Synchronization CSCI 444/544 Operating Systems Fall 2008.
Scheduler Activations Jeff Chase. Threads in a Process Threads are useful at user-level – Parallelism, hide I/O latency, interactivity Option A (early.
Chapter 51 Threads Chapter 5. 2 Process Characteristics  Concept of Process has two facets.  A Process is: A Unit of resource ownership:  a virtual.
CSE 451: Operating Systems Autumn 2013 Module 6 Review of Processes, Kernel Threads, User-Level Threads Ed Lazowska 570 Allen.
COP 4600 Operating Systems Spring 2011 Dan C. Marinescu Office: HEC 304 Office hours: Tu-Th 5:00-6:00 PM.
Quick overview of threads in Java Babak Esfandiari (extracted from Qusay Mahmoud’s slides)
CPS110: Implementing threads Landon Cox. Recap and looking ahead Hardware OS Applications Where we’ve been Where we’re going.
1 Review of Process Mechanisms. 2 Scheduling: Policy and Mechanism Scheduling policy answers the question: Which process/thread, among all those ready.
Java Thread and Memory Model
Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451.
Operating Systems CSE 411 CPU Management Sept Lecture 10 Instructor: Bhuvan Urgaonkar.
CS399 New Beginnings Jonathan Walpole. 2 Concurrent Programming & Synchronization Primitives.
CPS110: Implementing threads on a uni-processor Landon Cox January 29, 2008.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 4: Threads.
Tutorial 2: Homework 1 and Project 1
Threads & Multithreading
Multithreading / Concurrency
Processes and threads.
CSE 120 Principles of Operating
Process Management Process Concept Why only the global variables?
Chapter 3: Process Concept
Multi Threading.
CS 6560: Operating Systems Design
Background on the need for Synchronization
CSE 451: Operating Systems Winter 2011 Threads
Synchronization.
Day 12 Threads.
CS399 New Beginnings Jonathan Walpole.
Chapter 2 Processes and Threads Today 2.1 Processes 2.2 Threads
Atomic Operations in Hardware
Process Management Presented By Aditya Gupta Assistant Professor
Chapter 4: Threads.
CSE 451: Operating Systems Winter 2010 Module 5 Threads
Condition Variables and Producer/Consumer
Chapter 26 Concurrency and Thread
CSE 451: Operating Systems Winter 2007 Module 5 Threads
CSE 451: Operating Systems Autumn 2004 Module 5 Threads
Condition Variables and Producer/Consumer
CSE 451: Operating Systems Spring 2012 Module 6 Review of Processes, Kernel Threads, User-Level Threads Ed Lazowska 570 Allen.
CSE 451: Operating Systems Spring 2008 Module 5 Threads
Thread Implementation Issues
COT 5611 Operating Systems Design Principles Spring 2012
Lecture Topics: 11/1 General Operating System Concepts Processes
CSE 451: Operating Systems Autumn 2003 Lecture 5 Threads
CSE 451: Operating Systems Winter 2007 Module 5 Threads
CSE 451: Operating Systems Winter 2003 Lecture 5 Threads
CSE 451: Operating Systems Winter 2009 Module 5 Threads
CSE 451: Operating Systems Spring 2005 Module 5 Threads
October 9, 2002 Gary Kimura Lecture #5 October 9, 2002
CSE 451: Operating Systems Winter 2012 Threads
Thomas E. Anderson, Brian N. Bershad,
CSE 451: Operating Systems Autumn 2003 Lecture 7 Synchronization
CSE 451: Operating Systems Autumn 2005 Lecture 7 Synchronization
CSE 451: Operating Systems Winter 2003 Lecture 7 Synchronization
Threads vs. Processes Hank Levy 1.
CSE 153 Design of Operating Systems Winter 19
CSE 451: Operating Systems Autumn 2004 Module 4 Processes
CS333 Intro to Operating Systems
Threads and Multithreading
Foundations and Definitions
Problems with Locks Andrew Whitaker CSE451.
CS703 – Advanced Operating Systems
CSE 451: Operating Systems Winter 2006 Module 5 Threads
CSE 451: Operating Systems Winter 2001 Lecture 5 Threads
Threads.
Presentation transcript:

Multi-processor Scheduling Two implementation choices Single, global ready queue Per-processor run queue Which is better? Which choice is preferred in practice?

Queue-per-processor Advantages of queue per processor Promotes processor affinity (better cache locality) Removes a centralized bottleneck Which runs in global memory Supported by default in Linux 2.6 Java 1.6 support: a double-ended queue (java.util.Deque) Use a bounded buffer per consumer If nothing in a consumer’s queue, steal work from somebody else If too much in the queue, push work somewhere else

Thread Implementation Issues Andrew Whitaker

Where do Threads Come From? A few choices: The operating system A user-mode library Some combination of the two… How expensive are system calls?

Option #1: Kernel Threads Threads implemented inside the OS Thread operations (creation, deletion, yield) are system calls Scheduling handled by the OS scheduler Described as “one-to-one” One user thread mapped to one kernel thread Every invocation of Thread.start() creates a kernel thread process OS threads

Option #2: User threads Implemented as a library inside a process All operations (creation, destruction, yield) are normal procedure calls Described as “many-to-one” Many user-perceived threads map to a single OS process/thread process OS thread

Process Address Space Review Every process has a user stack and a program counter In addition, each process has a kernel stack and program counter (not shown here) stack SP heap (dynamic allocated mem) static data (data segment) code (text segment) PC

Threaded Address Space User address space (for both user and kernel threads) Every thread always has its own user stack and program counter For both user, kernel threads For user threads, there is only a single kernel stack, program counter, PCB, etc. thread 1 stack SP (T1) thread 2 stack SP (T2) thread 3 stack SP (T3) heap (dynamic allocated mem) static data (data segment) PC (T2) code (text segment) PC (T1) PC (T3)

User Threads vs. Kernel Threads User threads are faster Operations do not pass through the OS But, user threads suffer from: Lack of physical parallelism Only run on a single processor! Poor performance with I/O A single blocking operation stalls the entire application For these reasons, most (all?) major OS’s provide some form of kernel threads

When Would User Threads Be Useful? The  calculator? The web server? The Fibonacci GUI? This boils down to understanding why we were using threads for these applications in the first place. The pi calculator uses threads to take advantage of physical parallelism. Therefore, pure user threads are not very useful. Likewise, the web server uses multiple processors and blocking I/O operations -- both of which are not handled well by user threads. Only the fibonacci gui stands to benefit from using user threads -- essentially, we’re using threads to make the system more responsive.

Option #3: Two-level Model OS supports native multi-threading And, a user library maps multiple user threads to a single kernel thread “Many-to-many” Potentially captures the best of both worlds Cheap thread operations Parallelism process OS threads

Problems with Many-to-Many Threads Lack of coordination between user and kernel schedulers “Left hand not talking to the right” Specific problems Poor performance e.g., the OS preempts a thread holding a crucial lock Deadlock Given K kernel threads, at most K user threads can block Other runnable threads are starved out!

Scheduler Activations, UW 1991 Add a layer of communication between kernel and user schedulers Examples: Kernel tells user-mode that a task has blocked User scheduler can re-use this execution context Kernel tells user-mode that a task is ready to resume Allows the user scheduler to alter the user-thread/kernel-thread mapping Supported by newest release of NetBSD

Implementation Spilling Over into the Interface In practice, programmers have learned to live with expensive kernel threads For example, thread pools Re-use a static set of threads throughout the lifetime of the program

Locks Used for implementing critical sections interface Lock { public void acquire(); // only one thread allowed between an // acquire and a release public void release(); } Used for implementing critical sections Modern languages (Java, C#) implicitly acquire and release locks

Two Varieties of Locks Spin locks Blocking locks Threads busy wait until the lock is freed Thread stays in the ready/running state Blocking locks Threads yield the processor until the lock is freed Thread transitions to the blocked state

Why Use Spin Locks? Spin Locks can be faster No context switching required Sometimes, blocking is not an option For example, in the kernel scheduler implementation Spin locks are never used on a uniprocessor In general, busy waiting is bad. Why would we ever favor spin locks? Often spin locks

Bogus Spin Lock Implementation class SpinLock implements Lock { private volatile boolean isLocked = false; public void acquire() { while (isLocked) { ; } // busy wait isLocked = true; } public void release() { isLocked = false; Multiple threads can acquire this lock!

Hardware Support for Locking Problem: Lack of atomicity in testing and setting the isLocked flag Solution: Hardware-supported atomic instructions e.g., atomic test-and-set Java conveniently abstracts these primitives (AtomicInteger, and friends)

Corrected Spin Lock class SpinLock implements Lock { private final AtomicBoolean isLocked = new AtomicBoolean (false); public void acquire() { // get the old value, set a new value while (isLocked.getAndSet(true)) { ; } } public void release() { assert (isLocked.get() == true); isLocked.set(false);

Blocking Locks: Acquire Implementation Atomically test-and-set locked status If lock is already held: Set thread state to blocked Add PCB (task_struct) to a wait queue Invoke the scheduler Problem: must ensure thread-safe access to the wait queue!

Disabling Interrupts Prevents the processor from being interrupted Serves as a coarse-grained lock Must be used with extreme care No I/O or timers can be processed

Thread-safe Blocking Locks Atomically test-and-set locked status If lock is already held: Set thread state to blocked Disable interrupts Add PCB (task_struct) to a wait queue Invoke the scheduler Next task re-enables interrupts

Disabling Interrupts on a Multiprocessor Disabling interrupts can be done locally or globally (for all processors) Global disabling is extremely heavyweight Linux: spin_lock_irq Disable interrupts on the local processor Grab a spin lock to lock out other processors

Preview For Next Week public class Example extends Thread { private static int x = 1; private static int y = 1; private static boolean ready = false; public static void main(String[] args) { Thread t = new new Example(); t.start(); x = 2; y = 2; ready = true; } public void run() { while (! ready) Thread.yield(); // give up the processor System.out.println(“x= “ + x + “y= “ + y);

What Does This Program Print? Answer: it’s a race condition. Many different outputs are possible x=2, y=2 x=1,y=2 x=2,y=1 x=1,y=1 Or, the program may print nothing! The ready loop runs forever