תכנות מקבילי - I יישום ב- Java
References קורס "תכנות מקבילי ומבוזר", הפקולטה למדעי המחשב, הטכניון. קורס "מערכות מידע מבוזרות", הפקולטה להנדסת תעשייה וניהול, הטכניון.
Processes & Threads in Java Definition Definitions: A thread is a single sequential flow of control within a program. A process is a combination of thread(s) and address space. In general : A thread is a single flow of control within a program. Sequential definition
What Is a Thread? Sequential programs have a beginning, an execution sequence, and an end. At any given time during the runtime of the program there is a single point of execution. A thread is similar to the sequential programs: it has a beginning, a sequence, and an end, and at any given time during the runtime of the thread, there is a single point of execution. However, a thread itself is not a program. It runs within a program, and it cannot run on its own.
Thread vs. Process Similarities: - Has a beginning, an execution sequence (of commands), and an end. - Program counter - Execution stack Differences: ProcessThread Has registers, address space in addition to program counter and execution stack Has to run within the program [considered lightweight process because it runs within the context of a program] Use of multiple threads in a single program, running at the same time and performing different tasks Importance:
Examples of Multithreaded Applications 1.A web browser. Within the browser you can scroll a page while it's downloading an image, play animation and sound concurrently, or print a page in the background while you download a new page 2.A word processor. The user interacts with the program while it is doing some internal processing, like printing a file "in the background".
Using Threads 1.Threads allow speedup due to interleaving of I/O tasks and computational tasks. 2.Threads are a natural and easy way to write programs that do several things concurrently, like print a document and scroll its view on screen in a wordprocessor. 3.However, there is an overhead for context switching and synchronization.
Execution Order Process execution is a-synchronic, no global bip, no global clock. Each process has a different execution speed, which may change over time. For an observer, on the time axis, instruction execution is ordered in execution order. Any order is legal. Execution order for a single process is called program order. x xxxxxxxxxxx oooooooooooo time P1 P2
N processes perform an instruction sequence, which is composed of a critical section and a non-critical section. Mutual exclusion property: instructions from critical sections of two or more processes must not be interleaved in the (global observer’s) execution order. Mutual Exclusion x (xxx)xxxxxxxx o(ooo)oooooooo time P1 P2
Threads in Java The Java Virtual Machine allows an application to have multiple threads of execution running concurrently. From the programmer's point of view, a thread is a java object. It is created (like any other java object) from a Java class with new and an appropriate constructor. It has members and methods. It can be passed as a parameter, put in an array, etc. The Java Virtual Machine maps this Java runnable object to a system dependent thread implementation. The operating system allocates resources (including CPU time) to this thread implementation. Each thread object has a run method. The run method gives a thread something to do (its code implements the thread's running behavior). Usually, the run method contains a loop that is executed until the thread's task is finished.
Creating a Thread There are two techniques for creating a new thread of execution and providing a run method for it: Subclassing the class Thread, and overriding its run method. Implementing the Runnable interface. A a simple rule to help you decide what option to use: If your class must be derived from some other class (for example, Applet) then it should implement Runnable. Otherwise, it should extend Thread. Thread and Runnable are part of the java.lang package.
Subclassing java.lang.Thread Declare a class to be a subclass of Thread. Override the run method of class Thread in this subclass. Allocate an instance of the subclass. Start running the thread object.
Subclassing Thread and Overriding run public class SimpleThread extends Thread { public SimpleThread(String str) { super(str); } public void run() { for (int i = 0; i < 10; i++) { System.out.println(i + " " + getName()); try { sleep((long)(Math.random() * 1000)); } catch (InterruptedException e) {} } System.out.println("DONE! " + getName()); } public class TwoThreadsDemo { public static void main (String[] args) { new SimpleThread("Jamaica").start(); new SimpleThread("Fiji").start(); }
Possible output: 0 Jamaica 0 Fiji 1 Fiji 1 Jamaica 2 Jamaica 2 Fiji 3 Fiji 3 Jamaica 4 Jamaica 4 Fiji 5 Jamaica 5 Fiji 6 Fiji 6 Jamaica 7 Jamaica 7 Fiji 8 Fiji 9 Fiji 8 Jamaica DONE! Fiji 9 Jamaica DONE! Jamaica
Example of thread that computes primes larger than a stated value could be written as follows: class PrimeThread extends Thread { long minPrime; long biggestPrimeSoFar; PrimeThread(long minPrime) { this.minPrime = minPrime; } public void run() { for(;;) { biggestPrimeSoFar = findNextPrim( ); } } } The following code would then create a thread and start it running: PrimeThread p = new PrimeThread(143); p.start();.... // do some other stuff System.out.println("Biggest prime so far is " + p.biggestPrimeSoFar );
Implementing the java.lang.Runnable interface Declare a class that implements the Runnable interface. Implement the run method in this class. Allocate an instance of the class, and pass it as an argument to the constructor of a new Thread object. Start running the thread object.
The same example in this style looks like the following: class PrimeRun implements Runnable { long minPrime; PrimeRun(long minPrime) { this.minPrime = minPrime; } public void run() { // compute primes larger than minPrime... } The following code would then create a thread and start it running: PrimeRun p = new PrimeRun(143); new Thread(p).start();
Parallel Execution of Threads The actual parallelism in a multithreaded program depends on the way the processor's time was allocated to the threads, and whether there is more than one processor or not.
To demonstrate this principle, consider the following program and its outputs: class PrintThread implements Runnable { String str; public PrintThread (String str) { this.str = str; } public void run() { for (;;) System.out.print (str); } } class ConcurrencyTest { public static void main (String Args[]) { new Thread(new PrintThread("A")).start(); new Thread(new PrintThread("B")).start(); } } The output of the program above should look something like this (on Windows NT and on multi-processor machines it will indeed be so): AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBBBBBBBBBBBBBBBBBBBBBBBBBBB BBBBBBBBBBBBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBBBBBBBBB BBBBBBBBBBBBBBBBBBBBBBBBBBBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAABBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBAAAAAAAAAAAAAAAAAAAAAA AAAAABBBBBBBBBBBBBBBBBB... The output has fairly equal number of A’s and B’s.
Preemptive Versus Non-Preemptive Multithreading Preemptive multi-threading means that a thread may be preempted by another thread with an equal priority while it is running. The Java runtime will not preempt the currently running thread for another thread of the same priority. However, the underlying operating system implementation of threads may support preemption. The output of the previous example program on a SPARC/Solaris 2.5 machine is something like this: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA... This is because on Solaris (and other operating systems) multi-tasking is not preemprive
Since not all systems that support multi-threading have a preemption mechanism, you should never rely on preemptive multi-thread scheduling. A thread is supposed to be well behaved and give up the CPU periodically in order for other threads to be able to run. If your thread does not give up the CPU by suspending itself, waiting for a condition, sleeping or doing I/O operations then it should relinquish the CPU periodically by invoking the thread class’s yield() method.
Here is a revised version of the PrintThread class that yields the CPU after each letter printed: class PrintThread implements Runnable { String str; public PrintThread (String str) { this.str = str; } public void run() { for (;;) { System.out.print (str); Thread.currentThread().yield(); } } } The statement Thread.currentThread().yield() uses a public static method of the Thread class to get a handle to the currently running thread, and then tells it to yield. The output of this example is: ABABABABABABABABABABABABABABABA BABABABABABABABABABABABABABABAB ABABABABABABABABABABABABABABABA BABABABABABABABABABABABABABABAB ABABABABABABABABABABABABABAB... As a rule of thumb, threads should yield whenever possible, to allow others to run.
The Life Cycle of a Thread The following diagram shows the states that a Java thread can be in during its life. It also illustrates which method calls cause a transition to another state.
Creating a Thread A new Thread object is created by calling the Thread constructor.
Starting a Thread The start method creates the system resources necessary to run the thread, schedules the thread to run, and calls the thread's run method. After the start method has returned, the thread is in the Runnable state. The Java runtime system implements a scheduling scheme that shares the processor (or processors) between all the "running" threads. At any given time, a "running" thread actually may be waiting for its turn in the CPU.
A thread becomes Not Runnable when one of these events occurs: Its sleep method is invoked. The thread calls the wait method to wait for a specific condition to be satisfied. The thread is blocking on I/O. Making a Thread Not Runnable And making it run again...(Runnable) If a thread has been put to sleep, then the specified number of milliseconds must elapse. If a thread is waiting for a condition, then another object must notify the waiting thread of a change in condition by calling notify or notifyAll. If a thread is blocked on I/O, then the I/O must complete.
The isAlive Method The API for the Thread class includes a method called isAlive: The isAlive method returns true if the thread has been started and has not died yet. If the isAlive method returns false, you know that the thread either hasn't started yet or is dead. You cannot differentiate between a new thread which hasn't been started yet and a dead thread. Nor can you differentiate between a Runnable thread and a Not Runnable thread.
Thread properties Name Priority
Name Every thread has a name for identification purposes. More than one thread may have the same name. If a name is not specified when a thread is created (by passing it as a parameter to the thread's constructor), a new default name is generated for it. A thread's name can be read with the method getName.
Understanding Thread Priority The higher the integer, the higher the priority. At any given time, when multiple threads are ready to be executed, the runtime system chooses the runnable thread with the highest priority for execution. Only when that thread stops, yields, or becomes not runnable for some reason will a lower priority thread start executing. If two threads of the same priority are waiting for the CPU, the scheduler chooses one of them to run in a round-robin fashion. The chosen thread will run until one of the following conditions is true: A higher priority thread becomes runnable. It yields, or its run method exits. On systems that support time-slicing, its time allotment has expired.
Rule of thumb: At any given time, the highest priority thread is running. However, this is not guaranteed. The thread scheduler may choose to run a lower priority thread to avoid starvation. Use priority only to affect scheduling policy for efficiency purposes. Do not rely on thread priority for algorithm correctness. Understanding Thread Priority – cont. The Java runtime system's thread scheduling algorithm is also preemptive. The scheduling algorithm tries favouring higher priority runnable threads to lower prriority runnable threads. But the JVM may ignore priorities alltogether ! Therefore: priority should be used only to affect scheduling policy for efficiency purposes. Algorithm correctness should not depend on it. A thread's prority can be changed with setPriority
Critical Section Critical section - the code segment within a program that is accessed from separate, concurrent threads are called. In the Java language, a critical section can be: a block or a method. The critical section is identified with the synchronized keyword. The Java platform then associates a lock with every object that has synchronized code. public class CubbyHole { private int contents; private boolean available = false; public synchronized int get() {... } public synchronized void put(int value) {... }
Synchronization If more then one thread operate on an object at the same time, its data may become corrupt. For example, consider deleting an element from a doubly linked list (all pointers must be updated atomically). Since context switch may occur at any point in time, preventing concurrent access to an object is necessary even if there is only a single processor. The code segments within a program that access the same object from separate, concurrent threads are called critical sections. A mutual exclusion mechanism is needed, so no more then one thread will be in a critical section. The basic synchronization mechanism in Java is the monitor.
Semaphores A semaphore is a special variable. After initialization, only two atomic operations are applicable: wait(), signal(). Tere are several kinds of semaphores: Busy-Wait Semaphore. Blocked-Set Semaphore. Binary Semaphore. …
Semaphores Semaphores can be used for mutual exclusion and thread synchronization. Instead of busy waiting and wasting CPU cycles a thread can block on a semaphore (the operating system removes the thread from the CPU scheduling or ``ready'' queue) if it must wait to enter its critical section or if the resource it wants is not available
Mutual exclusion pseudocode: semaphore S = 1; wait(S); N=N+1; signal(S); Java has implicit binary semaphores of the form Object mutex = new Object(); /*...*/ synchronized (mutex) { /*...*/ } that can be used for mutual exclusion. Only one thread at a time can be executing inside the synchronized block
Policy for Programming with Semaphores Use semaphores as little as possible – these are strong operations! Define the role of each semaphore using a fixed relation between semaphore’s value and “something” in the program. Examples: Mutual Exclusion: Process may enter critical section iff S=1. Readers-Writers: S = # of free slots in the buffer. Then do: 1. Identify the necessity of each wait and signal with the above mentioned role of the semaphore. 2. Same for semaphore initialization. 3. Make sure each wait is eventually released.
Semaphores – a software engineering problem 1.An error using semaphore in any of the places in the system manifests itself in other processes at other times. It is extremely hard to identify the sources of such bugs. 2.Semaphores are like goto's and pointers: mistake prone work okay but lack structure and ``discipline''. For example a disastrous typo: signal(S); criticalSection(); signal(S) This leads to deadlock: wait(S); criticalSection(); wait(S) Nested critical sections can lead to deadlock: P1: wait(Q); wait(S);... signal(S); signal(Q); P2: wait(S); wait(Q);... signal(Q); signal(S);
Monitors Idea: lets put all the code for handling shared variables in one place. So we get like object-oriented programming style. Let’s make something which is: 1. Object 2. Monolithic monitor – a central core handling all requests. Each monitor has its own mission, and private data. Only a single process can enter a monitor at any point in time. Monitor (declaring variables local to the monitor and global to monitor procedures) Procedure name1 (…) Procedure name2 (…) … Begin ::: initializing monitor local variables End.
Monitors in Java In the Java language, a critical section can be a block or a method and are identified with the synchronized keyword. The Java platform associates a lock with any object. The acquisition and release of a lock is done automatically and atomically by the Java runtime system, when a synchronized code block is entered and exited.
Race conditions and Data integrity: Whenever control enters a synchronized method, the thread that called the method locks the object whose method has been called. Other threads cannot execute a synchronized method on the same object until the object is unlocked. If they call a synchronized method while the object is locked, they are blocked. When the thread that holds the lock exits the synchronized method, it automatically releases the lock. One of the threads waiting for the lock on the object acquires it, and enters the synchronized method it called.
Making a method synchronized means the lock of the current object (this) must be acquired by a thread before it can enter the method. To increase parallelism, a block of code (instead of the entire method) may be synchronized. Synchronized blocks also allow the programmer to explicitly specify which object's lock should be acquired by a thread before the block's code can be executed. This can be any Java object. It may even be an object that is not used inside the synchronized block.
/** make all elements in the array nonnegative */ public static void abs(int[] values) { synchronized (values) { for (int i = 0; i < values.length; i++) { if (values[i] < 0) values[i] = -values[i]; } } } public static int avg(int[] values) { int avg = 0; synchronized(values) { for (int i = 0; i < values.length; i++) avg = avg + values[i]; } return avg/values.length; }
Synchronizing Threads: The Producer/Consumer Problem Problem definition 1.The producer is a thread which genarates arbitrary items (encapsulated in Java objects). After each item is generated, the producer waits until the consumer consumes it, and then it proceeds to generate the next item. 2.A consumer waits until an object is produced, then it consumes it and waits for the next object.
First Try public class Storage { Object currItem; public void put(Object o) { currItem = o; } public Object get() { return currItem; } } public class ProducerConsumerTest { public static void main(String[] args) { Storage s = new Storage(); Producer p1 = new Producer(s, 1); Consumer c1 = new Consumer(s, 1); p1.start(); c1.start(); } }
public class Producer extends Thread { private Storage storage; private int ID; public Producer(Storage s, int ID) { storage = s; this.ID = ID; } public void run() { for (int i = 0; i < 10; i++) { String s = new String(i); System.out.println("Producer #" + this.ID + " put: " + s); storage.put( s ); try { sleep((int)(Math.random() * 100)); } catch (InterruptedException e) { } } } } public class Consumer extends Thread { private Storage storage; private int ID; public Consumer(Storage s, int ID) { storage = s; this.ID = ID; } public void run() { for (int i = 0; i < 10; i++) { Object value = storage.get(); System.out.println("Consumer #" + this.ID + " got: " + value); } }
The Desired Output: Producer #1 put: 0 Consumer #1 got: 0 Producer #1 put: 1 Consumer #1 got: 1 Producer #1 put: 2 Consumer #1 got: 2 Producer #1 put: 3 Consumer #1 got: 3 Producer #1 put: 4 Consumer #1 got: 4 Producer #1 put: 5 Consumer #1 got: 5 Producer #1 put: 6 Consumer #1 got: 6 Producer #1 put: 7 Consumer #1 got: 7 Producer #1 put: 8 Consumer #1 got: 8 Producer #1 put: 9 Consumer #1 got: 9
What may go wrong ?! Neither the Producer nor the Consumer makes any effort to ensure that the Consumer is getting each value produced once and only once. If the Producer is quicker than the Consumer and generates two numbers before the Consumer has a chance to consume the first one. The Consumer would skip a number. … Consumer #1 got: 3 Producer #1 put: 4 Producer #1 put: 5 Consumer #1 got: 5 …
The Consumer is quicker than the Producer and consumes the same value twice. The Consumer would print the same value twice. … Producer #1 put: 4 Consumer #1 got: 4 Producer #1 put: 5 …
Inconsistent Data Race conditions arise from multiple, asynchronously executing threads trying to access a single object at the same time and getting the wrong result. In our example there is no possibility for a race condition as we access a single reference variable (Storage.currItem) and in Java it is guaranteed that reference accesses are atomic. However, if we had to change and read a double values, or multiple references at once, then we could have got an incosistent result from a mixture of updates of the producer.
Therefore, in the general case (complex data updates) The Consumer should not access the Storage when the Producer is changing it. The Producer should not modify it when the Consumer is getting the value. Conclusion: The put and get methods of Storage are the critical sections. They should be marked with the synchronized keyword. Remember: The system associates a unique lock with every instance of Storage (including the one shared by the Producer and the Consumer).
Here's a code skeleton for the Storage class: public class Storage { private Object currItem; public synchronized Object get(){... } public synchronized void put(Object value){... } }
When the Producer calls Storage's put method, it locks the Storage object, thereby preventing the Consumer from calling the Storage's get method. When the put method returns, the Producer unlocks the Storage. public synchronized void put(Object value) { // Storage locked by the Producer... // Storage unlocked by the Producer } When the Consumer calls Storage's get method, it locks the Storage, thereby preventing the Producer from calling put: public synchronized Object get() { // Storage locked by the Consumer... // Storage unlocked by the Consumer {
Second Try public class Storage { Object currItem; boolean avail; public synchronized Object get() if (avail == true) { avail = false; return currItem; } return null; // return some default value } public synchronized void put(Object value) { if (avail == false) { avail = true; currItem = value; } } } Suppose we try to coordinate the threads using this improved Storage class:
What can go wrong here ? As implemented, these two methods won't work !!! Look at the get method. What happens if the Producer hasn't put anything in the Storage and available isn't true? get does nothing. Similarly, if the Producer calls put before the Consumer got the value, put doesn't do anything. We want the Consumer to wait until the Producer puts something in the Storage. The Producer must notify the Consumer when it's done so. Similarly, the Producer must wait until the Consumer takes a value (and notifies the Producer of its activities) before replacing it with a new value. The two threads must coordinate more fully, and can use Object's wait and notifyAll methods to do so.
The notifyAll, notify, wait methods wait() method The wait method makes the current thread wait until it is notified that it can continue running. wait must be called on a locked object. wait atomically puts the thread in a wait state and releases the object's lock. (what could have happened if those actions were not done atomically ??). The Object class contains two other versions of the wait method, that allow the thread to wake up if it is notified or if a timer expires.
notifyAll() method The notifyAll method wakes up all threads waiting on the object in question (in this case, the Storage). The awakened threads compete for the lock. notify() method The Object class also defines the notify method, which arbitrarily wakes up exactly one of the threads waiting on this object. The programmer cannot choose which thread will be notified, if more than one are waiting on the object. The programmer must also deal with spurious wakeups. i.e., wait can return even if the thread was not notified ! Note: wait, notify and notifyAll can only be called from within synchronized code (block or method), using the lock for the object on which they are invoked.
The usual way to wait for a condition synchronized void doWhenCondition() { while (!condition) wait(); //... do what needs doing when condition is true } The condition test should always be in a loop. Never assume that when the thread wakes up, the condition has been satisfied (i.e., don't change while to if). When wait suspends the thread, it also atomically releases the lock on the object. When the thread is restarted after being notified, the lock is reacquired. The wait methods throw an InterruptedException.
The usual way to change a condition synchronized void changeCondition() { //... change some value used in a condition test notify(); }
Third and Final Try public synchronized Object get() { while (avail == false) { try { // wait for Producer to put value wait(); } catch (InterruptedException e) { } } avail = false; notifyAll(); // notify Producer that value has been retrieved return currItem; } public synchronized void put(Object value) { while (avail == true) { try { // wait for Consumer to get value wait(); } catch (InterruptedException e) { } } currItem = value; avail = true; notifyAll(); // notify Consumer that value has been set } Here are the new implementations of get and put that wait on and notify each other of their activities:
The code in the get method loops until the Producer has produced a new value. Each time through the loop, get calls the wait method. The wait method relinquishes the lock held by the Consumer on the Storage (thereby allowing the Producer to get the lock and update the Storage) and then waits for notification from the Producer. When the Producer puts something in the Storage, it notifies the Consumer by calling notifyAll. The Consumer then comes out of the wait state, available is now true, the loop exits, and the get method returns the value in the Storage. The put method works in a similar fashion, waiting for the Consumer thread to consume the current value before allowing the Producer to produce a new one.