1 Based on: The art of multiprocessor programming Maurice Herlihy and Nir Shavit, 2008 Appendix A – Software Basics Appendix B – Hardware Basics Introduction to Concurrent Programming Software & Hardware Basics Slides by Ofer Givoli
Software Basics 2
3 Threads in Java Executes a single, sequential program Subclass of: java.lang.Thread
4 … Taken from: The art of multiprocessor programming, by Maurice Herlihy and Nir Shavit, 2008 (modified)
5 Monitors lock + waiting set every object is a monitor Critical section: using the synchronized keyword. Waiting: using the wait() method Waking-up waiting threads, using the methods: notify() notifyAll()
6 public class ConcurrentStack { private Stack innerStack = new Stack (); public void push(T obj) { innerStack.push(obj); } public T pop() { return innerStack.pop(); } } ConcurrentStack s =... s.push(1);... = s.pop(); Solution: mutual exclusion
7... = s.pop(); public class ConcurrentStack { private Stack innerStack = new Stack (); private Object monitor = new Object(); public void push(T obj) { synchronized(monitor) { innerStack.push(obj); } } public T pop() { synchronized(monitor) { return innerStack.pop(); } } } BLOCKED
8 public class ConcurrentStack { private Stack innerStack = new Stack (); private Object monitor = new Object(); public void push(T obj) { synchronized(monitor) { innerStack.push(obj); } } public T pop() { synchronized(monitor) { return innerStack.pop(); } } }
9 public class ConcurrentStack { private Stack innerStack = new Stack (); public void push(T obj) { synchronized(this) { innerStack.push(obj); } } public T pop() { synchronized(this) { return innerStack.pop(); } } }
10 public class ConcurrentStack { private Stack innerStack = new Stack (); public synchronized void push(T obj) { innerStack.push(obj); } public synchronized T pop() { return innerStack.pop(); } } New feature: waiting for pop()
11 public class ConcurrentStack { private Stack innerStack = new Stack (); public synchronized void push(T obj) { innerStack.push(obj); } public synchronized T pop() { while (innerStack.empty()) {} return innerStack.pop(); } } Problem?
12... = s.pop(); public class ConcurrentStack { private Stack innerStack = new Stack (); public synchronized void push(T obj) { innerStack.push(obj); } public synchronized T pop() { while (innerStack.empty()) {} return innerStack.pop(); } } s.push(1); BLOCKED deadlock
13 public class ConcurrentStack { private Stack innerStack = new Stack (); public synchronized void push(T obj) { innerStack.push(obj); } public synchronized T pop() { while (innerStack.empty()) {} return innerStack.pop(); } }
14 public class ConcurrentStack { private Stack innerStack = new Stack (); public synchronized void push(T obj) { if (innerStack.empty()) notifyAll(); innerStack.push(obj); } public synchronized T pop() { while (innerStack.empty()) {wait();} return innerStack.pop(); } }
... = s.pop();s.push(1); BLOCKED 15 WAITING public class ConcurrentStack { private Stack innerStack = new Stack (); public synchronized void push(T obj) { if (innerStack.empty()) notifyAll(); innerStack.push(obj); } public synchronized T pop() { while (innerStack.empty()) {wait();} return innerStack.pop(); } } BLOCKED
... = s.pop(); WAITING... = s.pop(); WAITING 16 public class ConcurrentStack { private Stack innerStack = new Stack (); public synchronized void push(T obj) { if (innerStack.empty()) notify(); innerStack.push(obj); } public synchronized T pop() { while (innerStack.empty()) {wait();} return innerStack.pop(); } } s.push(1); s.push(2); lost wakeup
17 Thread.yield(); Thread.sleep(t);
18 Thread-Local Objects class ThreadLocallD extends ThreadLocal { protected Integer initialValue() { return …; } } ThreadLocallD id = …; id.set(…); … … = id.get(); id.set(…); … … = id.get();
19 Synchronization in C# Pthreads
Hardware Basics 20
21 Taken from:
22 L1 Cache Speed: Fastest Slowest Size: Smallest Biggest Cost: Highest Lowest Power: Highest Lowest CPU L2 Cache L3 Cache Memory (DRAM) Taken from: Computer Structure 2014 slides, by Lihu Rappoport and Adi Yoaz (modified)
23 Processor 1 L1 cache Processor 2 L1 cache L2 cache (shared) Memory Taken from: Computer Structure 2014 slides, by Lihu Rappoport and Adi Yoaz
24 SMP (symmetric multiprocessing) NUMA (Non-uniform memory access) Taken from: The art of multiprocessor programming, by Maurice Herlihy and Nir Shavit, 2008 (modified) not scalable
25 Cache Coherence Cache-line states: Modified Exclusive Shared Invalid Taken from: The art of multiprocessor programming, by Maurice Herlihy and Nir Shavit, 2008 (modified) false sharing
26 Spinning SMP NUMA Taken from: The art of multiprocessor programming, by Maurice Herlihy and Nir Shavit, 2008 (modified)
27 Execute instructions out-of-order/in parallel/speculatively. write buffer reordering of reads-writes by compiler memory barrier instruction (expensive) reads-writes reorder in Java Volatile variables in Java Multi-Core and Multi-Threaded Architectures Taken from: The art of multiprocessor programming, by Maurice Herlihy and Nir Shavit, 2008 (modified)
28 Hardware Synchronization Instructions compare-and-swap/set (CAS) load-linked & store-conditional (LL/SC)
Thanks! 29