Problems with Locks Andrew Whitaker CSE451
Introduction Locks are hard to use correctly Incorrect use can lead to safety, liveness, performance problems Locks can’t always be used Interrupt handlers Locks lead to poor software modularity…
Software Engineering Conundrum public interface ThreadSafeHashTable { public void insert(Object key, Object value); public void delete (Object key); } No good way to atomically move an entry between hash tables Impossible if locking is done internally Possible if locking is done on the hash table But, this violates modularity
Potential Ways to Avoid Locking Cheat Omit locks when it is “obviously safe” to do so Non-blocking algorithms Transactional Memory (research!)
A (Seemingly) Simple Example public class VisiblityExample extends Thread { private static int x = 1; private static int y = 1; private static boolean ready = false; public static void main(String[] args) { Thread t = new VisiblityExample(); t.start(); // initialize some stuff… x = 2; y = 2; ready = true; } public void run() { while (! ready) Thread.yield(); // give up the processor System.out.println(“x= “ + x + “ y= “ + y);
Answer It’s a race condition. Many different outputs are possible: x=2, y=2 x=1,y=2 x=2,y=1 x=1,y=1 Or, the program may print nothing! The ready loop runs forever
What’s Going on Here? Processor caches ($) can get out-of-sync CPU CPU Memory
A Mental Model Every thread/processor has its own copy of every variable Yikes! // Not real code; for illustration purposes only public class Example extends Thread { private static final int NUM_PROCESSORS = 4; private static int x[NUM_PROCESSORS]; private static int y[NUM_PROCESSORS]; private static boolean ready[NUM_PROCESSORS]; // …
Simplified View of Cache Consistency Strategies Relaxed Java lives up here Amount of reordering Sequential Fast and scalable
Sequential Consistency All processors agree on a total order of memory accesses Reads and writes are propagated “immediately” Behaves like shuffling cards “Simple but slow”
Why Relaxed Consistency Models? Hardware perspective: consistency operations are expensive Writing processor must invalidate all other processors Reading processor must re-validate its cached state Compiler perspective: optimizations frequently re-arrange memory operations to hide latency These are guaranteed to be transparent, but only on a single processor
Relaxed Consistency Models Better performance Updates are published lazily But, incomprehensible programming model
Hardware Support: Memory Fences (Barriers) Limits the amount of reordering in the system Memory operations cannot be moved across a fence Several variants: Write fences Read fences Read/write (total) fences
Release Consistency Observation: concurrent programs usually use proper synchronization “All shared, mutable state must be properly synchronized” Thus, it suffices to sync-up memory during synchronized operations Big performance win: the number of cache coherency operations scales with synchronization, not the number of loads and stores
Simple Example Within the critical section, updates can be re-ordered Fetch current values synchronized (this) { x++; y++; } Publish new values Within the critical section, updates can be re-ordered Without publication, updates may never be visible
Java Volatile Variables Java synchronized does double-duty It provides mutual exclusion, atomicity It ensures safe publication of updates Volatile variables provide safe publication without mutual exclusion volatile int x = 7;
More on Volatile Updates to volatile fields are propagated immediately “Don’t cache me!” Effectively, this activates sequential consistency Volatile serves as a fence to the compiler and hardware Memory operations are not re-ordered around a volatile
Rule #1, Revised All shared, mutable state must be properly synchronized With a synchronized statement, an Atomic variable, or with volatile