Download presentation
Presentation is loading. Please wait.
Published byRudolf Hawkins Modified over 9 years ago
1
Spin Locks and Contention Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit
2
Art of Multiprocessor Programming2 Kinds of Architectures SISD (Uniprocessor) –Single instruction stream –Single data stream SIMD (Vector) –Single instruction –Multiple data MIMD (Multiprocessors) –Multiple instruction –Multiple data.
3
Art of Multiprocessor Programming3 Kinds of Architectures SISD (Uniprocessor) –Single instruction stream –Single data stream SIMD (Vector) –Single instruction –Multiple data MIMD (Multiprocessors) –Multiple instruction –Multiple data. Our space (1)
4
Art of Multiprocessor Programming4 MIMD Architectures Memory Contention Communication Contention Communication Latency Shared Bus memory Distributed
5
Art of Multiprocessor Programming5 What Should you do if you can’t get a lock? Keep trying –“spin” or “busy-wait” –Good if delays are short Give up the processor –Good if delays are long –Always good on uniprocessor (1)
6
Art of Multiprocessor Programming6 What Should you do if you can’t get a lock? Keep trying –“spin” or “busy-wait” –Good if delays are short Give up the processor –Good if delays are long –Always good on uniprocessor our focus
7
Art of Multiprocessor Programming7 Basic Spin-Lock CS Resets lock upon exit spin lock critical section...
8
Art of Multiprocessor Programming8 Basic Spin-Lock CS Resets lock upon exit spin lock critical section... …lock introduces sequential bottleneck
9
Art of Multiprocessor Programming9 Basic Spin-Lock CS Resets lock upon exit spin lock critical section... …lock suffers from contention
10
Art of Multiprocessor Programming10 Basic Spin-Lock CS Resets lock upon exit spin lock critical section... …lock suffers from contention Seq Bottleneck no parallelism
11
Art of Multiprocessor Programming11 Basic Spin-Lock CS Resets lock upon exit spin lock critical section... Contention ??? …lock suffers from contention
12
Art of Multiprocessor Programming12 Test-and-Set Boolean value Test-and-set (TAS) –Swap true with current value –Return value tells if prior value was true or false Can reset just by writing false TAS aka “getAndSet”
13
Art of Multiprocessor Programming13 Test-and-Set public class AtomicBoolean { boolean value; public synchronized boolean getAndSet(boolean newValue) { boolean prior = value; value = newValue; return prior; } (5)
14
Art of Multiprocessor Programming14 Test-and-Set public class AtomicBoolean { boolean value; public synchronized boolean getAndSet(boolean newValue) { boolean prior = value; value = newValue; return prior; } Package java.util.concurrent.atomic
15
Art of Multiprocessor Programming15 Test-and-Set public class AtomicBoolean { boolean value; public synchronized boolean getAndSet(boolean newValue) { boolean prior = value; value = newValue; return prior; } Swap old and new values
16
Art of Multiprocessor Programming16 Test-and-Set AtomicBoolean lock = new AtomicBoolean(false) … boolean prior = lock.getAndSet(true)
17
Art of Multiprocessor Programming17 Test-and-Set AtomicBoolean lock = new AtomicBoolean(false) … boolean prior = lock.getAndSet(true) (5) Swapping in true is called “test-and-set” or TAS
18
Art of Multiprocessor Programming18 Test-and-Set Locks Locking –Lock is free: value is false –Lock is taken: value is true Acquire lock by calling TAS –If result is false, you win –If result is true, you lose Release lock by writing false
19
Art of Multiprocessor Programming19 Test-and-set Lock class TASlock { AtomicBoolean state = new AtomicBoolean(false); void lock() { while (state.getAndSet(true)) {} } void unlock() { state.set(false); }}
20
Art of Multiprocessor Programming20 Test-and-set Lock class TASlock { AtomicBoolean state = new AtomicBoolean(false); void lock() { while (state.getAndSet(true)) {} } void unlock() { state.set(false); }} Lock state is AtomicBoolean
21
Art of Multiprocessor Programming21 Test-and-set Lock class TASlock { AtomicBoolean state = new AtomicBoolean(false); void lock() { while (state.getAndSet(true)) {} } void unlock() { state.set(false); }} Keep trying until lock acquired
22
Art of Multiprocessor Programming22 Test-and-set Lock class TASlock { AtomicBoolean state = new AtomicBoolean(false); void lock() { while (state.getAndSet(true)) {} } void unlock() { state.set(false); }} Release lock by resetting state to false
23
Art of Multiprocessor Programming23 Performance Experiment –n threads –Increment shared counter 1 million times How long should it take? How long does it take?
24
Art of Multiprocessor Programming24 Graph ideal time threads no speedup because of sequential bottleneck
25
Art of Multiprocessor Programming25 Mystery #1 time threads TAS lock Ideal (1) What is going on?
26
Art of Multiprocessor Programming26 Bus-Based Architectures Bus cache memory cache
27
Art of Multiprocessor Programming27 Bus-Based Architectures Bus cache memory cache Random access memory (10s of cycles)
28
Art of Multiprocessor Programming28 Bus-Based Architectures cache memory cache Shared Bus Broadcast medium One broadcaster at a time Processors and memory all “snoop” Bus
29
Art of Multiprocessor Programming29 Bus-Based Architectures Bus cache memory cache Per-Processor Caches Small Fast: 1 or 2 cycles Address & state information
30
Art of Multiprocessor Programming30 Jargon Watch Cache hit –“I found what I wanted in my cache” –Good Thing™
31
Art of Multiprocessor Programming31 Jargon Watch Cache hit –“I found what I wanted in my cache” –Good Thing™ Cache miss –“I had to shlep all the way to memory for that data” –Bad Thing™
32
Art of Multiprocessor Programming32 Bus Processor Issues Load Request cache memory cache data
33
Art of Multiprocessor Programming33 Bus Processor Issues Load Request Bus cache memory cache data Gimme data
34
Art of Multiprocessor Programming34 cache Bus Memory Responds Bus memory cache data Got your data right here data
35
Art of Multiprocessor Programming35 Bus Processor Issues Load Request memory cache data Gimme data
36
Art of Multiprocessor Programming36 Bus Processor Issues Load Request Bus memory cache data Gimme data
37
Art of Multiprocessor Programming37 Bus Processor Issues Load Request Bus memory cache data I got data
38
Art of Multiprocessor Programming38 Bus Other Processor Responds memory cache data I got data data Bus
39
Art of Multiprocessor Programming39 Bus Other Processor Responds memory cache data Bus
40
Art of Multiprocessor Programming40 Modify Cached Data Bus data memory cachedata (1)
41
Art of Multiprocessor Programming41 Modify Cached Data Bus data memory cachedata (1)
42
Art of Multiprocessor Programming42 memory Bus data Modify Cached Data cachedata
43
Art of Multiprocessor Programming43 memory Bus data Modify Cached Data cache What’s up with the other copies? data
44
Art of Multiprocessor Programming44 Cache Coherence We have lots of copies of data –Original copy in memory –Cached copies at processors Some processor modifies its own copy –What do we do with the others? –How to avoid confusion?
45
Art of Multiprocessor Programming45 Write-Back Caches Accumulate changes in cache Write back when needed –Need the cache for something else –Another processor wants it On first modification –Invalidate other entries –Requires non-trivial protocol …
46
Art of Multiprocessor Programming46 Write-Back Caches Cache entry has three states –Invalid: contains raw seething bits –Valid: I can read but I can’t write –Dirty: Data has been modified Intercept other load requests Write back to memory before using cache
47
Art of Multiprocessor Programming47 Bus Invalidate memory cachedata
48
Art of Multiprocessor Programming48 Bus Invalidate Bus memory cachedata Mine, all mine!
49
Art of Multiprocessor Programming49 Bus Invalidate Bus memory cachedata cache Uh,oh
50
Art of Multiprocessor Programming50 cache Bus Invalidate memory cachedata Other caches lose read permission
51
Art of Multiprocessor Programming51 cache Bus Invalidate memory cachedata Other caches lose read permission This cache acquires write permission
52
Art of Multiprocessor Programming52 cache Bus Invalidate memory cachedata Memory provides data only if not present in any cache, so no need to change it now (expensive) (2)
53
Art of Multiprocessor Programming53 cache Bus Another Processor Asks for Data memory cachedata (2) Bus
54
Art of Multiprocessor Programming54 cache data Bus Owner Responds memory cachedata (2) Bus Here it is!
55
Art of Multiprocessor Programming55 Bus End of the Day … memory cachedata (1) Reading OK, no writing data
56
Art of Multiprocessor Programming56 Mutual Exclusion What do we want to optimize? –Bus bandwidth used by spinning threads –Release/Acquire latency –Acquire latency for idle lock
57
Art of Multiprocessor Programming57 Simple TASLock TAS invalidates cache lines Spinners –Miss in cache –Go to bus
58
Art of Multiprocessor Programming58 NUMA Architecturs Acronym: –Non-Uniform Memory Architecture Illusion: –Flat shared memory Truth: –No caches (sometimes) –Some memory regions faster than others
59
Art of Multiprocessor Programming59 NUMA Machines Spinning on local memory is fast
60
Art of Multiprocessor Programming60 NUMA Machines Spinning on remote memory is slow
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.