Raphael Eidenbenz Roger Wattenhofer Roger Wattenhofer Good Programming in Transactional Memory Game Theory Meets Multicore Architecture
Raphael Eidenbenz, ETH Zurich. ISAAC 2009 Moore‘s Law Clock speed flattening sharply Transistor count still rising Advent of multi-core processors!
Raphael Eidenbenz, ETH Zurich. ISAAC 2009 Multicore Architecture Explicit locking Parallel threads Communication through shared memory Developer: Explicit locking of shared resources Mark critical sections System: Guarantee exclusive execution Transactional Memory
Raphael Eidenbenz, ETH Zurich. ISAAC 2009 Contention Management Which transaction shall I abort??
Raphael Eidenbenz, ETH Zurich. ISAAC 2009 Contention Managers Timestamp Oldest transaction wins Polite Exponential backoff Karma Transaction with most locked resources wins Priority is carried over to next attempt when aborted Polka Karma with exponential backoff Randomized Pick a random winner priority based non-priority based
Raphael Eidenbenz, ETH Zurich. ISAAC 2009 Is it a Game? Yes Players = programmers Strategy space = placing of transactions Their goal: fast execution Social goal: maximize system throughput „My thread is the fastest!“
Desired Behavior Raphael Eidenbenz, ETH Zurich. ISAAC 2009 incRingCounters(Node start){ var cur = start; transaction{ while(cur.next!=start){ cur.doSomething(); cur = cur.next; } }} incRingCountersGP(Node start){ var cur = start; while(cur.next!=start){ transaction{cur.doSomething();} cur = cur.next; }} Transactions as short as possible! R1R1 R3R3 t R2R2 RsRs R1R1 R3R3 t R2R2 RsRs
Raphael Eidenbenz, ETH Zurich. ISAAC 2009 Simulation Setup „Free-riding“ threads in DSTM2 Coarse transaction granularity ( ¸ 20 accesses per transaction) Collaborative threads Granularity =1 16 threads on 16 cores do random updates on shared ordered list or red-black tree during 10 s. 1 or 8 free-riders High contention
Raphael Eidenbenz, ETH Zurich. ISAAC 2009 Simulation Results Karma Polka Timestamp throughput collaborators (updates/s) Randomized throughput collaborators (updates/s) throughput free-riders (updates/s)
Good Programming Incentives A CM is GPI compatible iff it punishes unnecessary locking and rewards partitioning. Raphael Eidenbenz, ETH Zurich. ISAAC 2009
Priority Based CM CM associates with each thread J i a priority ! i Thread with highest priority wins conflicts Rationale: „Don‘t discard the transaction who has done most“ Underlying assumption: Priority measures the amount of work done E.g. Timestamp CM The oldest transaction has done the most work Raphael Eidenbenz, ETH Zurich. ISAAC 2009 Theorem: Polite, Greedy, Karma, Timestamp and Polka are not GPI compatible.
What is wrong? Raphael Eidenbenz, ETH Zurich. ISAAC 2009 R1R1 R3R3 t R2R2 RsRs R1R1 R3R3 t R2R2 RsRs
What is wrong? Raphael Eidenbenz, ETH Zurich. ISAAC 2009 Snatching up resources pushes priority R1R1 R3R3 t R2R2 RsRs R1R1 R3R3 t R2R2 RsRs
More Results Raphael Eidenbenz, ETH Zurich. ISAAC 2009 Theorem 3 : Any priority-accumulating CM M is not GPI compatible if one of the following holds: i.M increases a job’s priority on -events. ii.M increases relative priority on -events. iii.M schedules transactions gapless and increases priorities on -events. iv.M restarts aborted transactions immediately and increases priorities on -events Theorem 2 : Quasi priority accumulating CMs are not GPI compatible. Theorem 4 : Any priority-accumulating CM that is based only on time is GPI compatible.
TiTi T i1 T i2 Randomized CM Not priority based „Choose random winner“ Proof Intuition Unnecessary Locks: Stupid because only risk conflict (no priority gain) Partitioning: Raphael Eidenbenz, ETH Zurich. ISAAC 2009 Lemma 3 : Randomized CM is GPI compatible. TiTi T i1 T i2
TiTi T i1 T i2 Randomized CM Not priority based „Choose random winner“ Proof Intuition Unnecessary Locks: Stupid because only risk conflict (no priority gain) Partitioning: Raphael Eidenbenz, ETH Zurich. ISAAC 2009 Lemma 3 : Randomized CM is GPI compatible. TiTi T i2
Raphael Eidenbenz, ETH Zurich. ISAAC 2009 Conclusion & Open Problems Further work: Relax GPI compatibility Trace effect in „real“ software Thank you!