Complexity Implications of Memory Models
Out-of-Order Execution Avoid with fences (and atomic operations) Shared memory processes reordering buffer Hagit AttiyaDagstuhl // January 20152
Memory Models Abstract conditions on the way the reordering buffer is managed E.g., TSO does not allow reordering of stores PSO does not allow reordering of stores to the same location Dagstuhl // January 2015Hagit Attiya3 sequential consistency total store ordering (TSO) partial store ordering (PSO) relaxed memory ordering (RMO)
One fence is necessary Holds for concurrent data types with non- commutative operations (queues, counters…) Attiya, Guerraoui, Hendler, Kuznetsov, Michael, Vechev: Laws of order: expensive synchronization in concurrent algorithms cannot be eliminated. POPL 2011 A mutex algorithm must have a fence (unless it has an atomic operation) Hagit AttiyaDagstuhl // January 20154
Bakery algorithm needs O(1) fences But O(n) reads, and they are remote – Accesses served from cache: “free” – Remote Memory References (RMRs): “expensive” Not All Memory Accesses are Equal Shared memory processes operation buffer cache interconnect Hagit AttiyaDagstuhl // January 20155
Tournament-tree: entry section store Hagit AttiyaDagstuhl // January 20156
Tournament-tree: entry section store Hagit AttiyaDagstuhl // January 20157
Tournament-tree: entry section Hagit AttiyaDagstuhl // January 20158
Can We Optimize Fences and RMRs? E.g., with PSO, O(1) fences implies O(n) RMRs fencesRMRs Θ(log n) Tournament [Yang, Anderson] O(1)Θ(n)Bakery [Lamport] Without store reordering Θ(log n) O(1) With store reordering NO NO Hagit AttiyaDagstuhl // January Attiya, Hendler, Levy: An O(1)-barriers optimal RMRs mutual exclusion algorithm. PODC 2013
Write Fewer Fences? Make the tree more shallow by increasing the branching factor Hagit AttiyaDagstuhl // January
How Shallow? Hagit AttiyaDagstuhl // January Make the tree more shallow by increasing the branching factor
Tradeoff, for every f = 1, …, log n Make the tree more shallow by increasing the fanout… Hagit AttiyaDagstuhl // January This is optimal
Lower Bound on the Tradeoff with Store Reordering Hagit AttiyaDagstuhl // January
Wrap-Up Separation below TSO More accurate models for shared-memory multiprocessors – E.g., # fences Relate to semantic properties of implemented objects & operations Dagstuhl // January 2015Hagit Attiya14 sequential consistency total store ordering (TSO) partial store ordering (PSO) relaxed memory ordering (RMO)
Hagit AttiyaDagstuhl // January
Outline of the Tradeoff LB Proof Backup Slides Dagstuhl // January 2015Hagit Attiya16
Proof Strategy I RADICONJuly
Proof Strategy II RADICONJuly
Proof Strategy III RADICONJuly
The Encoding: Main Idea For each fence, handle writes with O(1) commands: – Wait for earlier processes to cover my writes – Wait for earlier processes to finish before my writes – Proceed Easy with a list of processes to wait for – But too lengthy Write their number, and figure out their id’s when decoding July 2014RADICON20