Tim Harris Researcher Microsoft Corporation
Source: Table from ,000 1,000,000 Transistor count 1,000,000, Intel 486 Intel Intel Pentium Intel Pentium II Intel Pentium 4 Intel Itanium 2 Increases in clock frequency Increases in ILP Parallelism
How do we find things to run in parallel? How do we control sharing of data between concurrent threads? Server workloads often provide a natural source of parallelism: Deal with different clients concurrently Numerical workloads are, of course, also frequently parallelisable: Problems can be partitioned, or parallel building blocks composed Programmers aren’t good at thinking about thread interleavings …and locks provide a complicated solution that discourages modularity
Races: Due to forgotten locks Deadlock: Locks acquired in “wrong” order Lost wakeups: Forgotten notify to condition variable Error recovery tricky: Need to restore invariants and release locks in exception handlers Simplicity versus scalability tension Lack of progress guarantees …but worst of all
Suppose we have a good concurrent hash table implementation: void Swap(int kx, int ky) { vx = ht.Remove(kx); vy = ht.Remove(ky); ht.Insert(kx, vy); ht.Insert(ky, vx); } void Swap(int kx, int ky) { vx = ht.Remove(kx); vy = ht.Remove(ky); ht.Insert(kx, vy); ht.Insert(ky, vx); }
void Swap(int kx, int ky) { if (kx < ky) { t = kx; kx = ky; ky = t; } ht.Lock(kx); ht.Lock(ky); vx = ht.Remove(kx); vy = ht.Remove(ky); ht.Insert(kx, vy); ht.Insert(ky, vx); ht.Unlock(kx); ht.Unlock(ky); } void Swap(int kx, int ky) { if (kx < ky) { t = kx; kx = ky; ky = t; } ht.Lock(kx); ht.Lock(ky); vx = ht.Remove(kx); vy = ht.Remove(ky); ht.Insert(kx, vy); ht.Insert(ky, vx); ht.Unlock(kx); ht.Unlock(ky); }
“Write the obvious sequential code and wrap ‘atomic’ around it” void Swap(int kx, int ky) { atomic { vx = ht.Remove(kx); vy = ht.Remove(ky); ht.Insert(kx, vy); ht.Insert(ky, vx); } void Swap(int kx, int ky) { atomic { vx = ht.Remove(kx); vy = ht.Remove(ky); ht.Insert(kx, vy); ht.Insert(ky, vx); }
Integration with other language features (exceptions, finalizers, class initialization,...) Integration with other parallel programming abstractions (locks/atomic, OpenMP,...) Interaction with non-transacted resources Correct usage and memory models Performance analysis and tuning Debugging
© 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Microsoft Research Faculty Summit 2007