OSDI ’10 Research Visions 3 October Epoch parallelism: One execution is not enough Jessica Ouyang, Kaushik Veeraraghavan, Dongyoon Lee, Peter Chen, Jason Flinn, Satish Narayanansamy University of Michigan
Motivation Write a single program that is both fast & correct Make it easier for programmers –Change approach to programming –Write program that is fast or correct – not both Combine multiple, specialized executions –Fast/buggy accelerates slow/correct –Slow/correct checks fast/buggy Jessica Ouyang2 Fast & Correct Slow & Correct Fast & Buggy Slow & Correct
E1 E3 E2 E0 ==? 2. Start epoch 1. Checkpoint state Jessica Ouyang3 Epoch parallelism E1 Time E0 E2 E3 Fast & buggySlow & correct E3 != 3. Check state 4. Roll back & Re-execute
Nice properties of uniprocessor -Fewer races -Stronger memory consistency model -Easier to replay Uniprocessor execution Jessica Ouyang4 CPU 0CPU 1CPU 2CPU 3 E1 E0 B1 B0 MultiprocessorUniprocessor Performance E0 B0 A1 A0 E1 B1 A1 A0
Using epoch parallelism Jessica Ouyang5 CPU 0CPU 1CPU 2CPU 3 E1 E0 B1 B0 E0 S0 Multi-threadedSingle-threaded E1 S1 Transform function Challenges -Importing state to start epochs -Checking state A1 A0
Jessica Ouyang - University of Michigan6 Conclusion Rethink having a single program/execution be both fast & correct Use separate, specialized executions to achieve different goals
OSDI ’10 Research Visions 3 October Epoch parallelism: One execution is not enough Jessica Ouyang, Kaushik Veeraraghavan, Dongyoon Lee, Peter Chen, Jason Flinn, Satish Narayanansamy University of Michigan
Related Work Master/Slave Speculative Parallelization –Zilles, Sohi, IEEE ‘02 Thread-Level Data Speculation –Steffan, Mowry, HPCA ‘98 Enhancing Software Reliability with Speculative Threads –Oplinger, Lam, APLOS ’02 BASE –Castro, Rodriguez, Liskov, TOCS ’03 GRACE –Berger, Yang, Liu, Novark, OOPSLA ‘09 Jessica Ouyang8
More uses of epoch parallelism Uniprocessor execution –Deterministic replay –Data race detection/avoidance Optimistic concurrency –Lock elision –Transactional memory Additional runtime checks –Assertions, bounds checking –Security checks Jessica Ouyang9
Programming effort Write one program –Compiler/runtime/hardware optimizes aggressively –Original program checks correctness Write 2 versions of same program –One with checks (assertions, security) and one without Write 2 versions + transform function –Arbitrary implementations Jessica Ouyang10
Programming effort Single-threaded & multi-threaded use case –Need additional transform function –Generate input to start epochs Is this really less work than 1 correct & fast multi- threaded program? Jessica Ouyang11
Redundancy & efficiency Base-line overhead is 2x throughput Acceptable for some applications –Core counts increasing –Using cores is hard Can make it more efficient –Remove redundant instructions –Use fast & buggy as software predictor for slow & correct (branched, load value) Jessica Ouyang12
E3 Jessica Ouyang13 Epoch parallelism E2 E1 E0 Time E0 E2 E3 Fast, buggyCorrect, slow
Slow and correct E0 has completed E1 Misspeculation in epoch parallelism Jessica Ouyang14 E0 Time E0 Fast, buggy Correct, slow Check thread- parallel checkpoint Checkpoint doesn’t match! E1 ? ? Use result from epoch- parallel Restart execution of epoch 1 E3 E2 E3 Continue executing