Dynamic Runtime Testing for Cycle-Accurate Simulators Saša Tomić, Adrián Cristal, Osman Unsal, Mateo Valero Barcelona Supercomputing Center (BSC) Universitat Politecnica de Catalunya (UPC)
2 Can we trust the simulator-based evaluations? Typical simulator evaluation: Make a simulator REPEAT { Debug Simulate } UNTIL: the results make sense (intuition!) Discard and ignore the failed simulations Are there any bugs left?
Verifying the simulators Verification is important! –Industry puts significant resources Testing and Verification 50-70% of the costs Mission critical application even 90% of the costs –Academia puts less resources Why do we have bugs? –Simulators are complex –Proposed extensions are often complex –The extensions may uncover existing bugs 3
Simulator bugs Timing bugs –Incorrect estimation of the execution time –Simulation terminates without obvious errors –Needs other types of testing and verification Functional bugs –Incorrect implementation of functional units –Simulations may or may not terminate correctly 4 Our target
Outline Examples of functional bugs An overview of the Dynamic Simulator Testing methodology Use Cases of Dynamic Simulator Testing Performance evaluation Conclusions 5
Example: a bug in the cache coherence protocol 6 simulator of multi-level coherent caches X=0 X=10X=0X=20 X += 10X += 20 Bug: X should be X = = 30 shared memory processor 1processor 2 Proc 1 X+=10 Proc 2 X+=20 X=0 X=30 time
Example: a bug in the HTM X = 0; Atomic { X += 10; } 7 HTM simulator X=0 X=10 nothing? Bug: not committed X = 10 processor shared memory X += 10
Detecting functional bugs The functionality of the simulators is often simple –Can be emulated with simple emulators –The emulators can be fast and stable Can we take an advantage of the emulators? 8
Dynamic Testing Methodology Add a simple, no-timing emulator Execute each operation in the simulator and then in the emulator Compare the executions –We compared only the memory accesses The execution must be identical during entire simulation 9
An overview of dynamic testing 10 timing simulator simple no-timing emulator input output Use the same input Compare the outputs Repeat for every operation!
Dynamic testing a cache coherence protocol 11 STL map timing simulator of multi-level coherent caches X += 10X += 20 shared memory processor 1processor 2 input X=10 input X=0 output X=0 output X=0 output X=20 input X=20 output input X=0 input X=10 Check failed: should be X=10 X=0X=10
Dynamic testing of an HTM 12 timing simulator of an HTM STL map per TX TX_Begin; X += 10; TX_Commit; Check failed: should commit X=10 input X=0 input X=0 processor shared memory input X=10 input X=10 output X=0 output X=???X=10 output X=0X=10
Other Use Cases Out-Of-Order or pipelined processor –With a processor emulator, e.g., QEMU Complex memory hierarchy –With an STL map Incoherent multilevel memory hierarchy –W/ multiple STL maps, one per memory hierarchy System-On-Chip, Routing Protocols, etc. –Simple emulators of the functionalities 13
Performance Evaluation Implemented on 4 HTMs with lazy and eager version management Implemented for a directory-based cache- coherence protocol Baseline: M5 full-system simulator 14
Performance evaluation OS booting 15
Performance evaluation applications 16
Our experience with Dynamic Testing Reduced the time spent on writing tests Faster debugging –Detects most bugs “in minutes” –Eliminating a bug takes tens of minutes instead of hours/days/weeks/… Shortened the total simulator development from months to 3-4 months 17
Conclusions Presented the Dynamic Simulator Testing Detects the functional bugs in Cycle-Accurate Simulators Modest reduction of simulator performance 18
Thanks! Sasha Tomić 19 Dynamic Runtime Testing for Cycle-Accurate Simulators
Cache/HTM emulator implementation 20 STL map (dictionary) address line data address line data