On Tuning Microarchitecture for Programs Daniel Crowell, Wenbin Fang, and Evan Samanas
Summary A flexible framework for microarchitecture adaptivity, which separates software policies from hardware mechanism Case study: adaptive cache Evaluation: SimpleScalar / Wattch / SPEC2000 / User program Conclusion: Microarchitecture adaptivity is awesome, and our framework is awesome too
Outline Motivation Adaptivity Framework Case study: Adaptive Cache Evaluation Conclusion
Motivation Optimizing for all is optimizing for nothing Software is more and more complex, and many are close source S/W and H/W codesign is infeasible for legacy software
One size doesn’t fit all Show the cache result from our primitive benchmarking To back our motivation to do this project To support our decision of doing case study on adaptive cache, rather than other components
Three Questions for Microarchitecture Adaptivity When to adapt? => Policy – Interval? Context switch? Function boundary? What goal(s)? => Policy – Performance first? Performance-power ratio first? How to adapt? => Mechanism – E.g., parameters of cache include block size, # of blocks, # of sets, replacement algorithm, …
Adaptivity Framework
Mechanism Basically, this is to list some related work on adaptivity, e.g., adaptive cache, adaptive TLB, adaptive processor, … And list some interesting findings during the course of this project, if we make any progress …
Policy Instruction 1: adapt_advise – Inspired from “madvise” in os system calls – Used in software: OS, compiler, user programs – Operand: performance first or performance- power ratio first Instruction 2: adapt_setup – Privilleged, only used by OS – Operand: allowed user programs to use adapt_advise or not
Policy [OS] Interval / Predicted Interval [OS] Context switch / Application boundary [Compiler] Function boundary [User] User program
Case study: Adaptive Cache According to our experimental result, we find cache is more interesting than other components …
Selective set VS Selective way Why do we want to do selective set? Any interesting
Implementation detail Hopefully we can put a block diagram here, making it look more professional in architecture area.
Evaluation Simulator – SimpleScalar 3.0 – Wattch Workload – 6 programs from SPEC 2000 – 3 microbenchmark programs Case study: Adaptive Cache
Microbenchmark Hong-Tai Chou, David J. DeWitt: An Evaluation of Buffer Management Strategies for Relational Database Systems. Algorithmica 1(3): (1986). Six data access patterns: 1.Straight Sequential (SS) References 2.Clustered Sequential (CS) References 3.Looping Sequential (LS) References 4.Independent Random (IR) References 5.Clustered Random (CR) References 6.Looping Hierarchical (LH) References
Mechanism Use 3 microbenchmark programs and 6 programs from SPEC 2000 Use simple policy: e.g., application boundary Show effectiveness of adaptive cache – Figure 1: bar chart on performance – Figure 2: bar chart on performance-power ratio
Policy Use 3 microbenchmark programs – Don’t use SPEC2000, due to some limitations, e.g., superscalar doesn’t support multi-process Use idealistic mechanism: best configuration Show the flexibility of software policies – Figure 1: bar chart on performance [x-axis: policies; y- axis: normalized performance] – Figure 2: bar chart performance-power ratio [x-axis: policies; y-axis: normalized performance-power ratio]
Mechanism + Policy If time is allowed, think of this part to make this project complete.
Conclusion Adaptivity is useful A flexible adaptivity framework – Mechanism – Policy