Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analytic Evaluation of Shared-Memory Systems with ILP Processors

Similar presentations


Presentation on theme: "Analytic Evaluation of Shared-Memory Systems with ILP Processors"— Presentation transcript:

1 Analytic Evaluation of Shared-Memory Systems with ILP Processors
D.J. Sorin, V.S. Pai, S.V. Adve, M.K. Vernon, D.A. Wood Presented by Bogdan Romanescu

2 Introduction Motivation: Simulating shared-memory systems with ILP processors takes painfully long Hypothesis: It is possible to describe the system with a set of equations which have simple parameters capture system details Method: View memory as a system of queues and delay centers Metric: Processor throughput

3 System under test Cache coherent shared-memory multiprocessor
Mesh interconnection Processor multiple issue out of order scheduling non blocking loads speculative execution L1 and L2 $ state tracking miss status holding registers (MSHR) Interleaved memory and directory

4 Model parameters Architecture parameters Application parameters
number of nodes number of MSHRs NI, bus and switch occupancies Application parameters ILP parameters: , CV fsynch-write fM Directory coherence parameters: Pread, Pwrite, Pupgrade, Pwb, PL|x, PM|x,y, P3hop|x&not-memory, H, X

5 Estimating parameters
Non-ILP dependent : fast simulators for multiprocessors with single issue in order processors ILP dependent : FastILP simulator Timestamping “Eras” division Trace-driven simulations

6 Analytical model Output measure: system throughput (IPC) as f(input parameters, system architecture) Iterations between 2 models Synchronous blocking model (SB): processor stalled due to load and read-modify-write MSHR blocking model (MB): processor stalled due to MSHRs full MVA equations used for computing delay Synchronizations accounted for separately (locks and barriers)

7 Equations Average round-trip time SB
Total average residence time at NI out queue Total mean delay for each type of synchronous transaction at local NI Utilization of local NI queue Average waiting time at local NI queue due to traffic from remote nodes

8 Model validations Better approximation for the residual life
Account for significant fsynch-write

9 Applications Insights into application behavior
fM : ability to exploit ILP to overlap read memory requests CV: degree of burstiness Evaluation of the impact of the MSHRs number Benefits of coupled/decoupled memory and directories Analysis of programmable coherence controllers impact

10 Questions Is “mean time” a representative measure?
How misleading can it be? Residual life: even with interpolation, accurate enough? Why are the errors going up even after using the 2 accuracy-increasing observations?


Download ppt "Analytic Evaluation of Shared-Memory Systems with ILP Processors"

Similar presentations


Ads by Google