Download presentation
Presentation is loading. Please wait.
1
Michigan Technological University, Houghton MI
Cost Effective Memory Dependence Prediction Using Speculation Levels and Color Sets Soner Önder Michigan Technological University, Houghton MI
2
Outline Background Memory dependence prediction.
Pairing based approach. Store sets. Color sets Notion of color sets. Color set implementation. Color set predictor. Instruction window modifications. Experimental evaluation Basic policy. Aggressive policy.
3
Memory Dependence Prediction
Assume ST-2, ST-p and LD-s all access the same memory location. If we issue LD-s at this point in time, we’ll get a memory order violation. If we know Load Ld-s is dependent on Store St-p, we can issue the load at the right time. Seq. 1 2 3 p p+1 p+2 p+3 Instruction ST-1 ST-2 ST-3 ST-p ST-p+1 ST-p+2 LD-s Ready No Yes St-p
4
Dynamic Memory Disambiguation
Problem: In the presence of unresolved stores in the instruction window, which load(s) must be held? Ideal Solution: Wait only for the producer store. Simple Solutions: Wait for all - no speculation. Issue blindly - blind speculation.
5
Memory dependence prediction (Moshovos et al. 1997-1998)
Earlier work which mainly concentrated on predicting precise dependencies among pairs of load/store instructions : To enable early issuing of loads through memory dependence prediction. To streamline communication so that values can be directly passed from producers to consumers instead of through memory. Emphasis has been given to identifying the precise store instruction a load may depend on.
6
Store-set Memory Dependence Predictor (Chrysos & Emer - 1998)
A store set is the set of all stores a load has been observed to be dependent on. Initially employ blind speculation for loads. Upon memory order violation create a store set for the offending load and store. Next time the same load is encountered make the load wait until the store issues. Store set may contain multiple stores: chain the stores and make load dependant upon the last store.
7
Store-set Implementation
PC LFST SSID Dependence information is digested to create SETS of colliding instructions. Each set tells exactly which stores a load should wait for. Sufficiently large tables yield performance of an ORACLE.
8
Color Set predictor Instead of
predicting precise dependencies among pairs of loads/stores or constructing sets of store and load instructions which collided in the past, We assign the processor, load and store instructions various speculation levels (colors) and predict the speculation level (i.e.,the color) a load or store can be issued without a collision. Predictor size
9
Color Set predictor Since we only try to predict the speculation level, we expect to have: smaller storage for the predictor, better performance at smaller hardware budgets, faster implementations, power savings and more collisions.
10
So, it is something like this
00 01 10 11 Processor 00 01 10 11 Load The rules governing the color change:policies. We investigate two policies, a basic policy and an aggressive policy.
11
Load instruction selection
Eligible load instructions 00 01 10 11 Current processor color
12
Load instruction selection
Eligible load instructions 00 01 10 11 Current processor color
13
Load instruction selection
Eligible load instructions 00 01 10 11 Current processor color
14
Load instruction selection
Eligible load instructions 00 01 10 11 Current processor color
15
Instruction window extensions
Inhibit color Window details Global color 1 + + <= + + + 1 Issue? + + Instructions entering window
16
Collisions 01 load 01 store load store 01 10 00 01 10 11
Current processor color
17
Color Set Predictor Basic Policy
1. Basic policy gradually becomes aggressive when port utilization is low. 2. The load instruction is given a higher color and a store instruction given a lower color upon a collision. 3. Processor runs at the smaller of the current processor color and the color of the store instructions. 4. Rules 2 & 3 together runs the processor at a lower speculation level than the level the prior collision has occurred.
18
Color Set Predictor Aggressive Policy
1. Aggressive policy switches to maximum speculation level when port utilization is low. 2. The load instruction is given a higher color and a store instruction is specifically marked upon a collision. 3. Processor decrements the current processor color when a colliding store is detected. 4. As a result, the processor runs at the highest speculation level that won’t result in a collision and at a different color than the color it had during the collision.
19
Color Set Predictor Accessed early in the pipeline using L/S PC
Updated upon collision/successful speculation Basic Policy 00 No speculation 01 Level 1 10 Level 2 11 Level 3 L/S PC L/S color 10 Aggressive Policy 00 No speculation 01 Level 1 10 Level 2 11 Level 3/Colliding store
20
Processor’s colorful perspective
Basic policy When port utilization is low, the processor moves on to next color. Processor assumes the lowest ranking store’s color. 00 01 10 11 Low port utilization Colliding stores
21
Processor’s colorful perspective
Aggressive policy When a colliding store enters the window, the processor decrements its color. When port utilization is low, processor switches to red. 00 01 10 11 Low port utilization Colliding stores
22
Load instruction color states
Both policies 00 01 10 11 Collision Successful speculation
23
Simulation Framework Aggressive out-of-order superscalar processor:
8 instructions/cycle fetch/dispatch 16 instructions/cycle retire width 64 entry centralized reservation station 8 symmetric functional units Multi-block gshare fetch unit 2 memory ports r/w Perfect D-cache Simulated using cycle-accurate simulators generated automatically from ADL descriptions using the FAST system.
24
Performance Spec Fp Arithmetic Mean
25
Performance Spec Fp Harmonic Mean
26
Performance Spec Int Arithmetic Mean
27
Performance Spec Int Harmonic Mean
28
Individual benchmarks 128-Fp
29
Individual benchmarks 4096-Fp
30
Individual benchmarks 128-Int
31
Individual benchmarks 4096-Int
32
So ... Cost effective dependence prediction. Why does it work?
Design space: Number of colors/number of entries. Confidence mechanisms. Other policies. Power consumption Disable chunks of predictor and use basic policy; Enable and become aggressive.
33
Have a colorful evening
Soner Önder Michigan Technological University Antalya, Turkey
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.