E. Bilir, R. Dickson, Y. Hu, M. Plakal, D. Sorin, Multicast Snooping E. Bilir, R. Dickson, Y. Hu, M. Plakal, D. Sorin, M. Hill, D. Wood Presented By Derek Hower
Why Multicast? Goal: Solution: Reduce communication overhead in cache coherent multiprocessors Scalable snooping Reduced latency directories Solution: Hybrid snoop/directory protocol
What is it? Replace snooping bus with Multicast Address Network Predict snoop transaction participants Backup speculation with directory Back end is Point-to-point data network (like Starfire)
The Protocol Snooping communication only with processors thought to be involved in the transaction assume transaction is correct until told otherwise Incorrect predictions are handled via nack and semiack Small, predictive, directory protocol backs up the speculative snooping
Mask Prediction Node locality makes prediction feasible local data (stack, some parts of the heap) misses to the same block Sticky-Spatial(k) prediction Tracks block access, last invaldator Introduced locality by using adjacent blocks in the prediction table Possible for unrelated block to influence prediction Memory corrects mistakes
Address Network Built as a fat tree (Modified Isotach) Total ordering accomplished with timestamps no need for synchronized delivery Capable of multiple broadcasts in parallel
Evaluation “Big picture” simulations mean number of sharers prediction capability mask set size network availability Simulated a MSI (not MOSI) protocol only hurts results
Results Prediction accuracy: 73 – 95% Avg. Nodes in Multicast: 2.4 – 5.6 (out of 32) Avg. excess nodes predicted: 0.3 – 3.4 Implementation better than half of optimal
Deep Thinking Evaluation of specifics Complexity Timing: what if time to traverse fat tree overwhelms the benefits of decreased communication? Complexity What is the range (in system size) for which the benefits of multicast networking overcome complexity Much room for improvement: Better prediction Smarter address network