Influence Of The Cache Size On The Bus Traffic Mohd Azlan bin Hj. Abd Rahman 710126-01-5985 M031010035.

Slides:



Advertisements
Similar presentations
Chapter 5 Part I: Shared Memory Multiprocessors
Advertisements

L.N. Bhuyan Adapted from Patterson’s slides
Lecture 19: Cache Basics Today’s topics: Out-of-order execution
Parallel Processing Problems Cache Coherence False Sharing Synchronization.
MESI cache coherence protocol
Multi-core systems System Architecture COMP25212 Daniel Goodman Advanced Processor Technologies Group.
Cache Optimization Summary
1 Copyright © 2011, Elsevier Inc. All rights Reserved. Appendix I Authors: John Hennessy & David Patterson.
Technical University of Lodz Department of Microelectronics and Computer Science Elements of high performance microprocessor architecture Shared-memory.
Spring 2003CSE P5481 Introduction Why memory subsystem design is important CPU speeds increase 55% per year DRAM speeds increase 3% per year rate of increase.
CS 7810 Lecture 19 Coherence Decoupling: Making Use of Incoherence J.Huh, J. Chang, D. Burger, G. Sohi Proceedings of ASPLOS-XI October 2004.
1 Lecture 12: Cache Innovations Today: cache access basics and innovations (Sections )
1 Lecture 14: Cache Innovations and DRAM Today: cache access basics and innovations, DRAM (Sections )
1 Multiprocessors. 2 Idea: create powerful computers by connecting many smaller ones good news: works for timesharing (better than supercomputer) bad.
Caches J. Nelson Amaral University of Alberta. Processor-Memory Performance Gap Bauer p. 47.
CS252/Patterson Lec /23/01 CS213 Parallel Processing Architecture Lecture 7: Multiprocessor Cache Coherency Problem.
1 Lecture 24: Transactional Memory Topics: transactional memory implementations.
1 Lecture 20: Coherence protocols Topics: snooping and directory-based coherence protocols (Sections )
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Nov 14, 2005 Topic: Cache Coherence.
1 Lecture 13: Cache Innovations Today: cache access basics and innovations, DRAM (Sections )
CPE 731 Advanced Computer Architecture Snooping Cache Multiprocessors Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University of.
An Intelligent Cache System with Hardware Prefetching for High Performance Jung-Hoon Lee; Seh-woong Jeong; Shin-Dug Kim; Weems, C.C. IEEE Transactions.
Logical Protocol to Physical Design
Snoopy Coherence Protocols Small-scale multiprocessors.
1 Shared-memory Architectures Adapted from a lecture by Ian Watson, University of Machester.
Multiprocessor Cache Coherency
Multi-core architectures. Single-core computer Single-core CPU chip.
Multi-Core Architectures
By: Aidahani Binti Ahmad
Cache Control and Cache Coherence Protocols How to Manage State of Cache How to Keep Processors Reading the Correct Information.
ECE200 – Computer Organization Chapter 9 – Multiprocessors.
A Low-Overhead Coherence Solution for Multiprocessors with Private Cache Memories Also known as “Snoopy cache” Paper by: Mark S. Papamarcos and Janak H.
I/O Computer Organization II 1 Interconnecting Components Need interconnections between – CPU, memory, I/O controllers Bus: shared communication channel.
Cache Memory Chapter 17 S. Dandamudi To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi.
Analyzing the Impact of Data Prefetching on Chip MultiProcessors Naoto Fukumoto, Tomonobu Mihara, Koji Inoue, Kazuaki Murakami Kyushu University, Japan.
Project 11: Influence of the Number of Processors on the Miss Rate Prepared By: Suhaimi bin Mohd Sukor M
Additional Material CEG 4131 Computer Architecture III
Cache Perf. CSE 471 Autumn 021 Cache Performance CPI contributed by cache = CPI c = miss rate * number of cycles to handle the miss Another important metric.
1 Lecture 7: Implementing Cache Coherence Topics: implementation details.
Performance of Snooping Protocols Kay Jr-Hui Jeng.
CMSC 611: Advanced Computer Architecture Shared Memory Most slides adapted from David Patterson. Some from Mohomed Younis.
1 Computer Architecture & Assembly Language Spring 2001 Dr. Richard Spillman Lecture 26 – Alternative Architectures.
תרגול מס' 5: MESI Protocol
Replacement Policy Replacement policy:
Multilevel Memories (Improving performance using alittle “cash”)
Computer Engineering 2nd Semester
CS 704 Advanced Computer Architecture
12.4 Memory Organization in Multiprocessor Systems
CSCI206 - Computer Organization & Programming
Multiprocessor Cache Coherency
Lecture 23: Cache, Memory, Security
Assignment 4 – (a) Consider a symmetric MP with two processors and a cache invalidate write-back cache. Each block corresponds to two words in memory.
Lecture 21: Memory Hierarchy
Lecture 21: Memory Hierarchy
Directory-based Protocol
Cache Coherence Protocols:
Cache Coherence (controllers snoop on bus transactions)
Lecture 23: Cache, Memory, Virtual Memory
Lecture 22: Cache Hierarchies, Memory
Interconnect with Cache Coherency Manager
Lecture 5: Snooping Protocol Design Issues
Multiprocessor Highlights
Lecture 22: Cache Hierarchies, Memory
Lecture 11: Cache Hierarchies
High Performance Computing
Lecture 21: Memory Hierarchy
Coherent caches Adapted from a lecture by Ian Watson, University of Machester.
Lecture 19: Coherence and Synchronization
Lecture 13: Cache Basics Topics: terminology, cache organization (Sections )
Lecture 14: Cache Performance
Presentation transcript:

Influence Of The Cache Size On The Bus Traffic Mohd Azlan bin Hj. Abd Rahman M

A summary of the traces is shown below :

Figure 1 (a) Miss Rates Versus Cache Size

Figure (b) Bus Traffic Versus Cache Size

The global miss rate for the system decreases as the cache size increases because capacity and conflict misses are reduced. For great caches sizes, the miss rate is stabilised, this shows the compulsory and coherence misses, which are independent of cache size. Current measurements demonstrate that the shared data have less spatial and temporal locality than other data types. That is, in general, parallel programs exhibit less spatial and temporal locality than serial programs. So, it is normal that the miss rates are greater for multiprocessor traces than for uniprocessor traces.

Fig. 1(b) shows the bus traffic on the multiprocessor per memory access for this same experimentation. The traffic is split into data traffic and address (including command) bus traffic. The bus traffic is reduced as the miss rate decreases because of two fundamental reasons : 1.There are less data transfers from the shared main memory to the caches. 2.Due to there are less misses, less bus transactions are necessary in order to manage the cache coherence protocol

Fig. 2

We can conclude that, the greater number of processors for a parallel application, the more miss rate and bus traffic. It is possible because with a invalidation-based protocol, like the MESI protocol, the more processors are, the more possible is that several caches share the same block, and hence that, on a write operation, a cache force the other caches to invalidate that block, producing new misses (coherence misses) and increasing the number of block transfers.

On the other hand, the greater number of processors, the greater number of bus transactions is needed to hold the cache coherence. In short, as the number of processors increases for a given problem size, the working set starts to fit in the cache, and a domination by local misses (mainly, capacity misses) is replaced by a domination by coherence misses.

THANK YOU