Additional Material CEG 4131 Computer Architecture III

Slides:



Advertisements
Similar presentations
Chapter 5 Part I: Shared Memory Multiprocessors
Advertisements

L.N. Bhuyan Adapted from Patterson’s slides
CSCI 8150 Advanced Computer Architecture
1 Lecture 6: Directory Protocols Topics: directory-based cache coherence implementations (wrap-up of SGI Origin and Sequent NUMA case study)
Multi-core systems System Architecture COMP25212 Daniel Goodman Advanced Processor Technologies Group.
Cache Optimization Summary
Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Snoopy Caches II Steve Ko Computer Sciences and Engineering University at Buffalo.
Technical University of Lodz Department of Microelectronics and Computer Science Elements of high performance microprocessor architecture Shared-memory.
The University of Adelaide, School of Computer Science
Computer Architecture II 1 Computer architecture II Lecture 8.
CIS629 Coherence 1 Cache Coherence: Snooping Protocol, Directory Protocol Some of these slides courtesty of David Patterson and David Culler.
1 Multiprocessors. 2 Idea: create powerful computers by connecting many smaller ones good news: works for timesharing (better than supercomputer) bad.
EECC756 - Shaaban #1 lec # 10 Spring Shared Memory Multiprocessors Symmetric Memory Multiprocessors (SMPs): commonly 2-4 processors/node.
EECC756 - Shaaban #1 lec # 11 Spring Shared Memory Multiprocessors Symmetric Multiprocessors (SMPs): –Symmetric access to all of main memory.
CS252/Patterson Lec /23/01 CS213 Parallel Processing Architecture Lecture 7: Multiprocessor Cache Coherency Problem.
BusMultis.1 Review: Where are We Now? Processor Control Datapath Memory Input Output Input Output Memory Processor Control Datapath  Multiprocessor –
1 Lecture 1: Parallel Architecture Intro Course organization:  ~5 lectures based on Culler-Singh textbook  ~5 lectures based on Larus-Rajwar textbook.
1 Lecture 20: Coherence protocols Topics: snooping and directory-based coherence protocols (Sections )
1 Lecture 3: Snooping Protocols Topics: snooping-based cache coherence implementations.
1 Lecture 18: Coherence Protocols Topics: coherence protocols for symmetric and distributed shared-memory multiprocessors (Sections )
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Nov 14, 2005 Topic: Cache Coherence.
CPE 731 Advanced Computer Architecture Snooping Cache Multiprocessors Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University of.
CS252/Patterson Lec /28/01 CS 213 Lecture 10: Multiprocessor 3: Directory Organization.
Snoopy Coherence Protocols Small-scale multiprocessors.
Lecture 37: Chapter 7: Multiprocessors Today’s topic –Introduction to multiprocessors –Parallelism in software –Memory organization –Cache coherence 1.
1 Shared-memory Architectures Adapted from a lecture by Ian Watson, University of Machester.
Multiprocessor Cache Coherency
Spring 2003CSE P5481 Cache Coherency Cache coherent processors reading processor must get the most current value most current value is the last write Cache.
Graduate Computer Architecture I Lecture 10: Shared Memory Multiprocessors Young Cho.
1 Cache coherence CEG 4131 Computer Architecture III Slides developed by Dr. Hesham El-Rewini Copyright Hesham El-Rewini.
Tufts University Department of Electrical and Computer Engineering
Shared Address Space Computing: Hardware Issues Alistair Rendell See Chapter 2 of Lin and Synder, Chapter 2 of Grama, Gupta, Karypis and Kumar, and also.
CSIE30300 Computer Architecture Unit 15: Multiprocessors Hsin-Chou Chi [Adapted from material by and
EECS 252 Graduate Computer Architecture Lec 13 – Snooping Cache and Directory Based Multiprocessors David Patterson Electrical Engineering and Computer.
Cache Control and Cache Coherence Protocols How to Manage State of Cache How to Keep Processors Reading the Correct Information.
A Low-Overhead Coherence Solution for Multiprocessors with Private Cache Memories Also known as “Snoopy cache” Paper by: Mark S. Papamarcos and Janak H.
Ch4. Multiprocessors & Thread-Level Parallelism 2. SMP (Symmetric shared-memory Multiprocessors) ECE468/562 Advanced Computer Architecture Prof. Honggang.
Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture.
OpenRISC 1000 Yung-Luen Lan, b Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization.
.1 Intro to Multiprocessors. .2 The Big Picture: Where are We Now? Processor Control Datapath Memory Input Output Input Output Memory Processor Control.
S YMMETRIC S HARED M EMORY A RCHITECTURE Presented By: Rahul M.Tech CSE, GBPEC Pauri.
Cache Coherence Protocols A. Jantsch / Z. Lu / I. Sander.
Influence Of The Cache Size On The Bus Traffic Mohd Azlan bin Hj. Abd Rahman M
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 March 20, 2008 Session 9.
1 Lecture 19: Scalable Protocols & Synch Topics: coherence protocols for distributed shared-memory multiprocessors and synchronization (Sections )
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 February Session 13.
1 Lecture 3: Coherence Protocols Topics: consistency models, coherence protocol examples.
Cache Coherence CS433 Spring 2001 Laxmikant Kale.
The University of Adelaide, School of Computer Science
1 Lecture: Coherence Protocols Topics: snooping-based protocols.
Lecture 28Fall 2006 Computer Architecture Fall 2006 Lecture 28: Bus Connected Multiprocessors Adapted from Mary Jane Irwin ( )
1 Lecture 17: Multiprocessors Topics: multiprocessor intro and taxonomy, symmetric shared-memory multiprocessors (Sections )
1 Lecture: Coherence Topics: snooping-based coherence, directory-based coherence protocols (Sections )
Performance of Snooping Protocols Kay Jr-Hui Jeng.
CMSC 611: Advanced Computer Architecture Shared Memory Most slides adapted from David Patterson. Some from Mohomed Younis.
The University of Adelaide, School of Computer Science
22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distrubuted Shared-Memory Architectures by Seda Demirağ.
CSC/ECE 506: Architecture of Parallel Computers Bus-Based Coherent Multiprocessors 1 Lecture 12 (Chapter 8) Lecture 12 (Chapter 8)
COSC6385 Advanced Computer Architecture
CS 704 Advanced Computer Architecture
12.4 Memory Organization in Multiprocessor Systems
CMSC 611: Advanced Computer Architecture
Example Cache Coherence Problem
Cache Coherence Protocols:
Cache Coherence Protocols:
Multiprocessors - Flynn’s taxonomy (1966)
High Performance Computing
Lecture 25: Multiprocessors
Coherent caches Adapted from a lecture by Ian Watson, University of Machester.
Presentation transcript:

Additional Material CEG 4131 Computer Architecture III Cache coherence Additional Material CEG 4131 Computer Architecture III

Review Caches Address 0 1 5 19 25 9 1 13 22 4 21 22 23 5 999 401 Hit (Y/N) N Y N N N N Y N N Y N … Block Element1 Element2 Element3 Element4 0 4 1 1 2 3

Coherence with Write-through Caches Key extensions to uniprocessor: snooping, invalidating/updating caches no new states or bus transactions in this case invalidation- versus update-based protocols Write propagation: even in inval case, later reads will see new value inval causes miss on later access, and memory up-to-date via write- through

Write-Through Cache State Transitions R = Read, W = Write, Z = Replace i = local processor, j = other processor

Write-Back Cache

Goodman’s Write-Once Protocol State Diagram

Snoopy Bus Protocol Performance Write-invalidate protocol Better handles process migrations and synchronization than other protocols. Cache misses can result from invalidations sent by other processors before a cache access, which significantly increases bus traffic. Bus traffic may increase as block sizes increase. Write-invalidate facilities writing synchronization primitives. Average number of invalidated cache copies is small in a small multiprocessor. Write-update procotol Requires bus broadcast facility May update remote cached data that is never accessed again Can avoid the back and forth effect of the write-invalidate protocol for data shared among multiple caches Can’t be used with long write bursts Requires extensive tracing to identify actual behavior

References K. Hwang, Advanced Computer Architecture Parallelism, Scalability, Programmability,  McGraw-Hill 1993.