林仲祥, 李苡嬋 CSIE, NTU 11 / 03, 2008 Simplescalar: Victim Cache 1 ACA2008 HW3.

Slides:



Advertisements
Similar presentations
SE-292 High Performance Computing
Advertisements

SE-292 High Performance Computing Memory Hierarchy R. Govindarajan
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Cache III Steve Ko Computer Sciences and Engineering University at Buffalo.
Performance of Cache Memory
Quiz 4 Solution. n Frequency = 2.5GHz, CLK = 0.4ns n CPI = 0.4, 30% loads and stores, n L1 hit =0, n L1-ICACHE : 2% miss rate, 32-byte blocks n L1-DCACHE.
Cache Here we focus on cache improvements to support at least 1 instruction fetch and at least 1 data access per cycle – With a superscalar, we might need.
CSCE614 HW4: Implementing Pseudo-LRU Cache Replacement Policy
1 Recap: Memory Hierarchy. 2 Memory Hierarchy - the Big Picture Problem: memory is too slow and or too small Solution: memory hierarchy Fastest Slowest.
CSIE30300 Computer Architecture Unit 10: Virtual Memory Hsin-Chou Chi [Adapted from material by and
CSCI 232© 2005 JW Ryder1 Cache Memory Systems Introduced by M.V. Wilkes (“Slave Store”) Appeared in IBM S360/85 first commercially.
On-Chip Cache Analysis A Parameterized Cache Implementation for a System-on-Chip RISC CPU.
The Lord of the Cache Project 3. Caches Three common cache designs: Direct-Mapped store in exactly one cache line Fully Associative store in any cache.
Spring 2003CSE P5481 Introduction Why memory subsystem design is important CPU speeds increase 55% per year DRAM speeds increase 3% per year rate of increase.
1 Lecture 12: Cache Innovations Today: cache access basics and innovations (Sections )
Caches J. Nelson Amaral University of Alberta. Processor-Memory Performance Gap Bauer p. 47.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Oct 31, 2005 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
LRU Replacement Policy Counters Method Example
Caches The principle that states that if data is used, its neighbor will likely be used soon.
EENG449b/Savvides Lec /13/04 April 13, 2004 Prof. Andreas Savvides Spring EENG 449bG/CPSC 439bG Computer.
1 Lecture 13: Cache Innovations Today: cache access basics and innovations, DRAM (Sections )
EECS 470 Cache Systems Lecture 13 Coverage: Chapter 5.
1 Lecture: Cache Hierarchies Topics: cache innovations (Sections B.1-B.3, 2.1)
Cache intro CSE 471 Autumn 011 Principle of Locality: Memory Hierarchies Text and data are not accessed randomly Temporal locality –Recently accessed items.
An Intelligent Cache System with Hardware Prefetching for High Performance Jung-Hoon Lee; Seh-woong Jeong; Shin-Dug Kim; Weems, C.C. IEEE Transactions.
Predictor-Directed Stream Buffers Timothy Sherwood Suleyman Sair Brad Calder.
COEN 180 Main Memory Cache Architectures. Basics Speed difference between cache and memory is small. Therefore:  Cache algorithms need to be implemented.
Caches – basic idea Small, fast memory Stores frequently-accessed blocks of memory. When it fills up, discard some blocks and replace them with others.
Modularizing B+-trees: Three-Level B+-trees Work Fine Shigero Sasaki* and Takuya Araki NEC Corporation * currently with 1st Nexpire Inc.
Caches – basic idea Small, fast memory Stores frequently-accessed blocks of memory. When it fills up, discard some blocks and replace them with others.
1 Caching Basics CS Memory Hierarchies Takes advantage of locality of reference principle –Most programs do not access all code and data uniformly,
Cache Control and Cache Coherence Protocols How to Manage State of Cache How to Keep Processors Reading the Correct Information.
How to Build a CPU Cache COMP25212 – Lecture 2. Learning Objectives To understand: –how cache is logically structured –how cache operates CPU reads CPU.
1 Virtual Memory Main memory can act as a cache for the secondary storage (disk) Advantages: –illusion of having more physical memory –program relocation.
Lecture 5 Cache Operation ECE 463/521 Fall 2002 Edward F. Gehringer Based on notes by Drs. Eric Rotenberg & Tom Conte of NCSU.
Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.
Fundamentals of Parallel Computer Architecture - Chapter 61 Chapter 6 Introduction to Memory Hierarchy Organization Yan Solihin Copyright.
Computer Organization CS224 Fall 2012 Lessons 45 & 46.
M E M O R Y. Computer Performance It depends in large measure on the interface between processor and memory. CPI (or IPC) is affected CPI = Cycles per.
COMP SYSTEM ARCHITECTURE HOW TO BUILD A CACHE Antoniu Pop COMP25212 – Lecture 2Jan/Feb 2015.
11 Intro to cache memory Kosarev Nikolay MIPT Nov, 2009.
Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache And Pefetch Buffers Norman P. Jouppi Presenter:Shrinivas Narayani.
1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson.
Virtual Memory Review Goal: give illusion of a large memory Allow many processes to share single memory Strategy Break physical memory up into blocks (pages)
1 Appendix C. Review of Memory Hierarchy Introduction Cache ABCs Cache Performance Write policy Virtual Memory and TLB.
Cache Operation.
LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”
CAM Content Addressable Memory
Cache Small amount of fast memory Sits between normal main memory and CPU May be located on CPU chip or module.
1 Memory Hierarchy Design Chapter 5. 2 Cache Systems CPUCache Main Memory Data object transfer Block transfer CPU 400MHz Main Memory 10MHz Bus 66MHz CPU.
1 Memory Systems Caching Lecture 24 Digital Design and Computer Architecture Harris & Harris Morgan Kaufmann / Elsevier, 2007.
Lecture 5 Cache Operation
CSCI206 - Computer Organization & Programming
Lecture: Large Caches, Virtual Memory
Associativity in Caches Lecture 25
Consider a Direct Mapped Cache with 4 word blocks
Lecture: Cache Hierarchies
Lecture 23: Cache, Memory, Security
Chapter 5 Memory CSE 820.
Lecture: Cache Innovations, Virtual Memory
Module IV Memory Organization.
Virtual Memory فصل هشتم.
CDA 5155 Caches.
Miss Rate versus Block Size
Lecture 11: Cache Hierarchies
CS 3410, Spring 2014 Computer Science Cornell University
CSC3050 – Computer Architecture
Caches: AAT, 3C’s model of misses Prof. Eric Rotenberg
Update : about 8~16% are writes
Overview Problem Solution CPU vs Memory performance imbalance
Presentation transcript:

林仲祥, 李苡嬋 CSIE, NTU 11 / 03, 2008 Simplescalar: Victim Cache 1 ACA2008 HW3

 ISCA’90 (Norman P. Jouppi) “Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers” Victim Cache 2

General cache Victim cache data Hit Data Miss Data Search Victim miss data Memory (or lower cache)

Victim Cache General cache Victim cache data Miss Data Search Victim miss data Memory (or lower cache) data

Victim Cache General cache Victim cache data Miss Data Search Victim miss data Memory (or lower cache) data LRU

Victim Cache General cache Victim cache data Miss Data Search Victim hit Memory (or lower cache) data

HW3 How To  Implement by SimpleScalar

 Cache Software Architecture 8 cache.c / cache.h

 D-Cache L1 Hardware Architecture 9 loadstore

 ruu_issue(); for loads, first scan LSQ to see if a store forward is possible, if not, access the data cache.  ruu_commit(); stores must retire their store value to the cache at commit. sim-outorder.c 10 load store

 cache_create(); create and initialize a general cache structure  cache_access(); access a cache cache.c 11

 Check all cache parameters  Allocate the cache structure  Initialize user parameters  Initialize cache stats  Allocate data blocks cache.c: cache_create(); 12

 Check for a hit Fast hit  access to same block Hit  access cache  update dirty (write hit)  update list by LRU policy Miss  select the appropriate block to replace  write back replaced block data  read data block  update block tags  update dirty status (if cmd = write) cache.c: cache_access(); 13

Other information…  Most modifications in this homework: cache.c cache.h  Cache configuration (refer to hack_guide.pdf) : : : : Ex. dl1:8:32:2:l  Name = dl1  # of set = 8  Block size = 32 bytes  # of way = 2  Replacement policy = LRU 14 # of set = 8 # of way (assoc) = 2 block size = 32 bytes