Two Dimensional Highly Associative Level-Two Cache Design

Slides:

Advertisements

Similar presentations

SE-292 High Performance Computing

Advertisements

361 Computer Architecture Lecture 15: Cache Memory

SE-292 High Performance Computing Memory Hierarchy R. Govindarajan

BALANCED CACHE Ayşe BAKIR, Zeynep ZENGİN. Ayse Bakır,CMPE 511,Bogazici University2 Outline  Introduction  Motivation  The B-Cache Organization  Experimental.

1 Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers By Sreemukha Kandlakunta Phani Shashank.

LEVERAGING ACCESS LOCALITY FOR THE EFFICIENT USE OF MULTIBIT ERROR-CORRECTING CODES IN L2 CACHE By Hongbin Sun, Nanning Zheng, and Tong Zhang Joseph Schneider.

On-Chip Cache Analysis A Parameterized Cache Implementation for a System-on-Chip RISC CPU.

Power Efficient IP Lookup with Supernode Caching Lu Peng, Wencheng Lu*, and Lide Duan Dept. of Electrical & Computer Engineering Louisiana State University.

LRU Replacement Policy Counters Method Example

EENG449b/Savvides Lec /13/04 April 13, 2004 Prof. Andreas Savvides Spring EENG 449bG/CPSC 439bG Computer.

1 Balanced Cache:Reducing Conflict Misses of Direct-Mapped Caches through Programmable Decoders ISCA 2006,IEEE. By Chuanjun Zhang Speaker: WeiZeng.

Cache Organization Topics Background Simple examples.

An Intelligent Cache System with Hardware Prefetching for High Performance Jung-Hoon Lee; Seh-woong Jeong; Shin-Dug Kim; Weems, C.C. IEEE Transactions.

COEN 180 Main Memory Cache Architectures. Basics Speed difference between cache and memory is small. Therefore:  Cache algorithms need to be implemented.

Maninder Kaur CACHE MEMORY 24-Nov

Lecture Objectives: 1)Define set associative cache and fully associative cache. 2)Compare and contrast the performance of set associative caches, direct.

Computer Architecture Memory organization. Types of Memory Cache Memory Serves as a buffer for frequently accessed data Small  High Cost RAM (Main Memory)

L/O/G/O Cache Memory Chapter 3 (b) CS.216 Computer Architecture and Organization.

Computer Architecture Lecture 26 Fasih ur Rehman.

Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 January Session 2.

Routing Prefix Caching in Network Processor Design Huan Liu Department of Electrical Engineering Stanford University

CSE 241 Computer Engineering (1) هندسة الحاسبات (1) Lecture #3 Ch. 6 Memory System Design Dr. Tamer Samy Gaafar Dept. of Computer & Systems Engineering.

The Memory Hierarchy Lecture # 30 15/05/2009Lecture 30_CA&O_Engr Umbreen Sabir.

Han Liu Supervisor: Seok-Bum Ko Electrical & Computer Engineering Department 2010-Mar-9.

Computer Organization CS224 Fall 2012 Lessons 45 & 46.

ASPLOS’02 Presented by Kim, Sun-Hee.  Technology trends ◦ The rate of frequency scaling is slowing down  Performance must come from exploiting concurrency.

DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%

11 Intro to cache memory Kosarev Nikolay MIPT Nov, 2009.

The Evicted-Address Filter

CAM Content Addressable Memory

Cache Small amount of fast memory Sits between normal main memory and CPU May be located on CPU chip or module.

Cache Data Compaction: Milestone 2 Edward Ma, Siva Penke, Abhijeeth Nuthan.

Characteristics Location Capacity Unit of transfer Access method Performance Physical type Physical characteristics Organisation.

Chapter 5 Memory II CSE 820. Michigan State University Computer Science and Engineering Equations CPU execution time = (CPU cycles + Memory-stall cycles)

Dynamic Associative Caches:

Associative Mapping A main memory block can load into any line of cache Memory address is interpreted as tag and word Tag uniquely identifies block of.

Cache Memory Yi-Ning Huang. Principle of Locality Principle of Locality A phenomenon that the recent used memory location is more likely to be used again.

COSC3330 Computer Architecture

CS2100 Computer Organization

CAM Content Addressable Memory

Multilevel Memories (Improving performance using alittle “cash”)

Basic Performance Parameters in Computer Architecture:

Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.

Morgan Kaufmann Publishers Large and Fast: Exploiting Memory Hierarchy

Cache Memory Presentation I

Consider a Direct Mapped Cache with 4 word blocks

Morgan Kaufmann Publishers

Lecture: Large Caches, Virtual Memory

William Stallings Computer Organization and Architecture 7th Edition

Cache Miss Rate Computations

Lecture: Cache Hierarchies

Chapter 5 Memory CSE 820.

Module IV Memory Organization.

Module IV Memory Organization.

Lecture: Cache Innovations, Virtual Memory

Help! How does cache work?

Direct Mapping.

Module IV Memory Organization.

Chapter 6 Memory System Design

EE108B Review Session #6 Daxia Ge Friday February 23rd, 2007

Basic Cache Operation Prof. Eric Rotenberg

CSC3050 – Computer Architecture

Increasing Effective Cache Capacity Through the Use of Critical Words

Cache - Optimization.

Cache Memory and Performance

Han Liu Partner: Dongdong Chen Supervisor: Seok-Bum Ko

Overview Problem Solution CPU vs Memory performance imbalance

Restrictive Compression Techniques to Increase Level 1 Cache Capacity

Presentation transcript:

Two Dimensional Highly Associative Level-Two Cache Design Han Liu Supervisor: Seok-Bum Ko Electrical & Computer Engineering Department 2010-Mar-26

Outline Information of literature Background Introduction of CAM Previous Research 2D-Cache Organization 2D-Cache Operation Comparison

Information of literature Two Dimensional Highly Associative Level-Two Cache Design Chuanjun Zhang and Bing Xue IEEE International Conference on Computer Design, 2008. (ICCD 2008)

Background Highly associativity is important for L2 Cache Highly associative caches increase HW cost More comparator More routing wires Tag stored in CAM cells facilitate tag comparison CAM cells are power costly and consume large area LRU replacement is only implemented in low associative caches. Random replacement needs simple HW, but has higher miss rate than LRU

Introduction of CAM CAM: Content Addressable Memory

Previous Research CAM based method for low associative caches are conducted in such aspects, Low HW complexity Low miss rate Low miss penalty SW controlled fully associative has better flexibility This work focuses on inexpensive highly associative L2 cache

2D-Cache Organization For Column For Row

2D-Cache Operation (1/2) Search for the hit block for a cache hit All the column are searched concurrently Search for the victim block for a cache miss No CAM tag matches the desired address Each column has one CAM tag matches the desired address Some columns have CAM tag hits while remaining columns do not have CAM tag hits 2D-Cache layout

2D-Cache Operation (2/2) LRU in column Random in Row

Benchmark Results (1/3) 26 SPEC2K CAM-bit v.s. Miss rate

(ways in column is constant to 8) Benchmark Results (2/3) Ways in row v.s. Miss rate (ways in column is constant to 8)

Replacement Policy v.s. Miss rate Benchmark Results (3/3) Replacement Policy v.s. Miss rate (Row-Column)

Energy and Latency of diff. caches Comparison (1/3) Energy and Latency of diff. caches

IPC reduction over the 8-way caches Comparison (2/3) IPC reduction over the 8-way caches

Energy reduction over the 8-way caches Comparison (3/3) Energy reduction over the 8-way caches

Question Thanks!