Lecture 6 Memory Hierarchy

Slides:

Advertisements

Similar presentations

COSC 3330/6308 Solutions to Second Problem Set Jehan-François Pâris October 2012.

Advertisements

Lecture 19: Cache Basics Today’s topics: Out-of-order execution

1 Parallel Scientific Computing: Algorithms and Tools Lecture #2 APMA 2821A, Spring 2008 Instructors: George Em Karniadakis Leopold Grinberg.

Lecture 8: Memory Hierarchy Cache Performance Kai Bu

Practical Caches COMP25212 cache 3. Learning Objectives To understand: –Additional Control Bits in Cache Lines –Cache Line Size Tradeoffs –Separate I&D.

CS455/CpE 442 Intro. To Computer Architecure

Lecture: Pipelining Basics

CSC 4250 Computer Architectures December 8, 2006 Chapter 5. Memory Hierarchy.

Computational Astrophysics: Methodology 1.Identify astrophysical problem 2.Write down corresponding equations 3.Identify numerical algorithm 4.Find a computer.

Review for Midterm 2 CPSC 321 Computer Architecture Andreas Klappenecker.

Appendix A Pipelining: Basic and Intermediate Concepts

Memory: PerformanceCSCE430/830 Memory Hierarchy: Performance CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng Zhu (U. Maine)

Memory Hierarchy and Cache Design The following sources are used for preparing these slides: Lecture 14 from the course Computer architecture ECE 201 by.

CSC 4250 Computer Architectures September 15, 2006 Appendix A. Pipelining.

CS1104 – Computer Organization PART 2: Computer Architecture Lecture 12 Overview and Concluding Remarks.

L/O/G/O Cache Memory Chapter 3 (b) CS.216 Computer Architecture and Organization.

Computer Organization and Architecture Tutorial 1 Kenneth Lee.

The Memory Hierarchy Lecture # 30 15/05/2009Lecture 30_CA&O_Engr Umbreen Sabir.

Lecture 08: Memory Hierarchy Cache Performance Kai Bu

Nov. 15, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 8: Memory Hierarchy Design * Jeremy R. Johnson Wed. Nov. 15, 2000 *This lecture.

DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%

Lecture 17 Final Review Prof. Mike Schulte Computer Architecture ECE 201.

Memory Hierarchy and Caches. Who Cares about Memory Hierarchy? Processor Only Thus Far in Course CPU-DRAM Gap 1980: no cache in µproc; level cache,

COMP SYSTEM ARCHITECTURE PRACTICAL CACHES Sergio Davies Feb/Mar 2014COMP25212 – Lecture 3.

COSC 3330/6308 Second Review Session Fall Instruction Timings For each of the following MIPS instructions, check the cycles that each instruction.

High Performance Computing1 High Performance Computing (CS 680) Lecture 2a: Overview of High Performance Processors * Jeremy R. Johnson *This lecture was.

1 Memory Hierarchy Design Chapter 5. 2 Cache Systems CPUCache Main Memory Data object transfer Block transfer CPU 400MHz Main Memory 10MHz Bus 66MHz CPU.

1 Lecture 20: OOO, Memory Hierarchy Today’s topics:  Out-of-order execution  Cache basics.

COSC3330 Computer Architecture

Memory COMPUTER ARCHITECTURE

CS2100 Computer Organization

Associativity in Caches Lecture 25

A Closer Look at Instruction Set Architectures

Improving Memory Access 1/3 The Cache and Virtual Memory

CSC 4250 Computer Architectures

Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.

Cache Memory Presentation I

Consider a Direct Mapped Cache with 4 word blocks

Lecture: Large Caches, Virtual Memory

Exploiting Memory Hierarchy Chapter 7

Lecture 21: Memory Hierarchy

Lecture 21: Memory Hierarchy

Cache Memories September 30, 2008

Computer Organization “Central” Processing Unit (CPU)

Systems Architecture II

Lecture 08: Memory Hierarchy Cache Performance

CSCE Fall 2013 Prof. Jennifer L. Welch.

Lecture 22: Cache Hierarchies, Memory

Architectural Overview

Memory Hierarchy Memory: hierarchy of components of various speeds and capacities Hierarchy driven by cost and performance In early days Primary memory.

10/16: Lecture Topics Memory problem Memory Solution: Caches Locality

M. Usha Professor/CSE Sona College of Technology

Memory Organization.

Lecture 20: OOO, Memory Hierarchy

Morgan Kaufmann Publishers Memory Hierarchy: Cache Basics

Lecture 20: OOO, Memory Hierarchy

CS455/CpE 442 Intro. To Computer Architecure

If a DRAM has 512 rows and its refresh time is 9ms, what should be the frequency of row refresh operation on the average?

CSCE Fall 2012 Prof. Jennifer L. Welch.

Lecture 22: Cache Hierarchies, Memory

Instruction Set Principles

Lecture 21: Memory Hierarchy

Memory Hierarchy Memory: hierarchy of components of various speeds and capacities Hierarchy driven by cost and performance In early days Primary memory.

ARM ORGANISATION.

Chapter Five Large and Fast: Exploiting Memory Hierarchy

Lecture 13: Cache Basics Topics: terminology, cache organization (Sections )

Lecture: Pipelining Basics

Introduction to Computer Systems Engineering

Overview Problem Solution CPU vs Memory performance imbalance

Presentation transcript:

Lecture 6 Memory Hierarchy CSCE 513 Computer Architecture Lecture 6 Memory Hierarchy Topics Cache overview Readings: Appendix B September 20, 2017

B. 1 [10/ 10/ 10/ 15] < B. 1 > You are trying to appreciate how important the principle of locality is in justifying the use of a cache memory, so you experiment with a computer having an L1 data cache and a main memory (you exclusively focus on data accesses). The latencies (in CPU cycles) of the different kinds of accesses are as follows: cache hit, 1 cycle; cache miss, 105 cycles; main memory access with cache disabled, 100 cycles. a. [10] < B. 1 > When you run a program with an overall miss rate of 5%, what will the average memory access time (in CPU cycles) be?

b. [10] < B. 1 > Next, you run a program specifically designed to produce completely random data addresses with no locality. Toward that end, you use an array of size 256 MB (all of it fits in the main memory). Accesses to random elements of this array are continuously made (using a uniform random number generator to generate the elements indices). If your data cache size is 64 KB, what will the average memory access time be?

c. [10] < B. 1 > If you compare the result obtained in part (b) with the main memory access time when the cache is disabled, what can you conclude about the role of the principle of locality in justifying the use of cache memory?

d. [15] < B. 1 > You observed that a cache hit produces a gain of 99 cycles (1 cycle vs. 100), but it produces a loss of 5 cycles in the case of a miss (105 cycles vs. 100). In the general case, we can express these two quantities as G (gain) and L (loss). Using these two quantities (G and L), identify the highest miss rate after which the cache use would be disadvantageous.

B. 2 [15/ 15] < B. 1 > For the purpose of this exercise, we assume that we have 512-byte cache with 64-byte blocks. We will also assume that the main memory is 2 KB large. We can regard the memory as an array of 64-byte blocks: M0, M1, …, M31. Figure B. 30 sketches the memory blocks that can reside in different cache blocks if the cache was fully associative. a. [15 ] < B. 1 > Show the contents of the table if cache is organized as a direct-mapped cache.

b. [15] < B. 1 > Repeat part (a) with the cache organized as a four-way set associative cache.

C. 6 [12/ 13/ 13/ 15/ 15] < C. 1, C. 2, C. 3 > We will now add support for register-memory ALU operations to the classic five-stage RISC pipeline. To offset this increase in complexity, all memory addressing will be restricted to register indirect (i.e., all addresses are simply a value held in a register; no offset or displacement may be added to the register value). For example, the register-memory instruction ADD R4, R5, (R1) means add the contents of register R5 to the contents of the memory location with address equal to the value in register R1 and put the sum in register R4. Register-register ALU operations are unchanged. The following items apply to the integer RISC pipeline: a. [12] < C. 1 > List a rearranged order of the five traditional stages of the RISC pipeline that will support register-memory operations implemented exclusively by register indirect addressing.

b. [13] < C. 2, C. 3 > Describe what new forwarding paths are needed for the rearranged pipeline by stating the source, destination, and information transferred on each needed new path.

c. [13] < C. 2, C. 3 > For the reordered stages of the RISC pipeline, what new data hazards are created by this addressing mode? Give an instruction sequence illustrating each new hazard.

d. [15] < C. 3 > List all of the ways that the RISC pipeline with register-memory ALU operations can have a different instruction count for a given program than the original RISC pipeline. Give a pair of specific instruction sequences, one for the original pipeline and one for the rearranged pipeline, to illustrate each way.

e. [15] < C. 3 > Assume that all instructions take 1 clock cycle per stage. List all of the ways that the register-memory RISC can have a different CPI for a given program as compared to the original RISC pipeline.