Lecture 41: Review Session #3 Reminders –Office hours during final week TA as usual (Tuesday & Thursday 12:50pm-2:50pm) Hassan: Wednesday 1pm to 4pm or.

Slides:



Advertisements
Similar presentations
Lecture 19: Cache Basics Today’s topics: Out-of-order execution
Advertisements

1 Lecture 13: Cache and Virtual Memroy Review Cache optimization approaches, cache miss classification, Adapted from UCB CS252 S01.
Practical Caches COMP25212 cache 3. Learning Objectives To understand: –Additional Control Bits in Cache Lines –Cache Line Size Tradeoffs –Separate I&D.
1 Recap: Memory Hierarchy. 2 Memory Hierarchy - the Big Picture Problem: memory is too slow and or too small Solution: memory hierarchy Fastest Slowest.
CSC 4250 Computer Architectures December 8, 2006 Chapter 5. Memory Hierarchy.
Processor - Memory Interface
1 Lecture 20: Cache Hierarchies, Virtual Memory Today’s topics:  Cache hierarchies  Virtual memory Reminder:  Assignment 8 will be posted soon (due.
Cache Memories September 30, 2008 Topics Generic cache memory organization Direct mapped caches Set associative caches Impact of caches on performance.
1 Lecture 12: Cache Innovations Today: cache access basics and innovations (Sections )
1 Lecture 14: Cache Innovations and DRAM Today: cache access basics and innovations, DRAM (Sections )
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
Review CPSC 321 Andreas Klappenecker Announcements Tuesday, November 30, midterm exam.
1 Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
331 Lec20.1Fall :332:331 Computer Architecture and Assembly Language Fall 2003 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
EENG449b/Savvides Lec /13/04 April 13, 2004 Prof. Andreas Savvides Spring EENG 449bG/CPSC 439bG Computer.
1 Lecture 13: Cache Innovations Today: cache access basics and innovations, DRAM (Sections )
©UCB CS 162 Ch 7: Virtual Memory LECTURE 13 Instructor: L.N. Bhuyan
331 Lec20.1Spring :332:331 Computer Architecture and Assembly Language Spring 2005 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.
1 Lecture 14: Virtual Memory Today: DRAM and Virtual memory basics (Sections )
Cache Memories Effectiveness of cache is based on a property of computer programs called locality of reference Most of programs time is spent in loops.
Memory Hierarchy and Cache Design The following sources are used for preparing these slides: Lecture 14 from the course Computer architecture ECE 201 by.
Caches Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University See P&H 5.1, 5.2 (except writes)
Lecture 19: Virtual Memory
Lecture 10 Memory Hierarchy and Cache Design Computer Architecture COE 501.
The Memory Hierarchy 21/05/2009Lecture 32_CA&O_Engr Umbreen Sabir.
Multilevel Memory Caches Prof. Sirer CS 316 Cornell University.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui COMP 203 / NWEN 201 Computer Organisation / Computer Architectures Virtual.
CS1104 – Computer Organization PART 2: Computer Architecture Lecture 10 Memory Hierarchy.
Multiprocessor cache coherence. Caching: terms and definitions cache line, line size, cache size degree of associativity –direct-mapped, set and fully.
Lecture 40: Review Session #2 Reminders –Final exam, Thursday 3:10pm Sloan 150 –Course evaluation (Blue Course Evaluation) Access through.
B. Ramamurthy.  12 stage pipeline  At peak speed, the processor can request both an instruction and a data word on every clock.  We cannot afford pipeline.
CML CML CS 230: Computer Organization and Assembly Language Aviral Shrivastava Department of Computer Science and Engineering School of Computing and Informatics.
Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.
The Memory Hierarchy Lecture # 30 15/05/2009Lecture 30_CA&O_Engr Umbreen Sabir.
Spring 2003CSE P5481 Advanced Caching Techniques Approaches to improving memory system performance eliminate memory operations decrease the number of misses.
Outline Cache writes DRAM configurations Performance Associative caches Multi-level caches.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Nov. 15, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 8: Memory Hierarchy Design * Jeremy R. Johnson Wed. Nov. 15, 2000 *This lecture.
Review °We would like to have the capacity of disk at the speed of the processor: unfortunately this is not feasible. °So we create a memory hierarchy:
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
Memory Hierarchy How to improve memory access. Outline Locality Structure of memory hierarchy Cache Virtual memory.
Lecture 17 Final Review Prof. Mike Schulte Computer Architecture ECE 201.
1 CMPE 421 Parallel Computer Architecture PART3 Accessing a Cache.
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
Lecture 20 Last lecture: Today’s lecture: Types of memory
COMP SYSTEM ARCHITECTURE PRACTICAL CACHES Sergio Davies Feb/Mar 2014COMP25212 – Lecture 3.
The Memory Hierarchy (Lectures #17 - #20) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer.
Constructive Computer Architecture Realistic Memories and Caches Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
High Performance Computing1 High Performance Computing (CS 680) Lecture 2a: Overview of High Performance Processors * Jeremy R. Johnson *This lecture was.
COMP 3221: Microprocessors and Embedded Systems Lectures 27: Cache Memory - III Lecturer: Hui Wu Session 2, 2005 Modified.
Memory Hierarchy and Cache Design (3). Reducing Cache Miss Penalty 1. Giving priority to read misses over writes 2. Sub-block placement for reduced miss.
1 Lecture 20: OOO, Memory Hierarchy Today’s topics:  Out-of-order execution  Cache basics.
CSCI206 - Computer Organization & Programming
COSC3330 Computer Architecture
CSE 351 Section 9 3/1/12.
The Memory System (Chapter 5)
Memory COMPUTER ARCHITECTURE
CS2100 Computer Organization
Improving Memory Access 1/3 The Cache and Virtual Memory
CSC 4250 Computer Architectures
Cache Memory Presentation I
Lecture 21: Memory Hierarchy
Lecture 23: Cache, Memory, Virtual Memory
Morgan Kaufmann Publishers Memory Hierarchy: Cache Basics
Lecture 11: Cache Hierarchies
Lecture 21: Memory Hierarchy
Chapter Five Large and Fast: Exploiting Memory Hierarchy
Cache - Optimization.
Overview Problem Solution CPU vs Memory performance imbalance
Presentation transcript:

Lecture 41: Review Session #3 Reminders –Office hours during final week TA as usual (Tuesday & Thursday 12:50pm-2:50pm) Hassan: Wednesday 1pm to 4pm or me for an appointment –Final exam, Thursday 3:10pm Sloan 150 –Course evaluation (Blue Course Evaluation) Access through zzusis 1

Problem #9 How many total SRAM bits will be required to implement a 256KB four-way set associative cache. The cache is physically-indexed cache, and has 64-byte blocks. Assume that there are 4 extra bits per entry: 1 valid bit, 1 dirty bit, and 2 LRU bits for the replacement policy. Assume that the physical address is 50 bits wide. 2

Solution #9 The number of sets in the 256KB four-way set associative cache –(256*2 10 )/(4*64) =1024 A set has four entries. Each entry in the set occupies 4 bits + 64*8 bits = 516 bits The total number of SRAM bits required = 516*4*1024 =

Problem #10 Design a 128KB direct-mapped data cache that uses a 32-bit address and 16 bytes per block. Calculate the following: (a) How many bits are used for the byte offset? (b) How many bits are used for the set (index) field? (c) How many bits are used for the tag? 4

Solution #10 (a) How many bits are used for the byte offset? 4 bits (b) How many bits are used for the set (index) field? 13 bits (c) How many bits are used for the tag? 15 bits 5

Problem #11 Design a 8-way set associative cache that has 16 blocks and 32 bytes per block. Assume a 32 bit address. Calculate the following: (a) How many bits are used for the byte offset? (b) How many bits are used for the set (index) field? (c) How many bits are used for the tag? 6

Solution #11 (a) How many bits are used for the byte offset? 5 bits (b) How many bits are used for the set (index) field? 1 bits (c) How many bits are used for the tag? 26 bits 7

Problem #12 int i; int a[1024*1024]; int x=0; for(i=0;i<1024;i++) { x+=a[i]+a[1024*i]; } Consider the code snippet in code above. Suppose that it is executed on a system with a 2-way set associative 16KB data cache with 32-byte blocks, 32-bit words, and an LRU replacement policy. Assume that int is word-sized. Also assume that the address of ‘a’ is 0x0, that ‘i’ and ‘x’ are in registers, and that the cache is initially empty. How many data cache misses are there? 8

Solution #12 The number of sets in the cache = (16 * 2 10 ) /(2*32) = 256 Since a word size is 4 bytes, int is word sized and the size of a cache block is 32 bytes, the number of ints that would fit in a cache block is 8. Therefore all the ints in ‘a’ from a[0] to a[1023] map to one of the cache lines of the sets 0 to 127, while all the ints in ‘a’ from a[1024] to a[1024*2 -1] map to the sets 128 to 255. Similarly the array elements a[1024*2] to a[1024*3-1] map to cache lines of sets 0 to 127, a[1024*3] to a[1024*4 – 1] map to cache lines 128 to 255 and so on. In the loop, every time a[i] is accessed for ‘i’ being a multiple of 8 would be a miss. There the number of misses due to a[i] accesses inside the loop is 1024/8 = 128. Now all accesses to a[1024*i] within the loop are misses except the very first one (a[0] is already brought to the cache). This is because map alternately to sets 0 and 128 consecutively where there are cold misses the first time they are referenced. The total number of misses = =

Problem #13 Give a concise answer to each of the following questions. Limit your answers to words. (a) What is memory mapped I/O? (b) Why is DMA an improvement over CPU programmed I/O? (c) When would DMA transfer be a poor choice? (d) What are the two characteristics of program memory accesses that caches exploit? (e) What are three types of cache misses? (f) In what pipeline stage is the branch target buffer checked? (g) What needs to be stored in a branch target buffer in order to eliminate the branch penalty for an unconditional branch, Address of branch target, Address of branch target and branch prediction, or Instruction at branch target? 10

Problem #14 (True/False) A virtual cache access time is always faster than that of a physical cache? (True/False) High associativity in a cache reduces compulsory misses. (True/False) Both DRAM and SRAM must be refreshed periodically using a dummy read/write operation. (True/False) A write-through cache typically requires less bus bandwidth than a write-back cache. (True/False) Cache performance is of less importance in faster processors because the processor speed compensates for the high memory access time. (True/False) Memory interleaving is a technique for reducing memory access time through increased bandwidth utilization of the data bus. 11

What else? Midterm 2 & midterm 1 questions Homework assignments –Solutions for all assignments will be sent to your wsu by Monday! 12