COSC2410: LAB 19 INTRODUCTION TO MEMORY/CACHE DIRECT MAPPING 1.

Slides:



Advertisements
Similar presentations
Lecture 8: Memory Hierarchy Cache Performance Kai Bu
Advertisements

Practical Caches COMP25212 cache 3. Learning Objectives To understand: –Additional Control Bits in Cache Lines –Cache Line Size Tradeoffs –Separate I&D.
CSC1016 Coursework Clarification Derek Mortimer March 2010.
How caches take advantage of Temporal locality
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
1 Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
CE6105 Linux 作業系統 Linux Operating System 許 富 皓. Chapter 2 Memory Addressing.
Memory Organization.
Computer ArchitectureFall 2007 © November 12th, 2007 Majd F. Sakr CS-447– Computer Architecture.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
1  Caches load multiple bytes per block to take advantage of spatial locality  If cache block size = 2 n bytes, conceptually split memory into 2 n -byte.
CS 61C L21 Caches II (1) Garcia, Spring 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C : Machine.
1  The second question was how to determine whether or not the data we’re interested in is already stored in the cache.  If we want to read memory address.
Lecture 21 Last lecture Today’s lecture Cache Memory Virtual memory
CMPE 421 Parallel Computer Architecture
Lecture 19: Virtual Memory
Lecture 10 Memory Hierarchy and Cache Design Computer Architecture COE 501.
The Memory Hierarchy 21/05/2009Lecture 32_CA&O_Engr Umbreen Sabir.
Virtual Memory Expanding Memory Multiple Concurrent Processes.
10/18: Lecture topics Memory Hierarchy –Why it works: Locality –Levels in the hierarchy Cache access –Mapping strategies Cache performance Replacement.
1 Linux Operating System 許 富 皓. 2 Memory Addressing.
3-May-2006cse cache © DW Johnson and University of Washington1 Cache Memory CSE 410, Spring 2006 Computer Systems
Multiprocessor cache coherence. Caching: terms and definitions cache line, line size, cache size degree of associativity –direct-mapped, set and fully.
2007 Sept. 14SYSC 2001* - Fall SYSC2001-Ch4.ppt1 Chapter 4 Cache Memory 4.1 Memory system 4.2 Cache principles 4.3 Cache design 4.4 Examples.
Lecture 40: Review Session #2 Reminders –Final exam, Thursday 3:10pm Sloan 150 –Course evaluation (Blue Course Evaluation) Access through.
Virtual Memory 1 1.
1 How will execution time grow with SIZE? int array[SIZE]; int sum = 0; for (int i = 0 ; i < ; ++ i) { for (int j = 0 ; j < SIZE ; ++ j) { sum +=
CML CML CS 230: Computer Organization and Assembly Language Aviral Shrivastava Department of Computer Science and Engineering School of Computing and Informatics.
CSE378 Intro to caches1 Memory Hierarchy Memory: hierarchy of components of various speeds and capacities Hierarchy driven by cost and performance In early.
Lecture 08: Memory Hierarchy Cache Performance Kai Bu
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Caching Chapter 7.
DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson.
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
Lecture 20 Last lecture: Today’s lecture: Types of memory
COMP SYSTEM ARCHITECTURE PRACTICAL CACHES Sergio Davies Feb/Mar 2014COMP25212 – Lecture 3.
1 Appendix C. Review of Memory Hierarchy Introduction Cache ABCs Cache Performance Write policy Virtual Memory and TLB.
Cache Organization 1 Computer Organization II © CS:APP & McQuain Cache Memory and Performance Many of the following slides are taken with.
CSE 351 Caches. Section Feedback Before we begin, we’d like to get some feedback about section If you could answer the following questions on the provided.
3/1/2002CSE Virtual Memory Virtual Memory CPU On-chip cache Off-chip cache DRAM memory Disk memory Note: Some of the material in this lecture are.
CSE 351 Caches. Before we start… A lot of people confused lea and mov on the midterm Totally understandable, but it’s important to make the distinction.
Performance improvements ( 1 ) How to improve performance ? Reduce the number of cycles per instruction and/or Simplify the organization so that the clock.
CMSC 611: Advanced Computer Architecture
Memory: Page Table Structure
Memory Hierarchy Ideal memory is fast, large, and inexpensive
Address – 32 bits WRITE Write Cache Write Main Byte Offset Tag Index Valid Tag Data 16K entries 16.
CSE 351 Section 9 3/1/12.
Memory COMPUTER ARCHITECTURE
Lecture 12 Virtual Memory.
CS703 - Advanced Operating Systems
CSC 4250 Computer Architectures
How will execution time grow with SIZE?
Cache Memory Presentation I
Morgan Kaufmann Publishers Memory & Cache
CSCI206 - Computer Organization & Programming
Lecture 08: Memory Hierarchy Cache Performance
Module IV Memory Organization.
Memory Hierarchy Memory: hierarchy of components of various speeds and capacities Hierarchy driven by cost and performance In early days Primary memory.
How can we find data in the cache?
Morgan Kaufmann Publishers Memory Hierarchy: Cache Basics
CS-447– Computer Architecture Lecture 20 Cache Memories
Memory Hierarchy Memory: hierarchy of components of various speeds and capacities Hierarchy driven by cost and performance In early days Primary memory.
Cache - Optimization.
Cache Memory and Performance
10/18: Lecture Topics Using spatial locality
Virtual Memory 1 1.
Presentation transcript:

COSC2410: LAB 19 INTRODUCTION TO MEMORY/CACHE DIRECT MAPPING 1

Introduction  What is Cache?  What is main memory?  Extra Question: Why are we currently moving towards 64 bit processors instead of 32 bit? 2

Types of Cache Broad Range of Caches:  Disk Cache  Web Cache  CPU Cache  DNS Caching  Database Caching, etc. We are only going to be dealing with CPU Cache. 3

Type of Cache Mapping  Direct  Fully Associative  Set Associative 4

Facts we are assuming  RAM is divided into blocks of memory locations. Memory is grouped into 2 n byte blocks, where n is number of bits used to uniquely identify where a data is within a block.  Cache is organized into lines, containing enough space to store EXACTLY ONE block of data and a tag uniquely identifying where the block came from (It may also include some extra bits, as flags, etc.) 5

What this means – Main Memory 6 Block 0 Block 1 Block 2 N -2 Block 2 N -1 … Memory We can see that our memory is separated into different blocks (each a different color) N is the number of bits that are used to identify the particular block. If we have N = 4, that means we can have a total of 2 N = 16 blocks (i.e. block number 0 to block number 15) In this case, there are 4 DIFFERENT memory addresses belonging to the same block. This means we will need 2 bits to figure out where the address is within the block.

What this means – Cont’d 7 BlockAddressBlock identification bitsOffset Block 0 0x x x x Block 1 0x x x x And so on...until we get to the last row Block 2 n -1 0xFFFFC xFFFFD xFFFFE xFFFFF

Different Parts of an Address  The address is a set of bits that all together point to a specific memory location. Generally, it is split into 3 parts:  1. Tag – Identifies the block from among all the blocks that belong to the same index  2. Index – Identifies which line of the cache we write the block into  3. Offset – Identifies where in the block our memory location is. 8 TagIndexOffset Address

Simplifying it  Each block may have more than one byte of data. That means many memory locations could be present in a single block of memory.  For simplicity, we assume that a block is the smallest part of memory (in case of byte addressable memory, 1 byte). This way, we can see how the cache works easier. (This also means we don’t require any bits for offset, so our memory address is just split into tag and index) 9

Direct Mapping  Each memory block is assigned a specific line in the cache. If a line is already taken up by a block when a new block needs to be loaded, the old block is replaced. 10

Direct Mapping 11 In this case, all the reds will only go to the red line in the cache. All blues will go only to the blue line, and so on. If the cache contains 2 k blocks, the k least significant bits are used as the index (as we are assuming there is no offset. Its easy to find where a memory address, i, will go. We simply use: i mod 2 k

Direct Mapping – Tags and Validity  We need a way of checking if the cache has a valid entry, or if it does not, so we use a flag. If the flag is set, the entry is valid, if it is no, the entry doesn’t exist.  Since multiple memory blocks can be written to the same cache line, we need a way to identify from where the block came from. To do this, we introduce a tag. 12

Tags and Validity 13

Question:  If we increase the number of memory block locations in the previous slide from 16 to 32, but kept the cache size the same, how many bits are required for the tag? A) Number of tag bits won’t change. It will still be 2. B) We needed 2 bits for 16, since we are adding 16 more blocks, we need another 2 bits. So 4. C) We just need another bit to represent the additional 16 locations, so the answer is 3. D) We need 2k bits to represent the new number of locations, so the answer should be 5 14

Question 2:  Which will have more number of blocks? Cache or Main Memory? 15 Ans: Memory

Question 3:  Consider a byte addressable machine with 16 bit addresses having a cache with the following characteristics:  It is direct-mapped  Each block holds exactly one byte  The cache index is 4 bits long How many blocks does the cache hold? How many bits of storage are required to build the cache? 16

Question 3a:  How many blocks does the cache hold?  Ans: 4 bit index -> 2 4 = 16 blocks 17

Question 3b:  How many bits of storage are required to build the cache?  Ans: Tag bits = 12 bits ( 16 bit address – 4 bit index) (12 tag bits + 1 validity bit + 8 data bits) x 16 blocks = 21 bits X 16 blocks = 336 bits 18

Note: Using >1 byte/ block  If we consider a block to be exactly 1 byte in size, the block becomes the smallest part of memory. In this case, we don’t need to have an offset, however number of bits required for tag and index will increase.  If we use 32 bit addressing, then: Tag bits + Index Bits + Offset Bits = 32 (If block is 1 byte, number of offset bits = 0)  How will we get number of bits for offset?  Based off the block size  If block is 4 bytes, this means 4 addresses belong to the block.  How many bits will you need? 19

Question 4: Multiple bytes/block  Suppose we have a 32 bit processor. Thus we have main memory of 4GB (2 32 ), with each byte directly addressable by a 32 bit address. We divide our memory into 32 byte blocks, how many blocks do we have in memory? Answer can be in a power of Ans: 32 = 2 5 Thus number of blocks = 2 32 /2 5 = 2 27

Question 5:  Suppose we are using the memory from question 4. We now have a cache size of 512KB (2 19 ). How many cache lines do we have? Answer can be a power of 2. Ans: Cache line is same size as block size /2 5 =

Question 6:  Following the Previous Question. How many bits are required to represent the index? Ans: 14 22

Question 7:  Following the previous question. How many memory blocks are mapped to the same position in the cache? Answer can be a power of 2. Ans: Number of Memory Blocks / Number of Cache lines 2 27 /2 14 =

Question 8:  Following the previous questions. How many bits are required to represent the tag? Ans: 13 24

Question 9:  Continuing from the previous problem, If we consider using a validity bit, and 8 bits (1 byte) of data. How many bits long would each cache line be? Ans: = 13 tag bits + 1 validity bit + (8 data bits x 2 5 bytes in one block) = 270 bits 25

Spatial Locality WHAT IS IT? WHY IS IT IMPORTANT? HOW CAN WE TAKE ADVANTAGE OF IT? 26

Taking advantage of Spatial Locality Offset Index

Taking advantage of Spatial Locality (16)(17) Index Suppose we need to access memory location 17. We would write entire block containing 17 to corresponding cache line. Tag

Taking advantage of Spatial Locality (16)(17) Index Now when we check for address 16 (10000), we know our tag is going to be 1 bit (i.e. 1) as 1 bit for offset, and 3 bits for index. We can check for tag in index location 000 (from address). If the tag matches, we have a hit. Tag

Taking advantage of Spatial Locality  Now the important thing to notice, is each block consists of 2 different data (each data is one byte)  Each individual byte still needs 5 bits to represent. 4 bits to determine which block and which cache line it writes to, 1 bit as offset.  We transfer the entire block into the cache line. As we can see, the cache is split the same way as the memory. (ex. If we need to put the data in address 2 into the cache, data from both 2 and 3 are transferred. If we need to put the data in address 7 into the cache, data from both 6 and 7 are transferred.)  Thus, we do not need the offset to determine which line of cache the block needs to write into, just tag. The entire data from all memory locations in a block is written to cache line. 30