CSC1016 Coursework Clarification Derek Mortimer March 2010.

Slides:

Advertisements

Similar presentations

Math for Liberal Studies.  When we studied ID numbers, we found that many of these systems use remainders to compute the check digits  Remainders have.

Advertisements

Copyright 2003Curt Hill Hash indexes Are they better or worse than a B+Tree?

The Little man computer

How caches take advantage of Temporal locality

Memory Hierarchies Exercises [ ] Describe the general characteristics of a program that would exhibit very little spatial or temporal locality with.

Level ISA3: Information Representation

CS 206 Introduction to Computer Science II 10 / 14 / 2009 Instructor: Michael Eckmann.

Signed Numbers.

CS 300 – Lecture 20 Intro to Computer Architecture / Assembly Language Caches.

This material in not in your text (except as exercises) Sequence Comparisons –Problems in molecular biology involve finding the minimum number of edit.

1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.

1  Caches load multiple bytes per block to take advantage of spatial locality  If cache block size = 2 n bytes, conceptually split memory into 2 n -byte.

1  The second question was how to determine whether or not the data we’re interested in is already stored in the cache.  If we want to read memory address.

Algebra Problems… Solutions Algebra Problems… Solutions © 2007 Herbert I. Gross Set 5 By Herbert I. Gross and Richard A. Medeiros next.

Exponents Scientific Notation

Cache Memories Effectiveness of cache is based on a property of computer programs called locality of reference Most of programs time is spent in loops.

While Loops and Do Loops. Suppose you wanted to repeat the same code over and over again? System.out.println(“text”); System.out.println(“text”); System.out.println(“text”);

Real Numbers and the Decimal Number System

(2.1) Fundamentals  Terms for magnitudes – logarithms and logarithmic graphs  Digital representations – Binary numbers – Text – Analog information 

Lecture 21 Last lecture Today’s lecture Cache Memory Virtual memory

CMPE 421 Parallel Computer Architecture

CS212: DATA STRUCTURES Lecture 10:Hashing 1. Outline 2  Map Abstract Data type  Map Abstract Data type methods  What is hash  Hash tables  Bucket.

Lecture Objectives: 1)Define set associative cache and fully associative cache. 2)Compare and contrast the performance of set associative caches, direct.

IT253: Computer Organization Lecture 3: Memory and Bit Operations Tonga Institute of Higher Education.

Extending the Definition of Exponents © Math As A Second Language All Rights Reserved next #10 Taking the Fear out of Math 2 -8.

Chapter 7 File I/O 1. File, Record & Field 2 The file is just a chunk of disk space set aside for data and given a name. The computer has no idea what.

ECE 353 Lab 1: Cache Simulation. Purpose Introduce C programming by means of a simple example Reinforce your knowledge of set associative caches.

How to Build a CPU Cache COMP25212 – Lecture 2. Learning Objectives To understand: –how cache is logically structured –how cache operates CPU reads CPU.

10/18: Lecture topics Memory Hierarchy –Why it works: Locality –Levels in the hierarchy Cache access –Mapping strategies Cache performance Replacement.

Lecture 5 Cache Operation ECE 463/521 Fall 2002 Edward F. Gehringer Based on notes by Drs. Eric Rotenberg & Tom Conte of NCSU.

CHAPTER 8 SEARCHING CSEB324 DATA STRUCTURES & ALGORITHM.

1 How will execution time grow with SIZE? int array[SIZE]; int sum = 0; for (int i = 0 ; i < ; ++ i) { for (int j = 0 ; j < SIZE ; ++ j) { sum +=

Dividing Decimals # ÷ 3.5 next Taking the Fear out of Math

Time Parallel Simulations I Problem-Specific Approach to Create Massively Parallel Simulations.

COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.

Parallel and Distributed Simulation Time Parallel Simulation.

CS1Q Computer Systems Lecture 2 Simon Gay. Lecture 2CS1Q Computer Systems - Simon Gay2 Binary Numbers We’ll look at some details of the representation.

1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.

COMP SYSTEM ARCHITECTURE HOW TO BUILD A CACHE Antoniu Pop COMP25212 – Lecture 2Jan/Feb 2015.

CPSC 252 Hashing Page 1 Hashing We have already seen that we can search for a key item in an array using either linear or binary search. It would be better.

Key Stone Problem… Key Stone Problem… Set 17 Part 2 © 2007 Herbert I. Gross next.

1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.

CSCI 156: Lab 11 Paging. Our Simple Architecture Logical memory space for a process consists of 16 pages of 4k bytes each. Your program thinks it has.

Lecture 20 Last lecture: Today’s lecture: Types of memory

Searching CSE 103 Lecture 20 Wednesday, October 16, 2002 prepared by Doug Hogan.

Cache Organization 1 Computer Organization II © CS:APP & McQuain Cache Memory and Performance Many of the following slides are taken with.

Recursion. Objectives At the conclusion of this lesson, students should be able to Explain what recursion is Design and write functions that use recursion.

COMP 3221: Microprocessors and Embedded Systems Lectures 27: Cache Memory - III Lecturer: Hui Wu Session 2, 2005 Modified.

1 Contents Memory types & memory hierarchy Virtual memory (VM) Page replacement algorithms in case of VM.

Reliability of Disk Systems. Reliability So far, we looked at ways to improve the performance of disk systems. Next, we will look at ways to improve the.

CSE 351 Caches. Before we start… A lot of people confused lea and mov on the midterm Totally understandable, but it’s important to make the distinction.

COSC2410: LAB 19 INTRODUCTION TO MEMORY/CACHE DIRECT MAPPING 1.

Lecture 3: More Java Basics Michael Hsu CSULA. Recall From Lecture Two  Write a basic program in Java  The process of writing, compiling, and running.

Lecture 5 Cache Operation

The Little man computer

Multilevel Memories (Improving performance using alittle “cash”)

Error Correcting Code.

Number Systems.

Consider a Direct Mapped Cache with 4 word blocks

CS61C : Machine Structures Lecture 6. 2

CS61C : Machine Structures Lecture 6. 2

CS 240 – Lecture 9 Bit Shift Operations, Assignment Expressions, Modulo Operator, Converting Numeric Types to Strings.

Module IV Memory Organization.

CSE 351: The Hardware/Software Interface

How can we find data in the cache?

Caches III CSE 351 Autumn 2018 Instructor: Justin Hsia

Contents Memory types & memory hierarchy Virtual memory (VM)

Cache Memory and Performance

10/18: Lecture Topics Using spatial locality

Caches III CSE 351 Spring 2019 Instructor: Ruth Anderson

Presentation transcript:

CSC1016 Coursework Clarification Derek Mortimer March 2010

Contents Clarification of the specification Example Output Cache Layouts Caching Schemes – Direct Mapped – Fully Associative Implementation Notes March 2010CSC1016 Coursework

Clarification You are not implementing a “complete” cache You are simulating the hit and miss rates for a cache using direct mapped and fully associative schemes: – Example: If address 135 was used to access a cache, would it have been a hit or a miss? And so on… The hit and miss counts for a given sequence of addresses will always be the same if your schemes are implemented correctly March 2010CSC1016 Coursework

Clarification You do not need to deal with “real” data in the cache, but you will need to keep track of the meta-data for the cache blocks including: – Cache Tags – Validity Bits – LRU information (for fully associative cache) For a cache with 32 blocks, you will need to keep track of 32 tags, validity bits and the LRU information. You don’t need to model every byte in the cache, representing every block is enough. March 2010CSC1016 Coursework

Example Output Specification states that for any memory address, A, you may represent the contents in the cache as M(A). Example, an 8 byte cache with 2 byte blocks means the cache has 4 blocks (0, 1, 2, 3). We access the cache using address 32, suppose this maps to cache block 1 and then using address 24, mapping to block 3. At this point, blocks 1 and 3 have been accessed but 0 and 2 have not. March 2010CSC1016 Coursework

Example Output Following from the previous slide, your output would look similar to: As 32 and 24 were the addresses used and they caused cache data to be stored in blocks 1 and 3 March 2010CSC1016 Coursework Cache { [0]->EMPTY [1]->M(32) [2]->EMPTY [3]->M(24) }

Cache Layouts There are three key pieces of information in describing a cache’s layout. 1.Size of the entire cache (in bytes) 2.Size of a single cache block (in bytes) 3.Number of blocks within a cache Number of Blocks = Size of Cache / Size of Block For example: 128 bytes / 32 bytes = 4 blocks March 2011CSC1016 Coursework

Cache Layouts When talking about cache size, block size and number of blocks, it is more useful to talk about them in terms of exponents of a given base. For a Decimal cache, the base is always 10 For a Binary cache, the base is always 2 The equations are exactly the same for working in decimal or binary, however… Computers work in binary, you should too. March 2010CSC1016 Coursework

Cache Layouts We use N to describe the size of the entire cache so: – 2 N = cache size, in bytes(binary) – 10 N = cache size, in bytes(decimal) We use M to describe the size of the blocks within the cache so: – 2 M = block size, in bytes(binary) – 10 M = block size, in bytes(decimal) March 2010CSC1016 Coursework

Cache Layouts We use I to describe the number of blocks within a cache so: – 2 I = number of blocks (binary) – 10 I = number of blocks (decimal) We know N and M, we can always work out I : – 2 N / 2 M = 2 I (binary) – 10 N / 10 M = 10 I (decimal) Math says that a x / a y is the same as a x-y So we can say that I = N – M, this always holds March 2010CSC1016 Coursework

Cache Layouts To summarize, for any cache you want to simulate, you must specify: – Whether you are working in binary or decimal So you know whether to use 2 or 10 as a base – N :2 N or 10 N = how big your cache will be – M :2 M or 10 M = how big your blocks will be – I :2 I or 10 I = how many blocks you will have I = N – M March 2010CSC1016 Coursework

Example Layouts A Binary Cache, 32 bytes big with 8 byte blocks. Work out N, M and I – N :2 5 = 32so N = 5 – M :2 3 = 8so M = 3 – I : N – M = 5 – 3so I = 2 I is correct because 2 I = 2 2 = 4* and 2 N /2 M = 2 5 /2 3 = 32/8 = 4* March 2010CSC1016 Coursework *NOTE: 4 is the number of blocks and thus, the number of tags and validity bits we need to keep track of

Example Layouts A Decimal Cache, 1000 bytes big with 100 byte blocks. Work out N, M and I – N :10 3 = 1000so N = 3 – M :10 2 = 100so M = 2 – I : N – M = 3 – 2so I = 1 I is correct because 10 I = 10 1 = 10* and 10 N /10 M = 10 3 /10 2 = 1000/100 = 10* March 2010CSC1016 Coursework *NOTE: 10 is the number of blocks and thus, the number of tags and validity bits we need to keep track of

Block Sizes If you have a cache with blocks larger than 1 byte (meaning M is greater than 0), each block within the cache will represent more than 1 byte of information, meaning more than 1 address maps to each block When given some address, A, to access the cache with, if we know M, we can work out the Group Address (GA) – The group address is a common prefix that all addresses referencing the same block will share March 2010CSC1016 Coursework

Block Sizes Example: A Binary cache, 32 bytes big with 8 byte blocks (so 4 blocks) N = 5, M = 3and I = = 32, 2 3 = 8and2 2 = 4 If address 27 is used to access a block within the cache, this means 7 other addresses also point to the block as 8 (2 3 ) bytes would be contained within the block March 2010CSC1016 Coursework

Block Sizes How do we turn an address into a group address? – We know that for N=5, M=3 and I=2 that 8 addresses reference the same block. Because the example is binary, we will look at the addresses around 27 in binary. March 2010CSC1016 Coursework 24 = = = = = = = =11111

Block Sizes You will notice that the first 2 digits are the same for all addresses. This means we need to eliminate the last 3 digits from each address. – Remember what M equals? 3. So we know that to turn any address into a group address we want to remove the last M digits from it (in binary or decimal, of course). March 2010CSC1016 Coursework 24 = = = = = = = =11111

Block Sizes How do we remove the last M digits from a number? – Divide by 2 M (binary) – Divide by 10 M (decimal) To turn any address into its group address you simply do: – GA = A / 2 M (binary) – GA = A / 10 M (decimal) March 2010CSC1016 Coursework NOTE: You do not need to “convert” your numbers to binary if doing the binary cache, dividing by 2 M will work as if you had. I did this only to illustrate the common prefix.

Block Sizes Example: A Decimal cache, 1000 bytes big with 100 byte blocks (so 10 blocks) N = 3, M = 2and I = = 1000, 10 2 = 100and10 1 = 10 If address 1234 it used to access a block within the cache, this means 99 other addresses also point to the block as 100 (10 2 ) bytes would be contained within the block March 2010CSC1016 Coursework

Block Sizes We know that M = 2, and 10 2 = 100, so to remove the last 2 digits from the address 1234, we divide it by 10 2, / 100 = 12 (Note: Integer division, round down) This would turn every address from 1200 to 1299 (100 addresses) into the group address 12, the common prefix. March 2010CSC1016 Coursework

Block Sizes To summarize: – If working in binary, to turn any address into its group address you divide it by 2 M – If working in decimal, to turn any address into its group address you divide it by 10 M – Because this is integer division, always round the result down in dynamically typed languages – Don’t worry if you don’t fully understand it yet, just remember you have to do it March 2010CSC1016 Coursework

Cache Access For both schemes, the structure of your solution will be as follows – Given an address A Turn A into its corresponding GA – Does the cache contain a block with GA‘s data in it? Yes? Record as a Hit No? Record as a Miss – Record that A was used to access the cache – Repeat for another address A until done – At the end, print out the contents of the cache, the number of hits and the number of misses March 2010CSC1016 Coursework

Cache Access What differs between Direct Mapped and Fully Associative schemes? 1.How you check if the cache contains GA 2.How and where you record GA within the cache March 2010CSC1016 Coursework

Direct Mapped In a Directly Mapped scheme, an address, A, will always reference the same cache block This is achieved through a sequence of steps 1.Calculate the Group Address (GA) 2.Calculate the cache Index and Tag from the GA 3.Use the Tag and Validity Bit at the calculated index to check for a hit or miss 4.Update the cache data and try the next address March 2010CSC1016 Coursework NOTE: The Index you calculate refers to one of your cache blocks, it will always be between 0 and the number of blocks you have within the cache (due to modulo arithmetic)

Direct Mapped How do you work out the index and tag? – For a given GA, the last I digits are the index, everything before that is the tag This is why we need to know N and M, so we can work out I and then work out the index and tag for any address – Where we used division to get rid of digits for the group address, we can use modulo to extract them, meaning: Cache Index= GA % 2 I (binary) Cache Tag= GA / 2 I (binary) Cache Index= GA % 10 I (decimal) Cache Tag= GA / 10 I (decimal) March 2010CSC1016 Coursework NOTE: As with previous examples, you must round down if using JavaScript.

DM Example Binary cache, N = 5, M = 3, thus I = 2 32 byte cache, 8 byte blocks, gives 4 blocks 4 blocks means we need to keep track of 4 tags and 4 validity bits (vBits) The tags are initially empty, the vBits are initialised to false. We will ‘access’ the cache using a set of addresses and record if each caused a hit, or a miss Initial output might look like this: Cache { [0] -> EMPTY [1] -> EMPTY [2] -> EMPTY [3] -> EMPTY } Hits: 0, Misses: 0 March 2010CSC1016 Coursework

DM Example Access with Address = 24 GA = A / 2 M = 24 / 8 = 3 Index = GA % 2 I = 3 % 4 = 3 Tag = GA / 2 I = 3 / 4 = 0 (rounded down) Is the vBit set at the index? – vBit [3] == false. If the vBit is set, do the tags match? – vBit was false, do not check Hit if both were true, otherwise, Miss. – This is a miss Log a Miss and update the data* to say 24 was used to access the cache – vBit[3] = true, tag[3] = 0, address[3] = 24. March 2010CSC1016 Coursework Set the vBit at 3 to true, store the tag at 3 and remember A=24 was used to access block 3 last.

DM Example Following the previous slide, the output would now look like: Cache { [0] -> EMPTY [1] -> EMPTY [2] -> EMPTY [3] -> M(24) } Hits: 0, Misses: 1 As we used address 24 which resulted in a miss and new information being stored in block 3. Subsequent accesses may cause hits, misses and alter the contents of any of the blocks within the cache… March 2010CSC1016 Coursework

DM Example If 24, or (all addresses within the group) were used to access the cache again, the result would be a hit until another address mapped into the same block with a different tag Misses can happen because nothing is in the block whose index you have just calculated (the vBit would be false) - OR - Because the tag currently stored at the index (from a previous access) does not match the tag you just calculated (from a new address and group address) – This happens when your addresses can be larger than your cache size. E.g. In a cache 32 bytes big, addresses 0, 32, 64, 96 and 128 will all map to the same block (modulo arithmetic), so we need to check tags as well as vBits at the calculated index. March 2010CSC1016 Coursework

Fully Associative In a Fully Associative caching scheme, the steps are simpler, conceptually. The idea with FA is to store the information at the first available space. – This means addresses do not always end up in the same block, so you do not need to calculate a cache index March 2010CSC1016 Coursework

Fully Associative Fully associative schemes work the following way: 1.Calculate the group address from the address 2.Use the entire group address as the tag 3.Scan the entire cache and see if the tag is already in the cache 4.If the tag is found, record a hit 5.If the tag is not found, record a miss, and store the tag in the first available block (first vbit = false is empty) 6.If there are no spaces available, remove the LEAST recently used (LRU) block from the cache 7.Update the cache so the new tag is stored (if there was a miss) and remember this block is the MOST recently used. Set vBit to true, store tag in block, remember which address was used originally. March 2010CSC1016 Coursework

FA Example Same cache as the DM example – The initial output would be the same, empty cache with 0 hits and 0 misses Access with A = 24 Calculate GA = A / 2 M = 24 / 8 = 3 Tag = GA Check if 3 is currently in the cache? – No it is not Store 3 in the first available block (block 0, vBit is false) and set the vBit for block 0 to true March 2010CSC1016 Coursework

FA Example Following the previous slide, the output would now look like: Cache { [0] -> M(24) [1] -> EMPTY [2] -> EMPTY [3] -> EMPTY } Hits: 0, Misses: 1 As we used address 24 which resulted in a miss and new information being stored in the first available block, 0. Subsequent accesses may cause new tags to be placed into the cache, you will need to make sure that the least recently used blocks are the ones that are replaced March 2010CSC1016 Coursework

Cache Data For any cache with 2 I or 10 I blocks, you will need to keep track of 2 I or 10 I validity bits and tags 2 I or 10 I will change depending on the cache size and block size ( I changes as N and M do) Collections (e.g. Arrays and Lists) are ideal constructs to use for storing this information In DM you will know the index by working it out from the group address For FA you will need to scan the entire collection looking for the tag, empty block or a block to replace March 2010CSC1016 Coursework

Least Recently Used How do you keep track of the least recently used block, for fully associative caches. Two usual ways: 1.Keep the blocks in order, from most recently used to least recently used. When you need to replace a block, you remove the old one from the end and add the new one onto the beginning When you get a hit, move the hit to the beginning of the cache 2.Store ‘time’ information for each block in the cache so you can find the least recently used one Requires you to store more information, easy to get wrong March 2010CSC1016 Coursework

Summary Use these slides in conjunction with the coursework specification and examples in your CSC1016 notes – For DM and FA examples I did not cover the other things that can happen (hits, misses where tags do not match) – You should do some pen and paper examples to understand how both schemes work – Your resulting code will be quite small, but it will require you to understand what is going on March 2010CSC1016 Coursework

Final Notes Do NOT use String manipulation (including converting numbers to binary strings) to solve this coursework – Hardware caches work using arithmetic, you need to as well It is possible to have your code deal with binary and decimal without changing your calculations (all that changes is the base, 10 or 2). March 2010CSC1016 Coursework