Download presentation
Presentation is loading. Please wait.
Published byAnna Strickland Modified over 8 years ago
1
Computer Architecture Foundations for Graduate Level Students
2
Basic Paradigm HD MM CPU cache
3
Transfers of data HD MM CPU cache
4
Transfers of data HD MM CPU cache
5
CPU needs particular data HD MM CPU cache
6
If required data is found in cache HD MM CPU cache
7
When required data is not in cache HD MM CPU cache
8
CPU ultimately gets data from cache HD MM CPU cache
9
If data is not in cache and in MM HD MM CPU cache
10
From HD to MM to cache HD MM CPU cache
11
If MM is full HD MM CPU cache
12
If cache is full HD MM CPU cache
13
Swapping between MM and cache HD MM CPU cache
14
Access Time If every memory reference to cache required transfer of one word between MM and cache, no increase in speed is achieved. In fact, speed will drop because apart from MM access, there is additional access to cache Suppose reference is repeated n times, and after the first reference, location is always found in the cache
15
Cache Hit Ratio The probability that a word will be found in the cache Depends upon the program and the size and organization of the cache h = Number of times required word found in cache Total number of references h: hit ratio
16
Access Time t a = Average access time t c = Cache access time (1-h) = miss ratio t m = Memory access time
17
Fetch Mechanisms Demand Fetch –Fetch a block from memory when it is needed and is not in the cache Prefetch –Fetch block/s from memory before they are requested Selective Fetch –Not always fetching blocks, dependent on some defined criterion; blocks are stored in MM rather than the cache
18
Data in cache should be replaced with data from MM Blocks (a group of memory addresses) are transferred from MM to cache Cache has a limited capacity (page frame) MM cache
19
Replacement Algorithms When the word being requested by the CPU is not in the cache, it needs to be transferred from MM. (or it can also be from secondary memory to MM) A page fault occurs when a page or a block is not in the cache (or MM in the case of secondary memory) Replacement algorithms determine which page/block to remove or overwrite
20
Characteristics Usage based or Non-usage based –Usage based : the choice of page/block to replace is dependent on the how many times each page/block has been referenced –Non-usage based : Use some other criteria for replacement
21
Assumptions For a given page size, we only need to consider the page/block number. If we have a reference (hit) to a page p, then any immediately succeeding references to p does not cause a page fault The size of memory/cache is represented as the number of pages it is capable of holding (page frame )
22
Example Consider the following address sequence calls: 01100432 01010612010201030104 01010611010201030302 which, at 100 bytes per page, can be reduced to the following access string: 14161613 This sequence of page requests is called a reference string.
23
Replacement Policies Random replacement algorithm First-in first-out replacement Optimal Algorithm Least recently used algorithm Least Frequently Used Most Frequently Used
24
Random Replacement A page is chosen randomly at page fault time There is no relationship between the pages or their use. Choice is done by a random number generator.
25
FIFO Memory treated as a queue –When a page comes in, it is inserted at the tail –When a page is removed, the entry at the head of the queue gets deleted Easy to understand and program Performance is not consistently good; dependent on reference string
26
FIFO Example Consider the following reference string: 701203042 With a page frame of 3 ******** 701223042 70112304 7001230 An * indicates a miss (the page requested by the CPU is not in the cache or in MM)
27
FIFO Example #2 Consider the following reference string: 1 2 3 4 1 2 5 1 2 3 4 5 With a page frame of 3 * * * * * * * * * 1 2 3 4 1 2 5 5 5 3 4 4 1 2 3 4 1 2 2 2 5 3 3 1 2 3 4 1 1 1 2 5 5 We have 9 page faults Try performing this FIFO with a page frame of 4
28
Belady’s Anomaly An increase in page frame does not necessarily mean a decrease in page faults More formally, Belady’s anomaly reflects the fact that, for some page-replacement algorithms, the page fault rate may increase as the number of allocated frames increases
29
Optimal Algorithm The page that will not be used for the longest period of time is replaced Guarantees the lowest page fault rate for a fixed number of frames Difficult to implement because it requires future knowledge of the reference string
30
Optimal Algorithm Example Consider the following reference string: 701203042 With a page frame of 3 We look ahead and see that 7 is the page which will not be used again, so we replace 7; we also note that after our first hit we should not replace 0 immediately, but rather 1 because 1 will not be referenced any more (2 will be referenced last.) ****** 701223344 70112233 7000022
31
Least Recently Used Approximates the optimal algorithm Replaces the page that has not been used for the longest period of time When all page frames have been used up and every time there is a page hit, the referenced page is placed at the tail to indicate it has been recently accessed
32
LRU Example Consider the following reference string: 7 0 1 2 0 3 0 4 0 3 0 2 With a page frame of 3 * * * * * * * 7 0 1 2 0 3 0 4 0 3 0 2 7 0 1 2 0 3 0 4 0 3 0 7 0 1 2 2 3 3 4 4 3 We have 7 page faults Try performing this LRU with a page frame of 4
33
Least Frequently Used Counts the number of references made to each page; when page is accessed, counter is incremented by one Page with smallest count is replaced FIFO is used to resolve a tie Rationale: Page with the bigger counter is an actively used page Problem –Page initially actively may never be used again –Solved by using a decaying counter
34
LFU Example Consider the following reference string: 7 0 1 2 0 3 0 4 0 3 0 2 With a page frame of 3 * * * * * * * 7 1 0 1 1 1 2 1 2 1 3 1 3 1 4 1 4 1 4 1 4 1 2 1 7 1 0 1 1 1 1 1 2 1 2 1 3 1 3 1 3 2 3 2 3 2 7 1 0 1 0 2 0 2 0 3 0 3 0 4 0 4 0 5 0 5 We have 7 page faults Try performing this LFU with a page frame of 4
35
Most Frequently Used Opposite of LFU Replace page with the highest count Tie is resolved using FIFO Based on the argument that the page with smallest count has just been probably brought in and is yet to be used Both LFU and MFU are not common and implementation is expensive.
36
The Central Processing Unit The operating hub and heart of every computer system Composed of –Control Unit –Datapath Each component inside the CPU has a specific role in executing a command Communicates with other components of the system
37
Control Unit (CU) Regulates all activities inside the machine Serves as “nerve center” that sends control signals to other units and senses their status Connected to all components in the CPU as well as main memory
38
How The CU Is Connected Control Unit Registers ALU CPU Main Memory
39
Inside the CPU: The Datapath Registers ALU
40
Registers Components used for data storage (can be read from or written to) High speed memory locations used to store important information during CPU operations Two types –Special –General-purpose
41
Special Registers Registers used for specific purposes Used heavily during execution of CPU instructions
42
General Purpose Registers CPU registers used as “scratch pad” during execution of machine-level instructions Number varies between processors
43
Arithmetic Logic Unit (ALU) Performs all mathematical and logical operations within the CPU Operands not in the CPU would have to be retrieved from main memory
44
CPU-Memory Coordination Bus - a group of wires that connect separate components Types of bus: –Control bus (control signals) –Address bus (address information) –Data bus (instruction/data)
45
CPU-Memory Coordination The different busses facilitate communication between the CPU and main memory Actions of the two components are highly- synchronized to ensure efficient and timely execution of instructions
46
CPU Operations Instructions do not reside in the CPU, they have to be fetched from memory Each machine level instruction is broken down into a logical sequence of smaller steps
47
CPU Operations Instructions are carried out by performing one or more of the following functions in some pre-specified sequence –Retrieving data from main memory –Putting data to main memory –Register data transfer –ALU operation
48
How An Instruction is Processed Instruction is retrieved from memory Analyze what the instruction is and how to execute it Operands/parameters (if any) are fetched from main memory Instruction is executed Results are stored (CPU or MM) Prepare for next instruction
49
Instruction Processing Example Fetch instruction from memory Decode it (turns out to be an ADD) Get the two numbers to add from MM Perform the addition Where will it be stored? Prepare for next instruction
50
Processing Data in Clusters Information is organized into groups of fixed-size data that can be stored and retrieved in a single, basic operation Each group of n bits is referred to as a word of information Access to each word requires a distinct name (location/address) Can also refer to a characteristic of other components (i.e. size of bus)
51
Word Length Size of a word specified in bits known as word length Possible benefits of a large word length: –Faster processing (more data and/or instructions can be fetched at a time) –Greater numeric precision –More powerful instructions (e.g. instructions can have more operands)
52
Machine Language Composed of –Instruction –Data (instruction parameters) Instructions and data are represented by a stream of 1s and 0s Cumbersome to deal with when preparing programs; programmers use hexadecimal numbers In some computers, both instruction and data are stored in a single memory location
53
Assembly: An Improvement on Machine Language Symbols, called mnemonics, are used to represent instructions Sample instruction – Advantage: –Easier recall of instructions Disadvantage: –Need to convert mnemonics and hexadecimal numbers back to binary add (105),(8) instruction Instruction parameters
54
How Programs Are Loaded Into Main Memory Programs are loaded as binary numbers Assumptions: –An instruction is represented by a 2-digit hexadecimal number (e.g. add by 1A, mov by A0) –World length of instruction parameters: 3 hex digits (12 bits) 100101102103104105 A01040F21A105008A00F22000100000000000009… 100 1010 0000 0001 0000 0100 0000 1111 0010 InstructionInstruction parameters
55
Executing Multiple Programs Programs share processor time Time slicing Supported by modern CPUs
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.