Download presentation
Presentation is loading. Please wait.
Published byCody Lindsey Modified over 9 years ago
1
COSC 3330/6308 Second Review Session Fall 2012
2
Instruction Timings For each of the following MIPS instructions, check the cycles that each instruction does not skip. (4×5 points for each correct line) InstructionIFID/RRALUMEMWB add r1, r2, r3 slt r1, r2, r3 ld r1, d(r2) st r1, d(r2)
3
Instruction Timings For each of the following MIPS instructions, check the cycles that each instruction does not skip. (4×5 points for each correct line) InstructionIFID/RRALUMEMWB add r1, r2, r3XXXX slt r1, r2, r3 ld r1, d(r2) st r1, d(r2)
4
Instruction Timings For each of the following MIPS instructions, check the cycles that each instruction does not skip. (4×5 points for each correct line) InstructionIFID/RRALUMEMWB add r1, r2, r3XXXX slt r1, r2, r3XXXX ld r1, d(r2) st r1, d(r2)
5
Instruction Timings For each of the following MIPS instructions, check the cycles that each instruction does not skip. (4×5 points for each correct line) InstructionIFID/RRALUMEMWB add r1, r2, r3XXXX slt r1, r2, r3XXXX ld r1, d(r2)XXXXX st r1, d(r2)
6
Instruction Timings For each of the following MIPS instructions, check the cycles that each instruction does not skip. (4×5 points for each correct line) InstructionIFID/RRALUMEMWB add r1, r2, r3XXXX slt r1, r2, r3XXXX ld r1, d(r2)XXXXX st r1, d(r2)XXXX
7
Conditional branch What is missing in the following diagram sketching the datapaths of the non-pipelined version of the conditional branch instruction? (2×5 points)
8
Conditional Branch
9
Shift left 2
10
Conditional Branch Shift left 2 Add
11
Immediate instructions Remember that the MIPS instruction set has a variety of immediate instructions such as addi r1, r2, im that stores into r1 the sum of the contents of register r2 and the immediate value im. Show on the following diagram what would be the datapaths for that instruction. (3×5 points)
12
addi r1, r2, im Register file Sign-extended immediate ALU
13
addi r1, r2, im Register file Sign-extended immediate ALU
14
addi r1, r2, im Register file Sign-extended immediate ALU
15
Pipelining Consider the following pair of MIPS instructions sub r3, r1, r2 add r4, r3, r6 Show how the second instruction will proceed when bypassing is not implemented. (5 points)
16
Pipelining w/o bypassing Steps1234567 sub r3, r1, r2IFID/RRALUWB add r4, r3, r6IF Cannot read register operation before being able to read new value of register r3
17
Pipelining w/o bypassing Steps1234567 sub r3, r1, r2IFID/RRALUWB add r4, r3, r6IFID/RRALUWB Cannot read register operation before being able to read new value of register r3
18
Pipelining Show how the second instruction will proceed if bypassing is implemented.
19
Pipelining with bypassing Steps1234567 sub r3, r1, r2IFID/RRALUWB add r4, r3, r6IF
20
Pipelining with bypassing Steps1234567 sub r3, r1, r2IFID/RRALUWB add r4, r3, r6IFID/RRALUWB
21
More pipelining Consider the following pair of MIPS instructions lw r3, d(r1) add r4, r3, r6 Show how the second instruction will proceed when bypassing is not implemented. (5 points)
22
Without bypassing Steps1234567 lw r3, d(r1)IFID/RRALUMEMWB add r4, r3, r6IF Cannot read register operation before being able to read new value of register r3
23
Without bypassing Steps1234567 lw r3, d(r1)IFID/RRALUMEMWB add r4, r3, r6IFID/RRALU Cannot read register operation before being able to read new value of register r3
24
More pipelining Show how the second instruction will proceed if bypassing is implemented.
25
With bypassing Steps1234567 lw r3, d(r1)IFID/RRALUMEMWB add r4, r3, r6IF Cannot read register operation before being able to read new value of register r3
26
With bypassing Steps1234567 lw r3, d(r1)IFID/RRALUMEMWB add r4, r3, r6IFID/RRALUWB Cannot read register operation before being able to read new value of register r3
27
A last word about data hazards Which single MIPS instruction can cause the worst data hazards? (5 points)
28
A last word about data hazards Which single MIPS instruction can cause the worst data hazards? (5 points) lw (load word into register) It goes though all cycles before updating its register
29
The comparator The MIPS architecture we have discussed in class includes a small comparator that checks whether the two register read outputs are equal or not. Which MIPS instructions use this comparator? (5 points) Why do they use this comparator instead of the ALU? (5 points) How is this comparator implemented? (5 points)
30
The comparator The comparator is used by the beq and bne instructions So that the branch decision can be made one step earlier It XORes the two 32 values then ORes bitwise the result
31
Without special unit beqIFID/RRALUMEMWB nextIFID/RR ABORT nextIF ABORT destIFID/RRALU Must wait until end of ALU step of beq to know whether we will branch or not
32
With special unit beqIFID/RRALUMEMWB nextIF ABORT destIFID/RRALU Since special unit is very fast, we know whether we will branch or not by the end of the ID/RR step
33
Disk reliability What do we mean when we say that disk failure rates follow a bathtub curve? (5 points)
34
Disk reliability What do we mean when we say that disk failure rates follow a bathtub curve? (5 points) Disk failure rates are higher For new disks (infant mortality) As disks wear down at the end of their useful lifetime
35
Caching A small direct-mapping cache has 2,048 entries with each entry containing four words. The computer memory is byte-addressable and all addresses are 32-bit addresses. (4×5 points) What is the cache size (tags excluded) in bytes?
36
The cache Tag 4 words = 4 4 bytes Tag Bit 2,048 lines
37
Caching A small direct-mapping cache has 2,048 entries with each entry containing four words. The computer memory is byte-addressable and all addresses are 32-bit addresses. (4×5 points) What is the cache size (tags excluded) in bytes? 2,048 4 4 = 32K bytes
38
Caching A small direct-mapping cache has 2,048 entries with each entry containing four words. The computer memory is byte-addressable and all addresses are 32-bit addresses. (4×5 points) What is the tag size?
39
Caching A small direct-mapping cache has 2,048 entries with each entry containing four words. The computer memory is byte-addressable and all addresses are 32-bit addresses. (4×5 points) What is the tag size? 32 – 4 – 11 =17 bits Remove log2 (16) = 4 bits since each entry is 16-byte long Remove log2(2,048) = 11 bits that are given by address in cache.
40
Caching A small direct-mapping cache has 2,048 entries with each entry containing four words. The computer memory is byte-addressable and all addresses are 32-bit addresses. (4×5 points) How could we increase the hit ratio of the cache without increasing its size?
41
Caching A small direct-mapping cache has 2,048 entries with each entry containing four words. The computer memory is byte-addressable and all addresses are 32-bit addresses. (4×5 points) How could we increase the hit ratio of the cache without increasing its size? Replacing it with a set-associative cache that could store 1,204 pairs of four-word entries.
42
Caching A small direct-mapping cache has 2,048 entries with each entry containing four words. The computer memory is byte-addressable and all addresses are 32-bit addresses. (4×5 points) What would be the main disadvantage of your solution?
43
Caching A small direct-mapping cache has 2,048 entries with each entry containing four words. The computer memory is byte-addressable and all addresses are 32-bit addresses. (4×5 points) What would be the main disadvantage of your solution? Set-associative caches are slower than direct mapping caches
44
Main memory organization Assuming that a main memory access takes 1 bus clock cycle to send the address, 16 bus clock cycle to initiate a read, 1 bus clock cycle to send a word of data, how many clock cycles would it take to transfer 16 bytes to the cache if the data are stored in a single bank of memory? (5 points) the data are stored in a four-way interleaved memory? (5 points)
45
Single bank memory Assuming that a main memory access takes 1 bus clock cycle to send the address, 16 bus clock cycle to initiate a read, 1 bus clock cycle to send a word of data, how many clock cycles would it take to transfer 16 bytes to the cache? 1 + 4 (16 + 1) = 69 cycles All operations are done sequentially
46
Four-way interleaved memory Assuming that a main memory access takes 1 bus clock cycle to send the address, 16 bus clock cycle to initiate a read, 1 bus clock cycle to send a word of data, how many clock cycles would it take to transfer 16 bytes to the cache? 1 + 16 + 4 1 = 21 cycles The reads, but not the data transfers, are now performed in parallel
47
Protecting page tables How can we prevent user programs from modifying their own page tables? (5 points)
48
Protecting page tables How can we prevent user programs from modifying their own page tables? (5 points) We must store page tables in the protected area of the operating system.
49
Caches and virtual memory What would be a reasonable page size for a virtual memory system? Justify your answer in a few words. Would that be a reasonable block size for a cache? Justify your answer in a few words.
50
Caches and virtual memory What would be a reasonable page size for a virtual memory system? 4K bytes
51
Caches and virtual memory What would be a reasonable page size for a virtual memory system? 4K bytes Justify your answer in a few words. Because page faults are very costly, the system should try to bring in as many useful data as possible.
52
Caches and virtual memory What would be a reasonable page size for a virtual memory system? 4K bytes Would that be a reasonable block size for a cache? NO
53
Caches and virtual memory What would be a reasonable page size for a virtual memory system? 4K bytes Would that be a reasonable block size for a cache? NO Justify your answer in a few words. Cache block sizes are much smaller: 64 bytes is a good choice because larger block sizes create too many collisions.
54
Page table size How can we limit the size of page tables to 512KB in a 32-bit virtual system?
55
Answer We do all the computations in reverse Desired page table size 512 KB Number of page table entries
56
Answer We do all the computations in reverse Desired page table size 512 KB Number of page table entries?
57
Answer We do all the computations in reverse Desired page table size 512 KB Number of page table entries: 512/4 =128 K Each page table entry occupies four bytes Number of bits occupied by the page number?
58
Answer We do all the computations in reverse Desired page table size 512 KB Number of page table entries: 512/4 =128K Each page table entry occupies four bytes Number of bits occupied by the page number: log2(128K) = log2(2 17 ) = 17 bits Number of bits occupied by the byte offset?
59
Answer We do all the computations in reverse Desired page table size 512 KB Number of page table entries: 512/4 =128K Each page table entry occupies four bytes Number of bits occupied by the page number: log2(128K) = log2(2 17 ) = 17 bits Number of bits occupied by the byte offset: 32 - 17 = 15 bits Page size?
60
Answer We do all the computations in reverse Desired page table size 512 KB Number of page table entries: 512/4 =128K Each page table entry occupies four bytes Number of bits occupied by the page number: log2(128K) = log2(2 17 ) = 17 bits Number of bits occupied by the byte offset: 32 - 17 = 15 bits Page size: 2 15 bytes = 32 KB
61
TLB misses When comparing the hit ratios of two translation look-aside buffers, which question should we ask first?
62
Answer Are TLB misses handled by the firmware or by the OS? If TLB misses are handled by the firmware, the cost of a TLB miss is one extra memory reference If TLB misses are handled by the OS, the cost of a TLB miss is two context switches.
63
The dirty bit What is the purpose of the dirty bit?
64
Answer The dirty bit tells whether a page has been modified since the last time it was brought into main memory. It is used whenever a page must be expelled from main memory. If its dirty bit is ON, the page must be saved to disk before being expelled If its dirty bit is OFF, there already is an exact copy of the page on disk.
65
Page table organization What is the main advantage of hashed page tables?
66
Answer Hashed page tables only keep track of the pages that are actually in main memory Their size is proportional to the size of the physical memory Instead of the size of the virtual address space
67
ALWAYS REMEMBER One KILOis2 10 One MEGAis2 20 One GIGAis2 30 In binary, 2 n is 1 followed by n zeroes
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.