Download presentation
Presentation is loading. Please wait.
1
Today: Control Unit: A bit of review
How to implement the branch control unit? Other instruction set Architectures 11/23/2018 Instruction encoding
2
Instruction format review
We have three different instruction formats, each 16 bits long with a seven-bit opcode and nine bits for source registers or constants. The first three bits of the opcode determine the instruction category, while the other four bits indicate the exact instruction. For ALU/shift instructions, the four bits choose an ALU operation. For branches, the bits select one of eight branch conditions. We only support one one load, one store and one jump instruction. 11/23/2018 Instruction encoding
3
Review: The whole processor
Control Unit Datapath D Register file A B WR DA AA BA A B ALU G FS V C N Z Mux B MB Mux D MD ADRS DATA Data RAM OUT MW constant ADRS Instruction RAM OUT PC Instruction Decoder DA AA BA MB FS MD WR MW Branch Control V C N Z 11/23/2018 Instruction encoding
4
Review: Generating DA, AA, BA
The register file addresses DA, AA and BA can be taken directly out of the 16-bit binary instructions. Instruction bits 8-6 are the destination register, DA. Bits 5-3 are fed directly to AA, the first register file source. Bits 2-0 are connected directly to BA, the second source. This clearly works for a register-format instruction where bits 8-6, 5-3 and 2-0 were defined to hold the destination and source registers. D Register file A B WR DA AA BA 3 DA AA BA 11/23/2018 Instruction encoding
5
Review: Don’t-care conditions
In immediate-format instructions, bits 2-0 store a constant operand, not a second source register! However, immediate instructions only use one source register, so the control signal BA would be a don’t care condition anyway. Similarly, jump and branch instructions require neither a destination register nor a second source register. So we can always take DA, AA and BA directly from the instruction. 3 DA AA BA DA2 DA1 DA0 = I8 I7 I6 AA2 AA1 AA0 = I5 I4 I3 BA2 BA1 BA0 = I2 I1 I0 11/23/2018 Instruction encoding
6
FS for branch instructions
FS would be don’t-cares for loads, stores and jumps, which do not involve the ALU. However, FS is required for branch instructions, which depend on the ALU’s status bit outputs. For example, in BZ R3, -24 the contents of R3 must go through the ALU so that Z will be set appropriately. For our branches, we just need the ALU function “G = A” (FS = or 00111). D Register file A B WR DA AA BA A B ALU G FS V C N Z Mux B MB Mux D MD ADRS DATA Data RAM OUT MW constant 11/23/2018 Instruction encoding
7
More about the branch control unit
Next, let’s see how to manage the control flow of a program. The branch control unit needs a lot of information about the current instruction. Whether it’s a jump, a branch, or some other instruction. For branches and jumps, the target address. For branches, the specific branch condition. All of this can be generated by the instruction decoder, which has to process the instruction words anyway. ADRS Instruction RAM OUT PC Instruction Decoder DA AA BA MB FS MD WR MW Branch Control V C N Z 11/23/2018 Instruction encoding
8
Branch control unit inputs and outputs
Branch control inputs: PL, JB, BC and AD are output by the instruction decoder, and carry information about the current instruction. Status bits V, C, N and Z come from the datapath. The current PC is needed for PC-relative mode jumps and branches. Branch control outputs: A Load signal for the PC. When Load = 1, the branch control unit also generates the target address to jump or branch to. ADRS Instruction RAM OUT PC Instruction Decoder DA AA BA MB FS MD WR MW Branch Control V C N Z PL JB BC AD 11/23/2018 Instruction encoding
9
Branch control unit inputs
The decoder sends the following data to the branch control unit: PL and JB indicate the type of instruction. BC encodes the kind of branch. AD determines the jump or branch target address. ADRS Instruction RAM OUT PC Instruction Decoder DA AA BA MB FS MD WR MW Branch Control V C N Z PL JB BC AD 11/23/2018 Instruction encoding
10
Generating PL and JB The instruction decoder generates PL and JB from instruction opcodes. Note that if PL = 0, then the value of JB doesn’t matter. As expected, PL and JB only matter for jumps and branches. From this table you could derive: PL = I15 I14 JB = I13 11/23/2018 Instruction encoding
11
Generating BC and AD We defined the branch opcodes so that they already contain the branch type, so BC can come straight from the instruction. AD can also be taken directly out of the instruction. 3 BC AD BC2 BC1 BC0 = I11 I10 I9 AD5 AD4 AD3 AD2 AD1 AD0 = I8 I7 I6 I2 I1 I0 11/23/2018 Instruction encoding
12
Branch control unit Now we’ve seen how the instruction decoder generates PL, JB, BC and AD. How does the branch unit use these to control the PC? There are three cases, depending on the values of PL and JB. If PL = 0, the current instruction is not a jump or branch, so the branch control just needs to make the program counter increment, and execute the next instruction. PC Branch Control V C N Z PL JB BC AD 11/23/2018 Instruction encoding
13
Jumps If PL = 1 and JB = 1, the current instruction must be a jump.
We assume PC-relative addressing, so the jump “offset” (AD) must be added to the current PC value, and then stored back into the PC. The branch control unit would contain an adder just for computing the target address. Again, AD is signed so we can jump forwards or backwards. PC Branch Control V C N Z PL JB BC AD 11/23/2018 Instruction encoding
14
Branches If PL = 1 and JB = 0, the current instruction is a conditional branch. The branch control unit first determines if the branch should be taken. It checks the type of branch (BC) and the status bits (VCNZ). For example, if BC = 011 (branch if zero) and Z = 1, then the branch condition is true and the branch should be taken. Then the branch control unit sets the PC appropriately. If the branch is taken, AD is added to the PC, just as for jumps. Otherwise, the PC is incremented, just as for normal instructions. PC Branch Control V C N Z PL JB BC AD 11/23/2018 Instruction encoding
15
Summary of Control Unit
We saw an outline of the control unit hardware. The program counter points into a special instruction memory, which contains a machine language program. An instruction decoder looks at each instruction and generates the correct control signals for the datapath and a branching unit. The branch control unit handles instruction sequencing. The control unit implementation depends on both the instruction set architecture and the datapath. Careful selection of opcodes and instruction formats can make the control unit simpler. In MP4 you’ll design the control unit for a slightly different CPU. We now have a whole processor! This is the culmination of everything we did this semester, starting from those tiny little primitive gates. 11/23/2018 Instruction encoding
16
Other ISAs Next, we’ll first we look at a longer example program, starting with some C code and translating it into our assembly language. Then we discuss some alternative instruction set designs. Different ways of specifying memory addresses Different numbers and types of operands in ALU instructions July 29, 2002 © Howard Huang
17
String manipulation example
A C-style string is represented as a special kind of array. Each element is a single byte, containing an ASCII code that represents one letter of the string. The last element of the array is a 0, or a “null” character. For example, The Godfather would be a 14-byte array: We’ll write a program to convert a string to all uppercase characters: Lowercase letters in ASCII range from 97 to 122 (‘a’ to ‘z’). Uppercase letters range from 65 to 90 (‘A’ to ‘Z’). A lowercase letter can be converted to uppercase by subtracting 32 from its ASCII code (e.g., = 65). 11/23/2018 Instruction encoding
18
Two versions in C Assume that variable R0 contains the address of the string, or its starting location in memory. Both of these loop through a string, subtracting 32 from lowercase letters, until they reach the terminating 0. The first one accesses letters by indexing the array R0, and incrementing the index on each loop iteration. The second one uses a pointer which is dereferenced to produce a letter. The pointer is incremented once per iteration. int i = 0; while (R0[i] != 0) { if (R0[i] >= 97 && R0[i] <= 122) R0[i] = R0[i] - 32; i++; } while (*R0 != 0) { if (*R0 >= 97 && *R0 <= 122) *R0 = *R0 - 32; R0++; 11/23/2018 Instruction encoding
19
Array version Here is a direct translation of the array version.
R0 contains the string address. R1 contains the loop index, i. R3 contains R0[i]. We also need the address of R0[i] to load the data; this is stored in R2. LD R1, #0 // Use R1 as index i LOOP: ADD R2, R1, R0 // R2 = &R0[i] LD R3, (R2) // R3 = R0[i] BEQ R3, #0, EXIT // Exit if R3 = 0 BLT R3, #97, NEXT // R3 < 97? BGT R3, #122, NEXT // R3 > 122? SUB R3, R3, #32 // Convert R3 and put ST (R2), R3 // it back in RAM NEXT: ADD R1, R1, #1 // i++ JMP LOOP // Loop EXIT: ... 11/23/2018 Instruction encoding
20
Pointer version Here is the pointer-based version of the same program.
R0 contains the string address. R3 contains *R0. This version is a little shorter. We increment R0 once, instead of incrementing an index i and then adding i to the string’s base address. This saves one addition per iteration, and R2 isn’t needed anymore. LOOP: LD R3, (R0) // R3 = *R0 BEQ R3, #0, EXIT // Exit if R3 = 0 BLT R3, #97, NEXT // R3 < 97? BGT R3, #122, NEXT // R3 > 122? SUB R3, R3, #32 // Convert R3 and ST (R0), R3 // store back to RAM NEXT: ADD R0, R0, #1 // R0++ JMP LOOP EXIT: ... 11/23/2018 Instruction encoding
21
Addressing modes The first instruction set design issue we’ll see are addressing modes, which let you specify memory addresses in various ways. Each mode has its own assembly language notation. Different modes may be useful in different situations. The location that is actually used is called the effective address. The addressing modes that are available will depend on the datapath. Our simple datapath only supports two forms of addressing. Older processors like the 8086 have zillions of addressing modes. We’ll introduce some of the more common ones. 11/23/2018 Instruction encoding
22
Immediate addressing One of the simplest modes is immediate addressing, where the operand itself is accessed. LD R1, #1999 R1 1999 This mode is a good way to specify initial values for registers. We’ve already used immediate addressing several times. We introduced it on Monday with some short examples. It appears in the string conversion program you just saw. 11/23/2018 Instruction encoding
23
Direct addressing Another possible mode is direct addressing, where the operand is a constant that represents a memory address. LD R1, 500 R1 M[500] Here the effective address is 500, the same as the operand. This is useful for working with pointers. You can think of the constant as a pointer. The register gets loaded with the data at that address. 11/23/2018 Instruction encoding
24
Register indirect addressing
We already saw register indirect mode, where the operand is a register that contains a memory address. LD R1, (R0) R1 M[R0] The effective address would be the value in R0. This is also useful for working with pointers. In the example above, R0 is a pointer, and R1 is loaded with the data at that address. This is similar to R1 = *R0 in C or C++. So what’s the difference between direct mode and this one? In direct mode, the address is a constant that is hard-coded into the program and cannot be changed. Here the contents of R0, and hence the address being accessed, can easily be changed. 11/23/2018 Instruction encoding
25
Stepping through arrays
Register indirect mode makes it easy to access contiguous locations in memory, such as elements of an array. If R0 is the address of the first element in an array, we can easily access the second element too: LD R1, (R0) // R1 contains the first element ADD R0, R0, #1 LD R2, (R0) // R2 contains the second element This is so common that some instruction sets can automatically increment the register for you: LD R1, (R0)+ // R1 contains the first element LD R2, (R0)+ // R2 contains the second element Such instructions can be used within loops to access an entire array. 11/23/2018 Instruction encoding
26
Indexed addressing Operands with indexed addressing include a constant and a register. LD R1, 500(R0) R1 M[R ] The effective address is the register data plus the constant. For instance, if R0 = 25, the effective address here would be 525. We can use this addressing mode to access arrays also. The constant is the array address, while the register contains an index into the array. The example instruction above might be used to load the 25th element of an array that starts at memory location 500. It’s possible to use negative constants too, which would let you index arrays backwards. 11/23/2018 Instruction encoding
27
PC-relative addressing
We’ve seen PC-relative addressing already. The operand is a constant that is added to the program counter to produce the effective memory address. 200: LD R1, $30 R1 M[ ] The PC usually points to the address of the next instruction, so the effective address here is 231 (assuming the LD instruction itself uses one word of memory). This is similar to indexed addressing, except the PC is used instead of a regular register. Relative addressing is often used in jump and branch instructions. For instance, JMP $30 lets you skip the next 30 instructions. A negative constant lets you jump backwards, which is common in writing loops. 11/23/2018 Instruction encoding
28
Indirect addressing The most complicated mode that we’ll look at is indirect addressing. LD R1, [360] R1 M[M[360]] The operand is a constant that specifies a memory location which refers to another location, whose contents are then accessed. The effective address here is M[360]. Indirect addressing is useful for working with multi-level pointers, or “handles.” The constant represents a pointer to a pointer. In C, we might write something like R1 = **ptr. 11/23/2018 Instruction encoding
29
Addressing mode summary
11/23/2018 Instruction encoding
30
Number of operands Another way to classify instruction sets is according to the number of operands that each data manipulation instruction can have. Our example instruction set had three-address instructions, because each one had up to three operands—two sources and one destination. This provides the most flexibility, but it’s also possible to have fewer than three operands. ADD R0, R1, R2 operation destination sources operands R0 R1 + R2 Register transfer instruction: 11/23/2018 Instruction encoding
31
Two-address instructions
In a two-address instruction, the first operand serves as both the destination and one of the source registers. Some other examples and the corresponding C code: ADD R3, #1 R3 R3 + 1 R3++; MUL R1, #5 R1 R1 * 5 R1 *= 5; NOT R1 R1 R1’ R1 = ~R1; ADD R0, R1 operation destination and source 1 source 2 operands R0 R0 + R1 Register transfer instruction: 11/23/2018 Instruction encoding
32
One-address instructions
Some computers, like this old Apple II, have one-address instructions. The CPU has a special register called an accumulator, which implicitly serves as the destination and one of the sources. Here is an example sequence which increments M[R0]: LD (R0) ACC M[R0] ADD #1 ACC ACC + 1 ST (R0) M[R0] ACC ADD R0 operation source ACC ACC + R0 Register transfer instruction: 11/23/2018 Instruction encoding
33
The ultimate: zero addresses
If the destination and sources are all implicit, then you don’t have to specify any operands at all! This is possible with processors that use a stack architecture. HP calculators and their “reverse Polish notation” use a stack. The Java Virtual Machine is also stack-based. How can you do calculations with a stack? Operands are pushed onto a stack. The most recently pushed element is at the “top” of the stack (TOS). Operations use the topmost stack elements as their operands. Those values are then replaced with the operation’s result. 11/23/2018 Instruction encoding
34
Stack architecture example
From left to right, here are three stack instructions, and what the stack looks like after each example instruction is executed. This sequence of stack operations corresponds to one register transfer instruction: TOS R1 + R2 PUSH R1 PUSH R2 ADD (Top) (Bottom) 11/23/2018 Instruction encoding
35
Data movement instructions
Finally, the types of operands allowed in data manipulation instructions is another way of characterizing instruction sets. So far, we’ve assumed that ALU operations can have only register and constant operands. Many real instruction sets allow memory-based operands as well. We’ll use the book’s example and illustrate how the following operation can be translated into some different assembly languages. X = (A + B)(C + D) Assume that A, B, C, D and X are really memory addresses. 11/23/2018 Instruction encoding
36
Register-to-register architectures
Our programs so far assume a register-to-register, or load/store, architecture, which matches our datapath from last week nicely. Operands in data manipulation instructions must be registers. Other instructions are needed to move data between memory and the register file. With a register-to-register, three-address instruction set, we might translate X = (A + B)(C + D) into: LD R1, A R1 M[A] // Use direct addressing LD R2, B R2 M[B] ADD R3, R1, R2 R3 R1 + R2 // R3 = M[A] + M[B] LD R1, C R1 M[C] LD R2, D R2 M[D] ADD R1, R1, R2 R1 R1 + R2 // R1 = M[C] + M[D] MUL R1, R1, R3 R1 R1 * R3 // R1 has the result ST X, R1 M[X] R1 // Store that into M[X] 11/23/2018 Instruction encoding
37
Memory-to-memory architectures
In memory-to-memory architectures, all data manipulation instructions use memory addresses as operands. With a memory-to-memory, three-address instruction set, we might translate X = (A + B)(C + D) into simply: How about with a two-address instruction set? ADD X, A, B M[X] M[A] + M[B] ADD T, C, D M[T] M[C] + M[D] // T is temporary storage MUL X, X, T M[X] M[X] * M[T] MOVE X, A M[X] M[A] // Copy M[A] to M[X] first ADD X, B M[X] M[X] + M[B] // Add M[B] MOVE T, C M[T] M[C] // Copy M[C] to M[T] ADD T, D M[T] M[T] + M[D] // Add M[D] MUL X, T M[X] M[X] * M[T] // Multiply 11/23/2018 Instruction encoding
38
Register-to-memory architectures
Finally, register-to-memory architectures let the data manipulation instructions access both registers and memory. With two-address instructions, we might do the following: LD R1, A R1 M[A] // Load M[A] into R1 first ADD R1, B R1 R1 + M[B] // Add M[B] LD R2, C R2 M[C] // Load M[C] into R2 ADD R2, D R2 R2 + M[D] // Add M[D] MUL R1, R2 R1 R1 * R2 // Multiply ST X, R1 M[X] R1 // Store 11/23/2018 Instruction encoding
39
Size and speed There are lots of tradeoffs in deciding how many and what kind of operands and addressing modes to support in a processor. These decisions can affect the size of machine language programs. Memory addresses are long compared to register file addresses, so instructions with memory-based operands are typically longer than those with register operands. Permitting more operands also leads to longer instructions. There is also an impact on the speed of the program. Memory accesses are much slower than register accesses. Longer programs require more memory accesses, just for loading the instructions! Most newer processors use register-to-register designs. Reading from registers is faster than reading from RAM. Using register operands also leads to shorter instructions. 11/23/2018 Instruction encoding
40
Summary Instruction sets can be classified along several lines.
Addressing modes let instructions access memory in various ways. Data manipulation instructions can have from 0 to 3 operands. Those operands may be registers, memory addresses, or both. Instruction set design is intimately tied to processor datapath design. “The Godfather” is a 14-character C string. 11/23/2018 Instruction encoding
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.