Download presentation
Presentation is loading. Please wait.
Published byMarcus Andrew Garrison Modified over 9 years ago
1
1 CSC 3210 Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali
2
Outline Introduction Open Subroutines Register Saving Subroutine Linkage Arguments to Subroutines Examples Leaf Subroutines Pointers as Arguments to Subroutines
3
Introduction In programming there is frequently a need either to repeat a computation or to repeat the computation with different arguments. It is possible to repeat a computation by means of a subroutine. Subroutines may be either open or closed.
4
Introduction An open subroutine is handled by the text editor or by the macro preprocessor and is the insertion of the required code whenever it is needed in the program. A closed subroutine is one in which the code appears only once in the program; whenever it is needed, a jump to the code is executed, and when it completes, a return is made to the instruction occurring after the jump instruction. Arguments to closed subroutines may be placed in registers or on the stack.
5
Introduction Execution of the subroutine should not change the state of the machine, except possibly for the condition codes. i.e. any registers that the subroutine uses must first be saved and then restored after the subroutine completes execution. Arguments to subroutines are normally local variables of the subroutine, and generally, the subroutine is free to change them.
6
Register Saving Almost any computation will involve the use of registers. The SPARC architecture provides for a register file with a mapping register that indicates the active registers. Typically, 128 registers are provided, with the programmer having access to the eight global registers, and only 24 of the mapped registers at any one time. The save instruction changes the register mapping so that new registers are provided. A similar instruction, restore, restores the register mapping on subroutine return.
7
S1() { save %sp, -96, %sp ---- S2() ---- restore } S2() { save %sp, -96, %sp ---- restore } 1.Reserve new 24 registers (8-in | 8-local | 8- 0ut) 2. Reserve Stack memory (96 bytes in this case) + 8 global registers common to all subroutines 8-Global 8*16 =128 registers REGISTER FILE 8 registers Register set =16 registers 1.Restore/Release reserved registers 2. Release Stack memory (96 bytes in this case)
8
S1() { save %sp, -96, %sp ---- S2() ---- restore } S2() { save %sp, -64, %sp ---- restore } 8-Global 8*16 = 128 registers MEMORY REGISTER FILE %sp BEFORE EXECUTION
9
S1() { save %sp, -96, %sp ---- S2() ---- restore } S2() { save %sp, -64, %sp ---- restore } 8-Global 8*16 = 128 registers MEMORY REGISTER FILE 96 bytes %fp %sp CWP EXECUTION
10
S1() { save %sp, -96, %sp ---- S2() ---- restore } S2() { save %sp, -64, %sp ---- restore } 8-Global 8*16 = 128 registers MEMORY REGISTER FILE 96 bytes %fp %sp CWP 64 bytes EXECUTION
11
S1() { save %sp, -96, %sp ---- S2() ---- restore } S2() { save %sp, -64, %sp ---- restore } 8-Global 8*16 = 128 registers MEMORY REGISTER FILE 96 bytes %fp %sp CWP EXECUTION
12
S1() { save %sp, -96, %sp ---- S2() ---- restore } S2() { save %sp, -64, %sp ---- restore } 8-Global 8*16 = 128 registers MEMORY REGISTER FILE %sp EXECUTION
13
Register Saving The 32 registers are divided into four groups: in, local, out, and general The eight general registers, %g0—%g7, are NOT mapped and are global to all subroutines. The in registers are used to pass arguments to closed subroutines, The local registers are for a subroutine’s local variables, The out registers are used to pass arguments to subroutines that are called by the current subroutine. The in, local, and out registers are mapped.
14
Register Saving When the save instruction is executed the out registers become the in registers, and a new set of local and out registers is provided. The mapping pointer into the register file is changed by 16 registers
15
8-Global REGISTER FILE 8-Global REGISTER FILE
16
Register Saving The current register set is indicated by the current window pointer, “CWP,” a machine register. The last free register set is marked by the window invalid bit, in the “WIM,” another machine register. Each register set contains 16 general registers; the number of register sets is implementation dependent. There are really 8 x 16 hardware registers and that the set selected is controlled by the cwp. When the save instruction is executed, the prior subroutine’s register contents remain unchanged until a restore instruction is executed, resetting the cwp.
17
Register Saving
18
If a further five subroutine calls are made without any returns, the situation in Figure 7.3 exists. The out registers being used are from the invalid register window marked by the win bit. After further 5 subroutine calls without return After further 6 subroutine calls without return (hardware trap) One additional subroutine call
19
Register Saving If more than 5 additional subroutine call is made, a hardware trap occurs. Its effect is to move the 16 registers from window set seven onto the stack where the stack pointer of register window seven is pointing. The trap handler may use the local registers of the invalid window. The cwp and wim pointers are moved as shown in Figure 7.2.
20
Register Saving Register window mapping explains the process by which the stack pointer becomes the frame pointer. The stack pointer is register %o6, which, after a save, becomes %i6 the frame pointer.
21
Register Saving The save and restore instructions are both also add instructions. However, the source registers are always from the current register set, and the destination register is always in the new register set. Thus the following instruction subtracts 64 from the current stack pointer but stores the result into the new stack pointer, leaving the old stack pointer contents unchanged. After the save instruction is executed, the old, unchanged stack pointer becomes the new frame pointer. =(4bytes per register * 16 registers per register set)
22
Register Saving The restore instruction, restores the register window set. On doing this a register window can underflow if the cwp is moved to the wim. When this happens the window trap routine restores the registers from the stack and resets the pointers. The restore instruction is also an add instruction and is frequently used as the final add instruction in a subroutine
23
Subroutine Linkage To branch to the first instruction of a subroutine, a ba instruction might be used. Unfortunately, if it is used there is no way of returning to the point where the sub routine was called. The SPARC architecture supports two instructions for linking to subroutines. : jmpl and call Both instructions may be used to store the address of the instruction that called the subroutine into register %o7. Question : What is the return address of the subroutine with no save instruction executed at the beginning? Question: What is the return address of the subroutine with a save instruction executed at the beginning?
24
Subroutine Linkage As the instruction following the instruction that called the subroutine will also be executed, the return from a subroutine is to %o7 + 8, which is the address of the next instruction to be executed in the main program. If a save instruction is executed at the beginning of the subroutine, the contents of %o7 will become the contents of %i7 and the return will have to be to %i7 + 8.
25
Subroutine Linkage If the subroutine name is known at assembly time, the call instruction may be used to link to a subroutine. The call instruction has as operand the label at the entry to the subroutine and transfers control to that address. It also stores the current value of the program counter, %pc, into %o7. Like any instruction that changes the %pc, the call instruction is always followed by a delay slot instruction. The call instruction delay instruction may not be annulled.
26
Subroutine Linkage If the address of the subroutine is computed, it must be loaded into a register. If this is done, the jmpl instruction is used to call the subroutine. Like most other instructions, the jmpl instruction has two source arguments and a destination register. The source may be a register and a constant or two registers. The address of the subroutine is the sum of the register contents or the sum of the register and the constant. It is this address to which the transfer takes place.
27
Subroutine Linkage Like all branching instructions, jmpl is followed by a delay slot instruction. The address of the jmpl instruction is stored in the destination register. Thus, to call a subroutine whose address is in register %oO storing the return address into %o7, we would write: The assembler recognizes as You may use the call for both types of subroutine calls. Destination address (Called Subroutine) Return address (Calling Subroutine)
28
Subroutine Linkage The return from a subroutine also makes use of the jmpl instruction. In this case we need to return to %i7 + 8 and the assembler recognizes the mnemonic ret for: The call to a subroutine is then: At the entry of the subroutine: with the return: Save instruction in called subroutine causes %o6 in calling subroutine to map to %i6 in called subroutine
29
Subroutine Linkage The ret instruction is expanded by the assembler to The restore instruction: is normally used to fill the delay slot of the ret instruction. Restore the register window set Can be final add instruction in a subroutine
30
Arguments to Subroutines Arguments to subroutines can follow in-line after the call instruction, be on the stack, or be located in registers.
31
Arguments to Subroutines 1. Arguments follow in-line after the call instruction: For example, a Fortran routine to add two numbers, 3 and 4, together would be called by: and handled by the following subroutine code: Note that the return is to %i7 + 16 jumping over the arguments. This type of argument passing is very efficient but is limited. Recursive calls are not possible, nor is it possible to compute any of the arguments.
32
Arguments to Subroutines 2. Arguments placed in stack: Placing argument onto the stack is, very general but time consuming. Each argument must be stored on the stack before the subroutine may be called. allows us complete flexibility to compute arguments, pass any number of arguments, and support recursive calls.
33
Arguments to Subroutines 3. Arguments placed in in-registers: We can use registers %o0 to %o5 (6 registers) to pass on six values to the new subroutine ( where they will be stored in registers %i0 to %i5). But for more arguments than that, they have to be stored on the stack. Hence the save command at the start of the function will have to be modified accordingly. After execution of a save instruction the arguments will be in the first six in registers, %i0—%i5.
34
Arguments to Subroutines The convention established in the SPARC architecture is to pass the first six arguments in the first six out registers, %o0—%o5, with any additional arguments placed on the stack. However, space is always reserved for the first six arguments on the stack even though they are not there (similar to reserved space for register saving). In fact, the space is reserved even if there are no arguments at all. Each argument occupies ONE WORD on the stack or register, so that when passing byte arguments to subroutines, they must be moved into word quantities before passing.
35
Arguments to Subroutines The arguments are located on the stack, after the 64 bytes reserved for register window saving. However, immediately after the 64 bytes reserved for register window saving, there is a pointer to where a structure may be returned (this is discussed in Section 7.7). Thus, the structure return pointer will be at %sp + 64 and the first argument, if it were on the stack, at %sp + 68.
36
Arguments to Subroutines As we have seen in the previous examples, 64 bytes are reserved on the stack for register window saving. Further 4 bytes are now needed for a pointer to an address where a structure may be returned by the function. After that, 24 bytes are reserved by convention for the first six arguments. After that, more space can be reserved for local variables on the stack. The typical save command will now have to be modified as: =92 bytes
37
Arguments to Subroutines The save instruction provides: Space for saving the register window set, if necessary A Structure Pointer A place to save 6 arguments Space for any local variables While keeping the %sp aligned in a double-word boundary.
38
Arguments to Subroutines If we had a subroutine vector with local variables: then the save instruction would be: resulting in 104 bytes being subtracted from the stack pointer.
39
Arguments to Subroutines Structure pointer and space to save the called routine’s arguments are all accessed positively with respect to the stack pointer The subroutine’s arguments are located positively with respect to the frame pointer. Local variables are accessed negatively with respect to the frame pointer.
40
Arguments to Subroutines The argument offsets are logically defined as: Notice the positive offsets! (w.r.t. %sp) define(arg1_s, 68) define(arg2_s, 72) define(arg3_s, 76) define(arg4_s, 80) define(arg5_s, 84) define(arg6_s, 88)
41
Example – Called Subroutine Let us look at an example. We will express the algorithm in C as follows :
42
!incoming arguments define(a_r, i0) define(b_r, i1) define(c_r, i2) !automatic variables define(x_s,-4) define(y_s,-8) define(ary_s,-264) !register variables define(i_r, l0) define(j_r, l1).global example_function example: save %sp, (-92+-264)&-8, %sp add %a_r,%b_r,%o0!x=a+b st %o0, [%fp+x_s] add %c_r,64,%o0!i=c+64 add %a_r,%c_r,%o0!ary[i] =c+a sll %i_r, 1, %o1 add %fp, ary_s, %o2 sth %o0, [%o1 + %o2] ld [%fp+x_s], %o0!y = x*a call.mul mov %a_r, %o1 st %o0, [%fp+y_s] ld [%fp+x_s], %o0!j = x+i add %i_r. %o0, %j_r ld [%fp+x_s], %o0!return x+y ld [%fp+y_s], %o1 end_example: ret restore %o0, %o1, %o0!result in o0 ……
43
Return Values Subroutines that return a value are called functions. A function in C and C++ can also return a structure. The value returned by a function or subroutine is always returned in register %o0 of the calling program. If a save instruction has been executed in called function, %o0 will be %i0 before the restore instruction is executed.
44
Subroutines with Many Arguments Arguments beyond the sixth are passed on the stack. In this case we must first make room for the arguments by subtracting from the stack pointer. For example, to call a subroutine with eight arguments: which returns the sum: We first have to make room for arguments seven and eight, which will go on the stack making sure that the stack is still double word aligned.
45
Subroutines with Many Arguments Calling Subroutine: The seventh and eighth arguments will go onto the stack at %sp + 92 and at %sp + 96, respectively. We can then pass the arguments as follows: Notice the positive offsets of additional arguments w.r.t %sp ! Make space for two additional args on stack for foo add %sp,-2*4 &-8,%sp ! Load additional args to stack mov 7,%o0!load arg 7 with its value st %o0,[%sp+92] mov 8,%o0!load arg 8 with its value st %o0,[%sp+96] ! Load first 6 args going to in registers mov 6, %o5 mov 5, %o4 mov 4, %o3 mov 3, %o2 mov 2, %o1 mov 1, %o0 ! Call foo subroutine call foo nop ! Release space on stack reserved for additional args sub %sp, -2*4&-8,%sp Calling sub() { -- foo(1,2,3,4,5,6,7,8) -- } Two additional arguments
46
Subroutines with Many Arguments Called Subroutine: Inside foo the arguments may be accessed by: Notice the positive offsets of additional arguments w.r.t %fp !define incoming argument offsets define(a1_r, i0) define(a2_r, i1) define(a3_r, i2) define(a4_r, i3) define(a5_r, i4) define(a6_r, i5) define(a7_s,92) define(a8_s,96).global foo foo: save %sp,-96,%sp ld [%fp+a8_s],%o0!8 th argument ld [%fp+a7_s],%o1!7 th argument add %o0, %o1, %o0 add %a6_r,%o0,%o0!6 th argument add %a5_r,%o0,%o0!5 th argument add %a4_r,%o0,%o0!4 th argument add %a3_r,%o0,%o0!3 rd argument add %a2_r,%o0,%o0!2 nd argument ret restore %a1_r,%o0,%o0 !1 st argument foo(int: a1,a2,a3,a4,a5,a6,a7,a8) { return a1+a2+a3+a4+a5+a6+a7+a8 }
47
In Registers –calling sub MEMORYREGISTER FILE 92 bytes Before calling foo at caling sub: (but after placing arguments in registers and memory) 7 8 ? %fp %sp 1 2 3 local Registers –calling sub out Registers –calling sub 4 5 6 … … … %sp+92 %sp+96 In Registers –calling sub MEMORYREGISTER FILE 92 bytes After calling foo at caling sub: (but before foo returns) 7 8 ? %fp 1 2 3 local Registers –calling sub in Registers –foo sub 4 5 6 … … … %fp+92 %fp+96 92 bytes ? %sp local Registers –foo sub out Registers –foo sub
48
Leaf Subroutines A leaf routine is one that does not call any other routines. For a leaf routine the register usage is restricted as follows: The leaf routine may only use the first six out registers and the global registers %go and %g1. A leaf routine does not execute either a save or a restore instruction but simply uses the calling subroutine’s register set, observing the restrictions listed above. The elimination of register saving and restoring makes calling a leaf routine very efficient. The.mul routine is a leaf routine.
49
Leaf Subroutines A leaf routine is called in the same manner as a regular subroutine, placing the return address into %o7. As a save instruction is not executed, the return address for a leaf routine is %o7 + 8, not %i7 +8. To return from a leaf subroutine, we use retl statement The assembler recognizes retl for:
50
Leaf Subroutines The subroutine foo should have been written as a leaf routine as follows: !define incoming argument offsets define(a1_r, o0) define(a2_r, o1) define(a3_r, o2) define(a4_r, o3) define(a5_r, o4) define(a6_r, o5) define(a7_s,92) define(a8_s,96).global foo foo: add %a2_r,%a1_r,%o0!o0 = 1 st + 2 nd add %a3_r,%o0,%o0!o0 += 3 rd add %a4_r,%o0,%o0!o0 += 4 th add %a5_r,%o0,%o0!o0 += 5 th add %a6_r,%o0,%o0!o0 += 6 th ld [%sp+a7_s], %o1 add %o1, %o0, %o0!o0 += 7 th ld [%sp+a8_s], %o1 add %o1, %o0, %o0!o0 += 8 th end_foo: retl nop
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.