Copyright 2014 – Noah Mendelsohn UM Macro Assembler Functions Noah Mendelsohn Tufts University Web: COMP 40: Machine Structure and Assembly Language Programming (Fall 2014)
© 2010 Noah Mendelsohn 2 Today Creating and calling functions using UMASM Recursion and using the stack Invariants
© 2010 Noah Mendelsohn 3 Review x86-64 Standard Function Linkage
© 2010 Noah Mendelsohn Why have a standard AMD 64 “linkage” for calling functions? (Review) Functions are compiled separately and linked together We need to standardize enough that function calls can be freely mixed We may “cheat” when caller and callee are in the same source file 4
© 2010 Noah Mendelsohn Function calls on Linux/AMD 64 Caller “pushes” return address on stack Where practical, arguments passed in registers Exceptions: –Structs, etc. –Too many –What can’t be passed in registers is at known offsets from stack pointer! Return values –In register, typically %rax for integers and pointers –Exception: structures Each function gets a stack frame –Leaf functions that make no calls may skip setting one up Caller vs. callee saved registers 5
© 2010 Noah Mendelsohn AMD 64 The stack – general case (review) 6 Before call ???? %rsp Arguments Return address After callq %rsp Arguments %rsp If callee needs frame ???? Return address Args to next call? Callee vars sub $0x{framesize},%rsp Arguments framesize
© 2010 Noah Mendelsohn AMD 64 arguments/return values in registers 7 Arguments and return values passed in registers when types are suitable and when there aren’t too many Return values usually in %rax, %eax, etc. Callee may change these and some other registers! MMX and FP 87 registers used for floating point Read the specifications for full details! Operand Size Argument Number %rdi%rsi%rdx%rcx%r8%r9 32%edi%esi%edx%ecx%r8d%r9d 16%di%si%dx%cx%r8w%r9w 8%dil%sil%dl%cl%r8b%r9b
© 2010 Noah Mendelsohn fact:.LFB2: pushq %rbx.LCFI0: movq %rdi, %rbx movl $1, %eax testq %rdi, %rdi je.L4 leaq -1(%rdi), %rdi call fact imulq %rbx, %rax.L4: popq %rbx ret AMD 64 Factorial Revisited 8 int fact(int n) { if (n == 0) return 1; else return n * fact(n - 1); }
© 2010 Noah Mendelsohn 9 The UMASM Standard Function Linkage
© 2010 Noah Mendelsohn Why have a standard UMASM function linkage? Functions are compiled separately and linked together We need to standardize enough that function calls can be freely mixed It’s tricky, so doing it the same way every time minimizes bugs and makes code clearer We may “cheat” when caller and callee are in the same source file 10
© 2010 Noah Mendelsohn Summarizing calling conventions (review) r0 is always 0 (points to segment 0, used for goto and for the call stack) r1 is the return-address register and the result register r2 is the stack pointer and must have same value after return (nonvolatile) r3, r4 are nonvolatiles r5 is a helpful volatile register (good to copy return address) r6, r7 are volatile and are normally.temps (but up to each procedure) Arguments are on the stack, with first argument at the “bottom” (where r2 points) 11 EXCEPT FOR VOLATILES AND RESULT REGISTER CALLEE MUST LEAVE REGISTERS AND STACK UNCHANGED!
© 2010 Noah Mendelsohn Startup code with stack (REVIEW) Goals of startup code: –Set up useful registers 12 init (instructions) start.section init.temps r6, r7.section stk.space endstack:.section init.zero r0 start: r0 := 0 r2 := endstack goto main linking r1 halt
© 2010 Noah Mendelsohn Startup code with stack (REVIEW) Goals of startup code: –Set up useful registers 13 init (instructions) start.section init.temps r6, r7.section stk.space endstack:.section init.zero r0 start: r0 := 0 r2 := endstack goto main linking r1 halt
© 2010 Noah Mendelsohn Startup code with stack (REVIEW) Goals of startup code: –Set up useful registers –Set up a stack 14 init (instructions) start stk words endstack (r2).section init.temps r6, r7.section stk.space endstack:.section init.zero r0 start: r0 := 0 r2 := endstack goto main linking r1 halt
© 2010 Noah Mendelsohn Startup code with stack (REVIEW) Goals of startup code: –Set up useful registers –Set up a stack –Let UMASM know about useful registers 15 init (instructions) start stk words endstack (r2).section init.temps r6, r7.section stk.space endstack:.section init.zero r0 start: r0 := 0 r2 := endstack goto main linking r1 halt
© 2010 Noah Mendelsohn Startup code with stack (REVIEW) Goals of startup code: –Set up useful registers –Set up a stack –Let UMASM know about useful registers –Invoke or drop through to program code 16 init (instructions) start stk words endstack (r2).section init.temps r6, r7.section stk.space endstack:.section init.zero r0 start: r0 := 0 r2 := endstack goto main linking r1 halt
© 2010 Noah Mendelsohn Summarizing calling conventions r0 is always 0 (points to segment 0, used for goto and for the call stack) r1 is the return-address register and the result register r2 is the stack pointer and must have same value after return (nonvolatile) r3, r4 are nonvolatiles r5 is a helpful volatile register (good to copy return address) r6, r7 are volatile and are normally.temps (but up to each procedure) Arguments are on the stack, with first argument at the “bottom” (where r2 points) 17 EXCEPT FOR VOLATILES AND RESULT REGISTER CALLEE MUST LEAVE REGISTERS AND STACK UNCHANGED!
© 2010 Noah Mendelsohn 18 Calling a Function with No Arguments
© 2010 Noah Mendelsohn Calling a function with no arguments 19 r1 := return_addr // Offset in seg 0 r7 := somefunc // Offset in seg 0 goto *r7 in program m[r0] // Call the function return_addr: // Function returns here …back in caller… call somefunc();
© 2010 Noah Mendelsohn Calling a function with no arguments 20 r1 := return_addr // Offset in seg 0 r7 := somefunc // Offset in seg 0 goto *r7 in program m[r0] // Call the function return_addr: // Function returns here …back in caller… call somefunc(); Use of r1 for return address is required
© 2010 Noah Mendelsohn Calling a function with no arguments 21 r1 := return_addr // Offset in seg 0 r7 := somefunc // Offset in seg 0 goto *r7 in program m[r0] // Call the function return_addr: // Function returns here …back in caller… call somefunc(); Use of r7 for function address is programmer choice
© 2010 Noah Mendelsohn Calling a function with no arguments 22 r1 := return_addr // Offset in seg 0 r7 := somefunc // Offset in seg 0 goto *r7 in program m[r0] // Call the function return_addr: // Function returns here …back in caller… call somefunc(); goto somefunc linking r1 // goto main, setting r1 to the return address UMASM provides this built-in macro to do the above for you!
© 2010 Noah Mendelsohn Calling a function with no arguments 23 r1 := return_addr // Offset in seg 0 r7 := somefunc // Offset in seg 0 goto *r7 in program m[r0] // Call the function return_addr: // Function returns here …back in caller… call somefunc(); goto somefunc linking r1 // goto main, setting r1 to the return address UMASM provides this built-in macro to do the above for you! Also remember: r1 is both the linking and the result register. If somefunc calls another function, then r1 must be saved.
© 2010 Noah Mendelsohn 24 Organizing Startup Code
© 2010 Noah Mendelsohn Language startup routines The C language uses a startup/closeup routine called crt0 to prepare arguments and invoke main –Sets up argv and argc –Sets up exception handling –Gathers return/exit value from main –Special purpose versions possible: e.g. gcrt0 used when profiling So…startup code, run before main(), is shared by all programs. Can we do the same for UMSAM? –Can we package our UMASM programs into a main() function? …Yes! 25
© 2010 Noah Mendelsohn urt0: A reuseable UMASM startup routine 26.zero r0.temps r7.section stk.space endstack:.section init _ustart: r0 := 0 r2 := endstack goto main linking r1 halt // Finis urt0.ums
© 2010 Noah Mendelsohn urt0: A reuseable UMASM startup routine 27.zero r0.temps r7.section stk.space endstack:.section init _ustart: r0 := 0 r2 := endstack goto main linking r1 halt // Finis Establish r0 as always containing 0
© 2010 Noah Mendelsohn urt0: A reuseable UMASM startup routine 28.zero r0.temps r7.section stk.space endstack:.section init _ustart: r0 := 0 r2 := endstack goto main linking r1 halt // Finis Build stack and set up r2 as stack pointer
© 2010 Noah Mendelsohn urt0: A reuseable UMASM startup routine 29.zero r0.temps r7.section stk.space endstack:.section init _ustart: r0 := 0 r2 := endstack goto main linking r1 halt // Finis Invoke main() as a function! Trick: umasm urt0.ums mainprog.ums otherfile.ums Concatenates all source files before assembling.
© 2010 Noah Mendelsohn Startup code with stack (REVIEW) Goals of startup code: –Set up useful registers –Set up a stack –Let UMASM know about register assignments –Invoke or drop through to user code 30.section init.temps r6, r7.section stk.space endstack:.section init.zero r0 start: r0 := 0 r2 := endstack goto main linking r1 halt init (instructions) start stk words endstack (r2) WE’VE NOW MADE A COMMON VERSION OF THIS IN urt0.ums!!
© 2010 Noah Mendelsohn 31 Writing A Function That Takes No Arguments We will use main() as an example
© 2010 Noah Mendelsohn Pushing and popping (Review) Push macro Pop macro 32.zero r0.temps r6, r7 push r3 on stack r2 r6 := -1 // macro instruction (not shown) r2 := r2 + r6 m[r0][r2] := r3 Expands to: pop r3 off stack r2 r3 := m[r0][r2] r6 := 1 r2 := r2 + r6 Expands to: These instructions can be use to save and restore registers and to put function arguments on the stack!
© 2010 Noah Mendelsohn Our program can go in the “main” function 33.zero r0.temps r6,r7 main: push r1 on stack r2 // Address main returns to # Can skip one or both of the next two if registers not needed push r3 on stack r2 // Callee saves non-volatile reg push r4 on stack r2 // Callee saves non-volatile reg...DO REAL WORK HERE... output "Hello world\n" pop r4 off stack r2 // restore caller's r4 pop r3 off stack r2 // restore caller's r3 pop r5 off stack r2 // get return address r1 := 0 // EXIT_SUCCESS is the result goto r5
© 2010 Noah Mendelsohn Now our program can go in the “main” function 34.zero r0.temps r6,r7 main: push r1 on stack r2 // Address main returns to # Can skip one or both of the next two if registers not needed push r3 on stack r2 // Callee saves non-volatile reg push r4 on stack r2 // Callee saves non-volatile reg...DO REAL WORK HERE... output "Hello world\n" pop r4 off stack r2 // restore caller's r4 pop r3 off stack r2 // restore caller's r3 pop r5 off stack r2 // get return address r1 := 0 // EXIT_SUCCESS is the result goto r5 Remember…we’re assuming urt0 has set up stack and r0
© 2010 Noah Mendelsohn Now our program can go in the “main” function 35.zero r0.temps r6,r7 main: push r1 on stack r2 // Address main returns to # Can skip one or both of the next two if registers not needed push r3 on stack r2 // Callee saves non-volatile reg push r4 on stack r2 // Callee saves non-volatile reg...DO REAL WORK HERE... output "Hello world\n" pop r4 off stack r2 // restore caller's r4 pop r3 off stack r2 // restore caller's r3 pop r5 off stack r2 // get return address r1 := 0 // EXIT_SUCCESS is the result goto r5 Here’s where our useful work will go.
© 2010 Noah Mendelsohn Now our program can go in the “main” function 36.zero r0.temps r6,r7 main: push r1 on stack r2 // Address main returns to # Can skip one or both of the next two if registers not needed push r3 on stack r2 // Callee saves non-volatile reg push r4 on stack r2 // Callee saves non-volatile reg...DO REAL WORK HERE... output "Hello world\n" pop r4 off stack r2 // restore caller's r4 pop r3 off stack r2 // restore caller's r3 pop r5 off stack r2 // get return address r1 := 0 // EXIT_SUCCESS is the result goto r5 Return address saved from r1..but popped into R5
© 2010 Noah Mendelsohn Now our program can go in the “main” function 37.zero r0.temps r6,r7 main: push r1 on stack r2 // Address main returns to # Can skip one or both of the next two if registers not needed push r3 on stack r2 // Callee saves non-volatile reg push r4 on stack r2 // Callee saves non-volatile reg...DO REAL WORK HERE... output "Hello world\n" pop r4 off stack r2 // restore caller's r4 pop r3 off stack r2 // restore caller's r3 pop r5 off stack r2 // get return address r1 := 0 // EXIT_SUCCESS is the result goto r5 Only save non-volatiles if necessary
© 2010 Noah Mendelsohn Now our program can go in the “main” function 38.zero r0.temps r6,r7 main: push r1 on stack r2 // Address main returns to # Can skip one or both of the next two if registers not needed push r3 on stack r2 // Callee saves non-volatile reg push r4 on stack r2 // Callee saves non-volatile reg...DO REAL WORK HERE... output "Hello world\n" pop r4 off stack r2 // restore caller's r4 pop r3 off stack r2 // restore caller's r3 pop r5 off stack r2 // get return address r1 := 0 // EXIT_SUCCESS is the result goto r5 umasm urt0.ums hello.ums...more ums here if needed... > hello Remember to combine all the pieces when you build
© 2010 Noah Mendelsohn 39 UMASM Function Arguments and the Call Stack
© 2010 Noah Mendelsohn UMASM function arguments (Review) At entry to a function, the stack must look like this m[r0][r2 + 0] // first parameter m[r0][r2 + 1] // second parameter m[r0][r2 + 2] // third parameter The called function may push more data on the stack which will change r2… …when this happens the parameters will appear to be at larger offsets from the (modified) r2! 40 At function entry ???? r2 Third parm Second parm First parm
© 2010 Noah Mendelsohn 41 Recursion: Factorial
© 2010 Noah Mendelsohn Can we build UMASM code for this? 42 int fact(int n) { if (n == 0) return 1; else return n * fact(n - 1); } REMEMBER THE CONTRACT BETWEEN CALLER AND CALLEE!
© 2010 Noah Mendelsohn Summarizing calling conventions r0 is always 0 (points to segment 0, used for goto and for the call stack) r1 is the return-address register and the result register r2 is the stack pointer and must have same value after return (nonvolatile) r3, r4 are nonvolatiles r5 is a helpful volatile register (good to copy return address) r6, r7 are volatile and are normally.temps (but up to each procedure) Arguments are on the stack, with first argument at the “bottom” (where r2 points) 43 EXCEPT FOR VOLATILES AND RESULT REGISTER CALLEE MUST LEAVE REGISTERS AND STACK UNCHANGED!
© 2010 Noah Mendelsohn Factorial 44.section init.zero r0.temp r6,r7 fact: push r1 on stack r2 // save return address push r3 on stack r2 // save nonvolatile register r3 := m[r0][r2 + 2] // first parameter 'n' // Handle base case: if n == 0 if (r3 == 0) goto base_case // recursive call r5 := r3 - 1 // OK to step on volatile register push r5 on stack r2 // now (n - 1) is a paraemter goto fact linking r1 // make the recursive call pop stack r2 // pop parameter r1 := r3 * r1 // n * fact(n-1) goto finish_fact base_case: r1 := 1 finish_fact: // What are invariants here? pop r3 off stack r2 // restore saved register pop r5 off stack r2 // put return address in r5 goto r5
© 2010 Noah Mendelsohn Factorial 45.section init.zero r0.temp r6,r7 fact: push r1 on stack r2 // save return address push r3 on stack r2 // save nonvolatile register r3 := m[r0][r2 + 2] // first parameter 'n' // Handle base case: if n == 0 if (r3 == 0) goto base_case // recursive call r5 := r3 - 1 // OK to step on volatile register push r5 on stack r2 // now (n - 1) is a paraemter goto fact linking r1 // make the recursive call pop stack r2 // pop parameter r1 := r3 * r1 // n * fact(n-1) goto finish_fact base_case: r1 := 1 finish_fact: // What are invariants here? pop r3 off stack r2 // restore saved register pop r5 off stack r2 // put return address in r5 goto r5 What are the invariants here?
© 2010 Noah Mendelsohn fact:.LFB2: pushq %rbx.LCFI0: movq %rdi, %rbx movl $1, %eax testq %rdi, %rdi je.L4 leaq -1(%rdi), %rdi call fact imulq %rbx, %rax.L4: popq %rbx ret Compare with the AMD 64 version 46 int fact(int n) { if (n == 0) return 1; else return n * fact(n - 1); }
© 2010 Noah Mendelsohn Main program to call factorial 47 void main(void) { for (int i = 10; i != 0; i--) { print_d(i); puts("! == "); print_d(fact(i)); putchar('\n'); } return EXIT_SUCCESS; }
© 2010 Noah Mendelsohn Main program to call factorial 48.zero r0.temps r6,r7 main: push r1 on stack r2 // Address main returns to push r3 on stack r2 // Callee saves non-volatile reg r3 := 10 // Using r3 for i - loop var goto main_test main_loop: push r3 on stack r2 // Pass i as first arg to function goto print_d linking r1 // Call print_d pop stack r2 // Function leaves arg on stack output "! == " // puts() is inlined push r3 on stack r2 // Pass i as first arg to function goto fact linking r1 // Call fact push r1 on stack r2 // Pass fact's return value as arg goto print_d linking r1 // Call print_d r2 := r2 + 2 // pop 2 parameters, 1 for each call output '\n' // putc() inlined --- it's a C macro r3 := r3 - 1 // i-- main_test: if (r3 != 0) goto main_loop // while (i != 0) pop r3 off stack r2 // restore caller's r3 pop r5 off stack r2 // get return address r1 := 0 // EXIT_SUCCESS is the result goto r5 // return