Assembly Language Programming II: C Compiler Calling Sequences
Homework Reading Machine Project Labs PAL, pp 201-216, 297-312 Starting on mp2warmup? Labs Continue labs with your assigned section
Coding and Calling Functions An assembly language programmer handles a lot of details to coordinate the code for calling a function and the code in the function itself There are two mechanisms in the instruction set for calling and returning from functions: Linux system calls and returns int $0x80 and iret (hardware saves eip, eflag) C library style function calls and returns call and ret (hardware saves eip)
Coding and Calling Functions A really “old school” way to pass data back and forth between assembly language functions is to leave all data in “global memory” This was really very efficient back when CPU’s were not very powerful and some did not have hardware supported stack mechanisms Today we understand the software maintenance problem that this choice creates and the CPU’s are powerful enough for us to not need to do it
Coding and Calling Functions A somewhat “old school” way to call functions: Load up registers with input values before call Unload return values from registers after return (if any) This is still in use in Linux system calls, such as: # <unistd> write as a Linux system call movl $4, %eax # system call value movl $1, %ebx # file descriptor movl $output, %ecx # *buffer movl $len, %edx # length int $0x80 # call to system
Coding and Calling Functions We won’t use the Linux system call and return mechanism in this course, but: I feel that you should be aware of it and recognize it when the textbook uses it in an example We’ll use the iret instruction later with hardware interrupts We will use the call and ret mechanism as is typically used for C library function calls
main function in C Example of a caller function in C: extern int mycode(int x, int y); int main(void){ … z = mycode(x, y); }
Compiled output for main() # C compiler generated code for main: .text . . . pushl y # put arg y on stack pushl x # put arg x on stack call _mycode # call function mycode addl $8, %esp # purge args from stack movl %eax, z # save return value .data z: .long 0 # location for variable z
Function Written in C Example of a callee function in C : int mycode(int x, int y) { /* automatic variables */ int i; int j; . . . return result; }
Function Written in Assembly calling sequence discussed in later slides .text .globl _mycode _mycode: # entry point label pushl %ebp # set up stack frame movl %esp,%ebp # save %esp in %ebp subl $n,%esp # automatic variables .. . # code as needed movl xxx, %eax # set return value movl %ebp, %esp # restore %esp from %ebp popl %ebp # restore %ebp ret # return to caller .end
Writing C functions in Assembly- General Coding Conventions Use same function name as used in the calling C program except add a leading underscore ‘_’ Setup C compiler stack frame (optional) Use only %eax, %ecx, and %edx as scratch registers (see http://wiki.osdev.org/Calling_Conventions) Save/restore any other registers on stack if used Put return value in %eax Remove C compiler stack frame (optional) Return
Writing C functions in Assembly- Coding Conventions for callee If function has arguments or automatic variables (that require n bytes), include this optional code Assembly language after entry point label (enter): pushl %ebp # set up stack frame movl %esp,%ebp # save %esp in %ebp subl $n,%esp # automatic variables Assembly language before ret (leave): movl %ebp, %esp # restore %esp from %ebp popl %ebp # restore %ebp
Stack Pointer For C Function Call State of the stack during function execution: %esp before setting up call Points to previous stack frame %esp %ebp Lower level Function Calls %esp before call _func %ebx j i %ebp %eip x y Automatic Variables i = -4(%ebp) j = -8(%ebp) Return Address Argument Variables Lower Level Function Returns x = 8(%ebp) Low address High address y = 12(%ebp)
C Compiler Reserved Registers The C compiler assumes it can keep data in certain registers (%ebx, %ebp) when it generates code If assembly code uses compiler’s reserved registers, it must save and restore the values for the calling C code Example: . . . # we can’t use %ebx yet pushl %ebx # save register contents . . . # we can use %ebx now popl %ebx # restore %ebx . . . # we can’t use %ebx any more ret Matching pair
Code Demo 1 # demo_fun1.s:add 2 integers 1 /* demo_fun1c.c: 2 # Need to preserve ebp 3 # Ron Cheung 02/28/2015 4 # 5 .text 6 .globl _add2 7 _add2: 8 # enter (setup stack frame) 9 pushl %ebp 10 movl %esp, %ebp 11 subl $8, %esp # pointer for local automatic 12 movl 8(%ebp), %eax # move a to eax 13 addl 12(%ebp), %eax # add b to eax 14 done: 15 # leave (remove stack frame) 16 movl %ebp, %esp 17 popl %ebp 18 ret 19 .end 1 /* demo_fun1c.c: 2 Demo program for a C main calling an assembly function 3 C compiler expects ebp, ebx to be preserved 4 Ron Cheung 02/28/2015 5 */ 6 7 #include <stdio.h> 8 extern int add2(int a, int b); 9 int abc; 10 int a=0x10, b=0x20; 11 12 int main(void) 13 { 14 abc = add2(a, b); 15 printf("The sum is %d\n", abc); 16 return 0; 17 }
Disassembled Output _main 22 00000038 </groups/ulab/pcdev/lib/startup0.opc-1000c8> pushl %ebp 23 00000039 </groups/ulab/pcdev/lib/startup0.opc-1000c7> movl %esp,%ebp 24 0000003b </groups/ulab/pcdev/lib/startup0.opc-1000c5> call 00000638 </groups/ulab/pcdev/lib/startup0.opc-ffac8> 25 00000040 </groups/ulab/pcdev/lib/startup0.opc-1000c0> movl 0x100fbc,%eax 26 00000045 </groups/ulab/pcdev/lib/startup0.opc-1000bb> pushl %eax 27 00000046 </groups/ulab/pcdev/lib/startup0.opc-1000ba> movl 0x100fb8,%eax 28 0000004b </groups/ulab/pcdev/lib/startup0.opc-1000b5> pushl %eax 29 0000004c </groups/ulab/pcdev/lib/startup0.opc-1000b4> call 00000078 </groups/ulab/pcdev/lib/startup0.opc-100088> 30 00000051 </groups/ulab/pcdev/lib/startup0.opc-1000af> addl $0x8,%esp 31 00000054 </groups/ulab/pcdev/lib/startup0.opc-1000ac> movl %eax,%eax 32 00000056 </groups/ulab/pcdev/lib/startup0.opc-1000aa> movl %eax,0x100fd0 33 0000005b </groups/ulab/pcdev/lib/startup0.opc-1000a5> movl 0x100fd0,%eax 34 00000060 </groups/ulab/pcdev/lib/startup0.opc-1000a0> pushl %eax 35 00000061 </groups/ulab/pcdev/lib/startup0.opc-10009f> pushl $0x100128 36 00000066 </groups/ulab/pcdev/lib/startup0.opc-10009a> call 00000640 </groups/ulab/pcdev/lib/startup0.opc-ffac0> 37 0000006b </groups/ulab/pcdev/lib/startup0.opc-100095> addl $0x8,%esp 38 0000006e </groups/ulab/pcdev/lib/startup0.opc-100092> xorl %eax,%eax 39 00000070 </groups/ulab/pcdev/lib/startup0.opc-100090> jmp 00000074 </groups/ulab/pcdev/lib/startup0.opc-10008c> 40 00000072 </groups/ulab/pcdev/lib/startup0.opc-10008e> nop 41 00000073 </groups/ulab/pcdev/lib/startup0.opc-10008d> nop 42 00000074 </groups/ulab/pcdev/lib/startup0.opc-10008c> leave 43 00000075 </groups/ulab/pcdev/lib/startup0.opc-10008b> ret 44 00000076 </groups/ulab/pcdev/lib/startup0.opc-10008a> nop 45 00000077 </groups/ulab/pcdev/lib/startup0.opc-100089> nop
Disassembled Output (cont’d) _add2 46 00000078 </groups/ulab/pcdev/lib/startup0.opc-100088> pushl %ebp 47 00000079 </groups/ulab/pcdev/lib/startup0.opc-100087> movl %esp,%ebp 48 0000007b </groups/ulab/pcdev/lib/startup0.opc-100085> subl $0x8,%esp 49 0000007e </groups/ulab/pcdev/lib/startup0.opc-100082> movl 0x8(%ebp),%eax 50 00000081 </groups/ulab/pcdev/lib/startup0.opc-10007f> addl 0xc(%ebp),%eax 51 00000084 </groups/ulab/pcdev/lib/startup0.opc-10007c> movl %ebp,%esp 52 00000086 </groups/ulab/pcdev/lib/startup0.opc-10007a> popl %ebp 53 00000087 </groups/ulab/pcdev/lib/startup0.opc-100079> ret
Printing From Assembler The C calling routine (helloc.c according to our convention) to get things going is: extern void hello(); int main(int argc, char ** argv) { hello(); return 0; }
Printing From Assembler Assembly code to print Hello: .globl _hello .text _hello: pushl $hellostr # pass string argument call _printf # print the string addl $4, %esp # restore stack ret .data hellostr: .asciz “Hello\n” # printf format string .end
Printing from Assembler Assembly code to use a format statement and variable: . . . pushl x # x is a 32-bit integer pushl $format # pointer to format call _printf # call C printf routine addl $8, %esp # purge the arguments x: .long 0x341256 format: .asciz “x is: %d”
Turning It Around Calling a C function from Assembly Language Can use printf to help debug assembly code (although it’s better to use either tutor or gdb as a debugger ) Assume C functions “clobber” the contents of the %eax, %ecx, and %edx registers If you need to save them across a C function call: Push them on the stack before the call Pop them off the stack after the return
Preserving Compiler Scratch Registers C compiler assumes that it can use certain registers when it generates code (%eax, %ecx, and %edx) A C function may or may not clobber the value of these registers If assembly code needs to preserve the values of these registers across a C function call, it must save/restore their: . . . # if ecx is in use pushl %ecx # save %ecx call _cFunction # may clobber ecx popl %ecx # restore %ecx . . . # ecx is OK again
Function Calling Summary Passing information to functions in assembly In global memory (Horrors!) In registers (“old school” for assembly calling assembly) On the stack (for C compatible functions) For C compatible functions, the stack is used to hold: Arguments Return address Previous frame’s %ebp value Automatic variables Saved values for other compiler used registers (if needed) Passing information from function back to caller In registers (%eax is C compiler return value convention)