Functions/Methods in Assembly Computer Architecture
Methods/Functions A logical collection of frequently used instructions Also called subprograms Enables code reuse and minimizes reinvention Eases development of larger programs Enables concurrent development
x86 Support x86 architecture provides some basic support for developing functions Using CALL & RET instructions Involves the use of a stack Used to store address for resuming after function call Used to pass parameters Used to store local variables Variables that are defined within the scope of the function You must have a good understanding of stack operation in order to effectively use functions!
Stack Stack is a Last-In First-Out (LIFO) data structure Rather loosely specified in x86 architecture Some part of the memory is reserved for stack use The Stack Segment (SS) register indicates beginning of memory (RAM) reserved as stack space Setup by OS The Stack Pointer (SP) register indicates top of the stack. Initialized by OS Manipulated directly or indirectly using various instructions.
Stack Operations x86 includes 2 instructions for stack operations PUSH (16, or 32 bit value) Operand can be a register, symbol, immediate value PUSH decrements ESP based on size of operand And then copies the data into the stack POP (16, or 32 bit value) Operand can be a register or symbol Copies specified bytes into the operand Increments ESP based on size of the operand
Stack Operation Example Example assembly pushl $0x12345678 popw %ax popw %bx Memory Addresses ESP -= 4 Mem(ESP) = $0x12345678 AX = Mem(ESP)=0x5678 ESP += 2 BX = Mem(ESP) =0x1234 ESP += 2 0x101 SP (0x102) 78 0x102 56 0x103 SP (0x104) 34 0x104 12 0x105 SP (0x106) SP (0x106) Stack space that is already used up.
Direct Stack Inspection Stack can be directly inspected and modified It is just another memory location /* Stack Operations */ .text .global _start _start: /* simulate Push operation */ sub $2, %esp movw 0x1234, (%esp) /* simulate pop operation */ movb (%esp), %al inc %esp movb 1(%esp), %ah
Caution! No matter how you do it You must always undo any changes you do to the stack pointer! Whether in your main method Or in any other subprogram Other points to note The data type pushed and data type popped don’t have to match. Number of pushes don’t have to match number of pops
Functions There is no clear delineation of functions in assembly You just use a label to denote start of a function It conceptually becomes a function Invoke the function using CALL instruction You have to suitably arrange the parameters to the function on the stack or in registers The method returns using the RET instruction
Example text .global _start _start: /* Example does eax -= 3 */ call _decEAX /* Function to decrement EAX */ _decEAX: dec %eax ret
Working of CALL & RET CALL <destAddress> does the following: Pushes EIP (next instruction) on the stack PUSH %EIP This causes the return address to be stored on the stack. Sufficient to handle even complex recursions! Jumps to the specified <destAddress> RET does the following: Pops 4-bytes off the stack into EIP Popping EIP causes next instruction after the CALL to this method to be executed.
Example Working Example: 0x200: call _decEAX 0x205: /* … */ 0x300: _decEAX 0x300: dec %eax 0x301: ret Memory Addresses ESP during call to _decEAX 0x105 0x104 0x103 02 0x102 05 0x101 Stack space that is already used up. Memory Addresses ESP before CALL & after RET
Parameters on the Stack Parameters are typically passed to methods using the stack Requires some agreement or conventions between caller and called as to how parameters are handled In what order are the parameters going to be stored Are the parameters going to be removed from the stack? How are values returned by the method? In this course we are going to follow the C/C++ or GNU calling conventions.
Java view of Methods Here is a Java view of the method call that I am going to discuss further on: public class Tester { static int doSomething(int x, int y, int z){ return x + y + z; } public static void main(String[] args) { int a = 10, b = 20, c = 30; int result = doSomething(a, b, c);
Callee Conventions via Example Consider invoking a function doSomething(int a, int b, int c); The call is encoded (from right-to-left) as: pushl c pushl b pushl a call doSomething /* Clean up stack */ addl $12, %esp /* Process return value */ /* in eax register */ mov %eax, result ESP return addr. (4-bytes) a (4-bytes) b (4-bytes) C (4-bytes) Stack layout after call to doSomething method!
Called Method Conventions The following tasks must be performed Must preserve value in %ebp, %ebx , %esi , %edi registers If you change these registers in a method push them on to the stack and them pop them off! Use %eax for return values Use ebp to directly access parameters on the stack if method requires parameters! First save ebp register on the stack. Set ebp to point esp Use ebp to access parameters Pop ebp off the stack just before returning from method.
Called method by Example /* static int doSomething(int x, int y, int z){ return x + y + z; } */ doSomething: pushl %ebp movl %esp, %ebp movl 8(%ebp), %eax /* eax = x */ addl 12(%ebp), %eax /* eax += y */ addl 14(%ebp), %eax /* eax += z */ popl %ebp ret
Local Variables There is no explicit notion of local variables in assembly. Simply allocate some space on the stack and use it Decrement ESP by the space you think you need Restore ESP after you have used up the space.
Pass by value or reference Push values on the stack push variable1 push length Pass by reference Push addresses on the stack push $variable1 push $length push $stringName
What is an Interrupt Interrupt is a more elaborate function call That is typically used to transfer control from one program to another One program is typically yours The other program is either the operating system or the BIOS Have a slightly different API All parameters are passed via registers Stack is not used Registers contain return values like normal functions None of the registers are guaranteed to be preserved
How to invoke an interrupt? Interrupts (or elaborate function calls) are invoked using an interrupt number Example, invoking interrupt 0x80 is done by: int $0x80 Interrupt numbers range from 0x00 to 0xFF Maximum of 255 interrupts What does the interrupt number signify? Some function that has been associated with that number. How are functions associated with interrupt numbers? Using a table called an Interrupt Vector Table (IVT)
Interrupt Vector Table (IVT) IVT is a simple data structure Shared by all programs First part of memory starting at address 0x0 255 entries in IVT Each entry (4-bytes) has CS:EIP value Various programs fill in CS:EIP values corresponding to functions These functions are called Interrupt Service Routines (ISR) 00000 1 2 F1500 3 4 ••• F2150 80 F2550 07C00 /* Your program */ F1500 F2550 ISR for the Linux OS FFFFF
Working of Interrupts When microprocessor executes INT $0x80 It pushes EFLAGS, CS and then EIP registers onto the stack in this order Jumps to the address specified in the IVT for 0x80 In our case it is F2550 The ISR starts running It is a standard assembly program ISR uses IRET instruction to return IRET pops EIP, CS, and EFLAGS off the stack! Execution resumes from the interrupted exception. 00000 1 2 F1500 3 4 ••• F2150 80 F2550 07C00 /* Your program */ F1500 F2550 ISR for the Linux OS FFFFF
CALL vs. INT CALL INT Call instruction directly includes & uses address of function called. INT uses the interrupt number to look up the IVT to determine the actual address of function. Call pushes only EIP register onto the stack. INT pushes EFLAGS, EIP & CS registers onto the stack. Functions invoked via Call must use RET (pops EIP only) to exit the function. ISRs invoked via INT must use IRET (pops EFLAGS, CS & EIP) to exit from ISR. Functions can use the stack for parameter passing. ISRs cannot use caller’s stack because they are in a different program! They have to use registers to pass parameters.
Why are Interrupts needed? Interrupts enable relocation of BIOS or OS ISRs in memory but provide consistent API Different manufacturers may place BIOS at different places in memory However, you consistently call BIOS ISRs using predefined interrupt numbers. Different versions of the OS may place ISRs at different locations in memory May depend on how much physical memory the machine has. May depend on location of the ROM-BIOS. However, you consistently use INT $0x80 to invoke the OS’s ISR! Without this feature it is practically impossible to develop operating systems or any other software that can be installed on a variety of hardware platforms.