Download presentation
Presentation is loading. Please wait.
Published byArleen Mosley Modified over 8 years ago
1
Intel Xscale® Assembly Language and C
2
The Intel Xscale® Programmer’s Model (1) (We will not be using the Thumb instruction set.) Memory Formats –We will be using the Little Endian format the lowest numbered byte of a word is considered the word’s least significant byte, and the highest numbered byte is considered the most significant byte. Instruction Length –All instructions are 32-bits long. (ARM instructions) Data Types –8-bit bytes and 32-bit words. Processor Modes (of interest) –User: the “normal” program execution mode. –IRQ: used for general-purpose interrupt handling. –Supervisor: a protected mode for the operating system.
3
The Intel Xscale® Programmer’s Model (2) The Intel Xscale® Register Set –Registers R0-R15 + CPSR (Current Program Status Register) –R13 : Stack Pointer –R14 : Link Register –R15 : Program Counter where bits 0:1 are ignored Program Status Registers –CPSR (Current Program Status Register) holds info about the most recently performed ALU operation –contains N (negative), Z (zero), C (Carry) and V (oVerflow) bits controls the enabling and disabling of interrupts sets the processor operating mode –SPSR (Saved Program Status Registers) used by exception handlers Exceptions –reset, undefined instruction, SWI, IRQ.
4
Intro to Intel Xscale® Assembly Language “Load/store” architecture 32-bit instructions 32-bit and 8-bit data types 32-bit addresses 37 registers (30 general-purpose registers, 6 status registers and a PC) –only a subset is accessible at any point in time Load and store multiple instructions No instruction to move a 32-bit constant to a register (why?) Conditional execution Barrel shifter –scaled addressing, multiplication by a small constant, and ‘constant’ generation Co-processor instructions (we will not use these)
5
Intel Xscale® Assembly Language Basics Conditional Execution The Intel Xscale® Barrel Shifter Loading Constants into Registers Loading Addresses into Registers Jump Tables Using the Load and Store Multiple Instructions Check out Chapters 1 through 5 of the ARM Architecture Reference Manual
6
Generating Assembly Language Code from C Use the command-line option –S. –When you compile a.c file, you get a.s file –This.s file contains the assembly language code generated by the compiler When assembled, this code can potentially be linked and loaded as an executable
7
Register Names and Use Register #APCS NameAPCS Role R0 a1 argument 1 R1 a2 argument 2 R2 a3 argument 3 R3 a4 argument 4 R4..R8 v1..v5 register variables R9 sb/v6 static base/register variable R10 sl/v7 stack limit/register variable R11 fp frame pointer R12 ip scratch reg/ newsb in interlinkunit calls R13 sp low end of current stack frame R14 lr link address/scratch register R15 pc program counter
8
“Frame Pointer” foo: MOV ip, sp STMDB sp!,{a1a3, fp, ip, lr, pc} LDMDB fp,{fp, sp, pc} pc lr ip fp address 0x90 0x8c 0x88 0x84 0x80 0x7c 0x78 0x74 0x70 fp 1 a3 a2 a1 1 ipSP frame pointer (fp) points to the top of stack for function
9
The Frame Pointer fp points to top of the stack area for the current function –Or zero if not being used By using the frame pointer and storing it at the same offset for every function call, it creates a singlylinked list of activation records Creating the stack “backtrace” structure MOV ip, sp STMFD sp!,{a1a4,v1 v5,sb,sl,fp,ip,sp, lr,pc} SUB fp, ip, #4 pc lr SP before address 0x90 0x8c 0x88 0x84 0x80 0x7c 0x78 0x74 0x70 0x6c 0x68 0x64 0x60 0x5c 0x58 0x54 0x50 ip fp v7 v6 v5 v4 v3 v2 v1 a4 a3 a2 a1 SP after FP after sp
10
The Frame Pointer fp points to top of the stack area for the current function –Or zero if not being used By using the frame pointer and storing it at the same offset for every function call, it creates a singlylinked list of activation records –The fp register points to the stack backtrace structure for the currently executing function. –The saved fp value is (zero or) a pointer to a stack backtrace structure created by the function which called the current function. –The saved fp value in this structure is a pointer to the stack backtrace structure for the function that called the function that called the current function; and so on back until the first function. (saved) pc (saved) lr ( saved) sb SP before address 0x90 0x8c 0x88 0x84 0x80 0x7c 0x78 0x74 0x70 0x6c 0x68 0x64 0x60 0x5c 0x58 0x54 0x50 (saved) ip (saved) fp v7 v6 v5 v4 v3 v2 v1 a4 a3 a2 a1 SP current FP current
11
Example Backtrace (saved) pc (saved) lr ( saved) sb (saved) ip (saved) fp v7 v6 v5 v4 v3 v2 v1 a4 a3 a2 a1 (saved) pc (saved) lr ( saved) sb (saved) ip (saved) fp v7 v6 v5 v4 v3 v2 v1 a4 a3 a2 a1 (saved) pc (saved) lr ( saved) sb (saved) ip (saved) fp v7 v6 v5 v4 v3 v2 v1 a4 a3 a2 a1 fp bar ’s frame foo ’s frame main ’s frame If main calls foo which calls bar
12
Creating the “backtrace” structure MOV ip, sp STMFD sp!,{a1a4,v1v5,sb,fp,ip,lr,pc} SUB fp, ip, #4 … LDMFD fp, {fp,sp,sb,pc} (saved) pc (saved) lr ( saved) sb SP before address 0x90 0x8c 0x88 0x84 0x80 0x7c 0x78 0x74 0x70 0x6c 0x68 0x64 0x60 0x5c 0x58 0x54 0x50 (saved) ip (saved) fp v7 v6 v5 v4 v3 v2 v1 a4 a3 a2 a1 SP current FP after
13
How Does STM Place Things into Memory ? STM sp!, {r0r15} The XScale processor uses a bit-vector to represent each register to be saved The architecture places the lowest number register into the lowest address Default STM == STMDB pc lr sp SP before address 0x90 0x8c 0x88 0x84 0x80 0x7c 0x78 0x74 0x70 0x6c 0x68 0x64 0x60 0x5c 0x58 0x54 0x50 ip fp v7 v6 v5 v4 v3 v2 v1 a4 a3 a2 a1 SP after
14
Example 1: A Simple Program int a,b; int main() { a = 3; b = 4; } /* end main() */.text /*section declaration*/.align 2.global main /*export entry point*/.type main, %function main: mov ip, sp stmfd sp!, {fp, ip, lr, pc} sub fp, ip, #4 ldr r2,.L2 mov r3, #3 str r3, [r2, #0] /* a=3 */ ldr r2,.L2+4 mov r3, #4 str r3, [r2, #0] /* b=4 */ mov r0, r3 ldmfd sp, {fp, sp, pc} /*return*/.L3:.align 2.L2:.word a.word b STMFD store multiple, full descending sp sp 4 mem[sp] = pc ; program counter sp sp – 4 mem[sp] = lr ; link register sp sp – 4 mem[sp] = ip ; new stack base sp sp – 4 mem[sp] = fp ; frame pointer LDMFD load multiple, full descending fp = mem[sp] (fp) ; frame pointer sp sp + 4 sp = mem[sp] (ip) ; stack pointer sp sp + 4 pc = mem[sp] (lr) ; program counter
15
Example 2: Calling A Function int tmp, a, b; void swap(int a, int b); int main() { a = 3; b = 4; swap(a,b); } /* end main() */ void swap(int a,int b) { tmp = a; a = b; b = tmp; } /* end swap() */.global main /*export entry point*/.type main, %function main: mov ip, sp stmfd sp!, {fp, ip, lr, pc} sub fp, ip, #4 ldr r2,.L2 mov r3, #3 str r3, [r2, #0] /* a=3 */ ldr r2,.L2+4 mov r3, #4 str r3, [r2, #0] /* b=4 */ ldr r3,.L2 ldr r2,.L2+4 ldr r0, [r3, #0] /* a */ ldr r1, [r2, #0] /* b */ bl swap /* function call */ mov r0, r3 ldmfd sp, {fp, sp, pc} /*return*/.L3:.align 2.L2:.word a.word b
16
Example 2: Calling A Function (Cont’d).global swap.type swap, %function swap: mov ip, sp stmfd sp!, {fp, ip, lr, pc} sub fp, ip, #4 sub sp, sp, #8 str r0, [fp, #-16] /* a */ str r1, [fp, #-20] /* b */ ldr r2,.L5 /* r2 = &tmp */ ldr r3, [fp, #-16] /* r3 = a */ str r3, [r2, #0] /* tmp = a */ ldr r2, [fp, #-20] /* r2 = b */ str r3, [fp, #-16] /* a */ ldr r3,.L5 ldr r3, [r3, #0] /* tmp */ ldr r3, [fp, #-20] /* r3 = b */ str r3, [fp, #-16] /* a = b */ ldr r3,.L5 ldr r3, [r3, #0] /* tmp */ str r3, [fp, #-20] /* b = tmp */ sub sp, fp, #12 ldmfd sp, {fp, sp, pc} /*return*/.L6:.align 2.L5:.word tmp void swap(int a,int b) { tmp = a; a = b; b = tmp; } /* end swap() */
17
Example 3: Manipulating Pointers int tmp; int a, b; void swap (int *a, int *b); int main() { a = 3; b = 4; swap(&a, &b); } /* end main() */ void swap(int *a,int *b) { tmp = *a; *a = *b; *b = tmp; } /* end swap() */.global main /*export entry point*/.type main, %function main: mov ip, sp stmfd sp!, {fp, ip, lr, pc} sub fp, ip, #4 ldr r2,.L2 mov r3, #3 str r3, [r2, #0] /* a=3 */ ldr r2,.L2+4 mov r3, #4 str r3, [r2, #0] /* b=4 */ ldr r3,.L2 ldr r2,.L2+4 bl swap /* function call */ mov r0, r3 ldmfd sp, {fp, sp, pc} /*return*/.L3:.align 2.L2:.word a.word b
18
Example 3 (cont’d).global swap.type swap, %function swap: mov ip, sp stmfd sp!, {fp, ip, lr, pc} sub fp, ip, #4 sub sp, sp, #8 str r0, [fp, #-16] /* &a */ str r1, [fp, #-20] /* &b */ ldr r2,.L5 /* r2 = &tmp */ ldr r3, [fp, #-16] /* r3 = &a */ ldr r3, [r3, #0] /* r3 = a */ str r3, [r2, #0] /* tmp = a */ ldr r2, [fp, #-16] /* r2 = &a */ ldr r3, [fp, #-20] /* r3 = &b */ ldr r3, [r3, #0] /* r3 = b */ str r3, [r2, #0] /* a = b */ ldr r2, [fp, #-20] /* r2 = &b */ ldr r3,.L5 ldr r3, [r3, #0] /* r3 = tmp */ str r3, [r2, #0] /* b = tmp */ sub sp, fp, #12 ldmfd sp, {fp, sp, pc} /*return*/.L6:.align 2.L5:.word tmp void swap(int *a,int *b) { tmp = *a; *a = *b; *b = tmp; } /* end swap() */
19
Example 4: Dealing with Lots of Arguments int tmp; void test(int a, int b, int c, int d, int *e); int main() { int a, b, c, d, e; a = 3; b = 4; c = 5; d = 6; e = 7; test(a, b, c, d, &e); } /* end main() */ void test(int a,int b, int c, int d, int *e) { tmp = a; a = b; b = tmp; c = b; b = d; *e = d; } /* end test() */ main: mov ip, sp stmfd sp!,{fp,ip,lr,pc} sub fp, ip, #4 sub sp, sp, #24 mov r3, #3 str r3, [fp, #-16] mov r3, #4 str r3, [fp, #-20] mov r3, #5 str r3, [fp, #-24] mov r3, #6 str r3, [fp, #-28] mov r3, #7 str r3, [fp, #-32] sub r3, fp, #32 str r3, [sp, #0] /* &e */ ldr r0, [fp, #-16] /* a */ ldr r1, [fp, #-20] /* b */ ldr r2, [fp, #-24] /* c */ ldr r3, [fp, #-28] /* d */ bl test mov r0, r3 sub sp, fp, #12 ldmfd sp, {fp, sp, pc}
20
Example 4 (cont’d) test: mov ip, sp stmfd sp!, {fp, ip, lr, pc} sub fp, ip, #4 sub sp, sp, #16 str r0, [fp, #-16] str r1, [fp, #-20] str r2, [fp, #-24] str r3, [fp, #-28] ldr r2,.L3 /* tmp */ ldr r3, [fp, #-16] str r3, [r2, #0] /* tmp = a */ ldr r3, [fp, #-20] str r3, [fp, #-16] /* a = b */ ldr r3,.L3 ldr r3, [r3, #0] str r3, [fp, #-20] /* b = tmp */ ldr r3, [fp, #-20] str r3, [fp, #-24] /* c = b */ ldr r3, [fp, #-28] str r3, [fp, #-20] /* b = d */ ldr r2, [fp, #4] ldr r3, [fp, #-28] str r3, [r2, #0] /* *e = d */ sub sp, fp, #12 ldmfd sp, {fp, sp, pc} d c b a fp ip lr pc e fp ip sp 88 9c 98 94 90 8c 84 80 7c
21
Mixing C and Assembly Language XScale Assembly Code C Library C Source Code XScale Executable Compiler Linker Assembler
22
Interfacing C and Assembly Language ARM (the company @ www.arm.com) has developed a standard called the “ARM Procedure Call Standard” (APCS) which defines: –constraints on the use of registers –stack conventions –format of a stack backtrace data structure –argument passing and result return –support for ARM shared library mechanism Compilergenerated code conforms to the APCS –It's just a standard not an architectural requirement –Cannot avoid standard when interfacing C and assembly code –Can avoid standard when just writing assembly code or when writing assembly code that isn't called by C code
23
Multiply Multiply instruction can take multiple cycles –Can convert Y * Constant into series of adds and shifts –Y * 9 = Y * 8 + Y * 1 –Assume R1 holds Y and R2 will hold the result ADD R2, R2, R1, LSL #3 ; multiplication by 9: (Y * 8) + (Y * 1) RSB R2, R1, R1, LSL #3 ; multiplication by 7: (Y * 8) - (Y * 1) (RSB: reverse subtract - operands to subtraction are reversed) Another example: Y * 105 –105 = 128 23 = 128 (16 + 7) = 128 (16 + (8 1)) RSB r2, r1, r1, LSL #3 ; r2 < Y*7 = Y*8 Y*1(assume r1 holds Y) ADD r2, r2, r1, LSL #4 ; r2 < r2 + Y * 16 (r2 held Y*7; now holds Y*23) RSB r2, r2, r1, LSL #7 ; r2 < (Y * 128) r2 (r2 now holds Y*105) Or Y * 105 = Y * (15 * 7) = Y * (16 1) * (8 1) RSB r2,r1,r1,LSL #4 ; r2 < (r1 * 16) r1 RSB r3, r2, r2, LSL #3 ; r3 < (r2 * 8) r2
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.