Assembly Language: IA-32 Instructions

Slides:



Advertisements
Similar presentations
Fabián E. Bustamante, Spring 2007 Machine-Level Programming II: Control Flow Today Condition codes Control flow structures Next time Procedures.
Advertisements

Machine/Assembler Language Putting It All Together Noah Mendelsohn Tufts University Web:
COMP 2003: Assembly Language and Digital Logic
Computer Architecture and Operating Systems CS 3230 :Assembly Section Lecture 2 Department of Computer Science and Software Engineering University of Wisconsin-Platteville.
Department of Computer Science and Software Engineering
ACOE2511 Assembly Language Arithmetic and Logic Instructions.
1 Computer Architecture and Assembly Language COS 217.
1 ICS 51 Introductory Computer Organization Fall 2006 updated: Oct. 2, 2006.
Flow Control Instructions
1 Assembly Language: Overview. 2 If you’re a computer, What’s the fastest way to multiply by 5? What’s the fastest way to divide by 5?
1 Homework Reading –PAL, pp Machine Projects –MP2 due at start of Class 12 Labs –Continue labs with your assigned section.
CEG 320/520: Computer Organization and Assembly Language ProgrammingIntel Assembly 1 Intel IA-32 vs Motorola
An Introduction to 8086 Microprocessor.
Dr. José M. Reyes Álamo 1.  The 80x86 memory addressing modes provide flexible access to memory, allowing you to easily access ◦ Variables ◦ Arrays ◦
Low Level Programming Lecturer: Duncan Smeed Overview of IA-32 Part 1.
INSTRUCTION SET AND ASSEMBLY LANGUAGE PROGRAMMING
Faculty of Engineering, Electrical Department,
The x86 Architecture Lecture 15 Fri, Mar 4, 2005.
CSC 2400 Computer Systems I Lecture 9 Deeper into Assembly.
1 ICS 51 Introductory Computer Organization Fall 2009.
26-Nov-15 (1) CSC Computer Organization Lecture 6: Pentium IA-32.
CNIT 127: Exploit Development Ch 1: Before you begin.
Assembly Language. Symbol Table Variables.DATA var DW 0 sum DD 0 array TIMES 10 DW 0 message DB ’ Welcome ’,0 char1 DB ? Symbol Table Name Offset var.
1 Carnegie Mellon Assembly and Bomb Lab : Introduction to Computer Systems Recitation 4, Sept. 17, 2012.
3.4 Addressing modes Specify the operand to be used. To generate an address, a segment register is used also. Immediate addressing: the operand is a number.
1 Assembly Language Part 2 Professor Jennifer Rexford COS 217.
תרגול 5 תכנות באסמבלי, המשך
Assembly 06. Outline cmp (review) Jump commands test mnemonic bt mnemonic Addressing 1.
Microprocessor & Assembly Language Arithmetic and logical Instructions.
Jumps, Loops and Branching. Unconditional Jumps Transfer the control flow of the program to a specified instruction, other than the next instruction in.
K.K. Leung Fall 2008Introductory Pentium Programming1 Pentium Architecture: Introductory Programming Kin K. Leung
Intel MP Organization. Registers - storage locations found inside the processor for temporary storage of data 1- Data Registers (16-bit) AX, BX, CX, DX.
Machine-Level Programming 2 Control Flow Topics Condition Codes Setting Testing Control Flow If-then-else Varieties of Loops Switch Statements.
Precept 7: Introduction to IA-32 Assembly Language Programming
Machine-Level Programming 2 Control Flow
Data Transfers, Addressing, and Arithmetic
Homework Reading Labs PAL, pp
Aaron Miller David Cohen Spring 2011
EE3541 Introduction to Microprocessors
Morgan Kaufmann Publishers Computer Organization and Assembly Language
Computer Architecture
Basic Assembly Language
Assembly IA-32.
Assembly Language Programming Part 2
# include < stdio.h > v oid main(void) { long NUM1[5]; long SUM; long N; NUM1[0] = 17; NUM1[1] = 3; NUM1[2] =  51; NUM1[3] = 242; NUM1[4] = 113; SUM =
ADDRESSING MODES.
Processor Processor characterized by register set (state variables)
Machine-Level Programming II: Arithmetic & Control
Chapter 3 Machine-Level Representation of Programs
Machine-Level Representation of Programs II
Computer Architecture adapted by Jason Fritts then by David Ferry
Y86 Processor State Program Registers
BIC 10503: COMPUTER ARCHITECTURE
Assembly Language: IA-32 Instructions
Instructor: David Ferry
Condition Codes Single Bit Registers
Machine-Level Programming 2 Control Flow
CS 301 Fall 2002 Computer Organization
Machine-Level Programming 2 Control Flow
Machine-Level Representation of Programs III
Machine-Level Programming 2 Control Flow
Homework Reading Machine Projects Labs PAL, pp
Computer Architecture CST 250
Chapter 3 Machine-Level Representation of Programs
Machine-Level Programming II: Control Flow
X86 Assembly Review.
Carnegie Mellon Ithaca College
CS201- Lecture 8 IA32 Flow Control
Chapter 8: Instruction Set 8086 CPU Architecture
Computer Architecture and Assembly Language
Presentation transcript:

Assembly Language: IA-32 Instructions

Assume the text segment is not read-only Assume a von Neuman architecture What would happen if you wrote into the text segment? What would happen if you executed what you wrote? Could you use this to your advantage? Can you see problems with this approach?

Goals of Today’s Lecture Help you learn… To manipulate data of various sizes To leverage more sophisticated addressing modes To use condition codes and jumps to change control flow Focusing on the assembly-language code Rather than the layout of memory for storing data Why? Know the features of the IA-32 architecture Write more efficient assembly-language programs Understand the relationship to data types and common programming constructs in higher-level languages Note that precept will cover the full details of specifying the layout in memory, explaining assembler directives and the like. Here, lecture is just covering the code itself.

Variable Sizes in High-Level Language C data types vary in size Character: 1 byte Short, int, and long: varies, depending on the computer Float and double: varies, depending on the computer Pointers: typically 4 bytes Programmer-created types Struct: arbitrary size, depending on the fields Arrays Multiple consecutive elements of some fixed size Where each element could be a struct

Supporting Different Sizes in IA-32 Three main data sizes Byte (b): 1 byte Word (w): 2 bytes Long (l): 4 bytes Separate assembly-language instructions E.g., addb, addw, and addl Separate ways to access (parts of) a register E.g., %ah or %al, %ax, and %eax Larger sizes (e.g., struct) Manipulated in smaller byte, word, or long units

Byte Order in Multi-Byte Entities Intel is a little endian architecture Least significant byte of multi-byte entity is stored at lowest memory address “Little end goes first” Some other systems use big endian Most significant byte of multi-byte entity is stored at lowest memory address “Big end goes first” 1000 00000101 1001 00000000 The int 5 at address 1000: 1002 00000000 1003 00000000 1000 00000000 1001 00000000 The int 5 at address 1000: 1002 00000000 1003 00000101

Little Endian Example int main(void) { int i=0x003377ff, j; unsigned char *p = (unsigned char *) &i; for (j=0; j<4; j++) printf("Byte %d: %x\n", j, p[j]); } Note: this code is *not* portable. Different output on a big-endian machine. Byte 0: ff Byte 1: 77 Byte 2: 33 Byte 3: 0 Output on a little-endian machine

IA-32 General Purpose Registers 16-bit 32-bit 31 15 8 7 EAX EBX ECX EDX ESI EDI AH AL AX BX CX DX BH BL CH CL DH DL SI DI General-purpose registers EAX: accumulator for operands and results EBX: pointer to data in DS segment ECX: counter for string in loop operations EDX: IO pointer ESI: pointer to data in DS segment, string source pointer EDI: pointer to data in ES segment, string destination pointer EBP: pointer to data on stack (in SS segment) ESP: stack pointer in SS segment CS: code segment DS: data segment SS: stack segment ES, GS, FS: data segment

C Example: One-Byte Data Global char variable i is in %al, the lower byte of the “A” register. char i; … if (i > 5) { i++; else i--; } cmpb $5, %al jle else incb %al jmp endif else: decb %al endif:

C Example: Four-Byte Data Global int variable i is in %eax, the full 32 bits of the “A” register. int i; … if (i > 5) { i++; else i--; } cmpl $5, %eax jle else incl %eax jmp endif else: decl %eax endif:

Loading and Storing Data Processors have many ways to access data Known as “addressing modes” Two simple ways seen in previous examples Immediate addressing Example: movl $0, %ecx Data (e.g., number “0”) embedded in the instruction Initialize register ECX with zero Register addressing Example: movl %edx, %ecx Choice of register(s) embedded in the instruction Copy value in register EDX into register ECX

Accessing Memory Variables are stored in memory Global and static local variables in Data or BSS section Dynamically allocated variables in the heap Function parameters and local variables on the stack Need to be able to load from and store to memory To manipulate the data directly in memory Or copy the data between main memory and registers IA-32 has many different addressing modes Corresponding to common programming constructs E.g., accessing a global variable, dereferencing a pointer, accessing a field in a struct, or indexing an array

Direct Addressing Load or store from a particular memory location Memory address is embedded in the instruction Instruction reads from or writes to that address IA-32 example: movl 2000, %ecx Four-byte variable located at address 2000 Read four bytes starting at address 2000 Load the value into the ECX register Useful when the address is known in advance Global variables in the Data or BSS sections Can use a label for (human) readability E.g., “i” to allow “movl i, %eax”

Indirect Addressing Load or store from a previously-computed address Register with the address is embedded in the instruction Instruction reads from or writes to that address IA-32 example: movl (%eax), %ecx EAX register stores a 32-bit address (e.g., 2000) Read long-word variable stored at that address Load the value into the ECX register Useful when address is not known in advance Dynamically allocated data referenced by a pointer The “(%eax)” essentially dereferences a pointer

Base Pointer Addressing Load or store with an offset from a base address Register storing the base address Fixed offset also embedded in the instruction Instruction computes the address and does access IA-32 example: movl 8(%eax), %ecx EAX register stores a 32-bit base address (e.g., 2000) Offset of 8 is added to compute address (e.g., 2008) Read long-word variable stored at that address Load the value into the ECX register Useful when accessing part of a larger variable Specific field within a “struct” E.g., if “age” starts at the 8th byte of “student” record

Indexed Addressing Load or store with an offset and multiplier Fixed based address embedded in the instruction Offset computed by multiplying register with constant Instruction computes the address and does access IA-32 example: movl 2000(,%eax,4), %ecx Index register EAX (say, with value of 10) Multiplied by a multiplier of 1, 2, 4, or 8 (say, 4) Added to a fixed base of 2000 (say, to get 2040) Useful to iterate through an array (e.g., a[i]) Base is the start of the array (i.e., “a”) Register is the index (i.e., “i”) Multiplier is the size of the element (e.g., 4 for “int”)

Indexed Addressing Example global variable int a[20]; … int i, sum=0; for (i=0; i<20; i++) sum += a[i]; movl $0, %eax movl $0, %ebx sumloop movl a(,%eax,4), %ecx addl %ecx, %ebx incl %eax cmpl $19, %eax jle sumloop EAX: i EBX: sum ECX: temporary

Effective Address: More Generally eax ebx ecx edx esp ebp esi edi eax ebx ecx edx esp ebp esi edi None 8-bit 16-bit 32-bit 1 2 4 8 Offset = + * + Base Index scale displacement Displacement movl foo, %ebx Base movl (%eax), %ebx Base + displacement movl foo(%eax), %ebx movl 1(%eax), %ebx (Index * scale) + displacement movl (,%eax,4), %ebx Base + (index * scale) + displacement movl foo(%edx,%eax,4),%ebx

Data Access Methods: Summary Immediate addressing: data stored in the instruction itself movl $10, %ecx Register addressing: data stored in a register movl %eax, %ecx Direct addressing: address stored in instruction movl foo, %ecx Indirect addressing: address stored in a register movl (%eax), %ecx Base pointer addressing: includes an offset as well movl 4(%eax), %ecx Indexed addressing: instruction contains base address, and specifies an index register and a multiplier (1, 2, 4, or 8) movl 2000(,%eax,1), %ecx

Control Flow Common case Sometimes need to change control flow Execute code sequentially One instruction after another Sometimes need to change control flow If-then-else Loops Switch Two key ingredients Testing a condition Selecting what to run next based on result cmpl $5, %eax jle else incl %eax jmp endif else: decl %eax endif:

Condition Codes 1-bit registers set by arithmetic & logic instructions ZF: Zero Flag SF: Sign Flag CF: Carry Flag OF: Overflow Flag Example: “addl Src, Dest” (“t = a + b”) ZF: set if t == 0 SF: set if t < 0 CF: set if carry out from most significant bit Unsigned overflow OF: set if two’s complement overflow (a>0 && b>0 && t<0) || (a<0 && b<0 && t>=0) Note that the distinction between signed and unsigned is in the eyes of the beholder. The computer doesn’t know how you are interpreting the bits.

Condition Codes (continued) Example: “cmpl Src2,Src1” (compare b,a) Like computing a-b without setting destination ZF: set if a == b SF: set if (a-b) < 0 CF: set if carry out from most significant bit Used for unsigned comparisons OF: set if two’s complement overflow (a>0 && b<0 && (a-b)<0) || (a<0 && b>0 && (a-b)>0) Flags are not set by lea, inc, or dec instructions Hint: this is useful for the extra-credit part of the assembly-language programming assignment! 

Example Five-Bit Comparisons Comparison: cmp $6, $12 Not zero: ZF=0 (diff is not 00000) Positive: SF=0 (first bit is 0) No carry: CF=0 (unsigned diff is correct) No overflow: OF=0 (signed diff is correct) Comparison: cmp $12, $6 Negative: SF=1 (first bit is 1) Carry: CF=1 (unsigned diff is wrong) Comparison: cmp $-6, $-12 Carry: CF=1 (unsigned diff of 20 and 28 is wrong) 01100 - 00110 ?? 01100 +11010 00110 00110 - 01100 ?? 00110 +10100 11010 10100 - 11010 ?? 10100 +00110 11010

Jumps after Comparison (cmpl) Equality Equal: je (ZF) Not equal: jne (~ZF) Below/above (e.g., unsigned arithmetic) Below: jb (CF) Above or equal: jae (~CF) Below or equal: jbe (CF | ZF) Above: ja (~(CF | ZF)) Less/greater (e.g., signed arithmetic) Less: jl (SF ^ OF) Greater or equal: jge (~(SF ^ OF)) Less or equal: jle ((SF ^ OF) | ZF) Greater: jg (!((SF ^ OF) | ZF)) Note that the distinction between signed and unsigned is in the eyes of the beholder. The computer doesn’t know how you are interpreting the bits.

Branch Instructions Conditional jump Unconditional jump j{l,g,e,ne,...} target if (condition) {eip = target} Unconditional jump jmp target jmp *register Comparison Signed Unsigned  e “equal”  ne “not equal” > g a “greater,above”  ge ae “...-or-equal” < l b “less,below”  le be overflow/carry o c no ovf/carry no nc

Jumping Simple model of a “goto” statement Go to a particular place in the code Based on whether a condition is true or false Can represent if-the-else, switch, loops, etc. Pseudocode example: If-Then-Else if (!Test) jump to Else; then-body; jump to Done; Else: else-body; Done: if (Test) { then-body; } else { else-body;

Jumping (continued) Pseudocode example: Do-While loop Pseudocode example: While loop do { Body; } while (Test); loop: Body; if (Test) then jump to loop; jump to middle; loop: Body; middle: if (Test) then jump to loop; while (Test) Body;

Jumping (continued) Pseudocode example: For loop for (Init; Test; Update) Body Init; if (!Test) jump to done; loop: Body; Update; if (Test) jump to loop; done:

Arithmetic Instructions Simple instructions add{b,w,l} source, dest dest = source + dest sub{b,w,l} source, dest dest = dest – source Inc{b,w,l} dest dest = dest + 1 dec{b,w,l} dest dest = dest – 1 neg{b,w,l} dest dest = ~dest + 1 cmp{b,w,l} source1, source2 source2 – source1 Multiply mul (unsigned) or imul (signed) mull %ebx # edx, eax = eax * ebx Divide div (unsigned) or idiv (signed) idiv %ebx # edx = edx,eax / ebx Many more in Intel manual (volume 2) adc, sbb, decimal arithmetic instructions

Bitwise Logic Instructions Simple instructions and{b,w,l} source, dest dest = source & dest or{b,w,l} source, dest dest = source | dest xor{b,w,l} source, dest dest = source ^ dest not{b,w,l} dest dest = ~dest sal{b,w,l} source, dest (arithmetic) dest = dest << source sar{b,w,l} source, dest (arithmetic) dest = dest >> source Many more in Intel Manual (volume 2) Logic shift Rotation shift Bit scan Bit test Byte set on conditions

Data Transfer Instructions mov{b,w,l} source, dest General move instruction push{w,l} source pushl %ebx # equivalent instructions subl $4, %esp movl %ebx, (%esp) pop{w,l} dest popl %ebx # equivalent instructions movl (%esp), %ebx addl $4, %esp Many more in Intel manual (volume 2) Type conversion, conditional move, exchange, compare and exchange, I/O port, string move, etc. esp esp esp esp

Conclusions Accessing data Control flow Manipulating data Next time Byte, word, and long-word data types Wide variety of addressing modes Control flow Common C control-flow constructs Condition codes and jump instructions Manipulating data Arithmetic and logic operations Next time Calling functions, using the stack