Ithaca College 1 Machine-Level Programming X: inline Assembly Comp 21000: Introduction to Computer Systems & Assembly Lang On-Line resources* * See see.

Slides:



Advertisements
Similar presentations
Fabián E. Bustamante, Spring 2007 Machine-Level Programming II: Control Flow Today Condition codes Control flow structures Next time Procedures.
Advertisements

Machine/Assembler Language Putting It All Together Noah Mendelsohn Tufts University Web:
Assembly Language for x86 Processors 6th Edition Chapter 5: Procedures (c) Pearson Education, All rights reserved. You may modify and copy this slide.
COMP 2003: Assembly Language and Digital Logic
Department of Computer Science and Software Engineering
Computer Organization & Assembly Language
Inline Assembly Section 1: Recitation 7. In the early days of computing, most programs were written in assembly code. –Unmanageable because No type checking,
Assembly Language for Intel-Based Computers Chapter 5: Procedures Kip R. Irvine.
1 ICS 51 Introductory Computer Organization Fall 2006 updated: Oct. 2, 2006.
1 Homework Reading –PAL, pp , Machine Projects –Finish mp2warmup Questions? –Start mp2 as soon as possible Labs –Continue labs with your.
ICS312 Set 3 Pentium Registers. Intel 8086 Family of Microprocessors All of the Intel chips from the 8086 to the latest pentium, have similar architectures.
Chapter 4 Basic Instructions. 4.1 Copying Data mov Instructions mov (“move”) instructions are really copy instructions, like simple assignment statements.
High-Level Language Interface Chapter 17 S. Dandamudi.
INSTRUCTION SET AND ASSEMBLY LANGUAGE PROGRAMMING
Assembly תרגול 5 תכנות באסמבלי. Assembly vs. Higher level languages There are NO variables’ type definitions.  All kinds of data are stored in the same.
CET 3510 Microcomputer Systems Tech. Lecture 2 Professor: Dr. José M. Reyes Álamo.
Arithmetic Flags and Instructions
Computer Architecture and Operating Systems CS 3230 :Assembly Section Lecture 4 Department of Computer Science and Software Engineering University of Wisconsin-Platteville.
1 ICS 51 Introductory Computer Organization Fall 2009.
Assembly Language for x86 Processors 7th Edition Chapter 13: High-Level Language Interface (c) Pearson Education, All rights reserved. You may modify.
UHD:CS2401: A. Berrached1 The Intel x86 Hardware Organization.
Microprocessors The ia32 User Instruction Set Jan 31st, 2002.
1 Logic, Shift, and Rotate Instructions Read Sections 6.2, 7.2 and 7.3 of textbook.
Assembly Language. Symbol Table Variables.DATA var DW 0 sum DD 0 array TIMES 10 DW 0 message DB ’ Welcome ’,0 char1 DB ? Symbol Table Name Offset var.
Chapter 2 Parts of a Computer System. 2.1 PC Hardware: Memory.
October 1, 2003Serguei A. Mokhov, 1 SOEN228, Winter 2003 Revision 1.2 Date: October 25, 2003.
Embedding Assembly Code in C Programs תרגול 7 שילוב קוד אסמבלי בקוד C.
Microprocessor & Assembly Language Arithmetic and logical Instructions.
Arrays. Outline 1.(Introduction) Arrays An array is a contiguous block of list of data in memory. Each element of the list must be the same type and use.
Computer Organization & Assembly Language University of Sargodha, Lahore Campus Prepared by Ali Saeed.
Introduction to Intel IA-32 and IA-64 Instruction Set Architectures.
Computer and Information Sciences College / Computer Science Department CS 206 D Computer Organization and Assembly Language.
Riyadh Philanthropic Society For Science Prince Sultan College For Woman Dept. of Computer & Information Sciences CS 251 Introduction to Computer Organization.
Ithaca College 1 Machine-Level Programming XI: inline Assembly Comp 21000: Introduction to Computer Systems & Assembly Lang On-Line resources* * See see.
Precept 7: Introduction to IA-32 Assembly Language Programming
Practical Session 3.
Assembly language programming
Instruction Set Architecture
Computer Architecture and Assembly Language
Assembly Language Programming IV: shift, struct, recursion
Homework Reading Lab with your assigned section starts next week
Assembly language.
Credits and Disclaimers
Format of Assembly language
Data Transfers, Addressing, and Arithmetic
Chapter 13 Inline Code.
Chapter 13 Inline Code.
Assembly IA-32.
Assembly Language Programming Part 2
Homework Reading Continue work on mp1
Chapter 3 Machine-Level Representation of Programs
Introduction to Assembly Language
Chapter 4: Instructions
BIC 10503: COMPUTER ARCHITECTURE
Assembly Language Programming II: C Compiler Calling Sequences
CS 301 Fall 2002 Computer Organization
Fundamentals of Computer Organisation & Architecture
Machine-Level Programming 2 Control Flow
Shift & Rotate Instructions)
Multiplication and Division Instructions
Computer Architecture CST 250
Chapter 3 Machine-Level Representation of Programs
X86 Assembly Review.
Chapter 8: Instruction Set 8086 CPU Architecture
Machine-Level Programming XI: inline Assembly Comp 21000: Introduction to Computer Systems & Assembly Lang On-Line resources* See see
Credits and Disclaimers
Credits and Disclaimers
Computer Architecture and System Programming Laboratory
Computer Architecture and System Programming Laboratory
Computer Architecture and System Programming Laboratory
Presentation transcript:

Ithaca College 1 Machine-Level Programming X: inline Assembly Comp 21000: Introduction to Computer Systems & Assembly Lang On-Line resources* * See see

Ithaca College 2 Today Inline assembly  Example

Ithaca College 3 Why use assembly? Assembly can express very low-level things:  you can access machine-dependent registers and I/O  you can control the exact code behavior in critical sections that might otherwise involve deadlock between multiple software threads or hardware devices  you can break the conventions of your usual compiler, which might allow some optimizations (like temporarily breaking rules about memory allocation, threading, calling conventions, etc)  you can build interfaces between code fragments using incompatible conventions (e.g. produced by different compilers, or separated by a low-level interface)  you can get access to unusual programming modes of your processor (e.g. 16 bit mode to interface startup, firmware, or legacy code on Intel PCs)  you can produce reasonably fast code for tight loops to cope with a bad non- optimizing compiler (but then, there are free optimizing compilers available!)  you can produce hand-optimized code perfectly tuned for your particular hardware setup, though not to someone else's  you can write some code for your new language's optimizing compiler (that is something what very few ones will ever do, and even they not often)  i.e. you can be in complete control of your code From the linux assembly howto

Ithaca College 4 Why use assembly? Speed  Be careful! Optimizing compilers are almost always better!  useful when you know an assembly language instruction that can replace a library call  Example: transcendental function computation  has macros for some inline assembly sequences  Example: if spend most of the time in a loop computing the sine and cosine of the same angles, could use the fsincos assembly function  We’ll see an example later

Ithaca College 5 inline assembly code* Example:  mov instruction /* put this line in your C program*/ asm("assembly code "); /* alternative syntax */ __asm__ ("assembly code ") ; asm("movl %ebx, %eax"); /* moves the contents of ebx register to eax */ __asm__("movb %ch, (%ebx)") ; /* moves the byte from ch to the memory pointed by ebx */ * see

Ithaca College 6 inline assembly More sophisticated assembly  For more than one assembly instruction, use semicolon at the end of each instruction of inserted code  see example on next slide

Ithaca College 7 Example #include int main() { /* Add 10 and 20 and store result into register %eax */ __asm__ ( "movl $10, %eax;" "movl $20, %ebx;" "addl %ebx, %eax;" ) ; /* Subtract 20 from 10 and store result into register %eax */ __asm__ ( "movl $10, %eax;" "movl $20, %ebx;" "subl %ebx, %eax;" ) ; /* Multiply 10 and 20 and store result into register %eax */ __asm__ ( "movl $10, %eax;" "movl $20, %ebx;" "imull %ebx, %eax;" ) ; return 0 ; } compile with –m32 –O1 flags trace in gdb to see registers change. compile with –m32 –O1 flags trace in gdb to see registers change.

Ithaca College 8 Extended inline assembly Idea  In extended assembly, we can also specify the operands  can specify the input registers, output registers and a list of clobbered registers.  If there are no output operands but there are input operands, we must place two consecutive colons surrounding the place where the output operands would go.  Can omit list of clobbered registers to use,  GCC and GCC’s optimization scheme will take care of the reg. asm ( "assembly code " : output operands /* optional */ : input operands /* optional */ : list of clobbered registers /* optional */ );

Ithaca College 9 Example 1 The variable "val" is kept in a register  val is a C variable that must be declared earlier in the C program the value in register %eax is copied onto that register, and the value of "val" is updated into the memory from that register. note that eax is preceded by 2 percent signs  differentiate from a asm parameter (asm works like printf) asm ("movl %eax, %0;" : "=r" ( val )); Assm instr Output operands see ~barr/Student/Comp210/Resources/inline directory for all examples

Ithaca College 10 Example 1 (continued) the %0 indicates the first operand of asm, it is associated with the first parameter, i.e., val (similar to printf) “=r” indicates a register constraint  see chart on next page for all possible register specifications  “r” indicates that gcc may keep the variable in any available General Purpose Registers  the “=“ indicates write only mode. asm ("movl %eax, %0;" : "=r" ( val )); see ~barr/Student/Comp210/Resources/inline directory for all examples

Ithaca College 11 register specifiers rregister RGeneral register (EAX, EBX, ECX, EDX, ESI, EDI, EBP, ESP) qGeneral register for data (EAX, EBX, ECX, EDX) fFloating point reg a%eax, %ax, %al b%ebx, %bx, %bl c%ecx, %cx, %cl d%edx, %dx, %dl S%esi, %si D%edi, %di

Ithaca College 12 Example 2 Two variables are declared in the C code, no and val %0 is the first operand to asm and thus refers to the C variable val (the output variable) %1 is the second operand and thus refers to the C variable no (the input variable) “=r” and “r” say that gcc can use any registers to store the corresponding variable (either val or no ) the clobbered variable is %ebx so gcc should not use that variable anywhere else. int no = 100, val ; asm ("movl %1, %ebx;" "movl %ebx, %0;" : "=r" ( val ) /* output */ : "r" ( no ) /* input */ : "%ebx" /* clobbered register */ );

Ithaca College 13 Example 3 The C code declares three variables: arg1, arg2, add The input variables will use %eax (for arg1 ) and %ebx (for arg2 ) the output variable is add and will use register %eax no clobber register is set; gcc can determine int arg1, arg2, add ; arg1 = 10; arg2 = 25; __asm__ ( "addl %ebx, %eax;" : "=a" (add) : "a" (arg1), "b" (arg2) ) ;

Ithaca College 14 Example 4 #include int main() { int arg1, arg2, add, sub, mul, quo, rem ; printf( "Enter two integer numbers : " ); scanf( "%d%d", &arg1, &arg2 ); /* Perform Addition, Subtraction, Multiplication & Division */ __asm__ ( "addl %ebx, %eax;" : "=a" (add) : "a" (arg1), "b" (arg2) ) ; __asm__ ( "subl %ebx, %eax;" : "=a" (sub) : "a" (arg1), "b" (arg2) ) ; __asm__ ( "imull %ebx, %eax;" : "=a" (mul) : "a" (arg1), "b" (arg2) ) ; __asm__ ( "movl $0x0, %edx;" "movl %2, %eax;" "movl %3, %ebx;" "idivl %ebx;" : "=a" (quo), "=d" (rem) : "g" (arg1), "g" (arg2) ) ; printf( "%d + %d = %d\n", arg1, arg2, add ) ; printf( "%d - %d = %d\n", arg1, arg2, sub ) ; printf( "%d * %d = %d\n", arg1, arg2, mul ) ; printf( "%d / %d = %d\n", arg1, arg2, quo ) ; printf( "%d % %d = %d\n", arg1, arg2, rem ) ; return 0 ; } Example 4 idivl S # Signed divide R[%edx]  R[%edx]:R[%eax] mod S; R[%eax]  R[%edx]:R[%eax] / S idivl S # Signed divide R[%edx]  R[%edx]:R[%eax] mod S; R[%eax]  R[%edx]:R[%eax] / S

Ithaca College 15 Example 4 #include int main (int argc, char* argv[]) { long max = atoi (argv[1]); long number; long i; unsigned position; volatile unsigned result; /* Repeat the operation for a large number of values. */ for (number = 1; number <= max; ++number) { /* Repeatedly shift the number to the right, until the result is zero. Keep count of the number of shifts this requires. */ for (i = (number >> 1), position = 0; i != 0; ++position) i >>= 1; /* The position of the most significant set bit is the number of shifts we needed after the first one. */ result = position; } // end outer for loop return 0; } // end main Example 5 why use inline assembly? no assembly in this code Example 5 why use inline assembly? no assembly in this code % gcc -O2 -o bit-pos-loop bit-pos-loop.c % time./bit-pos-loop user 0.00 system 0:20.40 elapsed 95%CPU (0avgtext+0avgdata 0maxresident)k0inputs+0outputs (73major+11minor)pagefaults 0swaps % gcc -O2 -o bit-pos-loop bit-pos-loop.c % time./bit-pos-loop user 0.00 system 0:20.40 elapsed 95%CPU (0avgtext+0avgdata 0maxresident)k0inputs+0outputs (73major+11minor)pagefaults 0swaps

Ithaca College 16 Example 4 #include int main (int argc, char* argv[]) { long max = atoi (argv[1]); long number; unsigned position; volatile unsigned result; /* Repeat the operation for a large number of values. */ for (number = 1; number <= max; ++number) { /* Compute the position of the most significant set bit using the bsrl assembly instruction. */ asm (“bsrl %1, %0” : “=r” (position) : “r” (number)); result = position; } // end for loop return 0; } Example 5 why use inline assembly? assembly used for inner loop Example 5 why use inline assembly? assembly used for inner loop %gcc -O2 -o bit-pos-asm bit-pos-asm.c % time./bit-pos-asm user 0.00system 0:03.32elapsed 95%CPU (0avgtext+0avgdata 0maxresident)k0inputs+0outputs (73major+11minor)pagefaults 0swaps %gcc -O2 -o bit-pos-asm bit-pos-asm.c % time./bit-pos-asm user 0.00system 0:03.32elapsed 95%CPU (0avgtext+0avgdata 0maxresident)k0inputs+0outputs (73major+11minor)pagefaults 0swaps Compare to the 19.51user of the previous C only example! Bsr: Scans source operand for first bit set. Sets ZF if a bit is found set and loads the destination with an index to first set bit. Clears ZF is no bits are found set. BSF scans forward across bit pattern (0-n) while BSR scans in reverse (n-0).

Ithaca College 17 Volitile If our assembly statement must execute where we put it, (e.g. must not be moved out of a loop as an optimization), put the keyword "volatile" or "__volatile__" after "asm" or "__asm__" and before the ()s. asm volatile ( "...;” "...;" :... ); __asm__ __volatile__ ( "...;" "...;" :... ) ; asm volatile ( "...;” "...;" :... ); __asm__ __volatile__ ( "...;" "...;" :... ) ;

Ithaca College 18 Example 4 #include int gcd( int a, int b ) { int result ; /* Compute Greatest Common Divisor using Euclid's Algorithm */ __asm__ __volatile__ ( "movl %1, %eax;" "movl %2, %ebx;" "CONTD: cmpl $0, %ebx;" "je DONE;" "xorl %edx, %edx;" "idivl %ebx;" "movl %ebx, %eax;" "movl %edx, %ebx;" "jmp CONTD;" "DONE: movl %eax, %0;" : "=g" (result) : "g" (a), "g" (b) ) ; return result ; } int main() { int first, second ; printf( "Enter two integers : " ) ; scanf( "%d%d", &first, &second ); printf( "GCD of %d & %d is %d\n", first, second, gcd(first, second) ) ; return 0 ; } Example 5.5 Compute GCD with Euclid’s Algm gcd.c Example 5.5 Compute GCD with Euclid’s Algm gcd.c Note that we can put labels and jump to them! Note that we can put several asm instructions in one __asm__ function call

Ithaca College 19 #include int main() { int x, y, rslt, rtnval; int i; printf("Enter two integers\n"); rslt = scanf("%d%d", &x, &y); /* change into assembly code */ rtnval = 0; for(i = 0; i < y; i++) { rtnval += x; } /* end change */ printf("%d * %d = %d\n", x, y, rtnval); return 0; } Example 6 convert the for loop to assembly code (should be no more than 11 lines of assembly code) Example 6 convert the for loop to assembly code (should be no more than 11 lines of assembly code)