Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 5 Assembly Language

Similar presentations


Presentation on theme: "Lecture 5 Assembly Language"— Presentation transcript:

1 Lecture 5 Assembly Language
CSCE 212 Computer Architecture Lecture 5 Assembly Language Topics Assembly Language Lab 2 - January 27, 2011

2 Overview Last Time New Next Time:
Covered through slides 11… of Lecture 4 Floating point: Review, rounding to even, multiplication, addition Compilation steps New Architecture (Fred Brooks): Assembly Programmer’s View Address Modes Swap Next Time: Lab02 - Datalab

3 Pop Quiz - denormals What is the representation of the largest denormalized IEEE float (in binary)? Denormal  expField = Largest denormal  all frac bits are 1, ie., frac = …1111 Largest denormal representation = ….1 In hex? 0x007FFFF What is its value as an expression, i.e., (-1)sign m * 2exp Largest denormal’s value = … 1111 x 2-BIAS+2 How many floats are there between 1.0 and 2.0?

4 What is a/the representation of minus infinity?
expField=0xFF, sign bit =1, frac=0x (23 zeroes) -infinity = 0xFF In C are there more ints or doubles? #doubles = (distinct exp)*(number of doubles with same exp) #doubles = (211 – 2)*(252) = 263 – 253 (note this does not count NaN or +/- infinity as a double (This only counts positives? Ignores 0) In Math are there more rationals than integers ? Argument for No: the sets have the same cardinality. There are both countably infinite, where the Reals are uncountable. Argument for yes: every integer is a rational and ½ is a rational that is not an integer. So actually the way the question is worded what is the best answer? Extra credit for pop quiz 1: what is aleph-0?

5 Pop Quiz – FP multiplication
If x=1.5 what is the representation of x as a float (in hex) And if y=(2-ε)*237 Note 1.111…1 =(2-ε) Then what is the frac field of the float z = x*y And what is the exponent (not the exponent field) of z? What is the largest gap between consecutive floats? Note 1.11…11 X_______1.1__ 111…11 (24 bits) 111…11 (24 bits) 10.11…101 (26 bits?)

6 Printf conversion specifications
% -#0 12 .4 L d Start specification Flags Minimum field width conversion type Size modifier Precision Examples Figure taken from page 368 of “C a Reference Manual” by Harbison and Steele

7 CYGWIN Unix Emulation under Windows Others CH, … Other Direction: Wine
Provides a bash window .bash_profile Others CH, … Other Direction: Wine Downloading CYGWIN Google CYGWIN startxwin – run a windows emulator under GYGWIN Virtual Machines: Virtual Box, VMware

8 Homework 1 problem 2.90 /usr/include/math.h
# define M_PI /* pi */ Hex rep given in problem pi = 0x40490fdb Binary rep sign +, ExpField =  Exp = 128-BIAS = 1 Binary val = * 21, Now 22/7

9 Setting Variables and aliases in .bash_profile
PATH=$HOME/bin:${PATH:-/usr/bin:.} PATH=$PATH:/usr/local/simplescalar/bin:/usr/local/simplescalar/simplesim-3.0 # list of directories separated by colons, used to specify where to find commands export PATH PS1="`hostname`> w5=/class/csce /web/ w=/class/csce /Code/ alias h=history alias lsl="ls -lrt | grep ^d" # later you can use the variables in commands like “ls $w”

10 Intel Registers figure 3.2
Intel microprocessor evolution 4004 8008  8080 8086  80x86 Backward compatibilty Registers of 8080 A: AH – AL C: CH -- CL D: DH – DL B: BH -- BL Si,di,sp,bp

11 Homework page 105 of text 2.56 2.57 2.58 give hex representations and the value as an expression of the form 1.x-1x-2…x-n * 2 exp

12 2.56 Fill in the return value for the following procedure that tests whether its first argument is greater than or equal to its second. Assume the function f2u return an unsigned 32-bit number having the same bit representation as its floating-point argument. You can assume that neither argument is NaN. The two flavors of zero: +0 and -0 are considered equal. int float-qe(float x, float y){ unsigned ux = f2u(x); unsigned uy = f2u(y); /* Get the sign bits */ unsigned sx = ux >> 31; unsiqned sy = uu >> 31; /* Give an expression using only ux, uy, sx and sy */ return /* … */ ; }

13 2.57 Given a floating point format with a k-bit exponent and an n-bit fraction, write formulas for the exponent E, significand M, the fraction f, and the value V for the quantities that follow. In addition, describe the bit representation. The number 5.0. The largest odd integer that can be represented exactly. The reciprocal of the smallest positive normalized value.

14 2.58’ - changed table columns
Intel-compatible processors also support an "extended precision" floating-point format with an 80-bit word divided into a sign bit, k = 15 exponent bits, a single integer bit, and n = 63 fraction bits. The integer bit is an explicit copy of the implied bit in the IEEE, floating-point representation. That is, it equals 1- for normalized values and 0 for denormalized values. Fill in the following table giving the appropri-ate values of some "interesting" numbers in this format: Description Representation Value as Expression Smallest denormalized Smallest normalized Largest normalized

15 New Species: IA64 Name Date Transistors Itanium 2001 10M
Extends to IA64, a 64-bit architecture Radically new instruction set designed for high performance Will be able to run existing IA32 programs On-board “x86 engine” Joint project with Hewlett-Packard Itanium M Big performance boost

16 Assembly Programmer’s View
CPU Memory Addresses Registers E I P Object Code Program Data OS Data Data Condition Codes Instructions Stack Programmer-Visible State EIP Program Counter Address of next instruction Register File Heavily used program data Condition Codes Store status information about most recent arithmetic operation Used for conditional branching Memory Byte addressable array Code, user data, (some) OS data Includes stack used to support procedures

17 Moving Data Moving Data Operand Types movl Source,Dest:
%eax %edx %ecx %ebx %esi %edi %esp %ebp Moving Data movl Source,Dest: Move 4-byte (“long”) word Lots of these in typical code Operand Types Immediate: Constant integer data Like C constant, but prefixed with ‘$’ E.g., $0x400, $-533 Encoded with 1, 2, or 4 bytes Register: One of 8 integer registers But %esp and %ebp reserved for special use Others have special uses for particular instructions Memory: 4 consecutive bytes of memory Various “address modes”

18 Simple Addressing Modes
Normal (R) Mem[Reg[R]] Register R specifies memory address movl (%ecx),%eax Displacement D(R) Mem[Reg[R]+D] Register R specifies start of memory region Constant displacement D specifies offset movl 8(%ebp),%edx

19 Using Simple Addressing Modes
swap: pushl %ebp movl %esp,%ebp pushl %ebx movl 12(%ebp),%ecx movl 8(%ebp),%edx movl (%ecx),%eax movl (%edx),%ebx movl %eax,(%edx) movl %ebx,(%ecx) movl -4(%ebp),%ebx movl %ebp,%esp popl %ebp ret Set Up void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0; } Body Finish

20 Understanding Swap Stack yp xp Rtn adr Old %ebp %ebp 4 8 12 Offset •
4 8 12 Offset Old %ebx -4 void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0; } Stack Register Variable %ecx yp %edx xp %eax t1 %ebx t0 movl 12(%ebp),%ecx # ecx = yp movl 8(%ebp),%edx # edx = xp movl (%ecx),%eax # eax = *yp (t1) movl (%edx),%ebx # ebx = *xp (t0) movl %eax,(%edx) # *xp = eax movl %ebx,(%ecx) # *yp = ebx

21 Understanding Swap Address 123 0x124 456 0x120 0x11c %eax %edx %ecx
%ebx %esi %edi %esp %ebp 0x104 0x118 Offset 0x114 yp 12 0x120 0x110 xp 8 0x124 0x10c 4 Rtn adr 0x108 %ebp 0x104 -4 0x100 movl 12(%ebp),%ecx # ecx = yp movl 8(%ebp),%edx # edx = xp movl (%ecx),%eax # eax = *yp (t1) movl (%edx),%ebx # ebx = *xp (t0) movl %eax,(%edx) # *xp = eax movl %ebx,(%ecx) # *yp = ebx

22 Understanding Swap Address 123 0x124 456 0x120 0x11c %eax %edx %ecx
%ebx %esi %edi %esp %ebp 0x120 0x104 0x118 Offset 0x114 yp 12 0x120 0x110 xp 8 0x124 0x10c 4 Rtn adr 0x108 %ebp 0x104 -4 0x100 movl 12(%ebp),%ecx # ecx = yp movl 8(%ebp),%edx # edx = xp movl (%ecx),%eax # eax = *yp (t1) movl (%edx),%ebx # ebx = *xp (t0) movl %eax,(%edx) # *xp = eax movl %ebx,(%ecx) # *yp = ebx

23 Understanding Swap Address 123 0x124 456 0x120 0x11c %eax %edx %ecx
%ebx %esi %edi %esp %ebp 0x124 0x120 0x104 0x118 Offset 0x114 yp 12 0x120 0x110 xp 8 0x124 0x10c 4 Rtn adr 0x108 %ebp 0x104 -4 0x100 movl 12(%ebp),%ecx # ecx = yp movl 8(%ebp),%edx # edx = xp movl (%ecx),%eax # eax = *yp (t1) movl (%edx),%ebx # ebx = *xp (t0) movl %eax,(%edx) # *xp = eax movl %ebx,(%ecx) # *yp = ebx

24 Understanding Swap Address 123 0x124 456 0x120 0x11c %eax %edx %ecx
%ebx %esi %edi %esp %ebp 456 0x124 0x120 0x104 0x118 Offset 0x114 yp 12 0x120 0x110 xp 8 0x124 0x10c 4 Rtn adr 0x108 %ebp 0x104 -4 0x100 movl 12(%ebp),%ecx # ecx = yp movl 8(%ebp),%edx # edx = xp movl (%ecx),%eax # eax = *yp (t1) movl (%edx),%ebx # ebx = *xp (t0) movl %eax,(%edx) # *xp = eax movl %ebx,(%ecx) # *yp = ebx

25 Understanding Swap Address 123 0x124 456 0x120 0x11c %eax %edx %ecx
%ebx %esi %edi %esp %ebp 456 0x124 0x120 123 0x104 0x118 Offset 0x114 yp 12 0x120 0x110 xp 8 0x124 0x10c 4 Rtn adr 0x108 %ebp 0x104 -4 0x100 movl 12(%ebp),%ecx # ecx = yp movl 8(%ebp),%edx # edx = xp movl (%ecx),%eax # eax = *yp (t1) movl (%edx),%ebx # ebx = *xp (t0) movl %eax,(%edx) # *xp = eax movl %ebx,(%ecx) # *yp = ebx

26 Understanding Swap Address 456 0x124 456 0x120 0x11c %eax %edx %ecx
%ebx %esi %edi %esp %ebp 456 0x124 0x120 123 0x104 0x118 Offset 0x114 yp 12 0x120 0x110 xp 8 0x124 0x10c 4 Rtn adr 0x108 %ebp 0x104 -4 0x100 movl 12(%ebp),%ecx # ecx = yp movl 8(%ebp),%edx # edx = xp movl (%ecx),%eax # eax = *yp (t1) movl (%edx),%ebx # ebx = *xp (t0) movl %eax,(%edx) # *xp = eax movl %ebx,(%ecx) # *yp = ebx

27 Understanding Swap Address 456 0x124 123 0x120 0x11c %eax %edx %ecx
%ebx %esi %edi %esp %ebp 456 0x124 0x120 123 0x104 0x118 Offset 0x114 yp 12 0x120 0x110 xp 8 0x124 0x10c 4 Rtn adr 0x108 %ebp 0x104 -4 0x100 movl 12(%ebp),%ecx # ecx = yp movl 8(%ebp),%edx # edx = xp movl (%ecx),%eax # eax = *yp (t1) movl (%edx),%ebx # ebx = *xp (t0) movl %eax,(%edx) # *xp = eax movl %ebx,(%ecx) # *yp = ebx

28 Indexed Addressing Modes
Most General Form D(Rb,Ri,S) Refers to Address Mem[Reg[Rb]+S*Reg[Ri]+ D] D: Constant “displacement” 1, 2, or 4 bytes Rb: Base register: Any of 8 integer registers Ri: Index register: Any, except for %esp Unlikely you’d use %ebp, either S: Scale: 1, 2, 4, or 8 Special Cases (Rb,Ri) Mem[Reg[Rb]+Reg[Ri]] D(Rb,Ri) Mem[Reg[Rb]+Reg[Ri]+D] (Rb,Ri,S) Mem[Reg[Rb]+S*Reg[Ri]]

29 Address Computation Examples
%edx 0xf000 %ecx 0x100 Expression Computation Address 0x8(%edx) (%edx,%ecx) (%edx,%ecx,4) 0x80(,%edx,2)

30 Address Computation Instruction
leal Src,Dest Src is address mode expression Set Dest to address denoted by expression Uses Computing address without doing memory reference E.g., translation of p = &x[i]; Computing arithmetic expressions of the form x + k*y k = 1, 2, 4, or 8.

31 Some Arithmetic Operations
Format Computation Two Operand Instructions addl Src,Dest Dest = Dest + Src subl Src,Dest Dest = Dest - Src imull Src,Dest Dest = Dest * Src sall Src,Dest Dest = Dest << Src Also called shll sarl Src,Dest Dest = Dest >> Src Arithmetic shrl Src,Dest Dest = Dest >> Src Logical xorl Src,Dest Dest = Dest ^ Src andl Src,Dest Dest = Dest & Src orl Src,Dest Dest = Dest | Src

32 Some Arithmetic Operations
Format Computation One Operand Instructions incl Dest Dest = Dest + 1 decl Dest Dest = Dest - 1 negl Dest Dest = - Dest notl Dest Dest = ~ Dest

33 Using leal for Arithmetic Expressions
pushl %ebp movl %esp,%ebp movl 8(%ebp),%eax movl 12(%ebp),%edx leal (%edx,%eax),%ecx leal (%edx,%edx,2),%edx sall $4,%edx addl 16(%ebp),%ecx leal 4(%edx,%eax),%eax imull %ecx,%eax movl %ebp,%esp popl %ebp ret Set Up int arith (int x, int y, int z) { int t1 = x+y; int t2 = z+t1; int t3 = x+4; int t4 = y * 48; int t5 = t3 + t4; int rval = t2 * t5; return rval; } Body Finish

34 Understanding arith Stack int arith (int x, int y, int z) {
int t1 = x+y; int t2 = z+t1; int t3 = x+4; int t4 = y * 48; int t5 = t3 + t4; int rval = t2 * t5; return rval; } y x Rtn adr Old %ebp %ebp 4 8 12 Offset Stack z 16 movl 8(%ebp),%eax # eax = x movl 12(%ebp),%edx # edx = y leal (%edx,%eax),%ecx # ecx = x+y (t1) leal (%edx,%edx,2),%edx # edx = 3*y sall $4,%edx # edx = 48*y (t4) addl 16(%ebp),%ecx # ecx = z+t1 (t2) leal 4(%edx,%eax),%eax # eax = 4+t4+x (t5) imull %ecx,%eax # eax = t5*t2 (rval)

35 Understanding arith # eax = x movl 8(%ebp),%eax # edx = y
movl 12(%ebp),%edx # ecx = x+y (t1) leal (%edx,%eax),%ecx # edx = 3*y leal (%edx,%edx,2),%edx # edx = 48*y (t4) sall $4,%edx # ecx = z+t1 (t2) addl 16(%ebp),%ecx # eax = 4+t4+x (t5) leal 4(%edx,%eax),%eax # eax = t5*t2 (rval) imull %ecx,%eax int arith (int x, int y, int z) { int t1 = x+y; int t2 = z+t1; int t3 = x+4; int t4 = y * 48; int t5 = t3 + t4; int rval = t2 * t5; return rval; }

36 Another Example logical: pushl %ebp Set movl %esp,%ebp
movl 8(%ebp),%eax xorl 12(%ebp),%eax sarl $17,%eax andl $8185,%eax movl %ebp,%esp popl %ebp ret Set Up int logical(int x, int y) { int t1 = x^y; int t2 = t1 >> 17; int mask = (1<<13) - 7; int rval = t2 & mask; return rval; } Body Finish 213 = 8192, 213 – 7 = 8185 movl 8(%ebp),%eax eax = x xorl 12(%ebp),%eax eax = x^y (t1) sarl $17,%eax eax = t1>>17 (t2) andl $8185,%eax eax = t2 & 8185

37 CISC Properties Instruction can reference different operand types
Immediate, register, memory Arithmetic operations can read/write memory Memory reference can involve complex computation Rb + S*Ri + D Useful for arithmetic expressions, too Instructions can have varying lengths IA32 instructions can range from 1 to 15 bytes

38 Summary: Abstract Machines
Machine Models Data Control C 1) char 2) int, float 3) double 4) struct, array 5) pointer 1) loops 2) conditionals 3) switch 4) Proc. call 5) Proc. return mem proc Assembly 1) byte 2) 2-byte word 3) 4-byte long word 4) contiguous byte allocation 5) address of initial byte 3) branch/jump 4) call 5) ret mem regs alu processor Stack Cond. Codes

39 Pentium Pro (P6) History Features Announced in Feb. ‘95
Basis for Pentium II, Pentium III, and Celeron processors Pentium 4 similar idea, but different details Features Dynamically translates instructions to more regular format Very wide, but simple instructions Executes operations in parallel Up to 5 at once Very deep pipeline 12–18 cycle latency

40 PentiumPro Block Diagram
Microprocessor Report 2/16/95

41 PentiumPro Operation Translates instructions dynamically into “Uops”
118 bits wide Holds operation, two sources, and destination Executes Uops with “Out of Order” engine Uop executed when Operands available Functional unit available Execution controlled by “Reservation Stations” Keeps track of data dependencies between uops Allocates resources Consequences Indirect relationship between IA32 code & what actually gets executed Tricky to predict / optimize performance at assembly level

42 Whose Assembler? Intel/Microsoft Format GAS/Gnu Format
lea eax,[ecx+ecx*2] sub esp,8 cmp dword ptr [ebp-8],0 mov eax,dword ptr [eax*4+100h] leal (%ecx,%ecx,2),%eax subl $8,%esp cmpl $0,-8(%ebp) movl $0x100(,%eax,4),%eax Intel/Microsoft Differs from GAS Operands listed in opposite order mov Dest, Src movl Src, Dest Constants not preceded by ‘$’, Denote hex with ‘h’ at end 100h $0x100 Operand size indicated by operands rather than operator suffix sub subl Addressing format shows effective address computation [eax*4+100h] $0x100(,%eax,4)

43 Overview Last Time New Lecture 03 – slides 1-14, 16?
Denormalized floats Special floats, Infinity, NaN Tiny Floats Error in show bytes code!!! New Finish denormals from last time Rounding, multiplication, addition Lab 1 comments Libraries Masks Unions Assembly Language

44 211 Review of Base-r to Decimal Conversions
Converting base-r to decimal by definition dndn-1dn-2…d 2d 1d 0(base r) = dnrn + dn-1rn-1… d2r2 +d 1r1 + d 0r0 Example 4F0C.A16 = 4*163 + F* *161 + C*160 +A*16-1 = 4* * *1 + 10/16 = /8 =

45 211 Review of Decimal to Base-r Conversion
Repeated division algorithm Justification: dndn-1dn-2…d 2d 1d 0 = dnrn + dn-1rn-1… d2r2 +d 1r1 + d 0r0 Dividing each side by r yields (dndn-1dn-2…d 2d 1d 0) / r = dnrn-1 + dn-1rn-2… d2r1+d 1r0 + d 0r-1 So d 0 is the remainder of the first division ((q1) / r = dnrn-2 + dn-1rn-3… d3r1+d 2r0 + d 1r-1 So d 1 is the remainder of the next division and d 2 is the remainder of the next division

46 211 Review of Decimal to Base-r Conversion Example
Repeated division algorithm Example Convert 4343 to hex 4343/16 = remainder = 7 271/16 = remainder = 15 16/16 = remainder = 0 1/16 = remainder = 1 So = 10F716 To check the answer convert back to decimal 10F7 = 1* *16 + 7*1 = = 4343

47 211 Review of Decimal Fractions to Hex
Repeated multiplication of base 16 time the fractional portion to generate the digits .884 * 16 =  (since 14 = E in hex) ~ .E16 .144 * 16 =  ~.E216 .304 * 16 =  ~ .E2416 .864 *16 =  ~ .E24D16 .824 * 16 =  ~ .E24DD16 .184 * 16 = 2.94 so if we are rounding to five hex digits since 2 < ½ *16 we round down and = .E24DD16

48 Convert to IEEE 754 float = 10F7. E24DD16, now converting to binary *20, = *212 , So ExpField = = 138 = = Sign bit = 1 (to represent a negative) and the fraction is the firs 110t 23 bits above almost because we round. ^ The rounding rule usually is (round to even if exactly ½ ) In this case the next digit is a 0 so we round down Rep = = = 0x C B F

49 Pop Quiz – Normal floats
Value Float F = 212; = Significand M = frac = Exponent E = Bias = Exp = = Floating Point Representation: Hex: Binary: exponent: 212:

50 Jan 25 Pop Quiz - denormals
What is the representation of the largest denormalized IEEE float (in binary)? In hex? What is its value as an expression, i.e., (-1)sign m * 2exp How many floats are there between 1.0 and 2.0? What is a/the representation of minus infinity? In C are there more ints or doubles? In Math are there more rationals than integers ? Extra credit for pop quiz 1: what is aleph-0?

51 Lab01 msb.c – extract and print most significant byte
Unions Pointers Masks and such sin.c – using math library gcc sin.c -lm

52 label_show_bytes void label_show_bytes(char *label, pointer start, int len) { int i; printf("%s ", label); for (i = 0; i < len; i++) printf("0x%p\t0x%.2x", start+i, start[i]); printf("\n"); }

53 Unions and such float f, pi; union { float fl; unsigned int ui; } un;
pi = ; /* what precision!*/ un.fl = -1*pi; printf("float %f assigned to unsigned %ud\n", pi, un.ui); label_show_bytes("un.fl", (pointer)&un.fl, 4); label_show_bytes("un.ui", (pointer)&un.ui, 4);

54 Pointers Declarations Dereferences Address-of operator
Explicit Casting

55 Masks and such

56 Math library /usr/lib ar t /usr/lib/libm.a gcc sin.c -lm

57 FP Multiplication Operands Exact Result Fixing Implementation
(–1)s1 M1 2E1 * (–1)s2 M2 2E2 Exact Result (–1)s M 2E Sign s: s1 ^ s2 Significand M: M1 * M2 Exponent E: E1 + E2 Fixing If M ≥ 2, shift M right, increment E If E out of range, overflow Round M to fit frac precision Implementation Biggest chore is multiplying significands

58 FP Addition Operands Exact Result Fixing (–1)s1 M1 2E1 (–1)s2 M2 2E2
Assume E1 > E2 Exact Result (–1)s M 2E Sign s, significand M: Result of signed align & add Exponent E: E1 Fixing If M ≥ 2, shift M right, increment E if M < 1, shift M left k positions, decrement E by k Overflow if E out of range Round M to fit frac precision (–1)s1 M1 (–1)s2 M2 E1–E2 + (–1)s M

59 Floating Point in C C Guarantees Two Levels Conversions
float single precision double double precision Conversions Casting between int, float, and double changes numeric values Double or float to int Truncates fractional part Like rounding toward zero Not defined when out of range Generally saturates to TMin or TMax int to double Exact conversion, as long as int has ≤ 53 bit word size int to float Will round according to rounding mode

60 IEEE 754 Rounding Algorithms
Round to nearest, ties to even – rounds to the nearest value; if the number falls midway it is rounded to the nearest value with an even (zero) least significant bit, which occurs 50% of the time; this is the default algorithm for binary floating-point and the recommended default for decimal Round to nearest, ties away from zero – rounds to the nearest value; if the number falls midway it is rounded to the nearest value above (for positive numbers) or below (for negative numbers) Round toward 0 – directed rounding towards zero (also called truncation) Round toward – directed rounding towards positive infinity Round toward – directed rounding towards negative infinity.

61 Ariane 5 Why Exploded 37 seconds after liftoff
Cargo worth $500 million Why Computed horizontal velocity as floating point number Converted to 16-bit integer Worked OK for Ariane 4 Overflowed for Ariane 5 Used same software

62 IA32 Processors Totally Dominate Computer Market Evolutionary Design
Starting in 1978 with 8086 Added more features as time goes on Still support old features, although obsolete Complex Instruction Set Computer (CISC) Many different instructions with many different formats But, only small subset encountered with Linux programs Hard to match performance of Reduced Instruction Set Computers (RISC) But, Intel has done just that!

63 X86 Evolution: Programmer’s View
Name Date Transistors K 16-bit processor. Basis for IBM PC & DOS Limited to 1MB address space. DOS only gives you 640K K Added elaborate, but not very useful, addressing scheme Basis for IBM PC-AT and Windows K Extended to 32 bits. Added “flat addressing” Capable of running Unix Linux/gcc uses no instructions introduced in later models

64 X86 Evolution: Programmer’s View
Name Date Transistors M Pentium M Pentium/MMX M Added special collection of instructions for operating on 64-bit vectors of 1, 2, or 4 byte integer data PentiumPro M Added conditional move instructions Big change in underlying microarchitecture

65 X86 Evolution: Programmer’s View
Name Date Transistors Pentium III M Added “streaming SIMD” instructions for operating on 128-bit vectors of 1, 2, or 4 byte integer or floating point data Our fish machines Pentium M Added 8-byte formats and 144 new instructions for streaming SIMD mode

66 X86 Evolution: Clones Advanced Micro Devices (AMD) Historically
AMD has followed just behind Intel A little bit slower, a lot cheaper Recently Recruited top circuit designers from Digital Equipment Corp. Exploited fact that Intel distracted by IA64 Now are close competitors to Intel Developing own extension to 64 bits

67 X86 Evolution: Clones Transmeta Recent start-up
Employer of Linus Torvalds Radically different approach to implementation Translates x86 code into “Very Long Instruction Word” (VLIW) code High degree of parallelism Shooting for low-power market

68 New Species: IA64 Name Date Transistors Itanium 2001 10M
Extends to IA64, a 64-bit architecture Radically new instruction set designed for high performance Will be able to run existing IA32 programs On-board “x86 engine” Joint project with Hewlett-Packard Itanium M Big performance boost

69 Assembly Programmer’s View
CPU Memory Addresses Registers E I P Object Code Program Data OS Data Data Condition Codes Instructions Stack Programmer-Visible State EIP Program Counter Address of next instruction Register File Heavily used program data Condition Codes Store status information about most recent arithmetic operation Used for conditional branching Memory Byte addressable array Code, user data, (some) OS data Includes stack used to support procedures

70 Turning C into Object Code
Code in files p1.c p2.c Compile with command: gcc -O p1.c p2.c -o p Use optimizations (-O) Put resulting binary in file p text C program (p1.c p2.c) Compiler (gcc -S) Asm program (p1.s p2.s) text Assembler (gcc or as) Object program (p1.o p2.o) Static libraries (.a) binary Linker (gcc or ld) binary Executable program (p)

71 Compiling Into Assembly
C Code Generated Assembly int sum(int x, int y) { int t = x+y; return t; } _sum: pushl %ebp movl %esp,%ebp movl 12(%ebp),%eax addl 8(%ebp),%eax movl %ebp,%esp popl %ebp ret Obtain with command gcc -O -S code.c Produces file code.s

72 Assembly Characteristics
Minimal Data Types “Integer” data of 1, 2, or 4 bytes Data values Addresses (untyped pointers) Floating point data of 4, 8, or 10 bytes No aggregate types such as arrays or structures Just contiguously allocated bytes in memory Primitive Operations Perform arithmetic function on register or memory data Transfer data between memory and register Load data from memory into register Store register data into memory Transfer control Unconditional jumps to/from procedures Conditional branches

73 Object Code Assembler Code for sum Linker Translates .s into .o
Binary encoding of each instruction Nearly-complete image of executable code Missing linkages between code in different files Linker Resolves references between files Combines with static run-time libraries E.g., code for malloc, printf Some libraries are dynamically linked Linking occurs when program begins execution Code for sum 0x <sum>: 0x55 0x89 0xe5 0x8b 0x45 0x0c 0x03 0x08 0xec 0x5d 0xc3 Total of 13 bytes Each instruction 1, 2, or 3 bytes Starts at address 0x401040

74 Machine Instruction Example
C Code Add two signed integers Assembly Add 2 4-byte integers “Long” words in GCC parlance Same instruction whether signed or unsigned Operands: x: Register %eax y: Memory M[%ebp+8] t: Register %eax Return function value in %eax Object Code 3-byte instruction Stored at address 0x401046 int t = x+y; addl 8(%ebp),%eax Similar to expression x += y 0x401046:

75 Disassembling Object Code
Disassembled <_sum>: 0: push %ebp 1: 89 e mov %esp,%ebp 3: 8b 45 0c mov 0xc(%ebp),%eax 6: add 0x8(%ebp),%eax 9: 89 ec mov %ebp,%esp b: 5d pop %ebp c: c ret d: 8d lea 0x0(%esi),%esi Disassembler objdump -d p Useful tool for examining object code Analyzes bit pattern of series of instructions Produces approximate rendition of assembly code Can be run on either a.out (complete executable) or .o file

76 Alternate Disassembly
Disassembled Object 0x <sum>: push %ebp 0x <sum+1>: mov %esp,%ebp 0x <sum+3>: mov 0xc(%ebp),%eax 0x <sum+6>: add 0x8(%ebp),%eax 0x <sum+9>: mov %ebp,%esp 0x40104b <sum+11>: pop %ebp 0x40104c <sum+12>: ret 0x40104d <sum+13>: lea 0x0(%esi),%esi 0x401040: 0x55 0x89 0xe5 0x8b 0x45 0x0c 0x03 0x08 0xec 0x5d 0xc3 Within gdb Debugger gdb p disassemble sum Disassemble procedure x/13b sum Examine the 13 bytes starting at sum

77 What Can be Disassembled?
% objdump -d WINWORD.EXE WINWORD.EXE: file format pei-i386 No symbols in "WINWORD.EXE". Disassembly of section .text: <.text>: : push %ebp : 8b ec mov %esp,%ebp : 6a ff push $0xffffffff : push $0x a: dc 4c 30 push $0x304cdc91 Anything that can be interpreted as executable code Disassembler examines bytes and reconstructs assembly source

78 Moving Data Moving Data Operand Types movl Source,Dest:
%eax %edx %ecx %ebx %esi %edi %esp %ebp Moving Data movl Source,Dest: Move 4-byte (“long”) word Lots of these in typical code Operand Types Immediate: Constant integer data Like C constant, but prefixed with ‘$’ E.g., $0x400, $-533 Encoded with 1, 2, or 4 bytes Register: One of 8 integer registers But %esp and %ebp reserved for special use Others have special uses for particular instructions Memory: 4 consecutive bytes of memory Various “address modes”

79 movl Operand Combinations
Source Destination C Analog Reg movl $0x4,%eax temp = 0x4; Imm Mem movl $-147,(%eax) *p = -147; Reg movl %eax,%edx temp2 = temp1; movl Reg Mem movl %eax,(%edx) *p = temp; Mem Reg movl (%eax),%edx temp = *p; Cannot do memory-memory transfers with single instruction

80 Simple Addressing Modes
Normal (R) Mem[Reg[R]] Register R specifies memory address movl (%ecx),%eax Displacement D(R) Mem[Reg[R]+D] Register R specifies start of memory region Constant displacement D specifies offset movl 8(%ebp),%edx

81 Using Simple Addressing Modes
swap: pushl %ebp movl %esp,%ebp pushl %ebx movl 12(%ebp),%ecx movl 8(%ebp),%edx movl (%ecx),%eax movl (%edx),%ebx movl %eax,(%edx) movl %ebx,(%ecx) movl -4(%ebp),%ebx movl %ebp,%esp popl %ebp ret Set Up void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0; } Body Finish

82 Understanding Swap Stack yp xp Rtn adr Old %ebp %ebp 4 8 12 Offset •
4 8 12 Offset Old %ebx -4 void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0; } Stack Register Variable %ecx yp %edx xp %eax t1 %ebx t0 movl 12(%ebp),%ecx # ecx = yp movl 8(%ebp),%edx # edx = xp movl (%ecx),%eax # eax = *yp (t1) movl (%edx),%ebx # ebx = *xp (t0) movl %eax,(%edx) # *xp = eax movl %ebx,(%ecx) # *yp = ebx

83 Understanding Swap Address 123 0x124 456 0x120 0x11c %eax %edx %ecx
%ebx %esi %edi %esp %ebp 0x104 0x118 Offset 0x114 yp 12 0x120 0x110 xp 8 0x124 0x10c 4 Rtn adr 0x108 %ebp 0x104 -4 0x100 movl 12(%ebp),%ecx # ecx = yp movl 8(%ebp),%edx # edx = xp movl (%ecx),%eax # eax = *yp (t1) movl (%edx),%ebx # ebx = *xp (t0) movl %eax,(%edx) # *xp = eax movl %ebx,(%ecx) # *yp = ebx

84 Understanding Swap Address 123 0x124 456 0x120 0x11c %eax %edx %ecx
%ebx %esi %edi %esp %ebp 0x120 0x104 0x118 Offset 0x114 yp 12 0x120 0x110 xp 8 0x124 0x10c 4 Rtn adr 0x108 %ebp 0x104 -4 0x100 movl 12(%ebp),%ecx # ecx = yp movl 8(%ebp),%edx # edx = xp movl (%ecx),%eax # eax = *yp (t1) movl (%edx),%ebx # ebx = *xp (t0) movl %eax,(%edx) # *xp = eax movl %ebx,(%ecx) # *yp = ebx

85 Understanding Swap Address 123 0x124 456 0x120 0x11c %eax %edx %ecx
%ebx %esi %edi %esp %ebp 0x124 0x120 0x104 0x118 Offset 0x114 yp 12 0x120 0x110 xp 8 0x124 0x10c 4 Rtn adr 0x108 %ebp 0x104 -4 0x100 movl 12(%ebp),%ecx # ecx = yp movl 8(%ebp),%edx # edx = xp movl (%ecx),%eax # eax = *yp (t1) movl (%edx),%ebx # ebx = *xp (t0) movl %eax,(%edx) # *xp = eax movl %ebx,(%ecx) # *yp = ebx

86 Understanding Swap Address 123 0x124 456 0x120 0x11c %eax %edx %ecx
%ebx %esi %edi %esp %ebp 456 0x124 0x120 0x104 0x118 Offset 0x114 yp 12 0x120 0x110 xp 8 0x124 0x10c 4 Rtn adr 0x108 %ebp 0x104 -4 0x100 movl 12(%ebp),%ecx # ecx = yp movl 8(%ebp),%edx # edx = xp movl (%ecx),%eax # eax = *yp (t1) movl (%edx),%ebx # ebx = *xp (t0) movl %eax,(%edx) # *xp = eax movl %ebx,(%ecx) # *yp = ebx

87 Understanding Swap Address 123 0x124 456 0x120 0x11c %eax %edx %ecx
%ebx %esi %edi %esp %ebp 456 0x124 0x120 123 0x104 0x118 Offset 0x114 yp 12 0x120 0x110 xp 8 0x124 0x10c 4 Rtn adr 0x108 %ebp 0x104 -4 0x100 movl 12(%ebp),%ecx # ecx = yp movl 8(%ebp),%edx # edx = xp movl (%ecx),%eax # eax = *yp (t1) movl (%edx),%ebx # ebx = *xp (t0) movl %eax,(%edx) # *xp = eax movl %ebx,(%ecx) # *yp = ebx

88 Understanding Swap Address 456 0x124 456 0x120 0x11c %eax %edx %ecx
%ebx %esi %edi %esp %ebp 456 0x124 0x120 123 0x104 0x118 Offset 0x114 yp 12 0x120 0x110 xp 8 0x124 0x10c 4 Rtn adr 0x108 %ebp 0x104 -4 0x100 movl 12(%ebp),%ecx # ecx = yp movl 8(%ebp),%edx # edx = xp movl (%ecx),%eax # eax = *yp (t1) movl (%edx),%ebx # ebx = *xp (t0) movl %eax,(%edx) # *xp = eax movl %ebx,(%ecx) # *yp = ebx

89 Understanding Swap Address 456 0x124 123 0x120 0x11c %eax %edx %ecx
%ebx %esi %edi %esp %ebp 456 0x124 0x120 123 0x104 0x118 Offset 0x114 yp 12 0x120 0x110 xp 8 0x124 0x10c 4 Rtn adr 0x108 %ebp 0x104 -4 0x100 movl 12(%ebp),%ecx # ecx = yp movl 8(%ebp),%edx # edx = xp movl (%ecx),%eax # eax = *yp (t1) movl (%edx),%ebx # ebx = *xp (t0) movl %eax,(%edx) # *xp = eax movl %ebx,(%ecx) # *yp = ebx

90 Indexed Addressing Modes
Most General Form D(Rb,Ri,S) Mem[Reg[Rb]+S*Reg[Ri]+ D] D: Constant “displacement” 1, 2, or 4 bytes Rb: Base register: Any of 8 integer registers Ri: Index register: Any, except for %esp Unlikely you’d use %ebp, either S: Scale: 1, 2, 4, or 8 Special Cases (Rb,Ri) Mem[Reg[Rb]+Reg[Ri]] D(Rb,Ri) Mem[Reg[Rb]+Reg[Ri]+D] (Rb,Ri,S) Mem[Reg[Rb]+S*Reg[Ri]]

91 Address Computation Examples
%edx 0xf000 %ecx 0x100 Expression Computation Address 0x8(%edx) 0xf x8 0xf008 (%edx,%ecx) 0xf x100 0xf100 (%edx,%ecx,4) 0xf *0x100 0xf400 0x80(,%edx,2) 2*0xf x80 0x1e080


Download ppt "Lecture 5 Assembly Language"

Similar presentations


Ads by Google