Ch. 2 Two’s Complement Boolean vs. Logical Floating Point How to represent a negative number in 2’s complement Exception: -Tmin = Tmin Boolean vs. Logical &, |, ~, ^ -- bit-wise operation &&, ||, ! Floating Point
Ch. 3 IA32 registers Memory addressing modes Control structures If .. Else While Case Jump table Linked list Data storage Word boundary alignment Union struct
Ch. 4 Logic Design Y86 CPU Control Unit for Y86 Truth table Machine code assembly CPU Control Unit for Y86 5 Instruction execution cycle (phase) Control logic (gray boxes) Seq vs. Pipelined
Chap. 2 Info Rep & Manipulation ASCII ‘L’ 0x4c ‘o’ 0x6f What is 0x70 6f 6f 4c ?
Chap. 2 Info Rep & Manipulation ASCII ‘L’ 0x4c ‘o’ 0x6f What is 0x70 6f 6f 4c ? Ans: No idea ! ASCII --> ‘Loop’ int --> … Float --> ~ 1.11x297 Machine code --> (Y86) jmp *??4c6f6f Base 5 -->
Unsigned vs. Signed Expression Evaluation Right Shift: x >> y If mix unsigned and signed in single expression, signed values implicitly cast to unsigned 127U < -128 (8-bit) Right Shift: x >> y Arithmetic shift Replicate most significant bit on right Useful with two’s complement integer representation Unsigned Two’s Complement
Boolean vs. Logical Operators &, |, ~, ^ -- bit-wise operation ~0x41 --> 0xBE ~010000012 --> 101111102 0x69 & 0x55 --> 0x41 011010012 & 010101012 --> 010000012 0x69 | 0x55 --> 0x7D 011010012 | 010101012 --> 011111012 &&, ||, ! !0x41 --> 0x00 !0x00 --> 0x01 !!0x41 --> 0x01 0x69 && 0x55 --> 0x01 0x69 || 0x55 --> 0x01
Floating Point s exp frac E Value 0 000 00 -2 0 0 000 00 -2 0 0 000 01 -2 1/4*1/4 = 1/16 0 000 10 -2 2/4*1/4 = 2/16 0 000 11 -2 3/4*1/4 = 3/16 0 001 00 -2 4/4*1/4 = 4/16 0 001 01 -2 5/4*1/4 = 5/16 … 0 010 11 -1 7/4*1/2 = 14/16 0 011 00 0 4/4*1 = 16/16 = 1 0 011 01 0 5/4*1 = 20/16 = 1.25 0 011 10 0 6/4*1 = 24/16 = 1.5 0 011 11 0 7/4*1 = 28/16 = 1.75 0 100 00 1 0 110 10 3 6/4*8 = 12 0 110 11 3 7/4*8 = 14 0 111 00 n/a inf closest to zero Denormalized numbers largest denorm smallest norm Normalized numbers largest norm
Integer Addition u • • • + v • • • u + v • • • TAddw(u , v) • • • v • • • Operands: w bits + v • • • True Sum: w+1 bits u + v • • • Discard Carry: w bits TAddw(u , v) • • • u v < 0 > 0 NegOver PosOver TAdd(u , v) Can have an overflow, underflow
Computer Arithmetic vs. Math Theorems int x = random(); float f=random(); double d=random(); Expressions always true ? (d + f) - d == f x * x >= 0 (x & 0xF) != 0xF || (x <<28 <0) x > 0 || -x >=0
Chap. 3 Assembly Programmer’s View CPU Memory Addresses Registers PC Object Code Program Data OS Data Data Condition Codes Instructions Stack Programmer-Visible State EIP (IA32) or RIP (IA64) Program Counter (PC) Address of next instruction Register File Heavily used program data Condition Codes Store status information about most recent arithmetic operation Used for conditional branching Memory Byte addressable array Code, user data, (some) OS data Includes stack used to support procedures
IA32 Machine Basics
Indexed Addressing Modes Most General Form D(Rb,Ri,S) Mem[Reg[Rb]+S*Reg[Ri]+ D] D: Constant “displacement” 1, 2, or 4 bytes Rb: Base register: Any of 8 integer registers Ri: Index register: Any, except for %esp Unlikely you’d use %ebp, either S: Scale: 1, 2, 4, or 8 Special Cases (Rb,Ri) Mem[Reg[Rb]+Reg[Ri]] D(Rb,Ri) Mem[Reg[Rb]+Reg[Ri]+D] (Rb,Ri,S) Mem[Reg[Rb]+S*Reg[Ri]]
3.6.7.Switch Statements Implementation Options Series of conditionals typedef enum {ADD, MULT, MINUS, DIV, MOD, BAD} op_type; char unparse_symbol(op_type op){ switch (op) { case ADD : return '+'; case MULT: return '*'; case MINUS: return '-'; case DIV: return '/'; case MOD: return '%'; case BAD: return '?'; } Implementation Options Series of conditionals Good if few cases Slow if many Jump Table Lookup branch target Avoids conditionals Possible when cases are small integer constants GCC Picks one based on case structure Bug in example code No default given
Jump Table Structure Jump Targets Switch Form Jump Table Targ0: Targ1: Code Block Targ0: 1 Targ1: 2 Targ2: n–1 Targn-1: • switch(op) { case val_0: Block 0 case val_1: Block 1 • • • case val_n-1: Block n–1 } Targ0 Targ1 Targ2 Targn-1 • jtab: Approx. Translation target = JTab[op]; goto *target;
Jump Table Targets & Completion Table Contents Enumerated Values ADD 0 movl $43,%eax # ’+’ jmp .L49 .L52: movl $42,%eax # ’*’ .L53: movl $45,%eax # ’-’ .L54: movl $47,%eax # ’/’ .L55: movl $37,%eax # ’%’ .L56: movl $63,%eax # ’?’ # Fall Through to .L49 .section .rodata .align 4 .L57: .long .L51 #Op = 0 .long .L52 #Op = 1 .long .L53 #Op = 2 .long .L54 #Op = 3 .long .L55 #Op = 4 .long .L56 #Op = 5 Enumerated Values ADD 0 MULT 1 MINUS 2 DIV 3 MOD 4 BAD 5
Approx. Translation unparse_symbol: pushl %ebp # Setup movl %esp,%ebp # Setup movl 8(%ebp),%eax # eax = op cmpl $5,%eax # Compare op : 5 ja .L49 # If > 5 goto done jmp *.L57(,%eax,4)# goto Table[op] Approx. Translation target = JTab[op]; goto *target;
Procedure Call Example 804854e: e8 3d 06 00 00 call 8048b90 <main> 8048553: 50 pushl %eax call 8048b90 0x110 0x110 0x10c 0x10c 0x108 123 0x108 123 0x104 0x8048553 %esp 0x108 %esp 0x108 0x104 %eip 0x804854e %eip 0x8048b90 %eip is program counter
Recursive Factorial Registers %eax used without first saving .globl rfact .type rfact,@function rfact: pushl %ebp movl %esp,%ebp pushl %ebx movl 8(%ebp),%ebx cmpl $1,%ebx jle .L78 leal -1(%ebx),%eax pushl %eax call rfact imull %ebx,%eax jmp .L79 .align 4 .L78: movl $1,%eax .L79: movl -4(%ebp),%ebx movl %ebp,%esp popl %ebp ret Recursive Factorial int rfact(int x) { int rval; if (x <= 1) return 1; rval = rfact(x-1); return rval * x; } Registers %eax used without first saving %ebx used, but save at beginning & restore at end
Data Alignment Windows (including Cygwin): Linux: c i[0] i[1] v p+0 struct S1 { char c; int i[2]; double v; } *p; Windows (including Cygwin): K = 8, due to double element Linux: K = 4; double treated like a 4-byte data type c i[0] i[1] v p+0 p+4 p+8 p+16 p+24 Multiple of 4 Multiple of 8 c i[0] i[1] p+0 p+4 p+8 Multiple of 4 v p+12 p+20
Union Allocation Principles c i[0] i[1] v up+0 up+4 up+8 Overlay union elements Allocate according to largest element Can only use one field at a time union U1 { char c; int i[2]; double v; } *up; c i[0] i[1] v up+0 up+4 up+8 struct S1 { char c; int i[2]; double v; } *sp; (Windows alignment) c i[0] i[1] v sp+0 sp+4 sp+8 sp+16 sp+24
Union to Access Bit Patterns typedef union { float f; unsigned u; } bit_float_t; float bit2float(unsigned u) { bit_float_t arg; arg.u = u; return arg.f; } u f unsigned float2bit(float f) { bit_float_t arg; arg.f = f; return arg.u; } 4 Get direct access to bit representation of float bit2float generates float with given bit pattern NOT the same as (float) u float2bit generates bit pattern from float NOT the same as (unsigned) f
Chap. 4. Y86 Instruction Set Byte 1 2 3 4 5 nop addl 6 subl 1 andl 2 1 2 3 4 5 nop addl 6 subl 1 andl 2 xorl 3 halt 1 rrmovl rA, rB 2 rA rB irmovl V, rB 3 8 rB V rmmovl rA, D(rB) 4 rA rB D jmp 7 jle 1 jl 2 je 3 jne 4 jge 5 jg 6 mrmovl D(rB), rA 5 rA rB D OPl rA, rB 6 fn rA rB jXX Dest 7 fn Dest call Dest 8 Dest ret 9 pushl rA A rA 8 popl rA B rA 8
Building Blocks MUX 1 Clock = Combinational Logic fun B = Combinational Logic Compute Boolean functions of inputs Continuously respond to input changes Operate on data and implement control Storage Elements Store bits Addressable memories Non-addressable registers Loaded only as clock rises MUX 1 Register file A B W dstW srcA valA srcB valB valW Clock Clock
SEQ Hardware Key Blue boxes: predesigned hardware blocks E.g., memories, ALU Gray boxes: control logic White ovals: labels for signals Thick lines: 32-bit word values Thin lines: 4-8 bit values Dotted lines: 1-bit values