1 COMP 2130 Introduction to Computer Systems Computing Science Thompson Rivers University Ref.: Carnegie Mellon, CSPP, Intel, AMD
Contents 2 What is an ISA (Instruction Set Architecture)? A brief history of Intel processors and architectures C, assembly, machine code x86 basics: registers
How can you improve the performance? 3 User Program in C C Compiler AssemblerHardware.c file executable file Code TimeCompile TimeRun Time
Performance Issues 4 The time required to execute a program depends on: ◦ The program (as written in C, for instance) ◦ The compiler: what set of assembler instructions it translates the C program into ◦ The instruction set architecture (ISA): what set of instructions it makes available to the compiler ◦ The hardware implementation: how much time it takes to execute an instruction
Instruction Set Architectures 5 The ISA defines: ◦ The system’s state (e.g. registers, memory, program counter) ◦ The instructions the CPU can execute ◦ The effect that each of these instructions will have on the system state CPU Memory PC Registers
General ISA Design Decisions 6 Instructions ◦ What instructions are available? What do they do? ◦ How are they encoded? Registers ◦ How many registers are there? ◦ How wide are they? Memory ◦ How do you specify a memory location? ◦ Addressing modes?
x86 Processors that implement the x86 ISA completely dominate the server, desktop and laptop markets Evolutionary design Backwards compatible up until 8086, introduced in 1978 Added more features as time goes on Complex instruction set computer (CISC) Many different instructions with many different formats But, only small subset encountered with Linux programs (as opposed to Reduced Instruction Set Computers (RISC), which use simpler instructions)
Intel x86 Evolution: Milestones 8 8087 The real interesting issues for Intel floating point functions begin with the 8087 coprocessor. Introduced in 1982, the (or 286) added additional memory functionality as well as expended the external data bus to 16bits. The 386 introduced both a new logical memory organization and 32bit processing to the personal computing world, yet did so in such a way as to retain compatibility with the 286 and 8088.
Intel x86 Processors 9 Machine Evolution M Pentium M Pentium/MMX M PentiumPro M Pentium III M Pentium M Core 2 Duo M Core i M Added Features Instructions to support multimedia operations Parallel operations on 1, 2, and 4-byte data Instructions to enable more efficient conditional operations More cores! Intel Core i7
Intel x86 Processors 10 4th Generation Intel® Core™ i7 Processors
x86 Clones: Advanced Micro Devices (AMD) Historically AMD has followed just behind Intel A little bit slower, a lot cheaper Then Recruited top circuit designers from Digital Equipment and other downward trending companies Built Opteron: tough competitor to Pentium 4 Developed x86-64, their own extension of x86 to 64 bits
AMD Opteron™ 6300 Series Processors
Definitions Architecture: (also instruction set architecture or ISA) The parts of a processor design that one needs to understand to write assembly code ◦ “What is directly visible to software” Microarchitecture: Implementation of the architecture Is cache size “architecture”? How about core frequency? And number of registers?
Assembly Programmer’s View Program Counter Address of next instruction Called “EIP” (IA32) or “RIP” (x86-64) Extended Instruction Pointer Register file Heavily used program data Condition codes Store status information about most recent arithmetic operation Used for conditional branching
Memory Byte addressable array Code, user data, (some) OS data Includes stack used to support procedures Assembly Programmer’s View
Addresses Data Instructions Assembly Programmer’s View
Introduction to Assembly Language
18 An Intermediate Language between Machine code and the High Level Language It is towards the Low Level Language Paradigm as it follow the norm of “One Language Instruction For One Machine Instruction” It has many advantages over: Machine code Better human understanding Easy to write and debug Use of mnemonics for instructions Reserves Memory location for data High Level Language It writes more efficient/optimized programs
19 Why should we spend our time learning machine code? 1. Even though compilers do most of the work in generating assembly code, being able to read and understand it is an important skill for serious programmers. 2. By reading assembly code, We can understand the optimization capabilities of the compiler and analyze the underlying inefficiencies in the code. We can understand the function invocation mechanism. We can help ourselves understand how computer systems (HW) and operating systems (SW) run programs.
Assembly Language - Basic Syntax 20 An assembly program can be divided into three sections: data Section used for declaring initialized data or constants. bss Section used for declaring variables. text Section used for keeping the actual code. Block Started by Symbol
Assembly Language - Basic Syntax 21 The syntax for declaring data section is: section.data The syntax for declaring bss section is: section.bss The syntax for declaring text section is: section.text global _start _start: Comments Assembly language comment begins with a semicolon (;). It may contain any printable character including blank. It can appear on a line by itself, like: ; This program displays a message on screen
Assembly Language - Basic Syntax 22 Assembly language statements are entered one statement per line. Each statement follows the following format: [label] mnemonic [operands] [;comment]
Assembly Language - Basic Syntax 23 some examples of typical assembly language statements: INC COUNT ; Increment the memory variable COUNT MOV TOTAL, 48 ; Transfer the value 48 in the ; memory variable TOTAL ADD AH, BH ; Add the content of the ; BH register into the AH register AND MASK1, 128 ; Perform AND operation on the ; variable MASK1 and 128 ADD MARKS, 10 ; Add 10 to the variable MARKS MOV AL, 10 ; Transfer the value 10 to the AL register
Assembly Language - Instruction Syntax 24
Assembly Language - Instruction Syntax 25 There are two conventions about assembly instruction syntax and representations: Intel and AT&T.
Assembly Language - Instruction Syntax 26
Assembly Language - Basic Syntax 27 Intel
Assembly Language - Basic Syntax 28 Intel
Assembly Language - Basic Syntax 29 AT&T
Assembly Language - Basic Syntax 30 AT&T You can use -m32 to generate the equivalent 32-bit x86 assembly language.
Compiling Into Assembly Instruction Set Architecture 31 text binary Compiler ( gcc -S ) Assembler ( gcc or as ) Linker ( gcc or ld ) C program ( p1.c p2.c ) Asm program ( p1.s p2.s ) Object program ( p1.o p2.o ) Executable program ( p ) Static libraries (.a ) Code in files p1.c p2.c Compile with command: gcc -O1 p1.c p2.c -o p ◦ Use basic optimizations ( -O1 ) ◦ Put resulting binary in file p
Compiling Into Assembly C Code int sum(int x, int y) { int t = x+y; return t; } Generated IA32 Assembly sum: pushl %ebp movl %esp,%ebp movl 12(%ebp),%eax addl 8(%ebp),%eax popl %ebp ret Obtain with command $ gcc –O1 -S code.c Produces a file code.s (att is the default) OR gcc -S -masm=intel code.c Produces a file code.s (intel) For Mac OSX: clang++ -S -mllvm --x86-asm-syntax=intel code.c
Three Basic Kinds of Instructions Perform arithmetic function on register or memory data Transfer data between memory and register Load data from memory into register Store register data into memory Transfer control Unconditional jumps to/from procedures Conditional branches
Assembly Characteristics: Data Types “Integer” data of 1, 2, 4 (IA32), or 8 (just in x86-64) bytes Data values Addresses (untyped pointers) Floating point data of 4, 8, or 10 bytes What about “aggregate” types such as arrays or structs? No aggregate types, just contiguously allocated bytes in memory
Code for sum 0x : 0x55 0x89 0xe5 0x8b 0x45 0x0c 0x03 0x45 0x08 0x89 0xec 0x5d 0xc3 Object Code Assembler Translates.s into.o Binary encoding of each instruction Nearly-complete image of executable code Missing links between code in different files Linker Resolves references between object files and (re)locates their data Combines with static run-time libraries E.g., code for malloc, printf Some libraries are dynamically linked Linking occurs when program begins execution Total of 13 bytes Each instruction 1, 2, or 3 bytes Starts at address 0x Not at all obvious where each instruction starts and ends
Machine Instruction Example C Code: add two signed integers Assembly Add two 4-byte integers Operands: x :Register %eax y :MemoryM[ %ebp+8] t :Register %eax - Return function value in %eax Object Code 3-byte instruction Stored at address 0x int t = x+y; addl 8(%ebp),%eax 0x401046: Similar to expression: x += y More precisely: int eax; int *ebp; eax += ebp[2]
Disassembled : 0:55 push %ebp 1:89 e5 mov %esp,%ebp 3:8b 45 0c mov 0xc(%ebp),%eax 6: add 0x8(%ebp),%eax 9:89 ec mov %ebp,%esp b:5d pop %ebp c:c3 ret Disassembling Object Code Disassembler objdump -d p Useful tool for examining object code ( man 1 objdump ) Analyzes bit pattern of series of instructions (delineates instructions) Produces near-exact rendition of assembly code Can be run on either p (complete executable) or p1.o / p2.o file
What Is A Register? These are small memory area which is volatile and is used for all memory manipulation There are 8 “general purpose” registers There is 1 “instruction pointer” that points to the next instruction to execute Out of 8 – 6 are the commonly used registers where as the other two are rarely used
Registers 39 EAX – used to store the value returned from a function or as an accumulator to add the values EBX – base pointer to the data section ECX – counter register for loops and strings EDX – I/O Pointer ESI – Source Indicator EDI – Destination Indicator ESP - stack Pointer EBP – Stack Frame base pointer (where the stack starts for a specific function) EIP – Pointer to the next instruction to execute
32 bit to 64 bit assembly 40 All registers can be accessed in 16-bit and 32-bit modes. In 16-bit mode, the register is identified by its two-letter abbreviation from the list above i.e. AX. In 32-bit mode, this two-letter abbreviation is prefixed with an 'E' (extended).
32 bit to 64 bit assembly 41 For example, 'EAX' is the accumulator register as a 32-bit value. In the 64-bit version, the 'E' is replaced with an 'R', so the 64-bit version of 'EAX' is called 'RAX'. %eax %ah%al 16-bit virtual registers (backwards compatibility) %ax %rax
Types of data storage 42 Caller-saved registers Caller must save/restore these registers when live across call Callee is free to use them
Types of data storage 43 Callee-saved registers Callee must save/restore these registers when it uses them Caller expects callee to not change them
Types of data storage 44 Caller-Saved Registers – EAX, ECX & EDX These registers are the responsibility of the caller function to manage the data stored in it. Callee-Saved Registers – EBP, EBX, ESI & EDI It is the responsibility of the called function to store the values of these registers before using them so that they may retrieve the same values before they leave the function
EFLAGS register 45
EFLAGS register 46 A special register that holds many single bit flags. ZERO FLAG (ZF) – sets if the result of the instruction is zero; cleared otherwise SIGN FLAG (SF) – sets equal to the most significant bit of the result OVERFLOW FLAG (OF): indicates the overflow of a high-order bit (leftmost bit) of data after a signed arithmetic operation. DIRECTION FLAG (DF): determines left or right direction for moving or comparing string data. When the DF value is 0, the string operation takes left-to-right direction and when the value is set to 1, the string operation takes right-to-left direction. INTERRUPT FLAG (IF): determines whether the external interrupts like keyboard entry, etc., are to be ignored or processed. It disables the external interrupt when the value is 0 and enables interrupts when set to 1. TRAP FLAG (TF): allows setting the operation of the processor in single-step mode. The DEBUG program we used sets the trap flag, so we could step through the execution one instruction at a time.
Instructions 47 NOP – does nothing, no values May be used for delay It is actually an exchange function to one register to the itself. XCHG EAX, EAX PUSH – push word, double-word or Quad-word on the stack It automatically decrements the stack pointer esp, by 4 POP – pops the data from the stack Sets the esp automatically It would increment esp EQU – sets a variable equal to some memory HLT – to halt the program
Operation Suffixes 48 GAS assembly instructions are generally suffixed with the letters "b", "s", "w", "l", "q" or "t" to determine what size operand is being manipulated. b = byte (8 bit) s = short (16 bit integer) or single (32-bit floating point) w = word (16 bit) l = long (32 bit integer or 64-bit floating point) q = quad (64 bit) t = ten bytes (80-bit floating point)
Moving Data: IA32 Moving Data (data transfer operations) movl Source, Dest: Operand Types Immediate: Constant integer data Example: $0x400, $-533 Like C constant, but prefixed with ‘$’ Encoded with 1, 2, or 4 bytes Register: One of 8 integer registers Example: %eax, %edx But %esp and %ebp reserved for special use Others have special uses for particular instructions Memory: 4 consecutive bytes of memory at address given by register Simplest example: (%eax) Various other “address modes” %eax %ecx %edx %ebx %esi %edi %esp %ebp
Addressing Modes Memory addressing - the memory may be addressed in a.Direct Memory Addressing (register) b.Indirect Memory Addressing c.Offset Addressing (register, offset)
Memory Addressing Assume the following values are stored at the indicated memory addresses and registers: AddressValue 0x1000xFF 0x1040xAB 0x1080x13 0x10C0x11 RegisterValue %eax0x100 %ecx0x01 %edx0x03
Memory Addressing Assume the following values are stored at the indicated memory addresses and registers: AddressValue 0x1000xFF 0x1040xAB 0x1080x13 0x10C0x11 RegisterValue %eax0x100 %ecx0x01 %edx0x03
Memory Addressing Assume the following values are stored at the indicated memory addresses and registers: AddressValue 0x1000xFF 0x1040xAB 0x1080x13 0x10C0x11 RegisterValue %eax0x100 %ecx0x01 %edx0x03
Memory Addressing Assume the following values are stored at the indicated memory addresses and registers: AddressValue 0x1000xFF 0x1040xAB 0x1080x13 0x10C0x11 RegisterValue %eax0x100 %ecx0x01 %edx0x03
Memory Addressing Assume the following values are stored at the indicated memory addresses and registers: AddressValue 0x1000xFF 0x1040xAB 0x1080x13 0x10C0x11 RegisterValue %eax0x100 %ecx0x01 %edx0x03
Memory Addressing Assume the following values are stored at the indicated memory addresses and registers: AddressValue 0x1000xFF 0x1040xAB 0x1080x13 0x10C0x11 RegisterValue %eax0x100 %ecx0x01 %edx0x03
Memory Addressing Assume the following values are stored at the indicated memory addresses and registers: AddressValue 0x1000xFF 0x1040xAB 0x1080x13 0x10C0x11 RegisterValue %eax0x100 %ecx0x01 %edx0x03
Assembler Types 58 There are three main types of assembler : MASM - the Microsoft Assembler. It outputs OMF files (but Microsoft's linker can convert them to win32 format). GAS - the GNU assember. This uses the rather ugly AT&T-style syntax so many people do not like it; however, you can configure it to use and understand the Intel-style. It was designed to be part of the back end of the GNU compiler collection (gcc). this is run as as, and this is the one followed in the book NASM - the "Netwide Assembler." It is free, small, and best of all it can output zillions of different types of object files. The language is much more sensible than MASM in many respects.
Questions? 59