Presentation is loading. Please wait.

Presentation is loading. Please wait.

Assembly Language for Intel-Based Computers, 5th Edition

Similar presentations


Presentation on theme: "Assembly Language for Intel-Based Computers, 5th Edition"— Presentation transcript:

0 IA-32 Processor Architecture
Department of Computer Science National Tsing Hua University

1 Assembly Language for Intel-Based Computers, 5th Edition
Kip Irvine Chapter 2: IA-32 Processor Architecture Slides prepared by the author Revision date: June 4, 2006 (c) Pearson Education, All rights reserved. You may modify and copy this slide show for your personal use, or for use in the classroom, as long as this copyright statement, the author's name, and the title are not changed.

2 Chapter Overview Goal: Understand IA-32 architecture
Basic Concepts of Computer Organization Instruction execution cycle Basic computer organization Data storage in memory How programs run IA-32 Processor Architecture IA-32 Memory Management Components of an IA-32 Microcomputer Input-Output System

3 Recall: Computer Model for ASM
MOV AX, a ADD AX, b MOV x, AX x  a + b Memory a Register CPU AX b BX PC ... x + - ALU

4 Meanings of the Code (assumed)
Assembly code Machine code MOV AX, a (Take the data stored in memory address ‘a’, and move it to register AX) ADD AX, b (Take the data stored in memory address ‘b’, and add it to register AX) MOV x, AX (Take the data stored in register AX, and move it to memory address ‘x’) MOV register address AX memory a ADD

5 Another Computer Model for ASM
Processor Stored program architecture Register Memory AX BX x a data b MOV AX, a ADD AX, b MOV x, AX ALU IR PC PC: program counter IR: instruction register address

6 Step 1: Fetch (MOV AX, a) Register Memory ALU AX x BX a data b
BX x a data b MOV AX, a ADD AX, b MOV x, AX ALU IR PC address

7 Step 2: Decode (MOV AX,a) Register Memory ALU AX x BX a data b
BX x a data Controller b MOV AX, a ADD AX, b MOV x, AX ALU clock IR PC address

8 Step 3: Execute (MOV AX,a)
Register Memory AX BX x a data Controller b MOV AX, a ADD AX, b MOV x, AX ALU clock IR PC address

9 Step 1: Fetch (ADD AX,b) Register Memory ALU AX x BX a data b
BX x a data b MOV AX, a ADD AX, b MOV x, AX ALU IR PC address

10 Step 2: Decode (ADD AX,b) Register Memory ALU AX x BX a data b
BX x a data Controller b MOV AX, a ADD AX, b MOV x, AX ALU clock IR PC address

11 Step 3a: Execute (ADD AX,b)
Register Memory AX BX x a data Controller b MOV AX, a + ADD AX, b MOV x, AX ALU clock IR PC address

12 Step 3b: Write Back (ADD AX,b)
Register Memory AX BX x a data Controller b MOV AX, a ADD AX, b MOV x, AX ALU clock IR PC address

13 Basic Computer Organization
Clock synchronizes CPU operations Control unit (CU) coordinates execution sequence ALU performs arithmetic and bitwise processing

14 Clock Operations in a computer are triggered and thus synchronized by a clock Clock tells “when”: (no need to ask each other!!) When to put data on output lines When to read data from input lines Clock cycle measures time of a single operation Must long enough to allow signal propagation

15 Instruction/Data for Operations
Where are the instructions needed for computer operations from? Stored-program architecture: The whole program is stored in main memory, including program instructions (code) and data CPU loads the instructions and data from memory for execution Don’t worry about the disk for now Where are the data needed for execution? Registers (inside the CPU, discussed later) Memory Constant encoded in the instructions

16 Memory Organized like mailboxes, numbered 0, 1, 2, 3,…, 2n-1.
Each box can hold 8 bits (1 byte) So it is called byte-addressing Address of mailboxes: 16-bit address is enough for up to 64K 20-bit for 1M 32-bit for 4G Most servers need more than 4G!! That’s why we need 64-bit CPUs like Alpha (DEC/Compaq/HP) or Merced (Intel)

17 Storing Data in Memory Character String: Integer:
So how are strings like “Hello, World!” are stored in memory? ASCII Code! (or Unicode…etc.) Each character is stored as a byte Review: how is “1234” stored in memory? Integer: A byte can hold an integer number: between 0 and 255 (unsigned) or between –128 and 127 (2’s complement) How to store a bigger number? Review: how is 1234 stored in memory?

18 Big or Little Endian? Example: 1234 is stored in 2 bytes.
= in binary = 04 D2 in hexadecimal Do you store 04 or D2 first? Big Endian: 04 first Little Endian: D2 first Intel’s choice Reason: more consistent for variable length (e.g., 2 bytes, 4 bytes, 8 bytes…etc.)

19 Cache Memory High-speed expensive static RAM both inside and outside the CPU. Level-1 cache: inside the CPU chip Level-2 cache: often outside the CPU chip Cache hit: when data to be read is already in cache memory Cache miss: when data to be read is not in cache memory

20 How a Program Runs?

21 Load and Execute Process
OS searches for program’s filename in current directory and then in directory path If found, OS reads information from directory OS loads file into memory from disk OS allocates memory for program information OS executes a branch to cause CPU to execute the program. A running program is called a process Process runs by itself. OS tracks execution and responds to requests for resources When the process ends, its handle is removed and memory is released How? OS is only a program!

22 Multitasking OS can run multiple programs at same time
Multiple threads of execution within the same program Scheduler utility assigns a given amount of CPU time to each running program Rapid switching of tasks Gives illusion that all programs are running at the same time Processor must support task switching What supports are needed from hardware?

23 What's Next General Concepts IA-32 Processor Architecture
Modes of operation Basic execution environment Floating-point unit Intel microprocessor history IA-32 Memory Management Components of an IA-32 Microcomputer Input-Output System

24 Modes of Operation Virtual-8086 mode Protected mode Real-address mode
native mode (Windows, Linux) Programs are given separate memory areas named segments Real-address mode native MS-DOS System management mode power management, system security, diagnostics Virtual-8086 mode hybrid of Protected each program has its own 8086 computer

25 Basic Execution Environment
Address space: Protected mode 4 GB 32-bit address Real-address and Virtual-8086 modes 1 MB space 20-bit address

26 Basic Execution Environment
Program execution registers: named storage locations inside the CPU, optimized for speed Z N Register Memory PC IR ALU clock Controller

27 General Purpose Registers
Used for arithmetic and data movement Addressing: AX, BX, CX, DX: 16 bits Split into H and L parts, 8 bits each Extended into E?X to become 32-bit register (i.e., EAX, EBX,…etc.)

28 Index and Base Registers
Some registers have only a 16-bit name for their lower half:

29 Some Specialized Register Uses
General purpose registers EAX: accumulator, automatically used by multiplication and division instructions ECX: loop counter ESP: stack pointer ESI, EDI: index registers (source, destination) for memory transfer, e.g. a[i,j] EBP: frame pointer to reference function parameters and local variables on stack EIP: instruction pointer (i.e. program counter)

30 Some Specialized Register Uses
Segment registers In real-address mode: indicate base addresses of preassigned memory areas named segments In protected mode: hold pointers to segment descriptor tables CS: code segment DS: data segment SS: stack segment ES, FS, GS: additional segments EFLAGS Status and control flags (single binary bits) Control the operation of the CPU or reflect the outcome of some CPU operation

31 Status Flags (EFLAGS) Reflect the outcomes of arithmetic and logical operations performed by the CPU Carry: unsigned arithmetic out of range Overflow: signed arithmetic out of range Sign: result is negative Zero: result is zero Auxiliary Carry: carry from bit 3 to bit 4 Parity: sum of 1 bits is an even number Z N Register Memory PC IR ALU clock Controller

32 System Registers Application programs cannot access system registers
IDTR (Interrupt Descriptor Table Register) GDTR (Global Descriptor Table Register) LDTR (Local Descriptor Table Register) Task Register Debug Registers Control registers CR0, CR2, CR3, CR4 Model-Specific Registers

33 Floating-Point, MMX, XMM Reg.
Eight 80-bit floating-point data registers ST(0), ST(1), , ST(7) arranged in a stack used for all floating-point arithmetic Eight 64-bit MMX registers Eight 128-bit XMM registers for single-instruction multiple-data (SIMD) operations

34 Intel Microprocessors
Early microprocessors: Intel 8080: 64K addressable RAM, 8-bit registers CP/M operating system S-100 BUS architecture 8-inch floppy disks! Intel 8086/8088 IBM-PC used 8088 1 MB addressable RAM, 16-bit registers 16-bit data bus (8-bit for 8088) separate floating-point unit (8087) This is where “real-address mode” comes from!

35 Intel Microprocessors
The IBM-AT Intel 80286 16 MB addressable RAM Protected memory Introduced IDE bus architecture 80287 floating point unit Intel IA-32 Family Intel386: 4 GB addressable RAM, 32-bit registers, paging (virtual memory) Intel486: instruction pipelining Pentium: superscalar, 32-bit address bus, 64-bit internal data path

36 Intel Microprocessors
Intel P6 Family Pentium Pro: advanced optimization techniques in microcode Pentium II: MMX (multimedia) instruction set Pentium III: SIMD (streaming extensions) instructions Pentium 4 and Xeon: Intel NetBurst micro-architecture, tuned for multimedia

37 What’s Next General Concepts of Computer Architecture
IA-32 Processor Architecture IA-32 Memory Management Real-address mode Calculating linear addresses Protected mode Multi-segment model Paging Components of an IA-32 Microcomputer Input-Output System Understand it from the view point of the processor

38 Real-Address Mode Programs have 1 MB RAM maximum addressable with 20-bit addresses Application programs can access any area of the 1MB memory Single tasking: one program at a time, but CPU can momentarily interrupt that program to process requests (called interrupts) from peripherals MS-DOS runs in real-address mode

39 Ancient History IBM PC XT (Intel 8088/8086) is a so-called 16-bit machine Each register has 16 bits 216 = = 64K Want to use more memory (640K, 1M)… How to hold 20-bit addresses with 16-bit registers in the 8086 processor? Solution: segmented memory All of memory is divided into 64KB units called segments  16 segments in total One 16-bit register for segment value and another for 16-bit offset within the segment

40 Segmented Memory linear addresses
Segmented memory addressing: absolute (linear) address is a combination of a 16-bit segment value (in CS, DS, SS, or ES) added to a 16-bit offset linear addresses one segment represented as 8000 0000 segment value offset

41 Calculating Linear Addresses
Given a segment address, multiply it by 16 (add a hexadecimal zero), and add it to the offset  all done by the processor Example: convert 08F1:0100 to a linear address Adjusted Segment value: 0 8 F 1 0 Add the offset: Linear address:

42 Protected Mode Designed for multitasking
Each process (running program) is assigned a total of 4GB of addressable RAM Two parts: Segmentation: provides a mechanism of isolating individual code, data, and stack so that multiple programs can run without interfering one another Paging: provides demand-paged virtual memory where sections of a program’s execution environ. are moved into physical memory as needed  Give segmentation the illusion that it has 4GB of physical memory

43 Segmentation in Protected Mode
Segment: a logical unit of storage (not the same as the “segment” in real-address mode) e.g., code/data/stack of a program, system data structures Variable size Processor hardware provides protection All segments in the system are in the processor’s linear address space (physical space if without paging) Need to specify: base address, size, type, …  segment descriptor & descriptor table  linear address = base address + offset

44 Flat Segment Model Use a single global descriptor table (GDT)
All segments (at least 1 code and 1 data) mapped to entire 32-bit address space

45 Multi-Segment Model Local descriptor table (LDT) for each program
One descriptor for each segment located in a system segment of LDT type

46 Segmentation Addressing
Program references a memory location with a logical address: segment selector + offset Segment selector: provides an offset into the descriptor table CS/DS/SS points to descriptor table for code/data/stack segment

47 Convert Logical to Linear Address
Segment selector points to a segment descriptor, which contains base address of the segment. The 32-bit offset from the logical address is added to the segment’s base address, generating a 32-bit linear address

48 Paging Supported directly by the processor
Divides each segment into 4096-byte blocks called pages Part of running program is in memory, part is on disk Sum of all programs can be larger than physical memory Virtual memory manager (VMM): An OS utility that manages loading and unloading of pages Page fault: issued by processor when a page must be loaded from disk

49 What's Next General Concepts IA-32 Processor Architecture
IA-32 Memory Management Components of an IA-32 Microcomputer Skipped … Input-Output System

50 What's Next General Concepts IA-32 Processor Architecture
IA-32 Memory Management Components of an IA-32 Microcomputer Input-Output System How to access I/O systems?

51 Different Access Levels of I/O
Call a HLL library function (C++, Java) easy to do; abstracted from hardware slowest performance Call an operating system function specific to one OS; device-independent medium performance Call a BIOS (basic input-output system) function may produce different results on different systems knowledge of hardware required usually good performance Communicate directly with the hardware May not be allowed by some operating systems

52 Displaying a String of Characters
When a HLL program displays a string of characters, the following steps take place: Calls an HLL library function to write the string to standard output Library function (Level 3) calls an OS function, passing a string pointer OS function (Level 2) calls a BIOS subroutine, passing ASCII code and color of each character BIOS subroutine (Level 1) maps the character to a system font, and sends it to a hardware port attached to the video controller card Video controller card (Level 0) generates timed hardware signals to video display to display pixels

53 Summary Central Processing Unit (CPU) Arithmetic Logic Unit (ALU)
Instruction execution cycle Multitasking Floating Point Unit (FPU) Complex Instruction Set Real mode and Protected mode Motherboard components Memory types Input/Output and access levels


Download ppt "Assembly Language for Intel-Based Computers, 5th Edition"

Similar presentations


Ads by Google