The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

Slides:



Advertisements
Similar presentations
Chapter 2 (cont.) An Introduction to the 80x86 Microprocessor Family Objectives: The different addressing modes and instruction types available The usefulness.
Advertisements

Assembly Language Fundamentals
IA-32 Processor Architecture
1 Hardware and Software Architecture Chapter 2 n The Intel Processor Architecture n History of PC Memory Usage (Real Mode)
Assembly Language for Intel-Based Computers Chapter 5: Procedures Kip R. Irvine.
1 ICS 51 Introductory Computer Organization Fall 2006 updated: Oct. 2, 2006.
ICS312 Set 3 Pentium Registers. Intel 8086 Family of Microprocessors All of the Intel chips from the 8086 to the latest pentium, have similar architectures.
Microcomputer & Interfacing Lecture 3
Addressing Modes Chapter 11 S. Dandamudi To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S.
CEG 320/520: Computer Organization and Assembly Language ProgrammingIntel Assembly 1 Intel IA-32 vs Motorola
An Introduction to 8086 Microprocessor.
Procedures and the Stack Chapter 10 S. Dandamudi.
Procedures and the Stack Chapter 5 S. Dandamudi To be used with S. Dandamudi, “Introduction to Assembly Language Programming,” Second Edition, Springer,
Overview of Assembly Language Chapter 9 S. Dandamudi.
The Pentium Processor.
The Pentium Processor.
The Pentium Processor Chapter 3 S. Dandamudi To be used with S. Dandamudi, “Introduction to Assembly Language Programming,” Second Edition, Springer,
The Pentium Processor Chapter 3 S. Dandamudi.
Machine Instruction Characteristics
1 Fundamental of Computer Suthida Chaichomchuen : SCC
Low Level Programming Lecturer: Duncan Smeed Overview of IA-32 Part 1.
Types of Registers (8086 Microprocessor Based)
Fall 2012 Chapter 2: x86 Processor Architecture. Irvine, Kip R. Assembly Language for x86 Processors 6/e, Chapter Overview General Concepts IA-32.
INSTRUCTION SET AND ASSEMBLY LANGUAGE PROGRAMMING
Faculty of Engineering, Electrical Department,
The x86 Architecture Lecture 15 Fri, Mar 4, 2005.
The x86 PC Assembly Language, Design, and Interfacing By Muhammad Ali Mazidi, Janice Gillespie Mazidi and Danny Causey © 2010, 2003, 2000, 1998 Pearson.
Overview of Assembly Language Chapter 4 S. Dandamudi.
Overview of Assembly Language Chapter 4 S. Dandamudi.
Overview of Assembly Language Chapter 4 S. Dandamudi.
Procedures and the Stack Chapter 5 S. Dandamudi To be used with S. Dandamudi, “Introduction to Assembly Language Programming,” Second Edition, Springer,
26-Nov-15 (1) CSC Computer Organization Lecture 6: Pentium IA-32.
ADVANCED PROCESSORS & CONTROLLERS
Addressing Modes Chapter 6 S. Dandamudi To be used with S. Dandamudi, “Introduction to Assembly Language Programming,” Second Edition, Springer,
Assembly Language. Symbol Table Variables.DATA var DW 0 sum DD 0 array TIMES 10 DW 0 message DB ’ Welcome ’,0 char1 DB ? Symbol Table Name Offset var.
Introduction to Assembly II Abed Asi Extended System Programming Laboratory (ESPL) CS BGU Fall 2013/2014.
Overview of Assembly Language Chapter 4 S. Dandamudi.
The Pentium Processor Chapter 7 S. Dandamudi To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 
Logical and Bit Operations Chapter 9 S. Dandamudi.
MOV Instruction MOV destination,source  MOV AX,BX  MOV SUM,EAX  MOV EDX,ARRAY[EBX][ESI]  MOV CL,5  MOV DL,[BX]
Introduction to Intel IA-32 and IA-64 Instruction Set Architectures.
Assembly Language Data Movement Instructions. MOV Instruction Move source operand to destination mov destination, source The source and destination are.
Internal Programming Architecture or Model
Selection and Iteration Chapter 8 S. Dandamudi To be used with S. Dandamudi, “Introduction to Assembly Language Programming,” Second Edition, Springer,
Microprocessors CSE- 341 Dr. Jia Uddin Assistant Professor, CSE, BRAC University Dr. Jia Uddin, CSE, BRAC University.
Intel MP Organization. Registers - storage locations found inside the processor for temporary storage of data 1- Data Registers (16-bit) AX, BX, CX, DX.
I NTEL 8086 M icroprocessor بسم الله الرحمن الرحيم 1.
Chapter 12 Processor Structure and Function. Central Processing Unit CPU architecture, Register organization, Instruction formats and addressing modes(Intel.
Stack Operations Dr. Hadi AL Saadi.
Instruction set Architecture
Format of Assembly language
Microprocessor and Assembly Language
Chapter 4 Data Movement Instructions
Basic Microprocessor Architecture
Machine control instruction
Intel 8088 (8086) Microprocessor Structure
Defining Types of data expression Dn [name] expression Dn [name]
Symbolic Instruction and Addressing
Introduction to Assembly Language
BIC 10503: COMPUTER ARCHITECTURE
Data Addressing Modes • MOV AX,BX; This instruction transfers the word contents of the source-register(BX) into the destination register(AX). • The source.
Subject Name: Microprocessors Subject Code:10EC46 Department: Electronics and Communication Date: /19/2018.
Intel 8088 (8086) Microprocessor Structure
Morgan Kaufmann Publishers Computer Organization and Assembly Language
Symbolic Instruction and Addressing
CS 301 Fall 2002 Computer Organization
The Microprocessor & Its Architecture
Symbolic Instruction and Addressing
Assembler Directives end label end of program, label is entry point
Chapter 6 –Symbolic Instruction and Addressing
Presentation transcript:

The Pentium Processor

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 2 Pentium Family Intel introduced microprocessors in 1969  4-bit microprocessor 4004  8-bit microprocessors »8080 »8085  16-bit processors »8086 introduced in 1979 –20-bit address bus, 16-bit data bus »8088 is a less expensive version –Uses 8-bit data bus »Can address up to 4 segments of 64 KB »Referred to as the real mode

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 3 Pentium Family (cont’d)  »A faster version of 8086 »16-bit data bus and 20-bit address bus »Improved instruction set  was introduced in 1982 »24-bit address bus »16 MB address space »Enhanced with memory protection capabilities »Introduced protected mode –Segmentation in protected mode is different from the real mode »Backwards compatible

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 4 Pentium Family (cont’d)  was introduced 1985 »First 32-bit processor »32-bit data bus and 32-bit address bus »4 GB address space »Segmentation can be turned off (flat model) »Introduced paging  was introduced 1989 »Improved version of 386 »Combined coprocessor functions for performing floating-point arithmetic »Added parallel execution capability to instruction decode and execution units –Achieves scalar execution of 1 instruction/clock »Later versions introduced energy savings for laptops

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 5 Pentium Family (cont’d)  Pentium (80586) was introduced in 1993 »Similar to 486 but with 64-bit data bus »Wider internal datapaths –128- and 256-bit wide »Added second execution pipeline –Superscalar performance –Two instructions/clock »Doubled on-chip L1 cache –8 KB data –8 KB instruction »Added branch prediction

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 6 Pentium Family (cont’d)  Pentium Pro was introduced in 1995 »Three-way superscalar –3 instructions/clock »36-bit address bus –64 GB address space »Introduced dynamic execution –Out-of-order execution –Speculative execution »In addition to the L1 cache –Has 256 KB L2 cache

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 7 Pentium Family (cont’d)  Pentium II was introduced in 1997 »Introduced multimedia (MMX) instructions »Doubled on-chip L1 cache –16 KB data –16 KB instruction »Introduced comprehensive power management features –Sleep –Deep sleep »In addition to the L1 cache –Has 256 KB L2 cache  Pentium III, Pentium IV,…

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 8 Pentium Family (cont’d)  Itanium processor »RISC design –Previous designs were CISC »64-bit processor »Uses 64-bit address bus »128-bit data bus »Introduced several advanced features –Speculative execution –Predication to eliminate branches –Branch prediction

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 9 Pentium Processor

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 10 Pentium Processor (cont’d) Data bus (D0 – D 63)  64-bit data bus Address bus (A3 – A31)  Only 29 lines »No A0-A2 (due to 8-byte wide data bus) Byte enable (BE0# - BE7#)  Identifies the set of bytes to read or write »BE0# : least significant byte (D0 – D7) »BE1# : next byte (D8 – D15) »… »BE7# : most significant byte (D56 – D63)  Any combination of bytes can be specified

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 11 Pentium Processor (cont’d) Data parity (DP0 – DP7)  Even parity for 8 bytes of data »DP0 : D0 – D7 »DP1 : D8 – D15 »… »DP7 : D56 – D63 Parity check (PCHK#)  Indicates the parity check result on data read  Parity is checked only for valid bytes »Indicated by BE# signals

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 12 Pentium Processor (cont’d) Parity enable (PEN#)  Determines whether parity check should be used Address parity (AP)  Bad address parity during inquire cycles Memory/IO (M/IO#)  Defines bus cycle: memory or I/O Write/Read (W/R#)  Distinguishes between write and read cycles Data/Code (D/C#)  Distinguishes between data and code

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 13 Pentium Processor (cont’d) Cacheability (CACHE#)  Read cycle: indicates internal cacheability  Write cycle: burst write-back Bus lock (LOCK#)  Used in read-modify-write cycle  Useful in implementing semaphores Interrupt (INTR)  External interrupt signal Nonmaskable interrupt (NMI)  External NMI signal

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 14 Pentium Processor (cont’d) Clock (CLK)  System clock signal Bus ready (BRDY#)  Used to extend the bus cycle »Introduces wait states Bus request (BREQ)  Used in bus arbitration Backoff (BOFF#)  Aborts all pending bus cycles and floats the bus  Useful to resolve deadlock between two bus masters

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 15 Pentium Processor (cont’d) Bus hold (HOLD)  Completes outstanding bus cycles and floats bus  Asserts HLDA to give control of bus to another master Bus hold acknowledge (HLDA)  Indicates the Pentium has given control to another local master  Pentium continues execution from its internal caches Cache enable (KEN#)  If asserted, the current cycle is transformed into cache line fill

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 16 Pentium Processor (cont’d) Write-back/Write-through (WB/WT#)  Determines the cache write policy to be used Reset (RESET)  Resets the processor  Starts execution at FFFFFFF0H  Invalidates all internal caches Initialization (INIT)  Similar to RESET but internal caches and FP registers are not flushed  After powerup, use RESET (not INIT)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 17 Pentium Registers Four 32-bit registers can be used as  Four 32-bit register (EAX, EBX, ECX, EDX)  Four 16-bit register (AX, BX, CX, DX)  Eight 8-bit register (AH, AL, BH, BL, CH, CL, DH, DL) Some registers have special use  ECX for count in loop instructions

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 18 Pentium Registers (cont’d) Two index registers  16- or 32-bit registers  Used in string instructions »Source (SI) and destination (DI)  Can be used as general- purpose data registers Two pointer registers  16- or 32-bit registers  Used exclusively to maintain the stack

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 19 Pentium Registers (cont’d)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 20 Pentium Registers (cont’d) Control registers  (E)IP »Program counter  (E) FLAGS »Status flags –Record status information about the result of the last arithmetic/logical instruction »Direction flag –Forward/backward direction for data copy »System flags –IF : interrupt enable –TF : Trap flag (useful in single-stepping)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 21 Pentium Registers (cont’d) Segment register  Six 16-bit registers  Support segmented memory architecture  At any time, only six segments are accessible  Segments contain distinct contents »Code »Data »Stack

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 22 Real Mode Architecture Pentium supports two modes  Real mode »Uses 16-bit addresses »Runs 8086 programs »Pentium acts as a faster 8086  Protected mode »32-bit mode »Native mode of Pentium »Supports segmentation and paging

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 23 Real Mode Architecture (cont’d) Segmented organization  16-bit wide segments  Two components »Base (16 bits) »Offset (16 bits) Two-component specification is called logical address  Also called effective address 20-bit physical address

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 24 Real Mode Architecture (cont’d) Conversion from logical to physical addresses (add 0 to base) (offset) (physical address)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 25 Real Mode Architecture (cont’d) Two logical addresses map to the same physical address

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 26 Real Mode Architecture (cont’d) Programs can access up to six segments at any time Two of these are for  Data  Code Another segment is typically used for  Stack Other segments can be used for  data, code,..

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 27 Real Mode Architecture (cont’d)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 28 Protected Mode Architecture Supports sophisticated segmentation Segment unit translates 32-bit logical address to 32-bit linear address Paging unit translates 32-bit linear address to 32-bit physical address  If no paging is used »Linear address = physical address

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 29 Protected Mode Architecture (cont’d) Address translation

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 30 Protected Mode Architecture (cont’d) Index  Selects a descriptor from one of two descriptor tables »Local »Global Table Indicator (TI)  Select the descriptor table to be used »0 = Local descriptor table »1 = Global descriptor table Requestor Privilege Level (RPL)  Privilege level to provide protected access to data »Smaller the RPL, higher the privilege level

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 31 Protected Mode Architecture (cont’d)  Visible part »Instructions to load segment selector  mov, pop, lds, les, lss, lgs, lfs  Invisible »Automatically loaded when the visible part is loaded from a descriptor table

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 32 Protected Mode Architecture (cont’d) Segment descriptor

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 33 Protected Mode Architecture (cont’d) Base address  32-bit segment starting address Granularity (G)  Indicates whether the segment size is in »0 = bytes, or »1 = 4KB Segment Limit  20-bit value specifies the segment size »G = 0: 1byte to 1 MB »G = 1: 4KB to 4GB, in increments of 4KB

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 34 Protected Mode Architecture (cont’d) D/B bit  Code segment »D bit: default size operands and offset value –D = 0: 16-bit values –D = 1: 32-bit values  Data segment »B bit: controls the size of the stack and stack pointer –B = 0: SP is used with an upper bound of FFFFH –B = 1: ESP is used with an upper bound of FFFFFFFFH  Cleared for real mode  Set for protected mode

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 35 Protected Mode Architecture (cont’d) S bit  Identifies whether »System segment, or »Application segment Descriptor privilege level (DPL)  Defines segment privilege level Type  Identifies type of segment »Data segment: read-only, read-write, … »Code segment: execute-only, execute/read-only, … P bit  Indicates whether the segment is present

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 36 Protected Mode Architecture (cont’d) Three types of segment descriptor tables  Global descriptor table (GDT) »Only one in the system »Contains OS code and data »Available to all tasks  Local descriptor table (LDT) »Several LDTs »Contains descriptors of a program  Interrupt descriptor table (IDT »Used in interrupt processing »Details in Chapter 20

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 37 Protected Mode Architecture (cont’d) Segmentation Models  Pentium can turn off segmentation  Flat model »Consists of one segment of 4GB »E.g. used by UNIX  Multisegment model »Up to six active segments »Can have more than six segments –Descriptors must be in the descriptor table »A segment becomes active by loading its descriptor into one of the segment registers

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 38 Protected Mode Architecture (cont’d)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 39 Mixed-Mode Operation Pentium allows mixed-mode operation  Possible to combine 16-bit and 32-bit operands and addresses  D/B bit indicates the default size »0 = 16 bit mode »1 = 32-bit mode  Pentium provides two override prefixes »One for operands »One for addresses  Details and examples in Chapter 11

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 40 Default Segments Pentium uses default segments depending on the purpose of the memory reference  Instruction fetch »CS register  Stack operations »16-bit mode: SP »32-bit mode: ESP  Accessing data »DS register »Offset depends on the addressing mode Last slide

Overview of Assembly Language

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 42 Assembly Language Statements Three different classes  Instructions »Tell CPU what to do »Executable instructions with an op-code  Directives (or pseudo-ops) »Provide information to assembler on various aspects of the assembly process »Non-executable –Do not generate machine language instructions  Macros »A shorthand notation for a group of statements »A sophisticated text substitution mechanism with parameters

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 43 Assembly Language Statements (cont’d) Assembly language statement format: [label] mnemonic [operands] [;comment]  Typically one statement per line  Fields in [ ] are optional  label serves two distinct purposes: »To label an instruction –Can transfer program execution to the labeled instruction »To label an identifier or constant  mnemonic identifies the operation (e.g., add, or )  operands specify the data required by the operation »Executable instructions can have zero to three operands

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 44 Assembly Language Statements (cont’d)  comments »Begin with a semicolon (;) and extend to the end of the line Examples repeat: inc result ; increment result CR EQU 0DH ; carriage return character White space can be used to improve readability repeat: inc result

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 45 Data Allocation Variable declaration in a high-level language such as C char response int value float total double average_value specifies »Amount storage required (1 byte, 2 bytes, …) »Label to identify the storage allocated ( response, value, …) »Interpretation of the bits stored (signed, floating point, …) –Bit pattern is interpreted as   29,255 as a signed number  36,281 as an unsigned number

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 46 Data Allocation (cont’d) In assembly language, we use the define directive  Define directive can be used »To reserve storage space »To label the storage space »To initialize »But no interpretation is attached to the bits stored –Interpretation is up to the program code  Define directive goes into the.DATA part of the assembly language program Define directive format [var-name] D? init-value [,init-value],...

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 47 Data Allocation (cont’d) Five define directives DB Define Byte ;allocates 1 byte DW Define Word ;allocates 2 bytes DD Define Doubleword ;allocates 4 bytes DQ Define Quadword ;allocates 8 bytes DT Define Ten bytes ;allocates 10 bytes Examples sorted DB ’y’ response DB ? ;no initialization value DW float1 DD float2 DQ

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 48 Data Allocation (cont’d) Multiple definitions can be abbreviated Example message DB ’B’ DB ’y’ DB ’e’ DB 0DH DB 0AH can be written as message DB ’B’,’y’,’e’,0DH,0AH More compactly as message DB ’Bye’,0DH,0AH

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 49 Data Allocation (cont’d) Multiple definitions can be cumbersome to initialize data structures such as arrays Example To declare and initialize an integer array of 8 elements marks DW 0,0,0,0,0,0,0,0 What if we want to declare and initialize to zero an array of 200 elements?  There is a better way of doing this than repeating zero 200 times in the above statement »Assembler provides a directive to do this (DUP directive)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 50 Data Allocation (cont’d) Multiple initializations  The DUP assembler directive allows multiple initializations to the same value  Previous marks array can be compactly declared as marks DW 8 DUP (0) Examples table1 DW 10 DUP (?) ;10 words, uninitialized message DB 3 DUP (’Bye!’) ;12 bytes, initialized ; as Bye!Bye!Bye! Name1 DB 30 DUP (’?’) ;30 bytes, each ; initialized to ?

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 51 Data Allocation (cont’d) The DUP directive may also be nested Example stars DB 4 DUP(3 DUP (’*’),2 DUP (’?’),5 DUP (’!’)) Reserves 40-bytes space and initializes it as ***??!!!!!***??!!!!!***??!!!!!***??!!!!! Example matrix DW 10 DUP (5 DUP (0)) defines a 10X5 matrix and initializes its elements to 0 This declaration can also be done by matrix DW 50 DUP (0)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 52 Data Allocation (cont’d) Symbol Table  Assembler builds a symbol table so we can refer to the allocated storage space by the associated label Example.DATA name offset value DW 0value 0 sum DD 0sum 2 marks DW 10 DUP (?)marks 6 message DB ‘The grade is:’,0message 26 char1 DB ?char1 40

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 53 Data Allocation (cont’d) Correspondence to C Data Types DirectiveC data type DBchar DWint, unsigned DDfloat, long DQdouble DT internal intermediate float value

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 54 Data Allocation (cont’d) LABEL Directive  LABEL directive provides another way to name a memory location  Format: name LABEL type type can be BYTE1 byte WORD2 bytes DWORD4 bytes QWORD8 bytes TWORD10 bytes

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 55 Data Allocation (cont’d) LABEL Directive Example.DATA count LABEL WORD Lo-count DB 0 Hi_count DB 0.CODE... movLo_count,AL mov Hi_count,CL  count refers to the 16-bit value  Lo_count refers to the low byte  Hi_count refers to the high byte

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 56 Where Are the Operands? Operands required by an operation can be specified in a variety of ways A few basic ways are:  operand in a register –register addressing mode  operand in the instruction itself –immediate addressing mode  operand in memory –variety of addressing modes  direct and indirect addressing modes  operand at an I/O port

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 57 Where Are the Operands? (cont’d) Register addressing mode  Operand is in an internal register Examples mov EAX,EBX ; 32-bit copy mov BX,CX ; 16-bit copy mov AL,CL ; 8-bit copy  The mov instruction mov destination,source copies data from source to destination

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 58 Where Are the Operands? (cont’d) Register addressing mode (cont’d)  Most efficient way of specifying an operand »No memory access is required  Instructions using this mode tend to be shorter »Fewer bits are needed to specify the register Compilers use this mode to optimize code total := 0 for (i = 1 to 400) total = total + marks[i] end for  Mapping total and i to registers during the for loop optimizes the code

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 59 Where Are the Operands? (cont’d) Immediate addressing mode  Data is part of the instruction »Ooperand is located in the code segment along with the instruction »Efficient as no separate operand fetch is needed »Typically used to specify a constant Example mov AL,75  This instruction uses register addressing mode for destination and immediate addressing mode for the source

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 60 Where Are the Operands? (cont’d) Direct addressing mode  Data is in the data segment »Need a logical address to access data –Two components: segment:offset »Various addressing modes to specify the offset component –offset part is called effective address  The offset is specified directly as part of instruction  We write assembly language programs using memory labels (e.g., declared using DB, DW, LABEL,...) »Assembler computes the offset value for the label –Uses symbol table to compute the offset of a label

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 61 Where Are the Operands? (cont’d) Direct addressing mode (cont’d) Examples mov AL,response »Assembler replaces response by its effective address (i.e., its offset value from the symbol table) mov table1,56 »table1 is declared as table1 DW 20 DUP (0) »Since the assembler replaces table1 by its effective address, this instruction refers to the first element of table1 –In C, it is equivalent to table1[0] = 56

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 62 Where Are the Operands? (cont’d) Direct addressing mode (cont’d) Problem with direct addressing  Useful only to specify simple variables  Causes serious problems in addressing data types such as arrays »As an example, consider adding elements of an array –Direct addressing does not facilitate using a loop structure to iterate through the array –We have to write an instruction to add each element of the array Indirect addressing mode remedies this problem

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 63 Where Are the Operands? (cont’d) Indirect addressing mode The offset is specified indirectly via a register  Sometimes called register indirect addressing mode  For 16-bit addressing, the offset value can be in one of the three registers: BX, SI, or DI  For 32-bit addressing, all 32-bit registers can be used Example mov AX,[BX]  Square brackets [ ] are used to indicate that BX is holding an offset value »BX contains a pointer to the operand, not the operand itself

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 64 Where Are the Operands? (cont’d) Using indirect addressing mode, we can process arrays using loops Example: Summing array elements  Load the starting address (i.e., offset) of the array into BX  Loop for each element in the array »Get the value using the offset in BX –Use indirect addressing »Add the value to the running total »Update the offset in BX to point to the next element of the array

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 65 Where Are the Operands? (cont’d) Loading offset value into a register Suppose we want to load BX with the offset value of table1 We cannot write mov BX,table1 Two ways of loading offset value »Using OFFSET assembler directive –Executed only at the assembly time »Using lea instruction –This is a processor instruction –Executed at run time

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 66 Where Are the Operands? (cont’d) Loading offset value into a register (cont’d) Using OFFSET assembler directive  The previous example can be written as mov BX,OFFSET table1 Using lea (load effective address) instruction  The format of lea instruction is lea register,source  The previous example can be written as lea BX,table1

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 67 Where Are the Operands? (cont’d) Loading offset value into a register (cont’d) Which one to use -- OFFSET or lea ?  Use OFFSET if possible »OFFSET incurs only one-time overhead (at assembly time) »lea incurs run time overhead (every time you run the program)  May have to use lea in some instances »When the needed data is available at run time only –An index passed as a parameter to a procedure »We can write lea BX,table1[SI] to load BX with the address of an element of table1 whose index is in SI register »We cannot use the OFFSET directive in this case

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 68 Default Segments In register indirect addressing mode  16-bit addresses »Effective addresses in BX, SI, or DI is taken as the offset into the data segment (relative to DS) »For BP and SP registers, the offset is taken to refer to the stack segment (relative to SS)  32-bit addresses »Effective address in EAX, EBX, ECX, EDX, ESI, and EDI is relative to DS »Effective address in EBP and ESP is relative to SS  push and pop are always relative to SS

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 69 Default Segments (cont’d) Default segment override  Possible to override the defaults by using override prefixes »CS, DS, SS, ES, FS, GS  Example 1 »We can use add AX,SS:[BX] to refer to a data item on the stack  Example 2 »We can use add AX,DS:[BP] to refer to a data item in the data segment

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 70 Data Transfer Instructions We will look at three instructions  mov (move) »Actually copy  xchg (exchange) »Exchanges two operands  xlat (translate) »Translates byte values using a translation table Other data transfer instructions such as movsx (move sign extended) movzx (move zero extended)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 71 Data Transfer Instructions (cont’d) The mov instruction  The format is mov destination,source »Copies the value from source to destination »source is not altered as a result of copying »Both operands should be of same size »source and destination cannot both be in memory –Most Pentium instructions do not allow both operands to be located in memory –Pentium provides special instructions to facilitate memory-to-memory block copying of data

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 72 Data Transfer Instructions (cont’d) The mov instruction  Five types of operand combinations are allowed: Instruction type Example mov register,register mov DX,CX mov register,immediate mov BL,100 mov register,memory mov BX,count mov memory,register mov count,SI mov memory,immediate mov count,23  The operand combinations are valid for all instructions that require two operands

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 73 Data Transfer Instructions (cont’d) Ambiguous moves: PTR directive For the following data definitions.DATA table1 DW 20 DUP (0) status DB 7 DUP (1) the last two mov instructions are ambiguous mov BX,OFFSET table1 mov SI,OFFSET status mov [BX],100 mov [SI],100  Not clear whether the assembler should use byte or word equivalent of 100

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 74 Data Transfer Instructions (cont’d) Ambiguous moves: PTR directive The PTR assembler directive can be used to clarify The last two mov instructions can be written as mov WORD PTR [BX],100 mov BYTE PTR [SI],100  WORD and BYTE are called type specifiers We can also use the following type specifiers: DWORD for doubleword values QWORD for quadword values TWORD for ten byte values

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 75 Data Transfer Instructions (cont’d) The xchg instruction The syntax is xchg operand1,operand2 Exchanges the values of operand1 and operand2 Examples xchg EAX,EDX xchg response,CL xchg total,DX Without the xchg instruction, we need a temporary register to exchange values using only the mov instruction

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 76 Data Transfer Instructions (cont’d) The xchg instruction The xchg instruction is useful for conversion of 16-bit data between little endian and big endian forms  Example: mov AL,AH converts the data in AX into the other endian form Pentium provides bswap instruction to do similar conversion on 32-bit data bswap 32-bit register  bswap works only on data located in a 32-bit register

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 77 Data Transfer Instructions (cont’d) The xlat instruction The xlat instruction translates bytes The format is xlatb To use xlat instruction »BX should be loaded with the starting address of the translation table »AL must contain an index in to the table –Index value starts at zero »The instruction reads the byte at this index in the translation table and stores this value in AL –The index value in AL is lost »Translation table can have at most 256 entries (due to AL)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 78 Data Transfer Instructions (cont’d) The xlat instruction Example: Encrypting digits Input digits: Encrypted digits: DATA xlat_table DB ’ ’....CODE mov BX,OFFSET xlat_table GetCh AL sub AL,’0’ ; converts input character to index xlatb ; AL = encrypted digit character PutCh AL...

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 79 Pentium Assembly Instructions Pentium provides several types of instructions Brief overview of some basic instructions:  Arithmetic instructions  Jump instructions  Loop instruction  Logical instructions  Shift instructions  Rotate instructions These instructions allow you to write reasonable assembly language programs

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 80 Arithmetic Instructions INC and DEC instructions  Format: inc destination dec destination  Semantics: destination = destination +/- 1 »destination can be 8-, 16-, or 32-bit operand, in memory or register  No immediate operand Examples inc BX dec value

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 81 Arithmetic Instructions (cont’d) Add instructions  Format: add destination,source  Semantics: destination = destination + source Examples add EBX,EAX add value,35  inc EAX is better than add EAX,1 –inc takes less space –Both execute at about the same speed

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 82 Arithmetic Instructions (cont’d) Add instructions  Addition with carry  Format: adc destination,source  Semantics: destination = destination + source + CF Example: 64-bit addition add EAX,ECX ; add lower 32 bits adc EBX,EDX ; add upper 32 bits with carry  64-bit result in EBX:EAX

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 83 Arithmetic Instructions (cont’d) Subtract instructions  Format: sub destination,source  Semantics: destination = destination - source Examples sub EBX,EAX sub value,35  dec EAX is better than sub EAX,1 –dec takes less space –Both execute at about the same speed

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 84 Arithmetic Instructions (cont’d) Subtract instructions  Subtract with borrow  Format: sbb destination,source  Semantics: destination = destination - source - CF  Like the adc, sbb is useful in dealing with more than 32-bit numbers Negation neg destination  Semantics: destination = 0 - destination

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 85 Arithmetic Instructions (cont’d) CMP instruction  Format: cmp destination,source  Semantics: destination - source  destination and source are not altered  Useful to test relationship (>, =) between two operands  Used in conjunction with conditional jump instructions for decision making purposes Examples cmp EBX,EAX cmp count,100

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 86 Unconditional Jump  Format: jmp label  Semantics: »Execution is transferred to the instruction identified by label Target can be specified in one of two ways  Directly »In the instruction itself  Indirectly »Through a register or memory

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 87 Unconditional Jump (cont’d) Example Two jump instructions  Forward jump jmp CX_init_done  Backward jump jmp repeat1 Programmer specifies target by a label Assembler computes the offset using the symbol table... mov CX,10 jmp CX_init_done init_CX_20: mov CX,20 CX_init_done: mov AX,CX repeat1: dec CX... jmp repeat1...

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 88 Unconditional Jump (cont’d) Address specified in the jump instruction is not the absolute address  Uses relative address »Specifies relative byte displacement between the target instruction and the instruction following the jump instruction »Displacement is w.r.t the instruction following jmp –Reason: IP points to this instruction after reading jump  Execution of jmp involves adding the displacement value to current IP  Displacement is a signed 16-bit number »Negative value for backward jumps »Positive value for forward jumps

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 89 Target Location Inter-segment jump  Target is in another segment CS = target-segment (2 bytes) IP = target-offset (2 bytes) »Called far jumps (needs five bytes to encode jmp ) Intra-segment jumps  Target is in the same segment IP = IP + relative-displacement (1 or 2 bytes)  Uses 1-byte displacement if target is within  128 to +127 »Called short jumps (needs two bytes to encode jmp )  If target is outside this range, uses 2-byte displacement »Called near jumps (needs three bytes to encode jmp )

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 90 Target Location (cont’d) In most cases, the assembler can figure out the type of jump  For backward jumps, assembler can decide whether to use the short jump form or not For forward jumps, it needs a hint from the programmer  Use SHORT prefix to the target label  If such a hint is not given »Assembler reserves three bytes for jmp instruction »If short jump can be used, leaves one byte of nop (no operation) –See the next example for details

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 91 Example EB 0C jmp SHORT CX_init_done = 0C B9 000A mov CX, A EB jmp CX_init_done nop D = init_CX_20: D B mov CX, E9 00D0 jmp near_jump 00E = D0 14 CX_init_done: B C1 mov AX,CX

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 92 Example (cont’d) 16 repeat1: dec CX EB FD jmp repeat = -3 = FDH DB EB 03 jmp SHORT short_jump 00E0 - 00DD = DD B9 FF00 mov CX, 0FF00H 86 short_jump: 87 00E0 BA 0020 mov DX, 20H 88 near_jump: 89 00E3 E9 FF27 jmp init_CX_20 000D - 00E6 = -217 = FF27H

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 93 Conditional Jumps (cont’d) Format: j lab –Execution is transferred to the instruction identified by label only if is met Example: Testing for carriage return read_char:... cmp AL,0DH ; 0DH = ASCII carriage return je CR_received inc CL jmp read_char... CR_received:

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 94 Conditional Jumps (cont’d)  Some conditional jump instructions –Treats operands of the CMP instruction as signed numbers jejump if equal jgjump if greater jljump if less jgejump if greater or equal jlejump if less or equal jnejump if not equal

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 95 Conditional Jumps (cont’d)  Conditional jump instructions can also test values of the individual flags jzjump if zero (i.e., if ZF = 1) jnzjump if not zero (i.e., if ZF = 0) jcjump if carry (i.e., if CF = 1) jncjump if not carry (i.e., if CF = 0)  jz is synonymous for je  jnz is synonymous for jne

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 96 A Note on Conditional Jumps target:... cmp AX,BX je target mov CX,10... traget is out of range for a short jump Use this code to get around target:... cmp AX,BX jne skip1 jmp target skip1: mov CX,10... All conditional jumps are encoded using 2 bytes  Treated as short jumps What if the target is outside this range?

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 97 Loop Instructions Unconditional loop instruction  Format: loop target  Semantics: »Decrements CX and jumps to target if CX  0 –CX should be loaded with a loop count value Example: Executes loop body 50 times mov CX,50 repeat: loop repeat...

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 98 Loop Instructions (cont’d) The previous example is equivalent to mov CX,50 repeat: dec CX jnz repeat...  Surprisingly, dec CX jnz repeat executes faster than loop repeat

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 99 Loop Instructions (cont’d) Conditional loop instructions  loope/loopz »Loop while equal/zero CX = CX – 1 ff (CX = 0 and ZF = 1) jump to target  loopne/loopnz »Loop while not equal/not zero CX = CX – 1 ff (CX = 0 and ZF = 0) jump to target

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 100 Logical Instructions  Format: and destination,source or destination,source xor destination,source not destination  Semantics: »Performs the standard bitwise logical operations –result goes to destination  test is a non-destructive and instruction test destination,source  Performs logical AND but the result is not stored in destination (like the CMP instruction)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 101 Logical Instructions (cont’d) Example:... and AL,01H ; test the least significant bit jz bit_is_zero jmp skip1 bit_is_zero: skip1:... test instruction is better in place of and

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 102 Shift Instructions Two types of shifts »Logical »Arithmetic  Logical shift instructions Shift left shl destination,count shl destination,CL Shift right shr destination,count shr destination,CL  Semantics: »Performs left/right shift of destination by the value in count or CL register – CL register contents are not altered

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 103 Shift Instructions (cont’d) Logical shift  Bit shifted out goes into the carry flag »Zero bit is shifted in at the other end

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 104 Shift Instructions (cont’d)  count is an immediate value shl AX,5  Specification of count greater than 31 is not allowed »If a greater value is specified, only the least significant 5 bits are used  CL version is useful if shift count is known at run time » Ex: when the shift count value is passed as a parameter in a procedure call »Only the CL register can be used  Shift count value should be loaded into CL mov CL,5 shl AX,CL

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 105 Shift Instructions (cont’d) Arithmetic shift  Two versions as in logical shift sal/sar destination,count sal/sar destination,CL

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 106 Double Shift Instructions Double shift instructions work on either 32- or 64- bit operands Format  Takes three operands shld dest,src,count ; left shift shrd dest,src,count ; right shift  dest can be in memory or register  src must be a register  count can be an immediate value or in CL as in other shift instructions

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 107 Double Shift Instructions (cont’d)  src is not modified by doubleshift instruction  Only dest is modified  Shifted out bit goes into the carry flag

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 108 Rotate Instructions  Two types of ROTATE instructions  Rotate without carry »rol (ROtate Left) »ror (ROtate Right)  Rotate with carry »rcl (Rotate through Carry Left) »rcr (Rotate through Carry Right)  Format of ROTATE instructions is similar to the SHIFT instructions »Supports two versions –Immediate count value –Count value in CL register

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 109 Rotate Instructions (cont’d)  Bit shifted out goes into the carry flag as in SHIFT instructions

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 110 Rotate Instructions (cont’d)  Bit shifted out goes into the carry flag as in SHIFT instructions

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 111 Rotate Instructions (cont’d) Example: Shifting 64-bit numbers  Multiplies a 64-bit value in EDX:EAX by 16 »Rotate version mov CX,4 shift_left: shl EAX,1 rcl EDX,1 loop shift_left »Doubleshift version shld EDX,EAX,4 shl EAX,4 Division can be done in a similar a way

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 112 Defining Constants Assembler provides two directives: »EQU directive –No reassignment –String constants can be defined »= directive –Can be reassigned –No string constants Defining constants has two advantages:  Improves program readability  Helps in software maintenance »Multiple occurrences can be changed from a single place Convention »We use all upper-case letters for names of constants

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 113 Defining Constants (cont’d) The EQU directive Syntax: name EQU expression  Assigns the result of expression to name  The expression is evaluated at assembly time  Similar to #define in C Examples NUM_OF_ROWS EQU 50 NUM_OF_COLS EQU 10 ARRAY_SIZE EQU NUM_OF_ROWS * NUM_OF_COLS  Can also be used to define string constants JUMP EQU jmp

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 114 Defining Constants (cont’d) The = directive Syntax: name = expression  Similar to EQU directive  Two key differences: »Redefinition is allowed count = 0... count = 99 is valid »Cannot be used to define string constants or to redefine keywords or instruction mnemonics Example: JUMP = jmp is not allowed

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 115 Macros Macros can be defined with MACRO and ENDM Format macro_name MACRO[parameter1, parameter2,...] macro body ENDM A macro can be invoked using macro_name [argument1, argument2, …] Example: Definition Invocation multAX_by_16 MACRO... sal AX,4 mov AX,27 ENDM multAX_by_16...

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 116 Macros (cont’d) Macros can be defined with parameters »More flexible »More useful Example mult_by_16 MACRO operand sal operand,4 ENDM  To multiply a byte in DL register mult_by_16 DL  To multiply a memory variable count mult_by_16 count

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 117 Macros (cont’d) Example: To exchange two memory words  Memory-to-memory transfer Wmxchg MACRO operand1, operand2 xchg AX,operand1 xchg AX,operand2 xchg AX,operand1 ENDM

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 118 Illustrative Examples Five examples in this chapter  Conversion of ASCII to binary representation (BINCHAR.ASM)  Conversion of ASCII to hexadecimal by character manipulation (HEX1CHAR.ASM)  Conversion of ASCII to hexadecimal using the XLAT instruction (HEX2CHAR.ASM)  Conversion of lowercase letters to uppercase by character manipulation (TOUPPER.ASM)  Sum of individual digits of a number (ADDIGITS.ASM) Last slide

Procedures and the Stack

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 120 What is a Stack? Stack is a last-in-first-out (LIFO) data structure  If we view the stack as a linear array of elements, both insertion and deletion operations are restricted to one end of the array  Only the element at the top-of-stack (TOS) is directly accessible Two basic stack operations  push »Insertion  pop »Deletion

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 121 What is a Stack? (cont’d) Example  Insertion of data items into the stack »Arrow points to the top-of-stack

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 122 What is a Stack? (cont’d) Example  Deletion of data items from the stack »Arrow points to the top-of-stack

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 123 Pentium Implementation of the Stack Stack segment is used to implement the stack  Registers SS and (E)SP are used  SS:(E)SP represents the top-of-stack Pentium stack implementation characteristics are  Only words (i.e., 16-bit data) or doublewords (i.e., 32- bit data) are saved on the stack, never a single byte  Stack grows toward lower memory addresses »Stack grows “downward”  Top-of-stack (TOS) always points to the last data item placed on the stack

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 124 Pentium Stack Example - 1

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 125 Pentium Stack Example - 2

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 126 Pentium Stack Instructions Pentium provides two basic instructions: push source pop destination  source and destination can be a »16- or 32-bit general register »a segment register »a word or doubleword in memory  source of push can also be an immediate operand of size 8, 16, or 32 bits

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 127 Pentium Stack Instructions: Examples On an empty stack created by.STACK 100H the following sequence of push instructions push 21ABH push 7FBD329AH results in the stack state shown in (a) in the last figure On this stack, executing pop EBX results in the stack state shown in (b) in the last figure and the register EBX gets the value 7FBD329AH

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 128 Additional Pentium Stack Instructions Stack Operations on Flags push and pop instructions cannot be used on the Flags register Two special instructions for this purpose are pushf (push 16-bit flags) popf (pop 16-bit flags) No operands are required Use pushfd and popfd for 32-bit flags (EFLAGS)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 129 Additional Pentium Stack Instructions (cont’d) Stack Operations on 8 General-Purpose Registers pusha and popa instructions can be used to save and restore the eight general-purpose registers AX, CX, DX, BX, SP, BP, SI, and DI pusha pushes these eight registers in the above order (AX first and DI last) popa restores these registers except that SP value is not loaded into the SP register Use pushad and popad for saving and restoring 32-bit registers

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 130 Uses of the Stack Three main uses »Temporary storage of data »Transfer of control »Parameter passing Temporary Storage of Data Example: Exchanging value1 and value2 can be done by using the stack to temporarily hold data push value1 push value2 pop value1 pop value2

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 131 Uses of the Stack (cont’d) Often used to free a set of registers ;save EBX & ECX registers on the stack push EBX push ECX... >... ;restore EBX & ECX from the stack pop ECX pop EBX

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 132 Uses of the Stack (cont’d) Transfer of Control In procedure calls and interrupts, the return address is stored on the stack  Our discussion on procedure calls clarifies this particular use of the stack Parameter Passing Stack is extensively used for parameter passing  Our discussion later on parameter passing describes how the stack is used for this purpose

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 133 Assembler Directives for Procedures Assembler provides two directives to define procedures: PROC and ENDP To define a NEAR procedure, use proc-name PROC NEAR  In a NEAR procedure, both calling and called procedures are in the same code segment A FAR procedure can be defined by proc-name PROC FAR  Called and calling procedures are in two different segments in a FAR procedure

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 134 Assembler Directives for Procedures (cont’d) If FAR or NEAR is not specified, NEAR is assumed (i.e., NEAR is the default) We focus on NEAR procedures A typical NAER procedure definition proc-name PROC proc-name ENDP proc-name should match in PROC and ENDP

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 135 Pentium Instructions for Procedures Pentium provides two instructions: call and ret call instruction is used to invoke a procedure The format is call proc-name proc-name is the procedure name Actions taken during a near procedure call SP = SP  2 (SS:SP) = IP IP = IP + relative displacement

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 136 Pentium Instructions for Procedures (cont’d) ret instruction is used to transfer control back to the calling procedure How will the processor know where to return?  Uses the return address pushed onto the stack as part of executing the call instruction  Important that TOS points to this return address when ret instruction is executed Actions taken during the execution of ret are: IP = (SS:SP) SP = SP + 2

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 137 Pentium Instructions for Procedures (cont’d) We can specify an optional integer in the ret instruction  The format is ret optional-integer  Example: ret 6 Actions taken on ret with optional-integer are: IP = (SS:SP) SP = SP optional-integer

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 138 How Is Program Control Transferred? Offset(hex) machine code(hex) main PROC... cs:000AE8000Ccall sum cs:000D8BD8mov BX,AX... main ENDP sum PROC cs: push BP... sum ENDP avg PROC... cs:0028E8FFEEcall sum cs:002B8BD0mov DX,AX... avg ENDP

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 139 Parameter Passing Parameter passing is different and complicated than in a high-level language In assembly language »First place all required parameters in a mutually accessible storage area »Then call the procedure Type of storage area used »Registers (general-purpose registers are used) »Memory (stack is used) Two common methods of parameter passing »Register method »Stack method

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 140 Parameter Passing: Register Method Calling procedure places the necessary parameters in the general-purpose registers before invoking the procedure through the call instruction Examples:  PROCEX1.ASM »call-by-value using the register method »a simple sum procedure  PROCEX2.ASM »call-by-reference using the register method »string length procedure

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 141 Pros and Cons of the Register Method Advantages  Convenient and easier  Faster Disadvantages  Only a few parameters can be passed using the register method –Only a small number of registers are available  Often these registers are not free –Freeing them by pushing their values onto the stack negates the second advantage

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 142 Parameter Passing: Stack Method All parameter values are pushed onto the stack before calling the procedure Example: push number1 push number2 call sum

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 143 Accessing Parameters on the Stack Parameter values are buried inside the stack We cannot use mov BX,[SP+2] ;illegal to access number2 in the previous example We can use mov BX,[ESP+2] ;valid Problem: The ESP value changes with push and pop operations »Relative offset depends of the stack operations performed »Not desirable

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 144 Accessing Parameters on the Stack (cont’d) We can also use add SP,2 mov BX,[SP] ;valid Problem: cumbersome »We have to remember to update SP to point to the return address on the stack before the end of the procedure Is there a better alternative?  Use the BP register to access parameters on the stack

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 145 Using BP Register to Access Parameters Preferred method of accessing parameters on the stack is mov BP,SP mov BX,[BP+2] to access number2 in the previous example Problem: BP contents are lost!  We have to preserve the contents of BP  Use the stack (caution: offset value changes) push BP mov BP,SP

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 146 Clearing the Stack Parameters Stack state after push BP Stack state after pop BP Stack state after executing ret

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 147 Clearing the Stack Parameters (cont’d) Two ways of clearing the unwanted parameters on the stack:  Use the optional-integer in the ret instruction »Use ret 4 in the previous example  Add the constant to SP in calling procedure (C uses this method) push number1 push number2 call sum add SP,4

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 148 Housekeeping Issues Who should clean up the stack of unwanted parameters?  Calling procedure »Need to update SP with every procedure call »Not really needed if procedures use fixed number of parameters »C uses this method because C allows variable number of parameters  Called procedure »Code becomes modular (parameter clearing is done in only one place) »Cannot be used with variable number of parameters

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 149 Housekeeping Issues (cont’d) Need to preserve the state across a procedure call »Stack is used for this purpose Which registers should be saved?  Save those registers that are used by the calling procedure but are modified by the called procedure »Might cause problems  Save all registers (brute force method) »Done by using pusha »Increased overhead –pusha takes 5 clocks as opposed 1 to save a register

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 150 Housekeeping Issues (cont’d) Stack state after pusha

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 151 Housekeeping Issues (cont’d) Who should preserve the state of the calling procedure?  Calling procedure »Need to know the registers used by the called procedure »Need to include instructions to save and restore registers with every procedure call »Causes program maintenance problems  Called procedure »Preferred method as the code becomes modular (state preservation is done only once and in one place) »Avoids the program maintenance problems mentioned

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 152 Stack Frame Instructions ENTER instruction  Facilitates stack frame (discussed later) allocation enter bytes,level bytes = local storage space level = nesting level (we use 0)  Example enter XX,0 Equivalent to push BP mov BP,SP sub SP,XX

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 153 Stack Frame Instructions (cont’d) LEAVE instruction  Releases stack frame leave »Takes no operands »Equivalent to mov SP,BP pop BP

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 154 A Typical Procedure Template proc-name PROC enter XX, leave ret YY proc-name ENDP

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 155 Stack Parameter Passing: Examples PROCEX3.ASM  call-by-value using the stack method  a simple sum procedure PROCSWAP.ASM  call-by-reference using the stack method  first two characters of the input string are swapped BBLSORT.ASM  implements bubble sort algorithm  uses pusha and popa to save and restore registers

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 156 Variable Number of Parameters For most procedures, the number of parameters is fixed  Same number of arguments in each call In procedures that can have variable number of parameters  Number of arguments can vary from call to call  C supports procedures with variable number of parameters Easy to support variable number of parameters using the stack method

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 157 Variable Number of Parameters (cont’d) To implement variable number of parameter passing:  Parameter count should be one of the parameters passed  This count should be the last parameter pushed onto the stack

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 158 Local Variables Local variables are dynamic in nature  Come into existence when the procedure is invoked  Disappear when the procedure terminates Cannot reserve space for these variable in the data segment for two reasons: »Such space allocation is static –Remains active even when the procedure is not »It does not work with recursive procedures Space for local variables is reserved on the stack

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 159 Local Variables (cont’d) Example N and temp  Two local variables  Each requires two bytes of storage

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 160 Local Variables (cont’d) The information stored in the stack »parameters »returns address »old BP value »local variables is collectively called stack frame In high-level languages, stack frame is also referred to as the activation record »Each procedure activation requires all this information The BP value is referred to as the frame pointer »Once the BP value is known, we can access all the data in the stack frame

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 161 Local Variables: Examples PROCFIB1.ASM  For simple procedures, registers can also be used for local variable storage  Uses registers for local variable storage  Outputs the largest Fibonacci number that is less than the given input number PROCFIB2.ASM  Uses the stack for local variable storage  Performance implications of using registers versus stack are discussed later

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 162 Multiple Module Programs In multi-module programs, a single program is split into multiple source files Advantages »If a module is modified, only that module needs to be reassembled (not the whole program) »Several programmers can share the work »Making modifications is easier with several short files »Unintended modifications can be avoided To facilitate separate assembly, two assembler directives are provided: »PUBLIC and EXTRN

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 163 PUBLIC Assembler Directive The PUBLIC directive makes the associated labels public »Makes these labels available for other modules of the program The format is PUBLIC label1, label2,... Almost any label can be made public including »procedure names »variable names »equated labels In the PUBLIC statement, it is not necessary to specify the type of label

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 164 Example: PUBLIC Assembler Directive..... PUBLIC error_msg, total, sample......DATA error_msg DB “Out of range!”,0 total DW CODE..... sample PROC..... sample ENDP.....

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 165 EXTRN Assembler Directive The EXTRN directive tells the assembler that certain labels are not defined in the current module  The assembler leaves “holes” in the OBJ file for the linker to fill in later on The format is EXTRN label:type where label is a label made public by a PUBLIC directive in some other module and type is the type of the label

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 166 EXTRN Assembler Directive (cont’d) TypeDescription UNKNOWNUndetermined or unknown type BYTEData variable (size is 8 bits) WORDData variable (size is 16 bits) DWORDData variable (size is 32 bits) QWORDData variable (size is 64 bits) FWORDData variable (size is 6 bytes) TBYTEData variable (size is 10 bytes) PROCA procedure name (NEAR or FAR according to.MODEL) NAERA near procedure name FARA far procedure name

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 167 EXTRN Assembler Directive (cont’d) Example.MODEL SMALL.... EXTRN error_msg:BYTE, total:WORD EXTRN sample:PROC.... Note: EXTRN (not EXTERN) Example module1.asm ( main procedure) module2.asm ( string-length procedure) Last slide

Addressing Modes

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 169 Addressing Modes Addressing mode refers to the specification of the location of data required by an operation Pentium supports three fundamental addressing modes:  Register mode  Immediate mode  Memory mode Specification of operands located in memory can be done in a variety of ways  Mainly to support high-level language constructs and data structures

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 170 Pentium Addressing Modes (32-bit Addresses)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 171 Memory Addressing Modes (16-bit Addresses)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 172 Simple Addressing Modes Register addressing mode  Operands are located in registers  It is the most efficient addressing mode Immediate addressing mode  Operand is stored as part of the instruction »This mode is used mostly for constants  It imposes several restrictions  Efficient as the data comes with the instructions »Instructions are generally prefetched Both addressing modes are discussed before

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 173 Memory Addressing Modes Pentium offers several addressing modes to access operands located in memory »Primary reason: To efficiently support high-level language constructs and data structures Available addressing modes depend on the address size used  16-bit modes (shown before) »same as those supported by 8086  32-bit modes (shown before) »supported by Pentium »more flexible set

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page Bit Addressing Modes These addressing modes use 32-bit registers Segment + Base + (Index * Scale) + displacement CSEAXEAX1no displacement SSEBXEBX28-bit displacement DSECXECX432-bit displacement ESEDXEDX8 FSESIESI GSEDIEDIEBP ESP

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 175 Differences between 16- and 32-bit Modes 16-bit addressing32-bit addressing Base registerBX, BPEAX, EBX, ECX, EDX, ESI, EDI, EBP, ESP Index registerSI, DIEAX, EBX, ECX, EDX, ESI, EDI, EBP Scale factorNone1, 2, 4, 8 Displacement0, 8, 16 bits0, 8, 32 bits

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page bit or 32-bit Addressing Mode? How does the processor know? Uses the D bit in the CS segment descriptor D = 0 »default size of operands and addresses is 16 bits D = 1 »default size of operands and addresses is 32 bits We can override these defaults  Pentium provides two size override prefixes 66Hoperand size override prefix 67Haddress size override prefix Using these prefixes, we can mix 16- and 32-bit data and addresses

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 177 Examples: Override Prefixes Our default mode is 16-bit data and addresses Example 1: Data size override mov AX,123 ==> B8 007B mov EAX,123 ==> 66 | B B Example 2: Address size override mov AX,[EBX*ESI+2] ==> 67 | 8B0473 Example 3: Address and data size override mov EAX,[EBX*ESI+2] ==> 66 | 67 | 8B0473

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 178 Memory Addressing Modes Direct addressing mode  Offset is specified as part of the instruction –Assembler replaces variable names by their offset values –Useful to access only simple variables Example total_marks = assign_marks + test_marks + exam_marks translated into mov EAX,assign_marks add EAX,test_marks add EAX,exam_marks mov total_marks,EAX

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 179 Memory Addressing Modes (cont’d) Register indirect addressing mode  Effective address is placed in a general-purpose register  In 16-bit segments »only BX, SI, and DI are allowed to hold an effective address add AX,[BX] is valid add AX,[CX] is NOT allowed  In 32-bit segments »any of the eight 32-bit registers can hold an effective address add AX,[ECX] is valid

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 180 Memory Addressing Modes (cont’d) Default Segments  16-bit addresses »BX, SI, DI : data segment »BP, SP : stack segment  32-bit addresses »EAX, EBX, ECX, EDX, ESI, EDI: data segment »EBP, ESP: stack segment Possible to override these defaults  Pentium provides segment override prefixes

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 181 Based Addressing Effective address is computed as base + signed displacement  Displacement: –16-bit addresses: 8- or 16-bit number –32-bit addresses: 8- or 32-bit number Useful to access fields of a structure or record »Base register  points to the base address of the structure »Displacement  relative offset within the structure Useful to access arrays whose element size is not 2, 4, or 8 bytes »Displacement  points to the beginning of the array »Base register  relative offset of an element within the array

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 182 Based Addressing (cont’d)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 183 Indexed Addressing Effective address is computed as (index * scale factor) + signed displacement  16-bit addresses: –displacement: 8- or 16-bit number –scale factor: none (i.e., 1)  32-bit addresses: –displacement: 8- or 32-bit number –scale factor: 2, 4, or 8 Useful to access elements of an array (particularly if the element size is 2, 4, or 8 bytes) »Displacement  points to the beginning of the array »Index register  selects an element of the array (array index) »Scaling factor  size of the array element

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 184 Indexed Addressing (cont’d) Examples add AX,[DI+20] –We have seen similar usage to access parameters off the stack (in Chapter 10) add AX,marks_table[ESI*4] –Assembler replaces marks_table by a constant (i.e., supplies the displacement) –Each element of marks_table takes 4 bytes (the scale factor value) –ESI needs to hold the element subscript value add AX,table1[SI] –SI needs to hold the element offset in bytes –When we use the scale factor we avoid such byte counting

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 185 Based-Indexed Addressing Based-indexed addressing with no scale factor Effective address is computed as base + index + signed displacement Useful in accessing two-dimensional arrays »Displacement  points to the beginning of the array »Base and index registers point to a row and an element within that row Useful in accessing arrays of records »Displacement  represents the offset of a field in a record »Base and index registers hold a pointer to the base of the array and the offset of an element relative to the base of the array

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 186 Based-Indexed Addressing (cont’d) Useful in accessing arrays passed on to a procedure »Base register  points to the beginning of the array »Index register  represents the offset of an element relative to the base of the array Example Assuming BX points to table1 mov AX,[BX+SI] cmpAX,[BX+SI+2] compares two successive elements of table1

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 187 Based-Indexed Addressing (cont’d) Based-indexed addressing with scale factor Effective address is computed as base + (index * scale factor) + signed displacement Useful in accessing two-dimensional arrays when the element size is 2, 4, or 8 bytes »Displacement ==> points to the beginning of the array »Base register ==> holds offset to a row (relative to start of array) »Index register ==> selects an element of the row »Scaling factor ==> size of the array element

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 188 Illustrative Examples Insertion sort  ins_sort.asm  Sorts an integer array using insertion sort algorithm »Inserts a new number into the sorted array in its right place Binary search  bin_srch.asm  Uses binary search to locate a data item in a sorted array »Efficient search algorithm

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 189 Arrays One-Dimensional Arrays Array declaration in HLL (such as C) int test_marks[10]; specifies a lot of information about the array: »Name of the array ( test_marks ) »Number of elements (10) »Element size (2 bytes) »Interpretation of each element ( int i.e., signed integer) »Index range (0 to 9 in C) You get very little help in assembly language!

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 190 Arrays (cont’d) In assembly language, declaration such as test_marks DW 10 DUP (?) only assigns name and allocates storage space.  You, as the assembly language programmer, have to “properly” access the array elements by taking element size and the range of subscripts. Accessing an array element requires its displacement or offset relative to the start of the array in bytes

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 191 Arrays (cont’d) To compute displacement, we need to know how the array is laid out »Simple for 1-D arrays Assuming C style subscripts displacement = subscript * element size in bytes If the element size is 2, 4, or 8 bytes  a scale factor can be used to avoid counting displacement in bytes

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 192 Multidimensional Arrays We focus on two-dimensional arrays »Our discussion can be generalized to higher dimensions A 5  3 array can be declared in C as int class_marks[5][3]; Two dimensional arrays can be stored in one of two ways:  Row-major order –Array is stored row by row –Most HLL including C and Pascal use this method  Column-major order –Array is stored column by column –FORTRAN uses this method

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 193 Multidimensional Arrays (cont’d)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 194 Multidimensional Arrays (cont’d) Why do we need to know the underlying storage representation? »In a HLL, we really don’t need to know »In assembly language, we need this information as we have to calculate displacement of element to be accessed In assembly language, class_marks DW 5*3 DUP (?) allocates 30 bytes of storage There is no support for using row and column subscripts »Need to translate these subscripts into a displacement value

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 195 Multidimensional Arrays (cont’d) Assuming C language subscript convention, we can express displacement of an element in a 2-D array at row i and column j as displacement = (i * COLUMNS + j) * ELEMENT_SIZE where COLUMNS = number of columns in the array ELEMENT_SIZE = element size in bytes Example:Displacement of class_marks[3,1] element is (3*3 + 1) * 2 = 20

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 196 Examples of Arrays Example 1 One-dimensional array »Computes array sum (each element is 4 bytes long e.g., long integers) »Uses scale factor 4 to access elements of the array by using a 32-bit addressing mode (uses ESI rather than SI) »Also illustrates the use of predefined location counter $ Example 2 Two-dimensional array »Finds sum of a column »Uses “based-indexed addressing with scale factor” to access elements of a column

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 197 Recursion A recursive procedure calls itself  Directly, or  Indirectly Some applications can be naturally expressed using recursion factorial(0) = 1 factorial (n) = n * factorial(n  1) for n > 0 From implementation viewpoint  Very similar to any other procedure call »Activation records are stored on the stack

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 198 Recursion (cont’d)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 199 Recursion (cont’d) Example 1  Factorial »Discussed before Example 2  Quicksort (on an N-element array)  Basic algorithm »Selects a partition element x »Assume that the final position of x is array[i] »Moves elements less than x into array[0]…array[i  1] »Moves elements greater than x into array[i+1]…array[N  1] »Applies quicksort recursively to sort these two subarrays Last slide

Selected Pentium Instructions

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 201 Status Flags

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 202 Status Flags (cont’d) Status flags are updated to indicate certain properties of the result  Example: If the result is zero, zero flag is set Once a flag is set, it remains in that state until another instruction that affects the flags is executed Not all instructions affect all status flags  add and sub affect all six flags  inc and dec affect all but the carry flag  mov, push, and pop do not affect any flags

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 203 Status Flags (cont’d) Example ; initially, assume ZF = 0 mov AL,55H ; ZF is still zero sub AL,55H ; result is 0 ; ZF is set (ZF = 1) push BX ; ZF remains 1 mov BX,AX ; ZF remains 1 pop DX ; ZF remains 1 mov CX,0 ; ZF remains 1 inc CX ; result is 1 ; ZF is cleared (ZF = 0)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 204 Status Flags (cont’d) Zero Flag  Indicates zero result –If the result is zero, ZF = 1 –Otherwise, ZF = 0  Zero can result in several ways (e.g. overflow) mov AL,0FH mov AX,0FFFFH mov AX,1 add AL,0F1H inc AX dec AX »All three examples result in zero result and set ZF  Related instructions jz jump if zero (jump if ZF = 1) jnz jump if not zero (jump if ZF = 0)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 205 Status Flags (cont’d) Uses of zero flag  Two main uses of zero flag »Testing equality –Often used with cmp instruction cmp char,’$’ ; ZF = 1 if char is $ cmp AX,BX »Counting to a preset value –Initialize a register with the count value –Decrement it using dec instruction –Use jz/jnz to transfer control

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 206 Status Flags (cont’d) Consider the following code sum := 0 for (i = 1 to M) for (j = 1 to N) sum := sum + 1 end for Assembly code sub AX,AX ; AX := 0 mov DX,M outer_loop: mov CX,N inner_loop: inc AX loop inner_loop dec DX jnz outer_loop exit_loops: mov sum,AX

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 207 Status Flags (cont’d) Two observations  loop instruction is equivalent to dec DX jnz outer_loop »This two instruction sequence is more efficient than the loop instruction (takes less time to execute) »loop instruction does not affect any flags!  This two instruction sequence is better than initializing DX = 1 and executing inc DX cmp DX,M jle inner_loop

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 208 Status Flags (cont’d) Carry Flag  Records the fact that the result of an arithmetic operation on unsigned numbers is out of range  The carry flag is set in the following examples mov AL,0FH mov AX,12AEH add AL,0F1H sub AX,12AFH  Range of 8-, 16-, and 32-bit unsigned numbers size range 8 bits0 to 255 (2 8  1) 16 bits0 to 65,535 (2 16  1) 32 bits0 to 4,294,967,295 (2 32  1)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 209 Status Flags (cont’d)  Carry flag is not set by inc and dec instructions »The carry flag is not set in the following examples mov AL,0FFH mov AX,0 inc AL dec AX  Related instructions jc jump if carry (jump if CF = 1) jnc jump if no carry (jump if CF = 0)  Carry flag can be manipulated directly using stc set carry flag (set CF to 1) clc clear carry flag (clears CF to 0) cmc complement carry flag (inverts CF value)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 210 Status Flags (cont’d) Uses of carry flag  To propagate carry/borrow in multiword addition/subtraction 1  carry from lower 32 bits x = A AE7H y = 489B A321 FE H 7FAB C9CA 10B7 DCFAH  To detect overflow/underflow condition »In the last example, carry out of leftmost bit indicates overflow  To test a bit using the shift/rotate instructions »Bit shifted/rotated out is captured in the carry flag »We can use jc/jnc to test whether this bit is 1 or 0

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 211 Status Flags (cont’d) Overflow flag  Indicates out-of-range result on signed numbers –Signed number counterpart of the carry flag  The following code sets the overflow flag but not the carry flag mov AL,72H ; 72H = 114D add AL,0EH ; 0EH = 14D  Range of 8-, 16-, and 32-bit signed numbers size range 8 bits  128 to to (2 7  1) 16 bits  32,768 to +32, to (2 15  1) 32 bits  2,147,483,648 to +2,147,483, to (2 31  1)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 212 Status Flags (cont’d) Unsigned interpretation mov AL,72H add AL,0EH jc overflow no_overflow: (no overflow code here).... overflow: (overflow code here).... Signed interpretation mov AL,72H add AL,0EH jo overflow no_overflow: (no overflow code here).... overflow: (overflow code here).... Signed or unsigned: How does the system know?  The processor does not know the interpretation  It sets carry and overflow under each interpretation

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 213 Status Flags (cont’d)  Related instructions jo jump if overflow (jump if OF = 1) jno jump if no overflow (jump if OF = 0)  There is a special software interrupt instruction into interrupt on overflow Details on this instruction in Chapter 20 Uses of overflow flag  Main use »To detect out-of-range result on signed numbers

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 214 Status Flags (cont’d) Sign flag  Indicates the sign of the result –Useful only when dealing with signed numbers –Simply a copy of the most significant bit of the result  Examplesmov AL,15 add AL,97sub AL,97 clears the sign flag as sets the sign flag as the result is 112 the result is  82 (or in binary) (or in binary)  Related instructions js jump if sign (jump if SF = 1) jns jump if no sign (jump if SF = 0)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 215 Status Flags (cont’d) Consider the count down loop: for (i = M downto 0) end for If we don’t use the jns, we need cmp as shown below: cmp CX,0 jl for_loop The count down loop can be implemented as mov CX,M for_loop: dec CX jns for_loop Usage of sign flag  To test the sign of the result  Also useful to efficiently implement countdown loops

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 216 Status Flags (cont’d) Auxiliary flag  Indicates whether an operation produced a carry or borrow in the low-order 4 bits (nibble) of 8-, 16-, or 32- bit operands (i.e. operand size doesn’t matter)  Example 1  carry from lower 4 bits mov AL,43 43D = B add AL,94 94D = B 137D = B »As there is a carry from the lower nibble, auxiliary flag is set

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 217 Status Flags (cont’d)  Related instructions »No conditional jump instructions with this flag »Arithmetic operations on BCD numbers use this flag aaa ASCII adjust for addition aas ASCII adjust for subtraction aam ASCII adjust for multiplication aad ASCII adjust for division daa Decimal adjust for addition das Decimal adjust for subtraction –Appendices I has more details on these instructions  Usage »Main use is in performing arithmetic operations on BCD numbers

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 218 Status Flags (cont’d) Parity flag  Indicates even parity of the low 8 bits of the result –PF is set if the lower 8 bits contain even number 1 bits –For 16- and 32-bit values, only the least significant 8 bits are considered for computing parity value  Example mov AL,53 53D = B add AL,89 89D = B 142D = B »As the result has even number of 1 bits, parity flag is set  Related instructions jp jump on even parity (jump if PF = 1) jnp jump on odd parity (jump if PF = 0)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 219 Status Flags (cont’d)  Usage of parity flag »Useful in writing data encoding programs »Example: Encodes the byte in AL (MSB is the parity bit) parity_encode PROC shl AL jp parity_zero stc ; CF = 1 jmp move_parity_bit parity_zero: clc ; CF = 0 move_parity_bit: rcr AL parity_encode ENDP

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 220 Arithmetic Instructions Pentium provides several arithmetic instructions that operate on 8-, 16- and 32-bit operands »Addition: add, adc, inc »Subtraction: sub, sbb, dec, neg, cmp »Multiplication: mul, imul »Division: div, idiv »Related instructions: cbw, cwd, cdq, cwde, movsx, movzx  There are few other instructions such as aaa, aas, etc. that operate on decimal numbers »See Appendix I for details

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 221 Arithmetic Instructions (cont’d) Multiplication  More complicated than add / sub »Produces double-length results –E.g. Multiplying two 8 bit numbers produces a 16-bit result »Cannot use a single multiply instruction for signed and unsigned numbers –add and sub instructions work both on signed and unsigned numbers –For multiplication, we need separate instructions mul for unsigned numbers imul for signed numbers

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 222 Arithmetic Instructions (cont’d) Unsigned multiplication mul source »Depending on the source operand size, the location of the other source operand and destination are selected

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 223 Arithmetic Instructions (cont’d)  Example mov AL,10 mov DL,25 mul DL produces 250D in AX register (result fits in AL) The imul instruction can use the same syntax »Also supports other formats  Example mov DL,0FFH ; DL = -1 mov AL,0BEH ; AL = -66 imul DL produces 66D in AX register (again, result fits in AL)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 224 Arithmetic Instructions (cont’d) Division instruction  Even more complicated than multiplication »Produces two results –Quotient –Remainder »In multiplication, using a double-length register, there will not be any overflow –In division, divide overflow is possible  Pentium provides a special software interrupt when a divide overflow occurs  Two instructions as in multiplication div source for unsigned numbers idiv source for signed numbers

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 225 Arithmetic Instructions (cont’d) Dividend is twice the size of the divisor Dividend is assumed to be in  AX (8-bit divisor)  DX:AX (16-bit divisor)  EDX:EAX (32-bit divisor)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 226 Arithmetic Instructions (cont’d)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 227 Arithmetic Instructions (cont’d) Example mov AX,251 mov CL,12 div CL produces 20D in AL and 11D as remainder in AH Example sub DX,DX ; clear DX mov AX,141BH ; AX = 5147D mov CX,012CH ; CX = 300D div CX produces 17D in AX and 47D as remainder in DX

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 228 Arithmetic Instructions (cont’d) Signed division requires some help »We extended an unsigned 16 bit number to 32 bits by placing zeros in the upper 16 bits »This will not work for signed numbers –To extend signed numbers, you have to copy the sign bit into those upper bit positions  Pentium provides three instructions in aiding sign extension »All three take no operands cbw converts byte to word (extends AL into AH) cwd converts word to doubleword (extends AX into DX) cdq converts doubleword to quadword (extends EAX into EDX)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 229 Arithmetic Instructions (cont’d)  Some additional related instructions »Sign extension cwde converts word to doubleword (extends AX into EAX) »Two move instructions movsx dest,src (move sign-extended src to dest ) movzx dest,src (move zero-extended src to dest ) »For both move instructions, dest has to be a register »The src operand can be in a register or memory –If src is 8-bits, dest must be either a 16- or 32-bit register –If src is 16-bits, dest must be a 32-bit register

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 230 Arithmetic Instructions (cont’d) Example mov AL,-95 cbw ; AH = FFH mov CL,12 idiv CL produces  7D in AL and  11D as remainder in AH Example mov AX,-5147 cwd ; DX := FFFFH mov CX,300 idiv CX produces  17D in AX and  47D as remainder in DX

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 231 Arithmetic Instructions (cont’d) Use of Shifts for Multiplication and Division  Shifts are more efficient  Example: Multiply AX by 32 mov CX,32 imul CX takes 12 clock cycles  Using sal AX,5 takes just one clock cycle

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 232 Application Examples PutInt8 procedure  To display a number, repeatedly divide it by 10 and display the remainders obtained quotient remainder 108/ / /10 01  To display digits, they must be converted to their character form »This means simply adding the ASCII code for zero line 24: add AH,’0’

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 233 Application Examples (cont’d) GetInt8 procedure  To read a number, read each digit character »Convert to its numeric equivalent »Multiply the running total by 10 and add this digit

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 234 Indirect Jumps Direct jump  Target address is encoded in the instruction itself Indirect jump  Introduces a level of indirection »Address is specified either through memory of a general- purpose register  Example jmp CX jumps to the address in CX  Address is absolute »Not relative as in direct jumps

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 235 Indirect Jumps (cont’d) Switch (ch) { Case ’0’: count[0]++; break; Case ’1’: count[1]++; break; Case ’2’: count[2]++; break; Case ’3’: count[3]++; break; Default: count[3]++; }

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 236 Indirect Jumps (cont’d) Turbo C assembly code for the switch statement _main PROC NEAR... mov AL,ch cbw sub AX,48 ; 48 = ASCII for 0 mov BX,AX cmp BX,3 ja default shl BX,1 ; BX = BX * 2 jmp WORD PTR CS:jump_table[BX] Indirect jump

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 237 Indirect Jumps (cont’d) case_0: inc WORD PTR [BP-10] jmp SHORT end_switch case_1: inc WORD PTR [BP-8] jmp SHORT end_switch case_2: inc WORD PTR [BP-6] jmp SHORT end_switch case_3: inc WORD PTR [BP-4] jmp SHORT end_switch default: inc WORD PTR [BP-2] end_switch:... _main ENDP

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 238 Indirect Jumps (cont’d) jump_table LABEL WORD DW case_0 DW case_1 DW case_2 DW case_3... Indirect jump uses this table to jump to the appropriate case routine The indirect jump instruction uses segment override prefix to refer to the jump_table in the CODE segment Jump table for the indirect jump

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 239 Conditional Jumps Three types of conditional jumps  Jumps based on the value of a single flag »Arithmetic flags such as zero, carry can be tested using these instructions  Jumps based on unsigned comparisons »Operands of cmp instruction are treated as unsigned numbers  Jumps based on signed comparisons »Operands of cmp instruction are treated as signed numbers

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 240 Jumps Based on Single Flags Testing for zero jz jump if zero jumps if ZF = 1 je jump if equal jumps if ZF = 1 jnz jump if not zero jumps if ZF = 0 jne jump if not equal jumps if ZF = 0 jcxz jump if CX = 0 jumps if CX = 0 (Flags are not tested)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 241 Jumps Based on Single Flags (cont’d) Testing for carry jc jump if carry jumps if CF = 1 jnc jump if no carry jumps if CF = 0 Testing for overflow jo jump if overflow jumps if OF = 1 jno jump if no overflow jumps if OF = 0 Testing for sign js jump if negative jumps if SF = 1 jns jump if not negative jumps if SF = 0

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 242 Jumps Based on Single Flags (cont’d) Testing for parity jp jump if parity jumps if PF = 1 jpe jump if parity jumps if PF = 1 is even jnp jump if not parity jumps if PF = 0 jpo jump if parity jumps if PF = 0 is odd

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 243 Jumps Based on Unsigned Comparisons MnemonicMeaningCondition je jump if equalZF = 1 jz jump if zeroZF = 1 jne jump if not equalZF = 0 jnz jump if not zeroZF = 0 ja jump if aboveCF = ZF = 0 jnbe jump if not below CF = ZF = 0 or equal

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 244 Jumps Based on Unsigned Comparisons MnemonicMeaningCondition jae jump if aboveCF = 0 or equal jnb jump if not belowCF = 0 jb jump if belowCF = 1 jnae jump if not aboveCF = 1 or equal jbe jump if belowCF=1 or ZF=1 or equal jna jump if not above CF=1 or ZF=1

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 245 Jumps Based on Signed Comparisons MnemonicMeaningCondition je jump if equalZF = 1 jz jump if zeroZF = 1 jne jump if not equalZF = 0 jnz jump if not zeroZF = 0 jg jump if greaterZF=0 & SF=OF jnle jump if not less ZF=0 & SF=OF or equal

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 246 Jumps Based on Signed Comparisons (cont’d) Mnemonic MeaningCondition jge jump if greater SF = OF or equal jnl jump if not less SF = OF jl jump if lessSF  OF jnge jump if not greaterSF  OF or equal jle jump if lessZF=1 or SF  OF or equal jng jump if not greater ZF=1 or SF  OF

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 247 Implementing HLL Decision Structures High-level language decision structures can be implemented in a straightforward way See Section 12.4 for examples that implement  if-then-else  if-then-else with a relational operator  if-then-else with logical operator AND  if-then-else with logical operator OR  while loop  repeat-until loop  for loops

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 248 Logical Expressions in HLLs Representation of Boolean data  Only a single bit is needed to represent Boolean data  Usually a single byte is used »For example, in C –All zero bits represents false –A non-zero value represents true Logical expressions  Logical instructions AND, OR, etc. are used Bit manipulation  Logical, shift, and rotate instructions are used

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 249 Evaluation of Logical Expressions Two basic ways  Full evaluation »Entire expression is evaluated before assigning a value »PASCAL uses full evaluation  Partial evaluation »Assigns as soon as the final outcome is known without blindly evaluating the entire logical expression »Two rules help: –cond1 AND cond2  If cond1 is false, no need to evaluate cond2 –cond1 OR cond2  If cond1 is true, no need to evaluate cond2

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 250 Evaluation of Logical Expressions (cont’d) Partial evaluation  Used by C Useful in certain cases to avoid run-time errors Example if ((X > 0) AND (Y/X > 100))  If x is 0, full evaluation results in divide error  Partial evaluation will not evaluate (Y/X > 100) if X = 0 Partial evaluation is used to test if a pointer value is NULL before accessing the data it points to

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 251 Bit Instructions Bit Test and Modify Instructions  Four bit test instructions  Each takes the position of the bit to be tested InstructionEffect on the selected bit bt (Bit Test)No effect bts (Bit Test and Set)selected bit  1 btr (Bit Test and Reset)selected bit  0 btc selected bit  NOT(selected bit) (Bit Test and Complement)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 252 Bit Instructions (cont’d) All four instructions have the same format We use bt to illustrate the format bt operand,bit_pos  operand is word or doubleword »Can be in a register or memory  bit_pos indicates the position of the bit to be tested »Can be an immediate value or in a 16/32-bit register Instructions in this group affect only the carry flag »Other five flags are undefined

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 253 Bit Scan Instructions These instructions scan the operand for a 1 bit  return the bit position in a register Two instructions bsf dest_reg,operand ;bit scan forward bsr dest_reg,operand ;bit scan reverse »operand can be a word or doubleword in a register or memory »dest_reg receives the bit position –Must be a 16- or 32-bit register  Only ZF is updated (other five flags undefined) –ZF = 1 if all bits of operand are 0 –ZF = 0 otherwise (position of first 1 bit in dest_reg )

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 254 Illustrative Examples Example 1  Linear search of an integer array Example 2  Selection sort on an integer array Example 3  Multiplication using shift and add operations »Multiplies two unsigned 8-bit numbers –Uses a loop that iterates 8 times Example 4  Multiplication using bit instructions

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 255 String Representation Two types  Fixed-length  Variable-length Fixed length strings  Each string uses the same length »Shorter strings are padded (e.g. by blank characters) »Longer strings are truncated  Selection of string length is critical »Too large ==> inefficient »Too small ==> truncation of larger strings

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 256 String Representation (cont’d) Variable-length strings  Avoids the pitfalls associated with fixed-length strings Two ways of representation  Explicitly storing string length (used in PASCAL) string DB ‘Error message’ str_len DW $-string –$ represents the current value of the location counter  $ points to the byte after the last character of string  Using a sentinel character (used in C) »Uses NULL character – Such NULL-terminated strings are called ASCIIZ strings

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 257 String Instructions Five string instructions LODSLOaD Stringsource STOSSTOre Stringdestination MOVSMOVe Stringsource & destination CMPSCoMPare Stringsource & destination SCASSCAn Stringdestination Specifying operands  32-bit segments: DS:ESI = source operand ES:EDI = destination operand  16-bit segments: DS:SI = source operand ES:DI = destination operand

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 258 String Instructions (cont’d) Each string instruction  Can operate on 8-, 16-, or 32-bit operands  Updates index register(s) automatically »Byte operands: increment/decrement by 1 »Word operands: increment/decrement by 2 »Doubleword operands: increment/decrement by 4 Direction flag  DF = 0: Forward direction (increments index registers)  DF = 1: Backward direction (decrements index registers) Two instructions to manipulate DF std set direction flag (DF = 1) cld clear direction flag (DF = 0)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 259 Repetition Prefixes String instructions can be repeated by using a repetition prefix Two types  Unconditional repetition rep REPeat  Conditional repetition repe/repz REPeat while Equal REPeat while Zero repne/repnz REPeat while Not Equal REPeat while Not Zero

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 260 Repetition Prefixes (cont’d) rep while (CX  0) execute the string instruction CX := CX  1 end while CX register is first checked  If zero, string instruction is not executed at all  More like the JCXZ instruction

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 261 Repetition Prefixes (cont’d) repe/repz while (CX  0) execute the string instruction CX := CX  1 if (ZF = 0) then exit loop end if end while Useful with cmps and scas string instructions

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 262 Repetition Prefixes (cont’d) repne/repnz while (CX  0) execute the string instruction CX := CX  1 if (ZF = 1) then exit loop end if end while

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 263 String Move Instructions Three basic instructions  movs, lods, and stos Move a string (movs) Format movs dest_string,source_string movsb ; operands are bytes movsw ; operands are words movsd ; operands are doublewords First form is not used frequently  Source and destination pointed by DS:(E)SI and ES:(E)DI, respectively

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 264 String Move Instructions (cont’d) movsb --- move a byte string ES:DI := (DS:SI); copy a byte if (DF=0); forward direction then SI := SI+1 DI := DI+1 else ; backward direction SI := SI  1 DI := DI  1 end if Flags affected: none

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 265 String Move Instructions (cont’d) Example.DATA string1 DB 'The original string',0 strLen EQU $ - string1 string2 DB 80 DUP (?).CODE.STARTUP mov AX,DS ; set up ES mov ES,AX ; to the data segment mov CX,strLen ; strLen includes NULL mov SI,OFFSET string1 mov DI,OFFSET string2 cld ; forward direction rep movsb

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 266 String Move Instructions (cont’d) Load a String (LODS) Copies the value from the source string at DS:(E)SI to  AL ( lodsb )  AX ( lodsw )  EAX ( lodsd ) Repetition prefix does not make sense  It leaves only the last value in AL, AX, or EAX register

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 267 String Move Instructions (cont’d) lodsb --- load a byte string AL := (DS:SI); copy a byte if (DF=0); forward direction then SI := SI+1 else ; backward direction SI := SI  1 end if Flags affected: none

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 268 String Move Instructions (cont’d) Store a String (STOS) Performs the complementary operation Copies the value in »AL ( lodsb ) »AX ( lodsw ) »EAX ( lodsd ) to the destination string at ES:(E)DI Repetition prefix can be used to initialize a block of memory

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 269 String Move Instructions (cont’d) stosb --- store a byte string ES:DI := AL; copy a byte if (DF=0); forward direction then DI := DI+1 else ; backward direction DI := DI  1 end if Flags affected: none

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 270 String Move Instructions (cont’d) Example: Initializes array1 with -1. DATA array1 DW 100 DUP (?).CODE.STARTUP mov AX,DS ; set up ES mov ES,AX ; to the data segment mov CX,100 mov DI,OFFSET array1 mov AX,-1 cld ; forward direction rep stosw

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 271 String Move Instructions (cont’d) In general, repeat prefixes are not useful with lods and stos Used in a loop to do conversions while copying mov CX,strLen mov SI,OFFSET string1 mov DI,OFFSET string2 cld ; forward direction loop1: lodsb or AL,20H stosb loop loop1 done:

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 272 String Compare Instruction cmpsb --- compare two byte strings Compare two bytes at DS:SI and ES:DI and set flags if (DF=0); forward direction then SI := SI+1 DI := DI+1 else ; backward direction SI := SI  1 DI := DI  1 end if Flags affected: As per cmp instruction (DS:SI)  (ES:DI)

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 273 String Compare Instruction (cont’d).DATA string1 DB 'abcdfghi',0 strLen EQU $ - string1 string2 DB 'abcdefgh',0.CODE.STARTUP mov AX,DS ; set up ES mov ES,AX ; to the data segment mov CX,strLen mov SI,OFFSET string1 mov DI,OFFSET string2 cld ; forward direction repe cmpsb dec SI dec DI ; leaves SI & DI pointing to the last character that differs

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 274 String Compare Instruction (cont’d).DATA string1 DB 'abcdfghi',0 strLen EQU $ - string1 - 1 string2 DB 'abcdefgh',0.CODE.STARTUP mov AX,DS ; set up ES mov ES,AX ; to the data segment mov CX,strLen mov SI,OFFSET string1 + strLen - 1 mov DI,OFFSET string2 + strLen - 1 std ; backward direction repne cmpsb inc SI ; Leaves SI & DI pointing to the first character that matches inc DI ; in the backward direction

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 275 String Scan Instruction scasb --- Scan a byte string Compare AL to the byte at ES:DI and set flags if (DF=0); forward direction then DI := DI+1 else ; backward direction DI := DI  1 end if Flags affected: As per cmp instruction (DS:SI)-(ES:DI) scasw uses AX and scasd uses EAX registers instead of AL

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 276 String Scan Instruction (cont’d).DATA string1 DB 'abcdefgh',0 strLen EQU $ - string1.CODE.STARTUP mov AX,DS ; set up ES mov ES,AX ; to the data segment mov CX,strLen mov DI,OFFSET string1 mov AL,'e' ; character to be searched cld ; forward direction repne scasb dec DI ; leaves DI pointing to e in string1 Example 1

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 277 String Scan Instruction (cont’d).DATA string1 DB ' abc',0 strLen EQU $ - string1.CODE.STARTUP mov AX,DS ; set up ES mov ES,AX ; to the data segment mov CX,strLen mov DI,OFFSET string1 mov AL,' ' ; character to be searched cld ; forward direction repe scasb dec DI ; leaves DI pointing to the first non-blank character a Example 2

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 278 Illustrative Examples LDS and LES instructions String pointer can be loaded into DS/SI or ES/DI register pair by using lds or les instructions Syntax lds register,source les register,source  register should be a 16-bit register  source is a pointer to a 32-bit memory operand register is typically SI in lds and DI in les

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 279 Illustrative Examples (cont’d) Actions of lds and les lds register := (source) DS := (source+2) les register := (source) ES := (source+2) Pentium also supports lfs, lgs, and lss to load the other segment registers

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 280 Illustrative Examples (cont’d) Seven popular string processing routines are given as examples in string.asm  str_len  str_mov  str-cpy  str_cat  str_cmp  str_chr  str_cnv Given in the text

2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi Chapter 7: Page 281 Indirect Procedure Call Direct procedure calls specify the offset of the first instruction of the called procedure In indirect procedure call, the offset is specified through memory or a register  If BX contains pointer to the procedure, we can use call BX  If the word in memory at target_proc_ptr contains the offset of the called procedure, we can use call target_proc_ptr These are similar to direct and indirect jumps Last slide