1 ECE 371 Microprocessors Chapter 5 Microprocessor Assembly Language 2 Herbert G. Mayer, PSU Status 10/2/2015 For use at CCUT Fall 2015.

1 ECE 371 Microprocessors Chapter 5 Microprocessor Assembly Language 2 Herbert G. Mayer, PSU Status 10/2/2015 For use at CCUT Fall 2015

2 Syllabus Motivation Integer Multiply Integer Divide Conditional Branch Loop Constructs Memory Access Call and Return Procedure PutDec Summary Appendix Bibliography

3 Motivation In another handout about x86 assembly language we cover modules, character- and string-output, and writing assembler procedures Here we cover integer arithmetic, loops, and develop a more complex program to output signed integer numbers Since integer multiplication can generate results that are twice as long in bits as any of the source operands, the machine instructions for integer multiply –conversely for integer divide– must make special provisions for the length of operands

4 X86 Integer Multiply and Divide

5 Integer Multiply Our first project is 16-bit signed integer multiplication To track all minute detail of the result, including overflow, sign of the result, etc. we use the small x86 machine model, which uses 16-bit operands In that model, the smallest negative integer is -32768, the largest is 32767 The same principles apply to the newer model with 64-bit precision

6 Integer Multiply Under Microsoft’s assembler the opcodes are mul for unsigned, and imul for signed integer multiplication One operand is the ax register; it is always implied The other operand may be a memory location, or another register A literal operand is not permitted in the small mode; i.e. on the 16-bit architecture version, ok on 32-bit The result/product is in the register pair ax and dx There exists also a byte-version of the multiply, in which case the implied operand is in al, the other operand is a byte memory location or a byte register, and the result/product is in ax In the code sample below we multiply literal 10, moved into register bx, with the contents of the second, implied operand: register ax

7 Integer Multiply ; integer multiplication on x86, small mode: ; multiply literal 10 with contents of ax ; ax holds a copy of memory location MAX movbx, 10; a literal is in bx movax, MAX; signed word at location MAX imulbx; product is in ax + dx ; hi order 16 bits in dx...

8 Integer Divide: cwd Just as the integer multiply creates a signed integer double-word result in register pair ax and dx, the integer divide instruction assumes the numerator to be in the register pair ax and dx But if the numerator happens to be a single precision operand, it will have to be explicitly extended The denominator may be in a register or memory To create a sign-extended double-register operand in the ax-dx pair from the single-precision operand in ax, the x86 architecture provides the convert-to- double instruction cwd The cwd has no explicit operand Assumed operand is the value in ax, ax is unchanged The sign of ax is extended into dx

9 Integer Divide: cwd ; memory location B_word holds operand ; that operand is copied i.e. moved into register ax ; to be used as numerator in divide ; but first convert single- to double-precision movax, B_word; signed word at B_word in ax cwd; convert word to double-word ; sign of ax extended in dx ; ditto with byte-sized operands mov al, a_byte; signed byte a_byte into ah cbw; convert byte to word ; now the numerator can be used as operand in divide...

10 Integer Divide Integer divide needs 2 operands Numerator is in ax extended to dx double word Other operand may be memory location or register Opcode div is for unsigned and idiv for signed integer division In example assume numerator to be in memory location A_wd Denominator is at memory location B_wd Quotient ends up in ax, and remainder in dx

11 Integer Divide ; signed integer divide on x86: ; assume operands to be in locations A_wd and B_wd ; movax, A_wd; signed word at A_wd in ax cwd; sign of A_wd 16 times in dx idivB_wd; quotient A_wd/B_wd in ax ; remainder A_wd/B_wd in dx ; flags set to see: negative?

12 Memory Access On the x86 Microprocessor

13 Memory Access Key components of any computer architecture are the processor and memory Memory is referenced implicitly and explicitly by instructions that read and write data to and from memory Explicit accesses are called loads (for reading) and stores (for writing) Assemblers provide explicit instructions for these operations Implicit memory accesses occur in machine instructions whose operands may be memory cells On RISC systems these implicit references generally do not exist; instead all memory traffic is exclusively funneled through loads and stores on RISC

14 Memory Access In an assembler program, memory locations (both for data and code) are generally referred to symbolically This improves readability and allows for relocation; i.e. the linker and loader have a certain degree of freedom of placement in physical memory However, explicit memory addressing via a hard coded numbers is also possible; for example, on a hypothetical machine ld r1, 1000 could mean: load the word in memory location 1000 into register r1 Some assembly languages provide syntax to render the indirection explicit, for example the load operation: ld r1, (1000) uses parentheses to allude to this indirection

15 Memory Access A common paradigm of referencing symbolic memory names (labels) is through what is called indirectly. This means, the label value (memory address) is not what is wanted, but the contents of memory at that label location For example, if the offset of the data name foo is 10000, then the operation ld r2, foo does generally not mean to load the value 10000 into a register Instead, foo is used indirectly, the word at that address is referenced, loaded into register r2. When the address is really wanted, the IBM 370 architecture for example uses a special type of load, called load address, while the masm assembler for the x86 architecture uses the seg -or offset- operator to allude to the fact that indirection is not wanted Instead, the segment register portion of the address - or the offset portion of the address- is wanted

16 Memory Access During indirect memory references it is sometimes desirable to index Indexing means that one wishes to modify an otherwise fixed memory address. Typically, such a modifier resides in a register And if the value in that register is modified from iteration to iteration, the indexing operation can access memory in some sequential order, say in increasing (or decreasing) fashion This access to sequential memory addresses in equal steps is known as stride. For example, if r2 is a register loaded with a value (say, 2) then the instruction ld r1, foo[r2] means: fetch the word which is located 2 bytes further in memory than the offset expressed by foo Load that word into register r1

17 Memory Access In addition to indexing through a register, many architectures (and thus their assemblers) allow the offset to be modified by an additional literal index The literal value is encoded into the instruction, referred to as an immediate operand Immediate values are usually small, since the architectures often provide just a few bits to hold it On some architectures this immediate operand may be signed, on others only unsigned literal modifiers are possible

18 Memory Access Memory holds the data being manipulated Also intermediate results must be stored somewhere Registers usually are in short supply, contrasted with the size of memory Before completing a computation, data must be brought from peripherals to memory After computation, data must be sent from memory to peripherals, e.g. printers Often a cache helps overcome the speed bottleneck of memory accesses

19 Memory Access Indexing on x86 Indirect memory references are the default semantics on assemblers for the x86 architecture On nasm and masm this can also be expressed explicitly via the [ ] operator pair For example the move instruction --this mov is really a load: data_seg1 segment foo dw-1999, 0,... data_seg1 ends... mov ax, foo; indirection implied in masm mov ax, [foo]; explicit indirection in masm

20 Memory Access Indexing on x86 The above mov code loads the word at data segment location foo into register ax, regardless of whether the [] operator is used In the nasm assembler the instruction movax, foo; load offset of address in nasm loads the address of the memory location into register ax, while the nasm mnemonic: movax, [foo]; loads contents at address in nasm loads the contents of memory location foo into ax; assembler differences can be very subtle!!

21 Memory Access Indexing on x86 A handy programming tool that makes indexing so convenient is the ability to modify address labels by registers, literals, or a combination of both Clearly, the underlying computer architecture must support this, i.e. there must be instructions in place that allow index or multiply indexed load and store operations Some architectures (including IBM 360 and x86) allows multiple registers to be used to modify (to index) the address label These registers are referred to as base- and as index- registers Note that the term base-register often means that the base address sits in that register

22 Memory Access Indexing on x86 However, in the x86 architecture, as long as an address expression includes a data memory label, that label is the base address With the following provisos: If l1, l2 are address labels, and c1, c2 are numeric literal constants, then: l1 + c1; is address of location l1 plus c1 l1 – c2; is address of location l1 minus c2 l1 – l2; is a pure numeric value: l1 – l2 [l2 + c1]; is the memory content at l2 + c1 [l1 – c2]; is the memory content at l1 – c2 l1 + l2; is illegal on x86

23 Memory Access Indexing on x86 On orthogonally designed architectures, a user visible register is usable as an index (or base-) register Practical limitations often forced compromises. For example, on the x86 architecture, only certain registers can be used for indexing, listed below: address expression + one of ( bx, bp, si, and di ) on x86 An address expression, being indexed by one (even 2) of these index registers, may also contain a literal modifier, or both, making the indexing operation practical and easy to use for array indexing. Note that it is possible to use up to 2 index- and base-registers in a single address expression, but only with the following restriction: address expression + two of ( ( bx or bp, and ( si or di ) ) on x86

24 Memory Access Indexing on x86 An address expression such as [min_data+bx+si+2] is allowed, while the expression [min_data+bx+bx] is not permitted due to multiple uses of the bx register These samples assume that min_data is a legal label in the date segment A complete expression with all typical arithmetic operators is allowed on x86 assemblers, as long as the resulting value is computable (and reducible to) a single numeric value at the time of assembly Thus, an expression like [chars+bx+si+2*3+4] is legal, provided that chars is a legal data label

25 Memory Access Implicit Segment Register Data declared in the data segment below are digits hex ‘0’.. ‘f’ The user-designed Put_Char macro uses system service call 02h for single-character output Using bx as index register Note that only base and index registers can be used for this purpose, e.g. not cx Memory operands (data labels) are used indirectly Indirection is explicitly expressed via [ and ] operator But not necessarily needed for memory operands in Microsoft SW, as indirection is most common case Since it is needed in nasm and Unix systems, we recommend use of the [ ] operation

26 Memory Access Implicit Segment Register Benefit is also improved readability to use explicit brackets to allude to indirection, such as [chars] Note that indirect offset and index register are both allowed Either or both or none may be modified by an immediate operand Immediate operand are limited to 16 bits in size Order of offset and index arbitrary The output of program below is:hm02012452267

27 Memory Access ; Purpose: memory references, indexing ; HM for use at CCUT startmacro; no parameters movax, @data; @data predefined macro movds, ax; now data segment reg set endm; end macro: start terminmacro ret_code; no parameters, assume 0 movah, Term_Code; we wanna terminate, ah + al moval, ret_code; any errors? If /= 0 int21h; call DOS for help endm; end macro: termin Put_Char macro char; output passed character movah, Cout_Code; tell DOS: Char out movdl, char; char into required byte reg int21h; and call DOS endm; end macro Put_Char Cout_Code = 2h Term_Code =4ch.model small.data charsdb"0123456789abcdef"

28 Memory Access.code main:start movbx, 2; index char '2' in chars movcl, 'h' Put_Char cl; o.k. since cl holds char Put_Char 'm' Put_Char chars; not good programming Put_Char chars[bx]; shows partial indirection Put_Char [chars]; explicit Put_Char [chars+1]; explicit Put_Char [chars+bx] Put_Char chars[bx+2] Put_Char [chars+bx+3] Put_Char [bx][chars] Put_Char [chars]+[bx] Put_Char [bx+4][chars] Put_Char [bx+3][chars+2] done:termin0; no errors if we reach

29 Memory Access Explicit Segment Register Again the data in the data segment is character string are: “0123456789abcdef” Macros as in the example earlier Use bx as index register again Note: no implicit segment register used Instead, cs used explicitly Note syntax: seg:offset The output of program below is:h02012452267

30 Memory Access ; Source file: mem2.asm; use explicit segment reg ; Purpose: memory ref, indexing with explicit ds:.model small.data charsdb"0123456789abcdef“.code main:start movbx, 2; index '2' in chars

31 Memory Access Put_Char 'h' Put_Char ds:chars; not good programming Put_Char ds:chars[bx]; only partial indirect Put_Char ds:[chars]; explicit Put_Char ds:[chars+1]; explicit Put_Char ds:[chars+bx] Put_Char ds:chars[bx+2] Put_Char ds:[chars+bx+3] Put_Char ds:[bx][chars] Put_Char ds:[chars]+[bx] Put_Char ds:[bx+4][chars] Put_Char ds:[bx+3][chars+2]...

32 Word Access Goal: to reference memory as words Output these integers as decimal numbers Use the yet to be designed PutDec() assembler procedure to print decimal numbers Macros start and termin as before Use register bx again as index register Data segment defines some decimal and some hex literals Data label nums defines an array of integer words

33 Word Access Observe that modifications to index register is done in steps of 2 Stride of word is 2 on x86! Note that words initialized via hex literals are still printed as signed integers Intended output shown below: Output: 511 512 512 513 1025 -8531 -8531 -17730 -17730

34 Word Access ; Purpose: word memory references, indexing startmacro; no parameters movax, @data; @data predefined macro movds, ax; now data segment reg set endm; end macro: start terminmacro ret_code; no parameters, assume 0 movah, Term_Code; terminate, ah + al moval, ret_code; any errors? If /= 0 int21h; call DOS for help endm; end macro: termin Term_Code=4ch.model small.data numsdw511, 512, 513, 1023, 1024, 1025 w1dw0deadh w2dw0beefh w3dw0c0edh w4dw0babeh

35 Word Access, Cont’d.code extrnPutDec : near main:start movbx, 2; use bx as index register movax, nums callPutDec; output is: 511 movax, [nums + 2] callPutDec; output is: 512 movax, [nums + bx] callPutDec; output is: 512 movax, [nums][bx + 2] callPutDec; output is: 513 movax, [nums+2][bx+6] callPutDec; output is: 1025

36 Word Access, Cont’d movnums, 0deadh movax, nums callPutDec; output is: -8531 movax, w1 callPutDec; output is: -8531 movax, [w2+bx+2] callPutDec; output is: -17730 movax, [w1+6] callPutDec; output is: -17730 done:termin0; no errors if we reach end main; start here!

37 Loop Constructs

38 Comparison By default, a machine executes one instruction after another, in sequence That sequence can be changed via branches Branches are also known as jumps, depending on manufacturer Unconditional branches transfer control to their defined destination Conditional branches make this change in control flow only if their associated condition holds

39 Comparison How does the microprocessor “know” when or whether a condition is true? The CPU has flags that specify this condition, and instructions that test for the condition Typical conditions are zero, negative, positive, overflow, carry, etc. Symbolic flags are CF, ZF, OF These can be used as operands in conditional branches, conditional calls etc.

40 Conditional Branch -- high-level source program snippet if a > b then max := a; else max := b; end if; ; corresponding x86 assembler snippet: mov ax, [a]; memory location a cmp ax, [b]; memory location b jle b_is_max; jump to b_is_max if mov [max], ax jmp end_if; jump around else b_is_max:; this is else mov ax, [b] mov [max], ax end_if:...

41 Loops Operations are performed repeatedly via loops In higher level languages, loops are hand- manufactured via conditions and branches (If Statement and Gotos) or using language defined structured loop statements The latter include Repeat, While, and For Statements We introduce the x86 loop instruction Generally a loop body is repeated until a particular value (sentinel) is found A loop body entered unconditionally is akin to a Repeat Statement

42 x86 Loop Another assembler example knows the iteration count at the time of assembly, hence the x86 provided loop instruction can be used A sample x86 loop instruction follows: loop next ; is executed: if --cx then goto next; This loop body can be characterized as a For Statement The third example does not know the number of iterations at the time of assembly. Hence, before entering the loop body the first time, a check must be made for the loop count to be = 0 If so, the body is bypassed; else the body is entered and executed countably many times. Thus, the loop resembles a C-style For Statement

43 x86 Loop We saw, loops allow the repeated operation of their bodies Based on a condition, or based on a defined number of steps, which in effect defines that condition On the x86 architecture, the cx register functions as the counter for counted loops, with the loop opcode On x86 the counted loop is executed by the loop instruction, assuming the loop count in cx As long as cx is not 0, execution continues at the place of the loop label Else execution continues at the next instruction after the loop opcode During each execution of the loop opcode, the value in cx is decremented by 1

44 x86 Loop ; demonstrate the x86 “loop” instruction ; assumes count to be in cx ; when loop is executed: decrement cx ; once cx is 0, continue at instruction after loop ; else branch to label ; place 10 into cx to define loop steps movcx, 10 again:; a label! Note the colon : movax, cx; print value in ax callPutDec; via PutDec procedure loopagain; check, if need to loop more ; prints the numbers 10 down to 1, but NOT 0

45 First Loop We define a string in data segment, all ‘0’..’f’ digits The data area is named ‘chars’ and being used as address (data offset) The sentinel for loop termination is ‘#’ Register bx used as index register Note that only bx, si, di, and bp can be used for indexing on x86 Practice the cmp instruction, which compares by subtracting, and then sets flags Learn to know conditional (jcc) and unconditional jump (jmp) See use of labels as destinations of jumps Output of program is: 0123456789abcdef

46 First Loop ; Source file: loop1.asm ; Purpose: use, syntax of indexing array w. sentinel Startmacro; no parameters movax, @data; @data predefined macro movds, ax; now data segment reg set endm; end macro: start Termin macro ret_code; 1 parameter: return code movah, 4ch; terminate: set ah + al moval, ret_code; any errors? If /= 0 int21h; call sys sw for help endm; end macro: termin Char_Out= 2h Sentin='#'.model small.data charsdb"0123456789abcdef", Sentin

47 First Loop.code main:start movah, Char_Out; set up ah for sys movbx, 0; to index string, init 0 next:movdl, chars[bx]; find next char inc bx; increment index reg bx cmpdl, Sentin; found sentinel? jedone; yep, so stop int 21h; nop, so print it jmpnext; try next; could be sent done:termin 0; no errors if we reach end main; start here!

48 Second Loop Again we define character string in data segment, all ‘0’..’f’ hex digits This time we use no sentinel Assume that the loop is executed exactly 16 times, and is known a-priori, i.e. a countable loop Again we use register bx as index register Learn loop instruction, which tracks loop count and conditional branch Loop instruction on x86 subtracts 1 from cx each time it is executed If cx = 0, fall through; else branch to target, which is part of instruction Output of program is: 0123456789abcdef

49 Second Loop ; Source file: loop2.asm ; Purpose: use, syntax of indexing char array ; loop is "countable" we know # of elements ; b 4 start of loop; we know at assembly time... same macros start, termin Char_Out= 2h Num_El=10h; 16 elements in chars array[].modelsmall.data charsdb"0123456789abcdef"

50 Second Loop.code ; abbreviation main:start movah, Char_Out ; set up ah for system call movbx, 0 ; initial index off 'chars' movcx, Num_El ; know # iterations a priori next:movdl, chars[bx]; find next char inc bx ; increment index register int 21h ; print it loopnext ; try next one; could be 0: end

51 Third Loop Again we define a character string in data segment, all ‘0’..’f’ hex digits, no sentinel Assume iteration count is not known a-priori Again use register bx as index register Must check whether cx is less than or equal to zero Caution: If cx were negative, this would be bad news, as looping will be excessive! Goood that x86 provides a special opcode jcxz Loop instruction on x86 subtracts 1 from cx; should start with a positive value New instruction jcxz: if cx is already zero at start, branch and don’t enter loop body Output of program is: 0123456789abcdef

52 Third Loop ; Source file: loop3.asm ; Purpose: use, syntax of indexing char array.modelsmall.data charsdb"0123456789abcdef".code main:start movah, Char_Out; set up ah for DOS call movbx, 0; initial index off 'chars‘ ; assume that # read at run time ; fake this reading by brute-force setting ; but the point is: The # could be non-positive!

53 Third Loop movcx, 16; pretend we read value of cx cmpcx, 0; then test if cx < 0 jldone_neg; if it is, jump jcxzdone_zero; if it is zero, jump also ; if we reach this: cx is positive next:movdl, [chars][bx] ; find next char inc bx; increment index register int 21h; output next character loopnext; try next one; could be end done:termin 0; no errors if we reach done_neg:termin 1; another error code. Not 0 done_zero:termin 2; an yet another error end main; start here!

54 X86 Call and Return

55 Call and Return High level programming requires logical (and physical) modularization to render the overall programming task manageable Key tool for logical modularization is the creation of procedures (in some languages called subroutines, functions, etc.) with their associated calls and returns This section introduces calling and returning, also known as context switching We’ll use the term procedure generically to mean procedure, function, or subroutine, unless the particular meaning is needed

56 Call and Return It is not feasible to express a complete program as single procedure, when the program is large Logical modules reduce complexity of programming task This allows re-use and reincarnation of the same procedure through parameterization A High Level language should hide the detail of call/return mechanism; not so in assembler For example, the manipulation of the stack through push and pop operations should be hidden However, some aspects of context switch should be reflected in High Level language, in particular the call and return

57 Call and Return Like in High-Level language programs, procedures are a key syntax tool to modularize Physical modules (procedures) encapsulate data and actions that belong together Physical modules –delineated by the proc and endp keywords) are the language tool to define modules Procedures can be called, via the call opcode, parameterized by the procedure name, e.g.: call PutDec Procedures return, via the ret instruction If they return a result to the calling environment, we refer to them as functions A return ends up at the instruction after the call

58 Call and Return Stack Frame Stack Pointer identifies top of current stack, and also top of current Stack Frame Stack pointer may vary often during invocation Stack pointer changes upon call, return, push, pop, explicit assignments Base pointer does not vary during call Base pointer only set up once at start of call Base pointer changed again at return, to value of previous base pointer, dynamic link Parameters can be addressed relative to base pointer in one direction

59 Call and Return Stack Frame Locals (and temps) can be addressed relative to base pointer in the other direction Possible to save base pointer, useful when registers are scarce, as on x86 However, this scheme is difficult, since compiler (or human programmer) must keep dynamic size of stack in mind at any moment of time of code generation; not discussed here

60 Call and Return Stack Frame

61 Call and Return Before Call Push actual parameters: Changes the stack Track size of actual parameters pushed In most languages the actual size is fixed; not so in C Base pointer still points to Stack Marker of caller After last actual parameter pushed: one flexible part of Stack Frame complete

62 Call and Return Call Push the instruction pointer (ip) The address of the instruction after the call must be saved as return address This identifies the beginning of the Stack Marker Set instruction pointer (ip, AKA pc) to the address of the destination (callée) x86 architecture has 24 flavors of call instructions

63 Call and Return Procedure Entry Push Base Pointer, this is the dynamic link Set Base Pointer to the value of the Stack Pointer Now the new Stack Frame is being addressed The fixed part of stack, the Stack Marker is being built Allocate space for local variables, if any This establishes another area of the Stack Frame that is variable in size

64 Call and Return Return Pop locals and temps off stack This frees the second variable size area from the Stack Frame Pop registers to be restored Pop the top of stack value into the Base Pointer(bp) This uses the Dynamic Link to reactivate the previous Stack Frame Pop top of stack value into instruction pointer

65 Call and Return Return This sets the ip register back to the instruction after the call The return instruction does this! Either caller (or a suitable argument of the return instruction) frees the space allocated for actual parameters Note that the x86 architecture allows an argument to the ret instruction, freeing that amount of bytes off of the stack

66 Call and Return Code 1a. Procedure Entry, No Locals, Save Regs pushbp; save dyn link in Stack Marker movbp, sp; establish new Frame: point to S.M. pushax; save ax if needed by callee, opt. pushbx; ditto for bx

67 Call and Return Code 1b. Procedure Exit, No Locals, Restore Regs popbx; restore bx if was used by callee popax; ditto for ax popbp; must find back old Stack Frame retargs; ip to instruction after call ; free args

68 Call and Return Code 2a. Procedure Entry, With Locals, No Regs pushbp; save dyn link in Stack Marker movbp, sp; establish new Frame: point to S.M. subsp, 24 ; allocate 24 bytes uninitialized ; space for locals

69 Call and Return Code 2b. Procedure Exit, With Locals, No Regs movsp, bp; free all locals and temps popbp; must find old S.F., RA on top retargs; ip to instruction after call ; free args

70 Call and Return Code 3a. Procedure Entry, With Locals, Save Regs pushbp; save dyn link in Stack Marker movbp, sp; establish new Frame: point to S.M. subsp, 24; allocate 24 bytes uninitialized ; space for locals pushax; save ax if needed by callee, opt. pushbx; ditto for bx

71 Call and Return Code 3b. Procedure Exit, With Locals, Restore Regs popbx; restore bx if was used by callee popax; ditto for ax movsp, bp; free all locals and temps popbp; must find back old S.F., RA on top retargs; ip to instruction after call ; free args

72 Call and Return Recursive Factorial in C // source: fact.c... unsigned fact( unsigned arg ) { // fact if ( arg <= 1 ) { return 1; }else{ return fact( arg - 1 ) * arg; } //end if } //end fact

73 Call and Return Recursive Factorial in x86 ; Source file: fact.asm penter macro pushbp movbp, sp pushbx pushcx pushdx endm pexitmacro args popdx popcx popbx popbp retargs endm Errcode=4ch MAX =9d.model small.stack100h.data argdw0

74 Call and Return Recursive Factorial in x86.code extrnuPutDec : near ; assume arg on tos ; return fact( int arg ) in ax rfactproc penter movax, [bp+4]; arg 4 bytes b4 dyn link cmpax, 1; argument > 1? jgrecurse; if so: recusive call base:movax, 1; No: then 0!=1!=1 pexit 2; done, free 2 bytes = arg recurse: movax, [bp+4]; recurse; get next arg decax; but decrement first pushax; and pass on stack callrfact; recurse! movcx, [bp+4]; product in ax, * arg mulcx; product in ax pexit 2; and done rfactendp

75 Call and Return Recursive Factorial in x86 drive_r proc movarg, 0; initial memory movax, 0; initial value again movbp, sp; no space for locals needed again_r: cmparg, MAX jgedone_r ; ax holds argument to be factorialized :-) pushax; argument on stack callrfact ; now ax holds factorial value calluPutDec; print next result incarg; compute next fact(arg) movax, arg; pass in ax jmpagain_r done_r:ret drive_r endp

76 Design Asm Procedure PutDec

77 Design PutDec Goal Definition Design an assembly language procedure, which prints a passed integer value in decimal notation Values are passed in a machine register Values may be positive or negative Use x86 small arithmetic, i.e. 16-bit integer precision, to easily track overflow, minimum and maximum integer values We’ll proceed stepwise:  Printing a character  Printing a decimal digit, given an integer value 0..9  Finally printing the complete integer

78 Design PutDec Define Macro Put_Ch to print one character ; character is passed in dl ; fiddle with ax, dx; restore before finishing Put_Ch macro char ; 'char' is char 2 b printed push ax ; save ax push dx ; ditto for dx; use only dl mov dl, char ; move into formal parameter mov ah, 02h ; tell system SW whom to call int 021h ; call system SW, e.g. DOS pop dx ; restore pop ax ; ditto endm

79 Design PutDec Print integer value 0..9 in dl as a character ; assume integer 0..9 to be in dl ; convert to ASCII character ; simple: just add ‘0’ ; p_char: add dl, '0’; convert int to char Put_Ch dl; previously defined macro

80 Design PutDec Print rightmost digit of number in ax in decimal ; ax holds non-negative integer value ; but is a binary number, i.e. binary 0..9; need ASCII mov bx, 10 ; base 10 is in bx sub dx, dx ; make ax-dx a double word div bx ; unsigned divide ax by 10 ; remainder is in dx; known to be < 10, so dl holds it add dl, '0' ; make int a printable char Put_Ch dl ; print that char

81 Asm Source For Procedure PutDec

82 PutDec Asm Code: Macros ; Purpose: print various signed 16-bit numbers start macro mov ax, @data ; typical for MS system SW mov ds, ax endm finish macro ; also MS system SW mov ax, 4c00h int 21h endm Put_Ch macro char ; 'char' char is printed push ax ; save cos ax is overwritten push dx ; ditto for dx mov dl, char ; move character into parameter mov ah, 02h ; tell DOS who int 021h ; call DOS pop dx ; restore pop ax ; ditto endm

83 PutDec Asm Code: Macros Put_Str macro str_addr ; print string at 'str_addr' push ax ; save push dx ; save mov dx, offset str_addr mov ah, 09h ; DOS proc id int 021h ; call DOS pop dx ; restore pop ax ; ditto endm base_10 = 10.model small.stack 500.data min_num db '-32768$' ; end strings with ‘$’ num_is db 'the number is: $' cr_lf db 10, 13, '$'; magic numbers for lf, cr

84 PutDec Asm Code: Body.code ; ax value printed as a decimal integer number PutDec proc ; special case -32768 cannot be negated cmp ax, -32768 ; is it special case? jne do_work ; nop, so do your real job Put_Str min_num ; yep: so print it and be done ret ; done. Printed -32768 do_work: ; ax NOT -32768; is negative? push ax push bx push cx push dx cmp ax, 0 ; negative number? jge positive ; if not, invert sign, print - neg ax ; here the inversion Put_Ch '-' ; but first print - sign positive: sub cx, cx ; cx counts steps = # digits mov bx, base_10 ; divisor is 10 ; now we know number in ax is non-negative

85 PutDec Asm Code: Body ; continue with non-negative number push_m: sub dx, dx ; make a double word div bx ; unsigned divide o.k. add dl, '0' ; make number a char push dx ; save; order reversed inc cx ; count steps cmp ax, 0 ; finally done? jne push_m ; if not, do next step ; now all chars are stored on stack in l-2-r order pop_m: pop dx ; pop to dx; dl of interest Put_Ch dl ; print it as char loop pop_m ; more work? If so, do again done: pop dx ; restore what you messed up pop cx ; ditto pop bx pop ax ret ; return to caller PutDec endp

86 PutDec Asm Code: Driver ; output readable string. Print #, carriage-return ; next_nproc put_str num_is; print message call putdec; print # put_str cr_lf; cr lf ret next_n endp; repeat label before endp num macro val; just to practice macros mov ax, val; PutDec expects # in ax call next_n; message, print #, cr lf endm

87 PutDec Asm Code: Main main proc; entry point under Windows start; set up for OS ; exercise all kinds of cases, including corner cases num -32768; all macro expansions num -32767; ditto num 32767; put this # into ax num 100 num 1 num -1 num 0 num 0ffh finish main endp end main; this IDs the entry point ; can be different name

88 Appendix: Some Definitions

89 Definitions Activation Record Synonym for Stack Marker

90 Definitions Base Address Memory address of an aggregate area Usually a segment- AKA base-register is used to hold a base address Addressing can then proceed relative to such a base address

91 Definitions Base Pointer An address pointer (often implemented via a dedicated register), that identifies an agreed-upon area in the Stack Frame of an executing procedure On the x86 architecture this is implemented via the dedicated bp register

92 Definitions Binding Procedures may have parameters Formal parameters express attributes such as type, name, and similar attributes At the place of call, these formal parameters receive initial, actual values through so-called actual parameters Sometimes, an actual parameter is solely the address of the true object referenced during the call The association of actual t formal parameter is referred to as parameter binding

93 Definitions Branch Transfer of control to a destination that is generally not the instruction following the branch Synonym: Jump. The destination is an explicit or implicit operand of the branch instruction

94 Definitions Call Transfer of control (a.k.a. context switch) to the operand of the call instruction A call expects that after completion, the program resumes execution at the place after the call instruction

95 Definitions Countable Loop Loop, in which the number of iterations can be computed (is known) before the loop body starts Thus the loop body must include code to change the remaining loop count And includes a check to test, whether the final count has been reached

96 Definitions Dynamic Link Location in the Stack Marker pointing to the Stack Frame of the calling procedure This caller is temporarily dormant; i.e. it is the callee’s stack frame that is active Since the caller also has a Dynamic Link object, all currently yet incomplete Stack Frames are linked together via this data structure

97 Definitions For Loop High-level construct implementing a countable loop The x86 instruction is a key component to write countable loops

98 Definitions Frame Pointer Synonym for Base Pointer

99 Definitions Hand-Manufactures Loop Most general type of loop: the number of iterations cannot be computed before, not even during the execution of the loop Generally, the number of iterations depends on data that are input via read operations Also, the number of steps may depend on the precision of a computer (floating-point) result and thus is not known until the end

100 Definitions Immediate Operand Operand encoded as part of the instruction No load is needed to get the immediate value; instead, it is immediately available in the instruction proper Since opcodes have a limited number of bits the size of immediate operands usually is limited to a fraction of a natural machine instruction—or word

101 Definitions Load Operation to move (read) data from memory to the processor Usually the destination is a register The source address is communicated in an immediate operand, or in another register, or indirectly through a register

102 Definitions Loop Body Program portion executed repeatedly This is the actual work to be accomplished. The rest is loop overhead. Goal to minimize that overhead

103 Definitions Offset A distance in memory units away from the base address On a byte addressable microprocessor an offset is a distance in units of bytes Offset is frequently defined as a distance from a base registers, on x86 from a segment register

104 Definitions Pop Stack operation that frees data from the stack Often, the data popped off are assigned to some other object Other times, data are just popped because they are no longer needed, in which case only the stack space is freed This can also be accomplished by changing the value of the stack pointer. Often the memory location is not overwritten by a pop, i.e. the data just stay. But the memory areas is not considered to be part of the active stack anymore

105 Definitions Push Stack operation that reserves temporary space on the stack Generally, the space reserved on the stack through a push is initialized with the argument of the push operation Other times, a push just reserves space on the stack for data to be initialized at a later time Note that on the x86 architecture a push decreases the top of stack pointer (sp value)

106 Definitions Repeat Loop Loop in which the body is entered unconditionally, and thus executed at least once The number of iterations is generally not known until the loop terminates The termination condition is computed logically at the physical end of the loop

107 Definitions Return Transfer of control after completion of a call Usually, this is accomplished through a return instruction The return instruction assumes the return address to be saved in a fixed part of the stack frame, called the Stack Marker

108 Definitions Return Value The value returned by a function call If the return value is a composite data structure, then the location set aside for the function return value is generally a pointer to the actual data When no value is returned, we refer to the callée as a procedure

109 Definitions Stack AKA runtime stack Run time data structure that grows and shrinks during program execution It generally holds data (parameters, locals, temps) and control information (return addresses, links) Operations that change the stack include push, pop, call, return, and the like

110 Definitions Stack Frame Run time data structure associated with an active procedure or function A Stack Frame is composed of the procedure parameters, the Stack Marker, local data, and space for temporary data, including saved registers

111 Definitions Stack Marker Run time data structure on the stack associated with a Stack Frame The Stack Marker holds fixed information, whose structure is known a priori This includes the return address, the static link, and the dynamic link In some implementations, the Stack Marker also holds an entry for the function return value and the saved registers

112 Definitions Stack Pointer AKA top of stack pointer A pointer (typically implemented via a register) that addresses the last element allocated (pushed) on top of the stack On the x86 architecture this is implemented via the sp register It is also possible to have the Stack Pointer refer to the next free location (if any) on the stack in case another push operation needs stack space

113 Definitions Static Link An address in the Stack Marker that points to the Frame Pointer of the last invocation of the procedure which lexicographically surrounds the one executing currently This is necessary only for high level languages that allow statically nested scopes, such as Ada, Algol, and Pascal This is not needed in more restricted languages such as C or C++

114 Definitions Store Operation to move data to memory Such moves are named: writes or stores Usually the source is a register, holding the source address The target is a memory location, whose address is held in some register Some architectures allow the target address to be an immediate operand; not so on RISC architectures

115 Definitions Stride Distance in number of bytes from one element to next of same type For example, the stride of an integer array on the x86 architecture is 2 for signed and unsigned words – note that x86 calls a unit of 2 bytes a word; most architectures have 4-byte words It is 4 for double words on x86

116 Definitions Top of Stack Stack location of the last allocated (pushed) object

117 Definitions While Loop Loop in which the body is entered after first checking whether the condition for execution is true If false, the body is not executed. This is also used as the termination criterion The number of iterations is generally not known until the loop terminates

118 Bibliography  Jan’s Linux and Assembler: http://www.janw.easynet.be/eng.html  Webster Assembly Language: http://webster.cs.ucr.edu/  Nasm assembler under Unix: http://www.int80h.org/bsdasm/

1 ECE 371 Microprocessors Chapter 5 Microprocessor Assembly Language 2 Herbert G. Mayer, PSU Status 10/2/2015 For use at CCUT Fall 2015.

Similar presentations

Presentation on theme: "1 ECE 371 Microprocessors Chapter 5 Microprocessor Assembly Language 2 Herbert G. Mayer, PSU Status 10/2/2015 For use at CCUT Fall 2015."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 ECE 371 Microprocessors Chapter 5 Microprocessor Assembly Language 2 Herbert G. Mayer, PSU Status 10/2/2015 For use at CCUT Fall 2015.

Similar presentations

Presentation on theme: "1 ECE 371 Microprocessors Chapter 5 Microprocessor Assembly Language 2 Herbert G. Mayer, PSU Status 10/2/2015 For use at CCUT Fall 2015."— Presentation transcript:

Similar presentations

About project

Feedback