Principles of Computers 18th Lecture

Slides:

Advertisements

Similar presentations

Machine Instructions Operations

Advertisements

COMP 2003: Assembly Language and Digital Logic

C Programming and Assembly Language Janakiraman V – NITK Surathkal 2 nd August 2014.

Lecture 11 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD

University of Washington Last Time For loops  for loop → while loop → do-while loop → goto version  for loop → while loop → goto “jump to middle” version.

PC hardware and x86 3/3/08 Frans Kaashoek MIT

1 Lecture 5: Procedures Assembly Language for Intel-Based Computers, 4th edition Kip R. Irvine.

1 Function Calls Professor Jennifer Rexford COS 217 Reading: Chapter 4 of “Programming From the Ground Up” (available online from the course Web site)

Accessing parameters from the stack and calling functions.

Practical Session 3. The Stack The stack is an area in memory that its purpose is to provide a space for temporary storage of addresses and data items.

Microprocessors Frame Pointers and the use of the –fomit-frame-pointer switch Feb 25th, 2002.

Stack Activation Records Topics IA32 stack discipline Register saving conventions Creating pointers to local variables February 6, 2003 CSCE 212H Computer.

Web siteWeb site ExamplesExamples Irvine, Kip R. Assembly Language for Intel-Based Computers, Stack Operations Runtime Stack PUSH Operation POP.

INSTRUCTION SET OF MICROPROCESSOR 8085

6.828: PC hardware and x86 Frans Kaashoek

Dr. José M. Reyes Álamo 1.  The 80x86 memory addressing modes provide flexible access to memory, allowing you to easily access ◦ Variables ◦ Arrays ◦

Code Generation Gülfem Savrun Yeniçeri CS 142 (b) 02/26/2013.

CSc 453 Runtime Environments Saumya Debray The University of Arizona Tucson.

Fabián E. Bustamante, Spring 2007 Machine-Level Programming III - Procedures Today IA32 stack discipline Register saving conventions Creating pointers.

The x86 Architecture Lecture 15 Fri, Mar 4, 2005.

IA32 (Pentium) Processor Architecture. Processor modes: 1.Protected (mode we will study) – 32-bit mode – 32-bit (4GB) address space 2.Virtual 8086 modes.

Computer Architecture and Operating Systems CS 3230 :Assembly Section Lecture 4 Department of Computer Science and Software Engineering University of Wisconsin-Platteville.

Microprocessors The ia32 User Instruction Set Jan 31st, 2002.

26-Nov-15 (1) CSC Computer Organization Lecture 6: Pentium IA-32.

The x86 Instruction Set Lecture 16 Mon, Mar 14, 2005.

Carnegie Mellon 1 Machine-Level Programming I: Basics Lecture, Feb. 21, 2013 These slides are from website which accompanies the.

CHARLES UNIVERSITY IN PRAGUE faculty of mathematics and physics Principles of Computers 17 th Lecture Pavel Ježek, Ph.D.

Compiler Construction Code Generation Activation Records

University of Amsterdam Computer Systems – the instruction set architecture Arnoud Visser 1 Computer Systems The instruction set architecture.

1 Assembly Language: Function Calls Jennifer Rexford.

CHARLES UNIVERSITY IN PRAGUE faculty of mathematics and physics Principles of Computers 18 th Lecture Pavel Ježek, Ph.D.

CHARLES UNIVERSITY IN PRAGUE faculty of mathematics and physics Principles of Computers 12 th Lecture Pavel Ježek, Ph.D.

Procedures Dr. Hadi Al Saadi Large problems can be divided into smaller tasks to make them more manageable A procedure is the ASM equivalent of a Java.

Stack Operations Dr. Hadi AL Saadi.

Assembly function call convention

Reading Condition Codes (Cont.)

Computer Architecture and Assembly Language

Assembly language.

Overview of Instruction Set Architectures

C function call conventions and the stack

Data Transfers, Addressing, and Arithmetic

IA32 Processors Evolutionary Design

143A: Principles of Operating Systems Lecture 4: Calling conventions

Aaron Miller David Cohen Spring 2011

RISC Concepts, MIPS ISA Logic Design Tutorial 8.

Introduction to Compilers Tim Teitelbaum

Assembly IA-32.

# include < stdio.h > v oid main(void) { long NUM1[5]; long SUM; long N; NUM1[0] = 17; NUM1[1] = 3; NUM1[2] =  51; NUM1[3] = 242; NUM1[4] = 113; SUM =

Principles of Computers 14th Lecture

Machine-Level Programming 4 Procedures

BIC 10503: COMPUTER ARCHITECTURE

Condition Codes Single Bit Registers

Stack Frames and Advanced Procedures

Machine-Level Programming III: Procedures Sept 18, 2001

MIPS Procedure Calls CSE 378 – Section 3.

EECE.3170 Microprocessor Systems Design I

Practical Session 4.

Multi-modules programming

EECE.3170 Microprocessor Systems Design I

X86 Assembly Review.

CSC 497/583 Advanced Topics in Computer Security

Computer Organization and Assembly Language

Principles of Computers 17th Lecture

Principles of Computers 16th Lecture

ICS51 Introductory Computer Organization

Principles of Computers 15th Lecture

Computer Architecture and System Programming Laboratory

Presentation transcript:

Principles of Computers 18th Lecture Pavel Ježek, Ph.D. pavel.jezek@d3s.mff.cuni.cz

6502 Registers (Accumulator Architecture) 7 0 X 7 0 Y 7 0 S 0000 0001 7 0 P 7 0 PC 15 0 6502: 8-bit CPU with 16-bit logical and physical address spaces (1:1 mapping between logical and physical addresses, i.e. logical address = physical address)

Load and Store & Setting Flags LDA #$xx LDA $xxxx LDA $xxxx,X LDA $xxxx,Y LDX imm/addr LDY imm/addr STA $xxxx STA $xxxx,X STA $xxxx,Y STX addr STY addr A := xx A := ($xxxx)^ A := ($xxxx + X)^ A := ($xxxx + Y)^ X := imm/addr Y := imm/addr ($xxxx)^ := A ($xxxx + X)^ := A ($xxxx + Y)^ := A addr := X addr := Y TAX TXA TAY TYA TSX TXS X := A A := X Y := A A := Y X := S S := X 7654 3210 P N... ..Z. P.Negative := target.7 if target = 0 then P.Zero := 1 else P.Zero := 0; PHP PLP PHA PLA push P (flags) pop P (flags) push A pop A CLC SEC P.Carry := 0 P.Carry := 1

Bitwise Operations ORA imm/addr AND imm/addr EOR imm/addr ? NOT ASL A LSR A ROL A ROR A A := A BitwiseOr imm/addr A := A BitwiseAnd imm/addr A := A BitwiseXor imm/addr EOR #$FF A := A shl 1 A := A shr 1 A := A rol 1 A := A ror 1 P.Negative := A.7 if A = 0 then P.Zero := 1 else P.Zero := 0;

Integer Operations (Adding 8-bit Numbers) 7654 3210 ADC imm/addr result := A + imm/addr + P.Carry P.Carry := result.8 A := result.7 … result.0 P N... ..ZC P.Negative := A.7 if A = 0 then P.Zero := 1 else P.Zero := 0; LSB of A stored at $A000 A7 A6 A5 A4 A3 A2 A1 A0 carry + + LSB of B stored at $B000 B7 B6 B5 B4 B3 B2 B1 B0 = LSB of C stored at $C000 = C8 carry C7 C6 C5 C4 C3 C2 C1 C0 LDA $A000 CLC ADC $B000 STA $C000

Integer Operations – Subtraction? Subtract with Borrow ADC imm/addr SBC imm/addr result := A + imm/addr + P.Carry P.Carry := result.8 A := result.7 … result.0 result := A – imm/addr – not(P.Carry) P.Carry := not(result.7) P.Negative := A.7 if A = 0 then P.Zero := 1 else P.Zero := 0; P.Negative := A.7 if A = 0 then P.Zero := 1 else P.Zero := 0; P.Negative := X/Y.7 if X/Y = 0 then P.Zero := 1 else P.Zero := 0; INX INY DEX DEY X := X + 1 Y := Y + 1 X := X – 1 Y := Y - 1

No Division (div), No Mutiplication (*) Instructions on 6502

How To Write/Compile Complex Expressions? x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x;

Rewriting Complex Expressions – Step 1: Function Calls As Separate Commands, Each Having Variables and Constants As Arguments Only (i.e. No Expressions As Arguments) x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; temp11 := b + c; temp12 := F1(temp11); temp21 := F3(2, 3); temp22 := d + e + temp21; temp23 := F4(4); temp24 := f + temp23; temp25 := F2(temp22, temp24); temp31 := F5(5); x := 1 + a + temp12 * temp25 + temp31 + x; Introducing temporary variables to hold intermediate results of inner subexpressions

Step 2: Rewrite To Have a := b op c Or d := F(…) Only x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; temp11 := b + c; temp12 := F1(temp11); temp21 := F3(2, 3); temp22 := d + e; temp22 := temp22 + temp21; temp23 := F4(4); temp24 := f + temp23; temp25 := F2(temp22, temp24); temp31 := F5(5); tempX := 1 + a; temp4 := temp12 * temp25; tempX := tempX + temp4; tempX := tempX + temp31; x := tempX + x; temp11 := b + c; temp12 := F1(temp11); temp21 := F3(2, 3); temp22 := d + e + temp21; temp23 := F4(4); temp24 := f + temp23; temp25 := F2(temp22, temp24); temp31 := F5(5); x := 1 + a + temp12 * temp25 + temp31 + x;

Step 3 (If Target CPU Does Not Support a := b op c): a := a + b x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; temp11 := b; temp11 := temp11 + c; temp12 := F1(temp11); temp21 := F3(2, 3); temp22 := d; temp22 := temp22 + e; temp22 := temp22 + temp21; temp23 := F4(4); temp24 := f; temp24 := temp24 + temp23; temp25 := F2(temp22, temp24); temp31 := F5(5); tempX := 1 { x := 1 + a + temp12 * temp25 + temp31 + x; } tempX := tempX + a; temp4 := temp12; temp4 := temp4 * temp25; tempX := tempX + temp4; tempX := tempX + temp31; tempX := tempX + x; x := tempX; OR x := x + tempX; temp11 := b + c; temp12 := F1(temp11); temp21 := F3(2, 3); temp22 := d + e; temp22 := temp22 + temp21; temp23 := F4(4); temp24 := f + temp23; temp25 := F2(temp22, temp24); temp31 := F5(5); tempX := 1 + a; temp4 := temp12 * temp25; tempX := tempX + temp4; tempX := tempX + temp31; x := tempX + x;

Step 4: Rewriting Function Calls – Expecting Calling Convention With Arguments Passed On Stack x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; temp11 := b; temp11 := temp11 + c; push temp11 call F1 temp12 := returnValue; push 3 push 2 call F3 temp21 := returnValue; temp22 := d; temp22 := temp22 + e; temp22 := temp22 + temp21; push 4 call F4 temp23 := returnValue; temp24 := f; temp24 := temp24 + temp23; push temp24 push temp22 call F2 temp25 := returnValue; push 5 call F5 temp31 := returnValue; tempX := 1 tempX := tempX + a; temp4 := temp12; temp4 := temp4 * temp25; tempX := tempX + temp4; tempX := tempX + temp31; tempX := tempX + x; x := tempX;

Step 5: Rewrite Operations Unsupported by Target CPU into Calls to Runtime x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; temp11 := b; temp11 := temp11 + c; push temp11 call F1 temp12 := returnValue; push 3 push 2 call F3 temp21 := returnValue; temp22 := d; temp22 := temp22 + e; temp22 := temp22 + temp21; push 4 call F4 temp23 := returnValue; temp24 := f; temp24 := temp24 + temp23; push temp24 push temp22 call F2 temp25 := returnValue; push 5 call F5 temp31 := returnValue; tempX := 1 tempX := tempX + a; push temp12 temp4 := temp12; push temp25 temp4 := temp4 * temp25; call runtimeMultiply temp4 := returnValue tempX := tempX + temp4; tempX := tempX + temp31; tempX := tempX + x; x := tempX;

Target CPU: 6502 x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; temp11 := b; temp11 := temp11 + c; push temp11 call F1 temp12 := returnValue; push 3 push 2 call F3 temp21 := returnValue; temp22 := d; temp22 := temp22 + e; temp22 := temp22 + temp21; push 4 call F4 temp23 := returnValue; temp24 := f; temp24 := temp24 + temp23; push temp24 push temp22 call F2 temp25 := returnValue; push 5 call F5 temp31 := returnValue; tempX := 1 tempX := tempX + a; temp4 := temp12; temp4 := temp4 * temp25; call Pascal runtime temp4 := returnValue tempX := tempX + temp4; tempX := tempX + temp31; tempX := tempX + x; x := tempX;

Target CPU: 6502 x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; temp11 := b; temp11 := temp11 + c; push temp11 call F1 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 temp21 := returnValue; temp22 := d; temp22 := temp22 + e; temp22 := temp22 + temp21; push 4 call F4 temp23 := returnValue; temp24 := f; temp24 := temp24 + temp23; push temp24 push temp22 call F2 temp25 := returnValue; push 5 call F5 temp31 := returnValue; tempX := 1 tempX := tempX + a; temp4 := temp12; temp4 := temp4 * temp25; call Pascal runtime temp4 := returnValue tempX := tempX + temp4; tempX := tempX + temp31; tempX := tempX + x; x := tempX;

Passing all return values in A register Target CPU: 6502 x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; Passing all return values in A register temp11 := b; temp11 := temp11 + c; push temp11 call F1 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 temp21 := returnValue; temp22 := d; temp22 := temp22 + e; temp22 := temp22 + temp21; push 4 call F4 temp23 := returnValue; temp24 := f; temp24 := temp24 + temp23; push temp24 push temp22 call F2 temp25 := returnValue; push 5 call F5 temp31 := returnValue; tempX := 1 tempX := tempX + a; temp4 := temp12; temp4 := temp4 * temp25; call Pascal runtime temp4 := returnValue tempX := tempX + temp4; tempX := tempX + temp31; tempX := tempX + x; x := tempX;

Passing all return values in A register Target CPU: 6502 x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; Passing all return values in A register temp11 := b; temp11 := temp11 + c; push temp11 call F1 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 STA &temp21 temp21 := returnValue; temp22 := d; temp22 := temp22 + e; temp22 := temp22 + temp21; push 4 call F4 temp23 := returnValue; temp24 := f; temp24 := temp24 + temp23; push temp24 push temp22 call F2 temp25 := returnValue; push 5 call F5 temp31 := returnValue; tempX := 1 tempX := tempX + a; temp4 := temp12; temp4 := temp4 * temp25; call Pascal runtime temp4 := returnValue tempX := tempX + temp4; tempX := tempX + temp31; tempX := tempX + x; x := tempX;

Passing all return values in A register Target CPU: 6502 x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; Passing all return values in A register temp11 := b; temp11 := temp11 + c; push temp11 call F1 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 STA &temp21 temp21 := returnValue; temp22 := d; temp22 := temp22 + e; temp22 := temp22 + temp21; LDA #4 push 4 JSR &F3 call F4 STA &temp23 temp23 := returnValue; temp24 := f; temp24 := temp24 + temp23; push temp24 push temp22 call F2 temp25 := returnValue; LDA #5 push 5 PHA JSR &F5 call F5 STA &temp31 temp31 := returnValue; tempX := 1 tempX := tempX + a; temp4 := temp12; temp4 := temp4 * temp25; call Pascal runtime temp4 := returnValue tempX := tempX + temp4; tempX := tempX + temp31; tempX := tempX + x; x := tempX;

x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; Target CPU: 6502 x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; Passing all return values in A register temp11 := b; temp11 := temp11 + c; push temp11 call F1 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 STA &temp21 temp21 := returnValue; temp22 := d; temp22 := temp22 + e; temp22 := temp22 + temp21; LDA #4 push 4 JSR &F3 call F4 STA &temp23 temp23 := returnValue; temp24 := f; temp24 := temp24 + temp23; push temp24 push temp22 call F2 temp25 := returnValue; LDA #5 push 5 PHA JSR &F5 call F5 STA &temp31 temp31 := returnValue; tempX := 1 tempX := tempX + a; temp4 := temp12; temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA &temp4 temp4 := returnValue tempX := tempX + temp4; tempX := tempX + temp31; tempX := tempX + x; x := tempX; Arguments of the runtimeMultiply function call

x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; Target CPU: 6502 x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; Passing all return values in A register temp11 := b; temp11 := temp11 + c; push temp11 call F1 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 STA &temp21 temp21 := returnValue; temp22 := d; temp22 := temp22 + e; temp22 := temp22 + temp21; LDA #4 push 4 JSR &F3 call F4 STA &temp23 temp23 := returnValue; temp24 := f; temp24 := temp24 + temp23; push temp24 push temp22 call F2 temp25 := returnValue; LDA #5 push 5 PHA JSR &F5 call F5 STA &temp31 temp31 := returnValue; tempX := 1 tempX := tempX + a; LDA &temp12 temp4 := temp12; LDA &temp25 temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA &temp4 temp4 := returnValue tempX := tempX + temp4; tempX := tempX + temp31; tempX := tempX + x; x := tempX; Arguments of the runtimeMultiply function call

Passing all return values in A register Target CPU: 6502 x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; Passing all return values in A register temp11 := b; temp11 := temp11 + c; push temp11 call F1 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 STA &temp21 temp21 := returnValue; LDA &d temp22 := d; ADC &e temp22 := temp22 + e; ADC &temp21 temp22 := temp22 + temp21; STA &temp22 LDA #4 push 4 JSR &F3 call F4 STA &temp23 temp23 := returnValue; temp24 := f; temp24 := temp24 + temp23; push temp24 push temp22 call F2 temp25 := returnValue; LDA #5 push 5 PHA JSR &F5 call F5 STA &temp31 temp31 := returnValue; tempX := 1 tempX := tempX + a; LDA &temp12 temp4 := temp12; LDA &temp25 temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA &temp4 temp4 := returnValue tempX := tempX + temp4; tempX := tempX + temp31; tempX := tempX + x; x := tempX; If temp* is part of arithmetic operation, it needs to be loaded into A

Passing all return values in A register Target CPU: 6502 x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; Passing all return values in A register LDA &b temp11 := b; ADC &c temp11 := temp11 + c; PHA push temp11 JSR &F1 call F1 STA &temp12 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 STA &temp21 temp21 := returnValue; LDA &d temp22 := d; ADC &e temp22 := temp22 + e; ADC &temp21 temp22 := temp22 + temp21; STA &temp22 LDA #4 push 4 JSR &F3 call F4 STA &temp23 temp23 := returnValue; temp24 := f; temp24 := temp24 + temp23; push temp24 push temp22 call F2 temp25 := returnValue; LDA #5 push 5 PHA JSR &F5 call F5 STA &temp31 temp31 := returnValue; tempX := 1 tempX := tempX + a; LDA &temp12 temp4 := temp12; LDA &temp25 temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA &temp4 temp4 := returnValue tempX := tempX + temp4; tempX := tempX + temp31; tempX := tempX + x; x := tempX; If temp* is part of arithmetic operation, it needs to be loaded into A

Passing all return values in A register Target CPU: 6502 x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; Passing all return values in A register LDA &b temp11 := b; ADC &c temp11 := temp11 + c; PHA push temp11 JSR &F1 call F1 STA &temp12 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 STA &temp21 temp21 := returnValue; LDA &d temp22 := d; ADC &e temp22 := temp22 + e; ADC &temp21 temp22 := temp22 + temp21; STA &temp22 LDA #4 push 4 JSR &F3 call F4 STA &temp23 temp23 := returnValue; LDA &f temp24 := f; ADC &temp23 temp24 := temp24 + temp23; PHA push temp24 LDA &temp22 push temp22 JSR &f2 call F2 STA &temp25 temp25 := returnValue; LDA #5 push 5 PHA JSR &F5 call F5 STA &temp31 temp31 := returnValue; tempX := 1 tempX := tempX + a; LDA &temp12 temp4 := temp12; LDA &temp25 temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA &temp4 temp4 := returnValue tempX := tempX + temp4; tempX := tempX + temp31; tempX := tempX + x; x := tempX;

Passing all return values in A register Target CPU: 6502 x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; Passing all return values in A register LDA &b temp11 := b; ADC &c temp11 := temp11 + c; PHA push temp11 JSR &F1 call F1 STA &temp12 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 STA &temp21 temp21 := returnValue; LDA &d temp22 := d; ADC &e temp22 := temp22 + e; ADC &temp21 temp22 := temp22 + temp21; STA &temp22 LDA #4 push 4 JSR &F3 call F4 STA &temp23 temp23 := returnValue; LDA &f temp24 := f; ADC &temp23 temp24 := temp24 + temp23; PHA push temp24 LDA &temp22 push temp22 JSR &f2 call F2 STA &temp25 temp25 := returnValue; LDA #5 push 5 PHA JSR &F5 call F5 STA &temp31 temp31 := returnValue; tempX := 1 tempX := tempX + a; LDA &temp12 temp4 := temp12; LDA &temp25 temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA &temp4 temp4 := returnValue tempX := tempX + temp4; tempX := tempX + temp31; tempX := tempX + x; x := tempX; x := 1 + a + temp12 * temp25 + temp31 + x;

Passing all return values in A register Target CPU: 6502 x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; Passing all return values in A register LDA &b temp11 := b; ADC &c temp11 := temp11 + c; PHA push temp11 JSR &F1 call F1 STA &temp12 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 STA &temp21 temp21 := returnValue; LDA &d temp22 := d; ADC &e temp22 := temp22 + e; ADC &temp21 temp22 := temp22 + temp21; STA &temp22 LDA #4 push 4 JSR &F3 call F4 STA &temp23 temp23 := returnValue; LDA &f temp24 := f; ADC &temp23 temp24 := temp24 + temp23; PHA push temp24 LDA &temp22 push temp22 JSR &f2 call F2 STA &temp25 temp25 := returnValue; LDA #5 push 5 PHA JSR &F5 call F5 STA &temp31 temp31 := returnValue; LDA #1 tempX := 1 ADC &a tempX := tempX + a; LDA &temp12 temp4 := temp12; LDA &temp25 temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA &temp4 temp4 := returnValue ADC &temp4 tempX := tempX + temp4; ADC &temp31 tempX := tempX + temp31; ADC &x tempX := tempX + x; STA &x x := tempX; x := 1 + a + temp12 * temp25 + temp31 + x;

x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; Target CPU: 6502 x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; Passing all return values in A register LDA &b temp11 := b; ADC &c temp11 := temp11 + c; PHA push temp11 JSR &F1 call F1 STA &temp12 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 STA &temp21 temp21 := returnValue; LDA &d temp22 := d; ADC &e temp22 := temp22 + e; ADC &temp21 temp22 := temp22 + temp21; STA &temp22 LDA #4 push 4 JSR &F3 call F4 STA &temp23 temp23 := returnValue; LDA &f temp24 := f; ADC &temp23 temp24 := temp24 + temp23; PHA push temp24 LDA &temp22 push temp22 JSR &f2 call F2 STA &temp25 temp25 := returnValue; LDA #5 push 5 PHA JSR &F5 call F5 STA &temp31 temp31 := returnValue; LDA #1 tempX := 1 ADC &a tempX := tempX + a; LDA &temp12 temp4 := temp12; LDA &temp25 temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA &temp4 temp4 := returnValue ADC &temp4 tempX := tempX + temp4; ADC &temp31 tempX := tempX + temp31; ADC &x tempX := tempX + x; STA &x x := tempX; x := 1 + a + temp12 * temp25 + temp31 + x; A ≠ tempX but was overwritten by current value of temp12, then temp25, then temp4

Passing all return values in A register Target CPU: 6502 x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; Passing all return values in A register LDA &b temp11 := b; ADC &c temp11 := temp11 + c; PHA push temp11 JSR &F1 call F1 STA &temp12 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 STA &temp21 temp21 := returnValue; LDA &d temp22 := d; ADC &e temp22 := temp22 + e; ADC &temp21 temp22 := temp22 + temp21; STA &temp22 LDA #4 push 4 JSR &F3 call F4 STA &temp23 temp23 := returnValue; LDA &f temp24 := f; ADC &temp23 temp24 := temp24 + temp23; PHA push temp24 LDA &temp22 push temp22 JSR &f2 call F2 STA &temp25 temp25 := returnValue; LDA #5 push 5 PHA JSR &F5 call F5 STA &temp31 temp31 := returnValue; LDA #1 tempX := 1 ADC &a tempX := tempX + a; TAY save tempX into Y reg. LDA &temp12 temp4 := temp12; LDA &temp25 temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA &temp4 temp4 := returnValue TYA restore tempX from Y ADC &temp4 tempX := tempX + temp4; ADC &temp31 tempX := tempX + temp31; ADC &x tempX := tempX + x; STA &x x := tempX; x := 1 + a + temp12 * temp25 + temp31 + x;

LDA &b temp11 := b; CLC ADC &c temp11 := temp11 + c; PHA push temp11 JSR &F1 call F1 STA &temp12 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 STA &temp21 temp21 := returnValue; LDA &d temp22 := d; ADC &e temp22 := temp22 + e; ADC &temp21 temp22 := temp22 + temp21; STA &temp22 LDA #4 push 4 JSR &F3 call F4 STA &temp23 temp23 := returnValue; LDA &f temp24 := f; ADC &temp23 temp24 := temp24 + temp23; PHA push temp24 LDA &temp22 push temp22 JSR &f2 call F2 STA &temp25 temp25 := returnValue; Fix 8-bit Additions x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; LDA #5 push 5 PHA JSR &F5 call F5 STA &temp31 temp31 := returnValue; LDA #1 tempX := 1 CLC ADC &a tempX := tempX + a; TAY save tempX into Y reg. LDA &temp12 temp4 := temp12; LDA &temp25 temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA &temp4 temp4 := returnValue TYA restore tempX from Y ADC &temp4 tempX := tempX + temp4; ADC &temp31 tempX := tempX + temp31; ADC &x tempX := tempX + x; STA &x x := tempX;

Allocate space for 7 temporary variables on stack LDA &b temp11 := b; CLC ADC &c temp11 := temp11 + c; PHA push temp11 JSR &F1 call F1 STA &temp12 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 STA &temp21 temp21 := returnValue; LDA &d temp22 := d; ADC &e temp22 := temp22 + e; ADC &temp21 temp22 := temp22 + temp21; STA &temp22 LDA #4 push 4 JSR &F3 call F4 STA &temp23 temp23 := returnValue; LDA &f temp24 := f; ADC &temp23 temp24 := temp24 + temp23; PHA push temp24 LDA &temp22 push temp22 JSR &f2 call F2 STA &temp25 temp25 := returnValue; Allocate space for 7 temporary variables on stack x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; LDA #5 push 5 PHA JSR &F5 call F5 STA &temp31 temp31 := returnValue; LDA #1 tempX := 1 CLC ADC &a tempX := tempX + a; TAY save tempX into Y reg. LDA &temp12 temp4 := temp12; LDA &temp25 temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA &temp4 temp4 := returnValue TYA restore tempX from Y ADC &temp4 tempX := tempX + temp4; ADC &temp31 tempX := tempX + temp31; ADC &x tempX := tempX + x; STA &x x := tempX; 6 1 2 7 3 4 5

Allocate space for 7 temporary variables on stack S := S - 7 LDA &b temp11 := b; CLC ADC &c temp11 := temp11 + c; PHA push temp11 JSR &F1 call F1 STA &temp12 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 STA &temp21 temp21 := returnValue; LDA &d temp22 := d; ADC &e temp22 := temp22 + e; ADC &temp21 temp22 := temp22 + temp21; STA &temp22 LDA #4 push 4 JSR &F3 call F4 STA &temp23 temp23 := returnValue; LDA &f temp24 := f; ADC &temp23 temp24 := temp24 + temp23; PHA push temp24 LDA &temp22 push temp22 JSR &f2 call F2 STA &temp25 temp25 := returnValue; Allocate space for 7 temporary variables on stack x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; LDA #5 push 5 PHA JSR &F5 call F5 STA &temp31 temp31 := returnValue; LDA #1 tempX := 1 CLC ADC &a tempX := tempX + a; TAY save tempX into Y reg. LDA &temp12 temp4 := temp12; LDA &temp25 temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA &temp4 temp4 := returnValue TYA restore tempX from Y ADC &temp4 tempX := tempX + temp4; ADC &temp31 tempX := tempX + temp31; ADC &x tempX := tempX + x; STA &x x := tempX; S := S + 7 6 1 2 7 3 4 5

Allocate space for 7 temporary variables on stack TSX allocate 7 one byte temps TXA SEC SBC #7 TAX TXS LDA &b temp11 := b; CLC ADC &c temp11 := temp11 + c; PHA push temp11 JSR &F1 call F1 STA &temp12 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 STA &temp21 temp21 := returnValue; LDA &d temp22 := d; ADC &e temp22 := temp22 + e; ADC &temp21 temp22 := temp22 + temp21; STA &temp22 LDA #4 push 4 JSR &F3 call F4 STA &temp23 temp23 := returnValue; LDA &f temp24 := f; ADC &temp23 temp24 := temp24 + temp23; PHA push temp24 LDA &temp22 push temp22 JSR &f2 call F2 STA &temp25 temp25 := returnValue; Allocate space for 7 temporary variables on stack x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; LDA #5 push 5 PHA JSR &F5 call F5 STA &temp31 temp31 := returnValue; LDA #1 tempX := 1 CLC ADC &a tempX := tempX + a; TAY save tempX into Y reg. LDA &temp12 temp4 := temp12; LDA &temp25 temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA &temp4 temp4 := returnValue TYA restore tempX from Y ADC &temp4 tempX := tempX + temp4; ADC &temp31 tempX := tempX + temp31; ADC &x tempX := tempX + x; STA &x x := tempX; TSX free 7 one byte temps TXA ADC #7 TAX TXS

Removing arguments from stack Passing all return values in A register TSX allocate 7 one byte temps TXA SEC SBC #7 TAX TXS LDA &b temp11 := b; CLC ADC &c temp11 := temp11 + c; PHA push temp11 JSR &F1 call F1 TSX remove arguments from stack INX STA &temp12 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 TSX remove arguments from stack STA &temp21 temp21 := returnValue; LDA &d temp22 := d; ADC &e temp22 := temp22 + e; ADC &temp21 temp22 := temp22 + temp21; STA &temp22 LDA #4 push 4 JSR &F3 call F4 TSX remove arguments from stack STA &temp23 temp23 := returnValue; LDA &f temp24 := f; ADC &temp23 temp24 := temp24 + temp23; PHA push temp24 LDA &temp22 push temp22 JSR &f2 call F2 STA &temp25 temp25 := returnValue; x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; LDA #5 push 5 PHA JSR &F5 call F5 TSX remove arguments from stack INX TXS STA &temp31 temp31 := returnValue; LDA #1 tempX := 1 CLC ADC &a tempX := tempX + a; TAY save tempX into Y reg. LDA &temp12 temp4 := temp12; LDA &temp25 temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA &temp4 temp4 := returnValue TSX remove arguments from stack TYA restore tempX from Y ADC &temp4 tempX := tempX + temp4; ADC &temp31 tempX := tempX + temp31; ADC &x tempX := tempX + x; STA &x x := tempX; TSX free 7 one byte temps TXA ADC #7 TAX

x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; TSX allocate 7 one byte temps TXA SEC SBC #7 TAX TXS LDA &b temp11 := b; CLC ADC &c temp11 := temp11 + c; PHA push temp11 JSR &F1 call F1 TSX INX remove arguments from stack STA &temp12 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 TSX remove arguments from stack INX STA &temp21 temp21 := returnValue; LDA &d temp22 := d; ADC &e temp22 := temp22 + e; ADC &temp21 temp22 := temp22 + temp21; STA &temp22 LDA #4 push 4 JSR &F3 call F4 TSX remove arguments from stack STA &temp23 temp23 := returnValue; LDA &f temp24 := f; ADC &temp23 temp24 := temp24 + temp23; PHA push temp24 LDA &temp22 push temp22 JSR &f2 call F2 TSX remove arguments from stack STA &temp25 temp25 := returnValue; Passing return value on stack for F1, and in accumulator for F2, F3, F4, F5 x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; LDA #5 push 5 PHA JSR &F5 call F5 TSX remove arguments from stack INX TXS STA &temp31 temp31 := returnValue; LDA #1 tempX := 1 CLC ADC &a tempX := tempX + a; TAY save tempX into Y reg. LDA &temp12 temp4 := temp12; LDA &temp25 temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA &temp4 temp4 := returnValue TSX remove arguments from stack TYA restore tempX from Y ADC &temp4 tempX := tempX + temp4; ADC &temp31 tempX := tempX + temp31; ADC &x tempX := tempX + x; STA &x x := tempX; TSX free 7 one byte temps TXA ADC #7 TAX

x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; TSX allocate 7 one byte temps TXA SEC SBC #7 TAX TXS LDA &b temp11 := b; CLC ADC &c temp11 := temp11 + c; PHA push temp11 PHA reserve space for return value by pushing dummy value JSR &F1 call F1 TSX ?? save return value into A INX remove return value from stack INX remove arguments from stack STA &temp12 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 TSX remove arguments from stack INX STA &temp21 temp21 := returnValue; LDA &d temp22 := d; ADC &e temp22 := temp22 + e; ADC &temp21 temp22 := temp22 + temp21; STA &temp22 LDA #4 push 4 JSR &F3 call F4 TSX remove arguments from stack STA &temp23 temp23 := returnValue; LDA &f temp24 := f; ADC &temp23 temp24 := temp24 + temp23; PHA push temp24 LDA &temp22 push temp22 JSR &f2 call F2 TSX remove arguments from stack STA &temp25 temp25 := returnValue; Passing return value on stack for F1, and in accumulator for F2, F3, F4, F5 x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; LDA #5 push 5 PHA JSR &F5 call F5 TSX remove arguments from stack INX TXS STA &temp31 temp31 := returnValue; LDA #1 tempX := 1 CLC ADC &a tempX := tempX + a; TAY save tempX into Y reg. LDA &temp12 temp4 := temp12; LDA &temp25 temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA &temp4 temp4 := returnValue TSX remove arguments from stack TYA restore tempX from Y ADC &temp4 tempX := tempX + temp4; ADC &temp31 tempX := tempX + temp31; ADC &x tempX := tempX + x; STA &x x := tempX; TSX free 7 one byte temps TXA ADC #7 TAX

6502 stack pointer register: TSX allocate 7 one byte temps TXA SEC SBC #7 TAX TXS LDA &b temp11 := b; CLC ADC &c temp11 := temp11 + c; PHA push temp11 PHA reserve space for return value by pushing dummy value JSR &F1 call F1 TSX ?? save return value into A INX remove return value from stack INX remove arguments from stack STA &temp12 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 TSX remove arguments from stack INX STA &temp21 temp21 := returnValue; LDA &d temp22 := d; ADC &e temp22 := temp22 + e; ADC &temp21 temp22 := temp22 + temp21; STA &temp22 LDA #4 push 4 JSR &F3 call F4 TSX remove arguments from stack STA &temp23 temp23 := returnValue; LDA &f temp24 := f; ADC &temp23 temp24 := temp24 + temp23; PHA push temp24 LDA &temp22 push temp22 JSR &f2 call F2 TSX remove arguments from stack STA &temp25 temp25 := returnValue; Passing return value on stack for F1, and in accumulator for F2, F3, F4, F5 x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; LDA #5 push 5 PHA JSR &F5 call F5 TSX remove arguments from stack INX TXS STA &temp31 temp31 := returnValue; LDA #1 tempX := 1 CLC ADC &a tempX := tempX + a; TAY save tempX into Y reg. LDA &temp12 temp4 := temp12; LDA &temp25 temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA &temp4 temp4 := returnValue TSX remove arguments from stack TYA restore tempX from Y ADC &temp4 tempX := tempX + temp4; ADC &temp31 tempX := tempX + temp31; ADC &x tempX := tempX + x; STA &x x := tempX; TSX free 7 one byte temps TXA ADC #7 TAX ... ?? $01EC $01EB $01EA $01E9 ↔ temp $01E8 ↔ temp $01E7 ↔ temp $01E6 ↔ temp $01E5 ↔ temp $01E4 ↔ temp $01E3 ↔ temp a1 $01E2 ↔ 1st argument r $01E1 ↔ return value S($E0) → $01E0 $01DF 6502 stack pointer register: 0000 0001 7 0 S

6502 stack pointer register: TSX allocate 7 one byte temps TXA SEC SBC #7 TAX TXS LDA &b temp11 := b; CLC ADC &c temp11 := temp11 + c; PHA push temp11 PHA reserve space for return value by pushing dummy value JSR &F1 call F1 TSX ?? save return value into A INX remove return value from stack INX remove arguments from stack STA &temp12 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 TSX remove arguments from stack INX STA &temp21 temp21 := returnValue; LDA &d temp22 := d; ADC &e temp22 := temp22 + e; ADC &temp21 temp22 := temp22 + temp21; STA &temp22 LDA #4 push 4 JSR &F3 call F4 TSX remove arguments from stack STA &temp23 temp23 := returnValue; LDA &f temp24 := f; ADC &temp23 temp24 := temp24 + temp23; PHA push temp24 LDA &temp22 push temp22 JSR &f2 call F2 TSX remove arguments from stack STA &temp25 temp25 := returnValue; Passing return value on stack for F1, and in accumulator for F2, F3, F4, F5 x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; LDA #5 push 5 PHA JSR &F5 call F5 TSX remove arguments from stack INX TXS STA &temp31 temp31 := returnValue; LDA #1 tempX := 1 CLC ADC &a tempX := tempX + a; TAY save tempX into Y reg. LDA &temp12 temp4 := temp12; LDA &temp25 temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA &temp4 temp4 := returnValue TSX remove arguments from stack TYA restore tempX from Y ADC &temp4 tempX := tempX + temp4; ADC &temp31 tempX := tempX + temp31; ADC &x tempX := tempX + x; STA &x x := tempX; TSX free 7 one byte temps TXA ADC #7 TAX ... ?? $01EC $01EB $01EA $01E9 ↔ temp $01E8 ↔ temp $01E7 ↔ temp $01E6 ↔ temp $01E5 ↔ temp $01E4 ↔ temp $01E3 ↔ temp $100+S+2 a1 $01E2 ↔ 1st argument $100+S+1 r $01E1 ↔ return value S($E0) → $01E0 $01DF 6502 stack pointer register: 0000 0001 7 0 S

6502 stack pointer register: TSX allocate 7 one byte temps TXA SEC SBC #7 TAX TXS LDA &b temp11 := b; CLC ADC &c temp11 := temp11 + c; PHA push temp11 PHA reserve space for return value by pushing dummy value JSR &F1 call F1 TSX ?? save return value into A INX remove return value from stack INX remove arguments from stack STA &temp12 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 TSX remove arguments from stack INX STA &temp21 temp21 := returnValue; LDA &d temp22 := d; ADC &e temp22 := temp22 + e; ADC &temp21 temp22 := temp22 + temp21; STA &temp22 LDA #4 push 4 JSR &F3 call F4 TSX remove arguments from stack STA &temp23 temp23 := returnValue; LDA &f temp24 := f; ADC &temp23 temp24 := temp24 + temp23; PHA push temp24 LDA &temp22 push temp22 JSR &f2 call F2 TSX remove arguments from stack STA &temp25 temp25 := returnValue; Passing return value on stack for F1, and in accumulator for F2, F3, F4, F5 x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; LDA #5 push 5 PHA JSR &F5 call F5 TSX remove arguments from stack INX TXS STA &temp31 temp31 := returnValue; LDA #1 tempX := 1 CLC ADC &a tempX := tempX + a; TAY save tempX into Y reg. LDA &temp12 temp4 := temp12; LDA &temp25 temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA &temp4 temp4 := returnValue TSX remove arguments from stack TYA restore tempX from Y ADC &temp4 tempX := tempX + temp4; ADC &temp31 tempX := tempX + temp31; ADC &x tempX := tempX + x; STA &x x := tempX; TSX free 7 one byte temps TXA ADC #7 TAX ... ?? $01EC $01EB $01EA $01E9 ↔ temp $01E8 ↔ temp $01E7 ↔ temp $01E6 ↔ temp $01E5 ↔ temp $01E4 ↔ temp $01E3 ↔ temp $102+S a1 $01E2 ↔ 1st argument $101+S r $01E1 ↔ return value S($E0) → $01E0 $01DF 6502 stack pointer register: 0000 0001 7 0 S

Not a valid 6502 instruction TSX allocate 7 one byte temps TXA SEC SBC #7 TAX TXS LDA &b temp11 := b; CLC ADC &c temp11 := temp11 + c; PHA push temp11 PHA reserve space for return value by pushing dummy value JSR &F1 call F1 TSX LDA $101 + S save return value into A INX remove return value from stack INX remove arguments from stack STA &temp12 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 TSX remove arguments from stack INX STA &temp21 temp21 := returnValue; LDA &d temp22 := d; ADC &e temp22 := temp22 + e; ADC &temp21 temp22 := temp22 + temp21; STA &temp22 LDA #4 push 4 JSR &F3 call F4 TSX remove arguments from stack STA &temp23 temp23 := returnValue; LDA &f temp24 := f; ADC &temp23 temp24 := temp24 + temp23; PHA push temp24 LDA &temp22 push temp22 JSR &f2 call F2 TSX remove arguments from stack STA &temp25 temp25 := returnValue; Passing return value on stack for F1, and in accumulator for F2, F3, F4, F5 x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; LDA #5 push 5 PHA JSR &F5 call F5 TSX remove arguments from stack INX TXS STA &temp31 temp31 := returnValue; LDA #1 tempX := 1 CLC ADC &a tempX := tempX + a; TAY save tempX into Y reg. LDA &temp12 temp4 := temp12; LDA &temp25 temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA &temp4 temp4 := returnValue TSX remove arguments from stack TYA restore tempX from Y ADC &temp4 tempX := tempX + temp4; ADC &temp31 tempX := tempX + temp31; ADC &x tempX := tempX + x; STA &x x := tempX; TSX free 7 one byte temps TXA ADC #7 TAX Not a valid 6502 instruction ... ?? $01EC $01EB $01EA $01E9 ↔ temp $01E8 ↔ temp $01E7 ↔ temp $01E6 ↔ temp $01E5 ↔ temp $01E4 ↔ temp $01E3 ↔ temp $102+S a1 $01E2 ↔ 1st argument $101+S r $01E1 ↔ return value S($E0) → $01E0 $01DF 6502 stack pointer register: 0000 0001 7 0 S

Not a valid 6502 instruction TSX allocate 7 one byte temps TXA SEC SBC #7 TAX TXS LDA &b temp11 := b; CLC ADC &c temp11 := temp11 + c; PHA push temp11 PHA reserve space for return value by pushing dummy value JSR &F1 call F1 TSX LDA $101 + S save return value into A INX remove return value from stack INX remove arguments from stack STA &temp12 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 TSX remove arguments from stack INX STA &temp21 temp21 := returnValue; LDA &d temp22 := d; ADC &e temp22 := temp22 + e; ADC &temp21 temp22 := temp22 + temp21; STA &temp22 LDA #4 push 4 JSR &F3 call F4 TSX remove arguments from stack STA &temp23 temp23 := returnValue; LDA &f temp24 := f; ADC &temp23 temp24 := temp24 + temp23; PHA push temp24 LDA &temp22 push temp22 JSR &f2 call F2 TSX remove arguments from stack STA &temp25 temp25 := returnValue; Passing return value on stack for F1, and in accumulator for F2, F3, F4, F5 x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; LDA #5 push 5 PHA JSR &F5 call F5 TSX remove arguments from stack INX TXS STA &temp31 temp31 := returnValue; LDA #1 tempX := 1 CLC ADC &a tempX := tempX + a; TAY save tempX into Y reg. LDA &temp12 temp4 := temp12; LDA &temp25 temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA &temp4 temp4 := returnValue TSX remove arguments from stack TYA restore tempX from Y ADC &temp4 tempX := tempX + temp4; ADC &temp31 tempX := tempX + temp31; ADC &x tempX := tempX + x; STA &x x := tempX; TSX free 7 one byte temps TXA ADC #7 TAX 6502 indexed load into A: LDA $xxxx,X LDA $xxxx,Y A := ($xxxx + X)^ A := ($xxxx + Y)^ Not a valid 6502 instruction ... ?? $01EC $01EB $01EA $01E9 ↔ temp $01E8 ↔ temp $01E7 ↔ temp $01E6 ↔ temp $01E5 ↔ temp $01E4 ↔ temp $01E3 ↔ temp $102+S a1 $01E2 ↔ 1st argument $101+S r $01E1 ↔ return value S($E0) → $01E0 $01DF 6502 stack pointer register: 0000 0001 7 0 S

We already have current content of S register stored in X register! TSX allocate 7 one byte temps TXA SEC SBC #7 TAX TXS LDA &b temp11 := b; CLC ADC &c temp11 := temp11 + c; PHA push temp11 PHA reserve space for return value by pushing dummy value JSR &F1 call F1 TSX LDA $101 + S save return value into A INX remove return value from stack INX remove arguments from stack STA &temp12 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 TSX remove arguments from stack INX STA &temp21 temp21 := returnValue; LDA &d temp22 := d; ADC &e temp22 := temp22 + e; ADC &temp21 temp22 := temp22 + temp21; STA &temp22 LDA #4 push 4 JSR &F3 call F4 TSX remove arguments from stack STA &temp23 temp23 := returnValue; LDA &f temp24 := f; ADC &temp23 temp24 := temp24 + temp23; PHA push temp24 LDA &temp22 push temp22 JSR &f2 call F2 TSX remove arguments from stack STA &temp25 temp25 := returnValue; Passing return value on stack for F1, and in accumulator for F2, F3, F4, F5 x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; We already have current content of S register stored in X register! LDA #5 push 5 PHA JSR &F5 call F5 TSX remove arguments from stack INX TXS STA &temp31 temp31 := returnValue; LDA #1 tempX := 1 CLC ADC &a tempX := tempX + a; TAY save tempX into Y reg. LDA &temp12 temp4 := temp12; LDA &temp25 temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA &temp4 temp4 := returnValue TSX remove arguments from stack TYA restore tempX from Y ADC &temp4 tempX := tempX + temp4; ADC &temp31 tempX := tempX + temp31; ADC &x tempX := tempX + x; STA &x x := tempX; TSX free 7 one byte temps TXA ADC #7 TAX 6502 indexed load into A: LDA $xxxx,X LDA $xxxx,Y A := ($xxxx + X)^ A := ($xxxx + Y)^ Not a valid 6502 instruction ... ?? $01EC $01EB $01EA $01E9 ↔ temp $01E8 ↔ temp $01E7 ↔ temp $01E6 ↔ temp $01E5 ↔ temp $01E4 ↔ temp $01E3 ↔ temp $102+S a1 $01E2 ↔ 1st argument $101+S r $01E1 ↔ return value S($E0) → $01E0 $01DF 6502 stack pointer register: 0000 0001 7 0 S

We already have current content of S register stored in X register! TSX allocate 7 one byte temps TXA SEC SBC #7 TAX TXS LDA &b temp11 := b; CLC ADC &c temp11 := temp11 + c; PHA push temp11 PHA reserve space for return value by pushing dummy value JSR &F1 call F1 TSX LDA $101,X save return value into A INX remove return value from stack INX remove arguments from stack STA &temp12 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 TSX remove arguments from stack INX STA &temp21 temp21 := returnValue; LDA &d temp22 := d; ADC &e temp22 := temp22 + e; ADC &temp21 temp22 := temp22 + temp21; STA &temp22 LDA #4 push 4 JSR &F3 call F4 TSX remove arguments from stack STA &temp23 temp23 := returnValue; LDA &f temp24 := f; ADC &temp23 temp24 := temp24 + temp23; PHA push temp24 LDA &temp22 push temp22 JSR &f2 call F2 TSX remove arguments from stack STA &temp25 temp25 := returnValue; Passing return value on stack for F1, and in accumulator for F2, F3, F4, F5 x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; We already have current content of S register stored in X register! LDA #5 push 5 PHA JSR &F5 call F5 TSX remove arguments from stack INX TXS STA &temp31 temp31 := returnValue; LDA #1 tempX := 1 CLC ADC &a tempX := tempX + a; TAY save tempX into Y reg. LDA &temp12 temp4 := temp12; LDA &temp25 temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA &temp4 temp4 := returnValue TSX remove arguments from stack TYA restore tempX from Y ADC &temp4 tempX := tempX + temp4; ADC &temp31 tempX := tempX + temp31; ADC &x tempX := tempX + x; STA &x x := tempX; TSX free 7 one byte temps TXA ADC #7 TAX LDA $101 + S 6502 indexed load into A: LDA $xxxx,X LDA $xxxx,Y A := ($xxxx + X)^ A := ($xxxx + Y)^ Not a valid 6502 instruction ... ?? $01EC $01EB $01EA $01E9 ↔ temp $01E8 ↔ temp $01E7 ↔ temp $01E6 ↔ temp $01E5 ↔ temp $01E4 ↔ temp $01E3 ↔ temp $102+S a1 $01E2 ↔ 1st argument $101+S r $01E1 ↔ return value S($E0) → $01E0 $01DF 6502 stack pointer register: 0000 0001 7 0 S

Assign memory locations to temporary variables TSX allocate 7 one byte temps TXA SEC SBC #7 TAX TXS LDA &b temp11 := b; CLC ADC &c temp11 := temp11 + c; PHA push temp11 PHA reserve space for return value by pushing dummy value JSR &F1 call F1 TSX LDA $101,X save return value into A INX remove return value from stack INX remove arguments from stack STA &temp12 temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 TSX remove arguments from stack INX STA &temp21 temp21 := returnValue; LDA &d temp22 := d; ADC &e temp22 := temp22 + e; ADC &temp21 temp22 := temp22 + temp21; STA &temp22 LDA #4 push 4 JSR &F3 call F4 TSX remove arguments from stack STA &temp23 temp23 := returnValue; LDA &f temp24 := f; ADC &temp23 temp24 := temp24 + temp23; PHA push temp24 LDA &temp22 push temp22 JSR &f2 call F2 TSX remove arguments from stack STA &temp25 temp25 := returnValue; Assign memory locations to temporary variables Passing return value on stack for F1, and in accumulator for F2, F3, F4, F5 x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; LDA #5 push 5 PHA JSR &F5 call F5 TSX remove arguments from stack INX TXS STA &temp31 temp31 := returnValue; LDA #1 tempX := 1 CLC ADC &a tempX := tempX + a; TAY save tempX into Y reg. LDA &temp12 temp4 := temp12; LDA &temp25 temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA &temp4 temp4 := returnValue TSX remove arguments from stack TYA restore tempX from Y ADC &temp4 tempX := tempX + temp4; ADC &temp31 tempX := tempX + temp31; ADC &x tempX := tempX + x; STA &x x := tempX; TSX free 7 one byte temps TXA ADC #7 TAX ... ?? $01EC $01EB $01EA +7 $01E9 ↔ temp12 +6 $01E8 ↔ temp21 +5 $01E7 ↔ temp22 +4 $01E6 ↔ temp23 +3 $01E5 ↔ temp25 +2 $01E4 ↔ temp31 +1 $01E3 ↔ temp4 S($E0) → a1 $01E2 r $01E1 $01E0 $01DF

Assign memory locations to temporary variables TSX allocate 7 one byte temps TXA SEC SBC #7 TAX TXS LDA &b temp11 := b; CLC ADC &c temp11 := temp11 + c; PHA push temp11 PHA reserve space for return value by pushing dummy value JSR &F1 call F1 TSX LDA $101,X save return value into A INX remove return value from stack INX remove arguments from stack STA $107,X temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 TSX remove arguments from stack INX STA $106,X temp21 := returnValue; LDA &d temp22 := d; ADC &e temp22 := temp22 + e; ADC $106,X temp22 := temp22 + temp21; STA $105,X LDA #4 push 4 JSR &F3 call F4 TSX remove arguments from stack STA $104,X temp23 := returnValue; LDA &f temp24 := f; ADC $104,X temp24 := temp24 + temp23; PHA push temp24 LDA $105,X push temp22 JSR &f2 call F2 TSX remove arguments from stack STA $103,X temp25 := returnValue; Assign memory locations to temporary variables x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; LDA #5 push 5 PHA JSR &F5 call F5 TSX remove arguments from stack INX TXS STA $102,X temp31 := returnValue; LDA #1 tempX := 1 CLC ADC &a tempX := tempX + a; TAY save tempX into Y reg. LDA $107,X temp4 := temp12; LDA $103,X temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA $101,X temp4 := returnValue TSX remove arguments from stack TYA restore tempX from Y ADC $101,X tempX := tempX + temp4; ADC $102,X tempX := tempX + temp31; ADC &x tempX := tempX + x; STA &x x := tempX; TSX free 7 one byte temps TXA ADC #7 TAX ... ?? $01EC $01EB $01EA +7 $01E9 ↔ temp12 +6 $01E8 ↔ temp21 +5 $01E7 ↔ temp22 +4 $01E6 ↔ temp23 +3 $01E5 ↔ temp25 +2 $01E4 ↔ temp31 +1 $01E3 ↔ temp4 S($E0) → a1 $01E2 r $01E1 $01E0 $01DF

Final variant = 89 instructions in 6502 CPU machine code TSX allocate 7 one byte temps TXA SEC SBC #7 TAX TXS LDA &b temp11 := b; CLC ADC &c temp11 := temp11 + c; PHA push temp11 PHA reserve space for return value by pushing dummy value JSR &F1 call F1 TSX LDA $101,X save return value into A INX remove return value from stack INX remove arguments from stack STA $107,X temp12 := returnValue; LDA #3 push 3 PHA LDA #2 push 2 JSR &F3 call F3 TSX remove arguments from stack INX STA $106,X temp21 := returnValue; LDA &d temp22 := d; ADC &e temp22 := temp22 + e; ADC $106,X temp22 := temp22 + temp21; STA $105,X LDA #4 push 4 JSR &F3 call F4 TSX remove arguments from stack STA $104,X temp23 := returnValue; LDA &f temp24 := f; ADC $104,X temp24 := temp24 + temp23; PHA push temp24 LDA $105,X push temp22 JSR &f2 call F2 TSX remove arguments from stack STA $103,X temp25 := returnValue; Single command in Pascal: x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; LDA #5 push 5 PHA JSR &F5 call F5 TSX remove arguments from stack INX TXS STA $102,X temp31 := returnValue; LDA #1 tempX := 1 CLC ADC &a tempX := tempX + a; TAY save tempX into Y reg. LDA $107,X temp4 := temp12; LDA $103,X temp4 := temp4 * temp25; JSR &runtimeMultiply call Pascal runtime STA $101,X temp4 := returnValue TSX remove arguments from stack TYA restore tempX from Y ADC $101,X tempX := tempX + temp4; ADC $102,X tempX := tempX + temp31; ADC &x tempX := tempX + x; STA &x x := tempX; TSX free 7 one byte temps TXA ADC #7 TAX Final variant = 89 instructions in 6502 CPU machine code

X86-* Registers: More Registers → More Freedom For Programmer/Compiler EAX AX AH AL 31 16 15 8 7 0 EAX EBX 31 16 15 8 7 0 ECX 31 16 15 8 7 0 EDX 31 16 15 8 7 0 ESI 31 0 EDI 31 0 EBP 31 0 x86 (IA-32)

x86: OP target, source MOV target, source → target := source OP target, source → target := target OP source

x86: OP target, source MOV target, source → target := source LD? #$xx LD? $xxxx LD? $xxxx,? ST? $xxxx ST? $xxxx,X T?? PHA PLA MOV r, imm MOV r, [addr] MOV r2, [r1 + addr] MOV [addr], r MOV [r1 + addr], r2 MOV r2, r1 PUSH r POP r MOV target, source → target := source OP target, source → target := target OP source

x86: OP target, source MOV target, source → target := source LD? #$xx LD? $xxxx LD? $xxxx,? ST? $xxxx ST? $xxxx,X T?? PHA PLA MOV r, imm MOV r, [addr] MOV r2, [r1 + addr] MOV [addr], r MOV [r1 + addr], r2 MOV r2, r1 PUSH r POP r Most complex x86 addressing mode: [r1 + (r2 * scale) + imm] scale = immediate value: 1, 2, 4, 8 MOV target, source → target := source OP target, source → target := target OP source

x86: Accessing Array of Records (pArr^[1].b?) LD? #$xx LD? $xxxx LD? $xxxx,? ST? $xxxx ST? $xxxx,X T?? PHA PLA MOV r, imm MOV r, [addr] MOV r2, [r1 + addr] MOV [addr], r MOV [r1 + addr], r2 MOV r2, r1 PUSH r POP r Most complex x86 addressing mode: [r1 + (r2 * scale) + imm] scale = immediate value: 1, 2, 4, 8 type TRecord = record a : longword; { offset 0 } b : longword; { offset 4 } c : byte; { offset 8 } end; var pArr : ^array[0..15] of TRecord; MOV target, source → target := source OP target, source → target := target OP source

x86: Accessing Array of Records (pArr^[1].b?) LD? #$xx LD? $xxxx LD? $xxxx,? ST? $xxxx ST? $xxxx,X T?? PHA PLA MOV r, imm MOV r, [addr] MOV r2, [r1 + addr] MOV [addr], r MOV [r1 + addr], r2 MOV r2, r1 PUSH r POP r ... 17 $00B91017 [2].b 16 $00B91016 15 $00B91015 14 $00B91014 13 $00B91013 [2].a 12 $00B91012 [1].c 11 $00B91011 10 $00B91010 0F $00B9100F 0E $00B9100E [1].b 0D $00B9100D 0C $00B9100C 0B $00B9100B 0A $00B9100A [1].a 09 $00B91009 [0].c 08 $00B91008 07 $00B91007 06 $00B91006 05 $00B91005 [0].b 04 $00B91004 03 $00B91003 02 $00B91002 01 $00B91001 [0].a 00 $00B91000 ← pArr Most complex x86 addressing mode: [r1 + (r2 * scale) + imm] scale = immediate value: 1, 2, 4, 8 type TRecord = record a : longword; { offset 0 } b : longword; { offset 4 } c : byte; { offset 8 } end; var pArr : ^array[0..15] of TRecord; If Pascal compiler does not align records in array. MOV target, source → target := source OP target, source → target := target OP source

x86: Accessing Array of Records (pArr^[1].b?) LD? #$xx LD? $xxxx LD? $xxxx,? ST? $xxxx ST? $xxxx,X T?? PHA PLA MOV r, imm MOV r, [addr] MOV r2, [r1 + addr] MOV [addr], r MOV [r1 + addr], r2 MOV r2, r1 PUSH r POP r ... 17 $00B91017 [2].b 16 $00B91016 15 $00B91015 14 $00B91014 13 $00B91013 [2].a 12 $00B91012 [1].c 11 $00B91011 10 $00B91010 0F $00B9100F 0E $00B9100E [1].b 0D $00B9100D .b = ? 0C $00B9100C 0B $00B9100B 0A $00B9100A [1].a 09 $00B91009 [1] = ? [0].c 08 $00B91008 07 $00B91007 06 $00B91006 05 $00B91005 [0].b 04 $00B91004 03 $00B91003 02 $00B91002 01 $00B91001 [0].a 00 $00B91000 ← EAX=pArr Most complex x86 addressing mode: [r1 + (r2 * scale) + imm] scale = immediate value: 1, 2, 4, 8 type TRecord = record a : longword; { offset 0 } b : longword; { offset 4 } c : byte; { offset 8 } end; var pArr : ^array[0..15] of TRecord; MOV target, source → target := source OP target, source → target := target OP source

x86: Accessing Array of Records (pArr^[1].b?) LD? #$xx LD? $xxxx LD? $xxxx,? ST? $xxxx ST? $xxxx,X T?? PHA PLA MOV r, imm MOV r, [addr] MOV r2, [r1 + addr] MOV [addr], r MOV [r1 + addr], r2 MOV r2, r1 PUSH r POP r ... 17 $00B91017 [2].b 16 $00B91016 15 $00B91015 14 $00B91014 13 $00B91013 [2].a 12 $00B91012 [1].c 11 $00B91011 10 $00B91010 0F $00B9100F 0E $00B9100E [1].b 0D $00B9100D +4 0C $00B9100C 0B $00B9100B 0A $00B9100A [1].a 09 $00B91009 +ECX=1*9 [0].c 08 $00B91008 07 $00B91007 06 $00B91006 05 $00B91005 [0].b 04 $00B91004 03 $00B91003 02 $00B91002 01 $00B91001 [0].a 00 $00B91000 ← EAX=pArr Most complex x86 addressing mode: [r1 + (r2 * scale) + imm] scale = immediate value: 1, 2, 4, 8 type TRecord = record a : longword; { offset 0 } b : longword; { offset 4 } c : byte; { offset 8 } end; var pArr : ^array[0..15] of TRecord; MOV target, source → target := source OP target, source → target := target OP source

x86: Accessing Array of Records (pArr^[1].b?) LD? #$xx LD? $xxxx LD? $xxxx,? ST? $xxxx ST? $xxxx,X T?? PHA PLA MOV r, imm MOV r, [addr] MOV r2, [r1 + addr] MOV [addr], r MOV [r1 + addr], r2 MOV r2, r1 PUSH r POP r ... 17 $00B91017 [2].b 16 $00B91016 15 $00B91015 14 $00B91014 13 $00B91013 [2].a 12 $00B91012 [1].c 11 $00B91011 10 $00B91010 0F $00B9100F 0E $00B9100E [1].b 0D $00B9100D +4 0C $00B9100C 0B $00B9100B 0A $00B9100A [1].a 09 $00B91009 +ECX=1*9 [0].c 08 $00B91008 07 $00B91007 06 $00B91006 05 $00B91005 [0].b 04 $00B91004 03 $00B91003 02 $00B91002 01 $00B91001 [0].a 00 $00B91000 ← EAX=pArr Most complex x86 addressing mode: [r1 + (r2 * scale) + imm] scale = immediate value: 1, 2, 4, 8 type TRecord = record a : longword; { offset 0 } b : longword; { offset 4 } c : byte; { offset 8 } end; var pArr : ^array[0..15] of TRecord; MOV ECX, 1 { 1 = index into array } IMUL ECX, 9 { 9 = size of record } MOV target, source → target := source OP target, source → target := target OP source

x86: Accessing Array of Records (pArr^[1].b?) LD? #$xx LD? $xxxx LD? $xxxx,? ST? $xxxx ST? $xxxx,X T?? PHA PLA MOV r, imm MOV r, [addr] MOV r2, [r1 + addr] MOV [addr], r MOV [r1 + addr], r2 MOV r2, r1 PUSH r POP r ... 17 $00B91017 [2].b 16 $00B91016 15 $00B91015 14 $00B91014 13 $00B91013 [2].a 12 $00B91012 [1].c 11 $00B91011 10 $00B91010 0F $00B9100F 0E $00B9100E [1].b 0D $00B9100D +4 0C $00B9100C 0B $00B9100B 0A $00B9100A [1].a 09 $00B91009 +ECX=1*9 [0].c 08 $00B91008 07 $00B91007 06 $00B91006 05 $00B91005 [0].b 04 $00B91004 03 $00B91003 02 $00B91002 01 $00B91001 [0].a 00 $00B91000 ← EAX=pArr Most complex x86 addressing mode: [r1 + (r2 * scale) + imm] scale = immediate value: 1, 2, 4, 8 type TRecord = record a : longword; { offset 0 } b : longword; { offset 4 } c : byte; { offset 8 } end; var pArr : ^array[0..15] of TRecord; MOV ECX, 1 { 1 = index into array } IMUL ECX, 9 { 9 = size of record } MOV EBX, [EAX + ECX + 4] { 4 = offset of b } MOV target, source → target := source OP target, source → target := target OP source

x86: Accessing Array of Records (pArr^[1].b?) LD? #$xx LD? $xxxx LD? $xxxx,? ST? $xxxx ST? $xxxx,X T?? PHA PLA MOV r, imm MOV r, [addr] MOV r2, [r1 + addr] MOV [addr], r MOV [r1 + addr], r2 MOV r2, r1 PUSH r POP r ... 17 $00B91017 16 $00B91016 15 $00B91015 [2].b 14 $00B91014 13 $00B91013 12 $00B91012 11 $00B91011 [2].a 10 $00B91010 0F $00B9100F 0E $00B9100E 0D $00B9100D +4 [1].b 0C $00B9100C 0B $00B9100B 0A $00B9100A 09 $00B91009 [1].a 08 $00B91008 +8=1*8 07 $00B91007 06 $00B91006 05 $00B91005 [0].b 04 $00B91004 03 $00B91003 02 $00B91002 01 $00B91001 [0].a 00 $00B91000 ← EAX=pArr Most complex x86 addressing mode: [r1 + (r2 * scale) + imm] scale = immediate value: 1, 2, 4, 8 type TRecord = record a : longword; { offset 0 } b : longword; { offset 4 } end; var pArr : ^array[0..15] of TRecord; MOV ECX, 1 { 1 = index into array } MOV EBX, [EAX + ECX * 8 + 4] { 8 = size of record } { 4 = offset of b } MOV target, source → target := source OP target, source → target := target OP source

x86: OP target, source MOV target, source → target := source LD? #$xx LD? $xxxx LD? $xxxx,? ST? $xxxx ST? $xxxx,X T?? PHA PLA MOV r, imm MOV r, [addr] MOV r2, [r1 + addr] MOV [addr], r MOV [r1 + addr], r2 MOV r2, r1 PUSH r POP r Most complex x86 addressing mode: [r1 + (r2 * scale) + imm] scale = immediate value: 1, 2, 4, 8 MOV target, source → target := source OP target, source → target := target OP source

x86: OP target, source MOV target, source → target := source LD? #$xx LD? $xxxx LD? $xxxx,? ST? $xxxx ST? $xxxx,X T?? PHA PLA MOV r, imm MOV r, [addr] MOV r2, [r1 + addr] MOV [addr], r MOV [r1 + addr], r2 MOV r2, r1 PUSH r POP r ADD r, reg/imm/addr ADC r, reg/imm/addr SUB r, reg/imm/addr SBB r, reg/imm/addr IMUL r, reg/imm/addr IDIV r, reg/imm/addr OR r, reg/imm/addr AND r, reg/imm/addr XOR r, reg/imm/addr NOT r SHR reg/addr, imm/CL SAR reg/addr, imm/CL SHL reg/addr, imm/CL r := r + reg/imm/addr EFlags.Carry := result.32 r := r + reg/imm/addr + EFlags.Carry Most complex x86 addressing mode: [r1 + (r2 * scale) + imm] scale = immediate value: 1, 2, 4, 8 MOV target, source → target := source OP target, source → target := target OP source

x86: Flags LD? #$xx LD? $xxxx LD? $xxxx,? ST? $xxxx ST? $xxxx,X T?? PHA PLA MOV r, imm MOV r, [addr] MOV r2, [r1 + addr] MOV [addr], r MOV [r1 + addr], r2 MOV r2, r1 PUSH r POP r ADD r, reg/imm/addr ADC r, reg/imm/addr SUB r, reg/imm/addr SBB r, reg/imm/addr IMUL r, reg/imm/addr IDIV r, reg/imm/addr OR r, reg/imm/addr AND r, reg/imm/addr XOR r, reg/imm/addr NOT r SHR reg/addr, imm/CL SAR reg/addr, imm/CL SHL reg/addr, imm/CL r := r + reg/imm/addr EFlags.Carry := result.32 r := r + reg/imm/addr + EFlags.Carry Most complex x86 addressing mode: [r1 + (r2 * scale) + imm] scale = immediate value: 1, 2, 4, 8 EFlags.Sign := target.31 if target = 0 then EFlags.Zero := 1 else EFlags.Zero := 0; CLC STC EFlags.Carry := 0 EFlags.Carry := 1

Same Example Targeting x86 Architecture x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; temp11 := b; temp11 := temp11 + c; push temp11 call F1 temp12 := returnValue; push 3 push 2 call F3 temp21 := returnValue; temp22 := d; temp22 := temp22 + e; temp22 := temp22 + temp21; push 4 call F4 temp23 := returnValue; temp24 := f; temp24 := temp24 + temp23; push temp24 push temp22 call F2 temp25 := returnValue; push 5 call F5 temp31 := returnValue; tempX := 1 tempX := tempX + a; temp4 := temp12; temp4 := temp4 * temp25; tempX := tempX + temp4; tempX := tempX + temp31; tempX := tempX + x; x := tempX;

x86 – With “Dynamic Temporary Variable Allocation” x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; MOV EAX, [&b] temp11 := b; ADD EAX, [&c] temp11 := temp11 + c; PUSH EAX push temp11 CALL &F1 call F1 ADD ESP, 4 PUSH EAX temp12 := returnValue; PUSH 3 push 3 PUSH 2 push 2 CALL &F3 call F3 ADD ESP, 8 NOP temp21 := returnValue; MOV EBX, [&d] temp22 := d; ADD EBX, [&e] temp22 := temp22 + e; ADD EBX, EAX temp22 := temp22 + temp21; PUSH 4 push 4 CALL &F4 call F4 NOP temp23 := returnValue; MOV ECX, [&f] temp24 := f; ADD ECX, EAX temp24 := temp24 + temp23; PUSH ECX push temp24 PUSH EBX push temp22 CALL F2 call F2 PUSH EAX temp25 := returnValue; PUSH 5 push 5 CALL &F5 call F5 ADD ESP, 4 NOP temp31 := returnValue; MOV EBX, 1 tempX := 1 ADD EBX, [&a] tempX := tempX + a; POP ECX assign ECX <- temp25 POP EDX temp4 := temp12; IMUL EDX, ECX temp4 := temp4 * temp25; ADD EBX, EDX tempX := tempX + temp4; ADD EBX, EAX tempX := tempX + temp31; ADD EBX, [&x] tempX := tempX + x; MOV [&x], EBX x := tempX;

x86 – “temporary variable preallocation” x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; SUB ESP, 8 MOV EAX, [&b] temp11 := b; ADD EAX, [&c] temp11 := temp11 + c; PUSH EAX push temp11 CALL &F1 call F1 ADD ESP, 4 MOV [ESP + 4], EAX temp12 := returnValue; PUSH 3 push 3 PUSH 2 push 2 CALL &F3 call F3 ADD ESP, 8 NOP temp21 := returnValue; MOV EBX, [&d] temp22 := d; ADD EBX, [&e] temp22 := temp22 + e; ADD EBX, EAX temp22 := temp22 + temp21; PUSH 4 push 4 CALL &F4 call F4 NOP temp23 := returnValue; MOV ECX, [&f] temp24 := f; ADD ECX, EAX temp24 := temp24 + temp23; PUSH ECX push temp24 PUSH EBX push temp22 CALL F2 call F2 MOV [ESP], EAX temp25 := returnValue; PUSH 5 push 5 CALL &F5 call F5 ADD ESP, 4 NOP temp31 := returnValue; MOV EBX, 1 tempX := 1 ADD EBX, [&a] tempX := tempX + a; MOV EDX, [ESP + 4] temp4 := temp12; IMUL EDX, [ESP] temp4 := temp4 * temp25; ADD EBX, EDX tempX := tempX + temp4; ADD EBX, EAX tempX := tempX + temp31; ADD EBX, [&x] tempX := tempX + x; MOV [&x], EBX x := tempX; ADD ESP, 8

Final variant = 89 instructions in 6502 CPU machine code x := 1 + a + F1(b + c) * F2(d + e + F3(2, 3), f + F4(4)) + F5(5) + x; SUB ESP, 8 MOV EAX, [&b] temp11 := b; ADD EAX, [&c] temp11 := temp11 + c; PUSH EAX push temp11 CALL &F1 call F1 ADD ESP, 4 MOV [ESP + 4], EAX temp12 := returnValue; PUSH 3 push 3 PUSH 2 push 2 CALL &F3 call F3 ADD ESP, 8 NOP temp21 := returnValue; MOV EBX, [&d] temp22 := d; ADD EBX, [&e] temp22 := temp22 + e; ADD EBX, EAX temp22 := temp22 + temp21; PUSH 4 push 4 CALL &F4 call F4 NOP temp23 := returnValue; MOV ECX, [&f] temp24 := f; ADD ECX, EAX temp24 := temp24 + temp23; PUSH ECX push temp24 PUSH EBX push temp22 CALL F2 call F2 MOV [ESP], EAX temp25 := returnValue; PUSH 5 push 5 CALL &F5 call F5 ADD ESP, 4 NOP temp31 := returnValue; MOV EBX, 1 tempX := 1 ADD EBX, [&a] tempX := tempX + a; MOV EDX, [ESP + 4] temp4 := temp12; IMUL EDX, [ESP] temp4 := temp4 * temp25; ADD EBX, EDX tempX := tempX + temp4; ADD EBX, EAX tempX := tempX + temp31; ADD EBX, [&x] tempX := tempX + x; MOV [&x], EBX x := tempX; ADD ESP, 8 Final variant = 36 instructions (excluding NOPs) in x86 CPU machine code Final variant = 89 instructions in 6502 CPU machine code

Saving Contents of Registers as Part of a Calling Convention (Save in Prolog/Restore in Epilog): x86 Save EBP at Least, 6502 Save X at Least

Typical x86 Prolog and Epilog PUSH EBP MOV EBP, ESP SUB ESP, 10 ; 10 = total size of all local and temp variables Function/procedure body: EBP + x: function/procedure arguments EBP + 4: return address EBP : old EBP (value saved for the caller) EBP – x: local variables/temporary variables Epilog: MOV ESP, EBP POP EBP RET Pascal procedure P1; var a : longword; b : longword; c : word; begin end;

Typical x86 Prolog and Epilog PUSH EBP MOV EBP, ESP SUB ESP, 10 ; 10 = total size of all local and temp variables Function/procedure body: EBP + x: function/procedure arguments EBP + 4: return address EBP : old EBP (value saved for the caller) EBP – x: local variables/temporary variables Epilog: MOV ESP, EBP POP EBP RET Pascal procedure P1; var a : longword; b : longword; c : word; begin end; Some compilers using a shorter variant on x86: LEAVE