6. Program Translation CS100: The World of Computing John Dougherty Haverford College
Overview The Problem The “Source” – high-level code The Target – low-level (machine) code Types of translation The translation algorithm/process PIPPIN
The Problem People communication in ambiguous, high- level languages, using experience, context, and can ask for clarity interactively e.g., Thoreau threw through the tunnel. Machines have no “sense” of context or experience, and need unambiguous instructions
The Source High-level programming languages close to natural language (but not quite) Alice, Javascript, C++, Java, C# Known as Source code Each instructions implies many lower-level instructions (as we’ll see …)
The Target Low-level instructions that are clear and simple – typically fixed in size, with a command and some reference to data Opcode Operand Known as … Machine code || binary code || executable
Program Translation From High- to Low-Level Recall “divide and conquer” in programming Input Process Output … then details of Process, then details of … Typically many low-level operations per high-level instruction From source code to machine/binary code Two ways to translate …
Interpretation Works with the source always Translates and executes “on the fly” Like a language translator at the UN Easier to debug Executes slower
Compilation Works with executable Translates the entire program from source to machine code once Executes the machine code as many times as needed Recompile often during development Executes substantially faster Most software is distributed (except open source) Hides algorithm
Phases of Translation Scanning – breaking text sequence into tokens (i.e., meaningful chunks) “ while ”, “ = ”, “ For all together ” Parsing – organizing the tokens to discover the meaning of the program Code Generation – writing the sequence of machine level operations Opcodes, operands
Language Levels High-level: one-to-many relation to machine language (e.g., z = x + y is 4 PIPPIN ops) Assembly language: one-to-one (roughly) relation to machine language (PIPPIN) Low-level: machine, or binary, language of 0s and 1s
Arithmetic Instructions To demonstrate this process, we’ll look at standard arithmetic expressions and statements in a high-level language Expressions have a pattern, or (recursively- defined) form Var = exp Where exp = value | exp + exp | exp – exp | … (demonstration of Rosetta)Rosetta
PIPPIN instruction layout opcodeoperand Each box contains a byte
Sample PIPPIN Opcodes LOD (load from X) STO (store to X) HLT (halt execution) ADD (acc = acc + X)
Example PIPPIN program ; PIPPIN code for Z = X + Y [1]LOD X; acc <= X [2]ADD Y; acc <= acc + Y [3]STO Z; acc => Z [4]HLT; halt ;other examples AE pp
Programming Paradigms Imperative: procedures as abstractions, details of how to do a task (e.g., FORTRAN, Pascal) Functional: mathematical approach of input- process-return value – functions can be composed of other functions (including themselves), and can be evaluated (e.g., LISP) Declarative: describe the information, but not the way it is processed (e.g., Prolog) Object-Oriented: interacting objects (e.g., Java, C++, C#, Smalltalk, Javascript, Alice)