Chapter 2 Language Processors Fall 2013
Chart 2 Translators and Compilers Interpreters Real and Abstract Machines Interpretive Compilers Portable Compilers Bootstrapping Case Study: The Triangle Language Processor
Chart 3 Translator: a program that accepts any text expressed in one language (the translator’s source language), and generates a semantically-equivalent text expressed in another language (its target language) o Chinese-into-English o Java-into-C o Java-into-x86 o X86 assembler
Chart 4 Assembler: translates from an assembly language into the corresponding machine code o Generates one machine code instruction per source instruction Compiler: translates from a high-level language into a low- level language o Generates several machine-code instructions per source command.
Chart 5 Disassembler: translates a machine code into the corresponding assembly language Decompiler: translates a low-level language into a high-level language Question: Why would you want a disassembler or decompiler?
Chart 6 Source Program: the source language text Object Program: the target language text Compiler Object Program Syntax Check Context Constraints Generate Object Code Semantic Analysis Source Program Object program semantically equivalent to source program If source program is well-formed
Chart 7 Why would you want to do: o Java-into-C translator o C-into-Java translator o Assembly-language-into-Pascal decompiler
Chart 8 M P L P L M P = Program Name L = Implementation Language M = Target Machine For this to work, L must equal M, that is, the implementation language must be the same as the machine language ST L S = Source Language T = Target Language L = Translator’s Implementation Language S-into-T Translator is itself a program that runs on machine L
Chart 9 Translating a source program P Expressed in language T, Using an S-into-T translator Running on machine M P S M ST M P T
Chart 10 Translating a source program sort Expressed in language Java, Using a Java-into-x86 translator Running on an x86 machine sort Java x86 Javax86 sort x86 The object program is running on the same machine as the compiler sort x86
Chart 11 sort Java x86 JavaPPC x86 sort PPC Cross Compiler: The object program is running on a different machine than the compiler sort PPC download Translating a source program sort Expressed in language Java, Using a Java-into-PPC translator Running on an x86 machine Downloaded to a PPC machine
Chart 12 sort Java x86 JavaC x86 sort C Two-stage Compiler: The source program is translated to another language before being translated into the object program sort x86 Translating a source program sort Expressed in language Java, Using a Java-into-C translator Running on an x86 machine x86 sort x86 C Then translating the C program Using an C-into x86 compiler Running on an x86 machine Into x86 object program
Chart 13 Translator Rules o Can run on machine M only if it is expressed in machine code M o Source program must be expressed in translator’s source language S o Object program is expressed in the translator’s target language T o Object program is semantically equivalent to the source program
Chart 14 Accepts any program (source program) expressed in a particular language (source language) and runs that source program immediately o Does not translate the source program into object code prior to execution
Chart 15 Interpreter Program Complete Fetch Instruction Analyze Instruction Execute Instruction Source Program Source program starts to run as soon as the first instruction is analyzed
Chart 16 When to Use Interpretation o Interactive mode – want to see results of instruction before entering next instruction o Only use program once o Each instruction expected to be executed only once o Instructions have simple formats Disadvantages o Slow: up to 100 times slower than in machine code
Chart 17 Examples o Basic o Lisp o Unix Command Language (shell) o SQL
Chart 18 S L S interpreter expressed in language L S M P S M Program P expressed in language S, using Interpreter S, running on machine M Basic x86 graph Basic x86 Program graph written in Basic running on a Basic interpreter executed on an x86 machine
Chart 19 Hardware emulation: Using software to execute one set of machine code on another machine o Can measure everything about the new machine except its speed o Abstract machine: emulator o Real machine: actual hardware An abstract machine is functionally equivalent to a real machine if they both implement the same language L
Chart 20 nmi C M CM M New Machine Instruction (nmi) interpreter written in C nmi C M The nmi interpreter is translated into machine code M using the C compiler Compiler to translate C program into M machine code nmi interpreter written in C nmi interpreter expressed in machine code M nmi M P M P
Chart 21 Combination of compiler and interpreter o Translate source program into an intermediate language o It is intermediate in level between the source language and ordinary machine code o Its instructions have simple formats, and therefore can be analyzed easily and quickly o Translation from the source language into the intermediate language is easy and fast An interpretive compiler combines fast compilation with tolerable running speed
Chart 22 JavaJVM M M Java into JVM translator running on machine M JVM code interpreter running on machine M JavaJVM M P Java P JVM M P M M A Java program P is first translated into JVM-code, and then the JVM-code object program is interpreted
Chart 23 A program is portable if it can be compiled and run on any machine, without change o A portable program is more valuable than an unportable one, because its development cost can be spread over more copies o Portability is measured by the proportion of code that remains unchanged when it is moved to a dissimilar machine Language affects portability Assembly language: 0% portable High level language: approaches 100% portability
Chart 24 Language Processors o Valuable and widely used programs o Typically written in high-level language Pascal, C, Java o Part of language processor is machine dependent Code generation part Language processor is only about 50% portable o Compiler that generates intermediate code is more portable than a compiler that generates machine code
Chart 25 JavaJVM Java JVM Java JVM P Java P JVM M P M M C JavaJVM 2. Rewrite interpreter in C CM M M JVM C M M Note: C M Compiler exists; rewrite JVM interpreter from Java to C 1. Start with the following 3. Compile the compiler 4. Java program P is translated into JVM program P and run using ghe JVM intrepreter
Chart 26 The language processor is used to process itself o Implementation language is the source language Bootstrapping a portable compiler o A portable compiler can be bootstrapped to make a true compiler – one that generates machine code – by writing an intermediate- language-into-machine-code translator Full bootstrap o Writing the compiler in itself o Using the latest version to upgrade the next version Half bootstrap o Compiler expressed in itself but targeted for another machine Bootstrapping to improve efficiency o Upgrade the compiler to optimize code generation as well as to improve compile efficiency
Chart 27 Bootstrap an interpretive compiler to generate machine code JVMM Java M JVM M M Java JVM M M M M M M M M JavaJVM M M JavaJVM M JavaJVM M M M M M P Java P JVM P M 1, First, write a JVM-coded-into- M translator in Java 2. Next, compile translator using existing interpreter 3. Use translator to translate itself 4. Translate Java- into-JVM-code translator into machine code 5. Two stage Java-into-M compiler
Chart 28 Full bootstrap Ada-SM C v1 Ada-SM C CM M M M M v1 Ada-SM v2 Ada-SM M M M M M v2 v1 AdaM Ada-S M M AdaM M M v3 v2 AdaM Ada-S v3 5. Extend Ada-S compiler to (full) Ada compiler 3. Convert the C version of Ada-S into Ada-S version of Ada-S 1. Write Ada-S compiler in C 2. Compile v1 using the C compiler 4. Use v1 to compile v2 6. Compile full version of Ada using Ada-S compiler
Chart 29 Half bootstrap AdaHM Ada HM AdaTM Ada TM Ada HM AdaTM HM P Ada TM HM P TM P AdaTM Ada TM HM AdaTM 2. Ada compiler that generates machine code for machine H expressed in HM 1. Ada compiler that generates machine code for machine H expressed in Ada 3.Rewrite Ada compiler that generates HM code to Ada compiler that generates machine code for machine TM 5. Make sure the compiler works properly 6. Use the compiler to compile itself. Now have an Ada compiler that generates TM code that runs on machine TM 4. Use existing Ada compiler fro HM to compile the Ada compiler that run on HM but generates code for TM
Chart 30 Bootstrap to improve efficiency AdaMsMs MsMs v1 AdaMsMs v1 AdaMfMf v2 AdaMfMf v2 AdaMsMs MsMs v1 AdaMfMf MsMs v2 M AdaMfMf MsMs v2 P Ada 1. Start with v1 targeted to M-slow written in Ada and compiled to run on M-slow 2. Rewrite v1 to run on M-fast in Ada 3. Use v1 to compile v2 P MfMf P MfMf M 4. Compile a program using v2; it will compile slow but run fast AdaMfMf v2 AdaMsMs MsMs v2 AdaMfMf MfMf v3 M 5. Use v2 to compile v2 to produce v3 which will be the fast compiler.