Review (Chapter 9) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of.

Slides:



Advertisements
Similar presentations
1 Languages and Compilers (SProg og Oversættere) Code Generation.
Advertisements

CPSC Compiler Tutorial 9 Review of Compiler.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 330 Programming Language Structures Ch.2: Syntax and Semantics Fall 2005.
PLLab, NTHU Cs2403 Programming Languages Implementation Issues Cs2403 Programming Language Spring 2005 Kun-Yuan Hsieh.
Chapter 3 Program translation1 Chapt. 3 Language Translation Syntax and Semantics Translation phases Formal translation models.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 531 Compiler Construction Ch.1 Spring 2010 Marco Valtorta
Syntax Analysis (Chapter 4) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
1 Languages and Compilers (SProg og Oversættere) Parsing.
Compilation (Chapter 3) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.
Syntax Analysis (Chapter 4) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
Java Programming Introduction & Concepts. Introduction to Java Developed at Sun Microsystems by James Gosling in 1991 Object Oriented Free Compiled and.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 2.
PART I: overview material
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a.
Interpretation (Chapter 8) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.
1 Languages and Compilers (SProg og Oversættere) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to Norm Hutchinson.
1 Languages and Compilers (SProg og Oversættere) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to Norm Hutchinson.
1 Languages and Compilers (SProg og Oversættere) Lexical analysis.
Languages and Compilers (SProg og Oversættere) Bent Thomsen Department of Computer Science Aalborg University.
Chapter 2. Design of a Simple Compiler J. H. Wang Sep. 21, 2015.
CPS 506 Comparative Programming Languages Syntax Specification.
Runtime Organization (Chapter 6) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.
Runtime Organization (Chapter 6) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.
Introduction to Code Generation and Intermediate Representations
Language Translation A programming language processor is any system that manipulates programs expressed in a PL A source program in some source language.
Contextual Analysis (Chapter 5) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.
CS536 Semantic Analysis Introduction with Emphasis on Name Analysis 1.
Introduction to Compiling
Introduction CPSC 388 Ellen Walker Hiram College.
Compiler Introduction 1 Kavita Patel. Outlines 2  1.1 What Do Compilers Do?  1.2 The Structure of a Compiler  1.3 Compilation Process  1.4 Phases.
Syntax Analysis (Chapter 4) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.
1 Compiler & its Phases Krishan Kumar Asstt. Prof. (CSE) BPRCE, Gohana.
The Interpreter Pattern (Behavioral) ©SoftMoore ConsultingSlide 1.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
 Fall Chart 2  Translators and Compilers  Textbook o Programming Language Processors in Java, Authors: David A. Watts & Deryck F. Brown, 2000,
Overview of Compilation Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 2.
1 Languages and Compilers (SProg og Oversættere) Semantic Analysis.
Presented by : A best website designer company. Chapter 1 Introduction Prof Chung. 1.
CS510 Compiler Lecture 1. Sources Lecture Notes Book 1 : “Compiler construction principles and practice”, Kenneth C. Louden. Book 2 : “Compilers Principles,
CC410: System Programming Dr. Manal Helal – Fall 2014 – Lecture 12–Compilers.
LECTURE 10 Semantic Analysis. REVIEW So far, we’ve covered the following: Compilation methods: compilation vs. interpretation. The overall compilation.
Chapter 3 – Describing Syntax
Describing Syntax and Semantics
Compiler Design (40-414) Main Text Book:
Chapter 1 Introduction.
A Simple Syntax-Directed Translator
Chapter 3 – Describing Syntax
Overview of Compilation The Compiler BACK End
Chapter 1 Introduction.
Ch. 4 – Semantic Analysis Errors can arise in syntax, static semantics, dynamic semantics Some PL features are impossible or infeasible to specify in grammar.
Compiler Lecture 1 CS510.
Languages and Compilers (SProg og Oversættere)
CMPE 152: Compiler Design December 5 Class Meeting
Introduction CI612 Compiler Design CI612 Compiler Design.
CSE401 Introduction to Compiler Construction
Overview of Compilation The Compiler BACK End
R.Rajkumar Asst.Professor CSE
CSCE 330 Programming Language Structures Ch.2: Syntax and Semantics
Languages and Compilers (SProg og Oversættere)
Course Overview PART I: overview material PART II: inside a compiler
BNF 9-Apr-19.
Course Overview PART I: overview material PART II: inside a compiler
Lec00-outline May 18, 2019 Compiler Design CS416 Compiler Design.
Course Overview PART I: overview material PART II: inside a compiler
Presentation transcript:

Review (Chapter 9) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a compiler PART II: inside a compiler 4Syntax analysis 5Contextual analysis 6Runtime organization 7Code generation PART III: conclusion 8Interpretation 9Review

Review (Chapter 9) 2 Levels of Programming Languages High-level program class Triangle {... float area( ) { return b*h/2; } class Triangle {... float area( ) { return b*h/2; } Low-level program LOAD r1,b LOAD r2,h MUL r1,r2 DIV r1,#2 RET LOAD r1,b LOAD r2,h MUL r1,r2 DIV r1,#2 RET Executable Machine code

Review (Chapter 9) 3 Compilers and other translators Examples: Chinese => English Java => JVM byte codes Scheme => C C => Scheme x86 Assembly Language => x86 binary codes Other non-traditional examples: disassembler, decompiler (e.g. JVM => Java)

Review (Chapter 9) 4 Tombstone Diagrams What are they? –diagrams consisting out of a set of “puzzle pieces” we can use to reason about language processors and programs –different kinds of pieces –combination rules (not all diagrams are “well formed”) M Machine implemented in hardware S --> T L Translator implemented in L MLML Language interpreter in L Program P implemented in L L P

Review (Chapter 9) 5 Syntax Specification Syntax is specified using “Context Free Grammars”: –A finite set of terminal symbols –A finite set of non-terminal symbols –A start symbol –A finite set of production rules Often CFG are written in “Bachus Naur Form” or BNF notation. Each production rule in BNF notation is written as: N ::=  where N is a non terminal and  a sequence of terminals and non-terminals N ::=  is an abbreviation for several rules with N as left-hand side.

Review (Chapter 9) 6 Concrete and Abstract Syntax The grammar specifies the concrete syntax of a programming language. The concrete syntax is important for the programmer who needs to know exactly how to write syntactically well- formed programs. The abstract syntax omits irrelevant syntactic details and only specifies the essential structure of programs. Example: different concrete syntaxes for an assignment v := e (set! v e) e -> v v = e

Review (Chapter 9) 7 Context-Free Grammars Grammar String

Review (Chapter 9) 8 Context-Free Grammars (continued) The given string has 2 parse trees (concrete syntax trees). So the grammar is ambiguous. E E EE E * id+ E E EE E+ *

Review (Chapter 9) 9 Abstract Syntax Trees Abstract Syntax Tree for: d:=d+10*n BinaryExpression VNameExp BinaryExpression Ident d + Op Int-Lit 10 * Op SimpleVName IntegerExpVNameExp Ident n SimpleVName AssignmentCmd d Ident VName SimpleVName Note: Triangle does not have precedence levels like C ++

Review (Chapter 9) 10 Contextual Constraints Syntax rules alone are not enough to specify the format of well-formed programs. Example 1: let const m~2 in putint(m + x) Example 2: let const m~2 ; var n:Boolean in begin n := m<4; n := n+1 end Undefined! Scope Rules Type error! Type Rules

Review (Chapter 9) 11 Semantics Specification of semantics is concerned with specifying the “meaning” of well-formed programs. Terminology: Expressions are evaluated and yield values (and may or may not perform side effects). Commands are executed and perform side effects. Declarations are elaborated to produce bindings. Side effects: change the values of variables perform input/output

Review (Chapter 9) 12 Phases of a Compiler A compiler’s phases are steps in transforming source code into object code. The different phases correspond roughly to the different parts of the language specification: Syntax analysis Syntax Contextual analysis Contextual constraints Code generation Semantics

Review (Chapter 9) 13 Compiler Passes A pass is a complete traversal of the source program, or a complete traversal of some internal representation of the source program (such as the syntax tree). A pass can correspond to a “phase” but it does not have to! Sometimes a single “pass” corresponds to several phases that are interleaved in time. What and how many passes a compiler does over the source program is an important design decision.

Review (Chapter 9) 14 Syntax Analysis Scanner Source Program Abstract Syntax Tree Error Reports Parser Stream of “Tokens” Stream of Characters Error Reports Dataflow chart

Review (Chapter 9) 15 Regular Expressions RE are a notation for expressing a set of strings of terminal symbols. Different kinds of RE:  The empty string tGenerates only the string t X YGenerates any string xy such that x is generated by x and y is generated by Y X | YGenerates any string which generated either by X or by Y X*The concatenation of zero or more strings generated by X (X)For grouping,

Review (Chapter 9) 16 Language Defined by a Regular Expression Recall: language = set of strings Language defined by a regular expression = set of strings that match the expression Regular ExpressionCorresponding Set of Strings  {""} a{"a"} a b c{"abc"} a | b | c{"a", "b", "c"} (a | b | c)*{"", "a", "b", "c", "aa", "ab",..., "bccabb"...}

Review (Chapter 9) 17 FSM and the implementation of Scanners Regular expressions, NFSM’s, and DFSM’s are all equivalent formalisms in terms of what languages can be defined with them. Regular expressions are a convenient notation for describing the “tokens” of programming languages. Regular expressions can be converted into NFSM’s (the algorithm for conversion into DFSM is straightforward). DFSM’s can be easily implemented as computer programs.

Review (Chapter 9) 18 DFSM Example: Integer Literals Here is a DFSM that accepts integer literals with an optional + or – sign: + digit S B A –

Review (Chapter 9) 19 Parsing Parsing == Recognition + determining syntax structure (for example by generating AST) –Different types of parsing strategies bottom up top down –Recursive descent parsing What is it How to implement one given an EBNF specification

Review (Chapter 9) 20 Top-down parsing Thecatseesarat.Thecatseesrat. Sentence SubjectVerbObject. Sentence Noun Subject The Noun cat Verb seesa Noun Object Noun rat.

Review (Chapter 9) 21 Bottom up parsing Thecatseesarat.Thecat Noun Subject sees Verb arat Noun Object. Sentence

Review (Chapter 9) 22 Development of Recursive Descent Parser (1)Express grammar in EBNF (2)Grammar Transformations: Left factorization and Left recursion elimination (3)Create a parser class with –private variable currentToken –methods to call the scanner: accept and acceptIt (4) Implement a public method for main function to call: –public parse method that fetches the first token from the scanner calls parse S (where S is start symbol of the grammar) verifies that scanner next produces the end–of–file token (5)Implement private parsing methods: –add private parse N method for each non terminal N

Review (Chapter 9) 23 LL 1 Grammars The presented algorithm to convert EBNF into a parser does not work for all possible grammars. It only works for so called “LL 1” grammars. Basically, an LL 1 grammar is a grammar which can be parsed with a top-down parser with a lookahead (in the input stream of tokens) of one token. What grammars are LL 1? How can we recognize that a grammar is (or is not) LL 1? => We can deduce the necessary conditions from the parser generation algorithm.

Review (Chapter 9) 24 Contextual Analysis --> Decorated AST Program LetCommand SequentialDeclaration n Ident SimpleT VarDecl SimpleT VarDecl Integer c Charc‘&’ nn +1 Ident OpChar.LitInt.Lit SimpleV Char.Expr SimpleV VNameExpInt.Expr AssignCommand BinaryExpr SequentialCommand AssignCommand :char :int result of identification :type result of type checking Annotations: :int SimpleV

Review (Chapter 9) 25 Nested Block Structure A language exhibits nested block structure if blocks may be nested one within another (typically with no upper bound on the level of nesting that is allowed). There can be any number of scope levels (depending on the level of nesting of blocks): Typical scope rules: no identifier may be declared more than once within the same block (at the same level). for any applied occurrence there must be a corresponding declaration, either within the same block or in a block in which it is nested. Nested

Review (Chapter 9) 26 Type Checking For most statically typed programming languages, a bottom up algorithm over the AST: Types of expression AST leaves are known immediately: –literals => obvious –variables => from the ID table –named constants => from the ID table Types of internal nodes are inferred from the type of the children and the type rule for that kind of expression

Review (Chapter 9) 27 Runtime organization Data Representation: how to represent values of the source language on the target machine. Primitives, arrays, structures, unions, pointers Expression Evaluation: How to organize computing the values of expressions (taking care of intermediate results) Register machine vs. stack machine Storage Allocation: How to organize storage for variables (considering various lifetimes of global, local, and heap variables) Activation records, static/dynamic links, dynamic allocation Routines: How to implement procedures, functions (and how to pass their parameters and return values) Value vs. reference parameters, closures, recursion Object Orientation: Runtime organization for OO languages Method tables

Review (Chapter 9) 28 Java Virtual Machine The JVM is an abstract machine in the truest sense of the word. The JVM specification does not give implementation details (can be dependent on target OS/platform, performance requirements, etc.) The JVM specification defines a machine independent “class file format” that all JVM implementations must support..class files JVM load External representation (platform independent) Internal representation (implementation dependent) objects classes methods arrays strings primitive types

Review (Chapter 9) 29 Inspecting JVM code % javac Factorial.java % javap -c -verbose Factorial Compiled from Factorial.java class Factorial extends java.lang.Object { Factorial(); /* Stack=1, Locals=1, Args_size=1 */ int fac(int); /* Stack=2, Locals=4, Args_size=2 */ } Method Factorial() 0 aload_0 1 invokespecial #1 4 return % javac Factorial.java % javap -c -verbose Factorial Compiled from Factorial.java class Factorial extends java.lang.Object { Factorial(); /* Stack=1, Locals=1, Args_size=1 */ int fac(int); /* Stack=2, Locals=4, Args_size=2 */ } Method Factorial() 0 aload_0 1 invokespecial #1 4 return

Review (Chapter 9) 30 Compiling and Disassembling... // address: Method int fac(int) // stack: this n result i 0 iconst_1 // stack: this n result i 1 1 istore_2 // stack: this n result i 2 iconst_2 // stack: this n result i 2 3 istore_3 // stack: this n result i 4 goto 14 7 iload_2 // stack: this n result i result 8 iload_3 // stack: this n result i result i 9 imul // stack: this n result i result*i 10 istore_2 // stack: this n result i 11 iinc 3 1 // stack: this n result i 14 iload_3 // stack: this n result i i 15 iload_1 // stack: this n result i i n 16 if_icmplt 7 // stack: this n result i 19 iload_2 // stack: this n result i result 20 ireturn

Review (Chapter 9) 31 Code Generation Source Program let var n: integer; var c: char in begin c := ‘&’; n := n+1 end PUSH 2 LOADL 38 STORE 1[SB] LOAD 0[SB] LOADL 1 CALL add STORE 0[SB] POP 2 HALT Target program ~ ~ Source and target program must be “semantically equivalent” Semantic specification of the source language is structured in terms of phrases in the SL: expressions, commands, etc. => Code generation follows the same “inductive” structure.

Review (Chapter 9) 32 Specifying Code Generation with Code Templates The code generation functions for Mini Triangle Syntax class Function Effect of the generated code Program Command Expres- sion V-name Decla- ration run P execute C evaluate E fetch V assign V elaborate D Run program P then halt. Start and finish with empty stack. Execute command C. May update variables but does not shrink or grow the stack! Evaluate expression E. Net result is pushing the value of E onto the stack. Push the value of constant or variable onto the stack. Pop value from stack and store in variable V. Elaborate declaration D. Make space on the stack for constants and variables in D.

Review (Chapter 9) 33 Code Generation with Code Templates execute [ while E do C ] = JUMP h g: execute [ C ] h: evaluate[ E ] JUMPIF(1) g C E While command

Review (Chapter 9) 34 Two Kinds of Interpreters Iterative interpretation: Well suited for quite simple languages, and fast (at most 10 times slower than compiled languages) Recursive interpretation: Well suited for more complex languages, but slower (up to 100 times slower than compiled languages)

Review (Chapter 9) 35 Hypo: a Hypothetical Abstract Machine 4096-word code store and 4096-word data store PC: program counter (register), initially 0 ACC: general purpose accumulator (register), initially 0 4-bit opcode and 12-bit operand Instruction set: OpcodeInstructionMeaning 0STORE dword at address d := ACC 1LOAD dACC := word at address d 2LOADL dACC := d 3ADD dACC := ACC + word at address d 4SUB dACC := ACC – word at address d 5JUMP dPC := d 6JUMPZ dif ACC = 0 then PC := d 7HALTstop execution

Review (Chapter 9) 36 Mini-Basic Interpreter Mini-Basic abstract machine: –Data store: array of size 26 floating-point values –Code store: array of commands –Possible representations for each command: Character string (yields slowest execution) Sequence of tokens (good compromise) AST (yields longest response time)

Review (Chapter 9) 37 Recursive Interpretation Recursively defined languages cannot be interpreted iteratively (fetch-analyze-execute), because each command can contain any number of other commands Both analysis and execution must be recursive (similar to the parsing phase when compiling a high-level language) Hence, the entire analysis must precede the entire execution: –Step 1: Fetch and analyze (recursively) –Step 2: Execute (recursively) Execution is a traversal of the decorated AST, hence we can use a new visitor Values (variables and constants) are handled internally

Review (Chapter 9) 38 Code optimization (improvement) The code generated by our compiler is not efficient: It computes some values at runtime that could be known at compile time It computes some values more times than necessary We can do better! Constant folding Common sub-expression elimination Code motion Dead code elimination

Review (Chapter 9) 39 Optimization implementation Is the optimization correct or safe? Is the optimization really an improvement? What sort of analyses do we need to perform to get the required information? – Local – Global