CS 426 Compiler Construction 1. Introduction. Prolog.

Slides:



Advertisements
Similar presentations
Compilers and Language Translation
Advertisements

1 Pass Compiler 1. 1.Introduction 1.1 Types of compilers 2.Stages of 1 Pass Compiler 2.1 Lexical analysis 2.2. syntactical analyzer 2.3. Code generation.
Semantic analysis Parsing only verifies that the program consists of tokens arranged in a syntactically-valid combination, we now move on to semantic analysis,
CPSC Compiler Tutorial 9 Review of Compiler.
ISBN Chapter 3 Describing Syntax and Semantics.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
CS 330 Programming Languages 09 / 13 / 2007 Instructor: Michael Eckmann.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Chapter 3 Program translation1 Chapt. 3 Language Translation Syntax and Semantics Translation phases Formal translation models.
CS 153: Concepts of Compiler Design August 25 Class Meeting Department of Computer Science San Jose State University Fall 2014 Instructor: Ron Mak
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
Invitation to Computer Science 5th Edition
1 Syntax and Semantics The Purpose of Syntax Problem of Describing Syntax Formal Methods of Describing Syntax Derivations and Parse Trees Sebesta Chapter.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
INTRODUCTION TO COMPUTING CHAPTER NO. 06. Compilers and Language Translation Introduction The Compilation Process Phase 1 – Lexical Analysis Phase 2 –
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
1 History of compiler development 1953 IBM develops the 701 EDPM (Electronic Data Processing Machine), the first general purpose computer, built as a “defense.
CSC 338: Compiler design and implementation
CS Describing Syntax CS 3360 Spring 2012 Sec Adapted from Addison Wesley’s lecture notes (Copyright © 2004 Pearson Addison Wesley)
C H A P T E R TWO Syntax and Semantic.
ISBN Chapter 3 Describing Syntax and Semantics.
TextBook Concepts of Programming Languages, Robert W. Sebesta, (10th edition), Addison-Wesley Publishing Company CSCI18 - Concepts of Programming languages.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
CPS 506 Comparative Programming Languages Syntax Specification.
Introduction to Compilers. Related Area Programming languages Machine architecture Language theory Algorithms Data structures Operating systems Software.
3.2 Semantics. 2 Semantics Attribute Grammars The Meanings of Programs: Semantics Sebesta Chapter 3.
Overview of Previous Lesson(s) Over View  A program must be translated into a form in which it can be executed by a computer.  The software systems.
1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.
Chapter 1 Introduction Study Goals: Master: the phases of a compiler Understand: what is a compiler Know: interpreter,compiler structure.
Compilation With an emphasis on getting the job done quickly Copyright © – Curt Hill.
Introduction to Compiling
Introduction CPSC 388 Ellen Walker Hiram College.
Compiler Design Introduction 1. 2 Course Outline Introduction to Compiling Lexical Analysis Syntax Analysis –Context Free Grammars –Top-Down Parsing –Bottom-Up.
Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi.
1 Compiler & its Phases Krishan Kumar Asstt. Prof. (CSE) BPRCE, Gohana.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
Compiler Construction CPCS302 Dr. Manal Abdulaziz.
Fortran Compilers David Padua University of Illinois at Urbana-Champaign.
CSC 4181 Compiler Construction
C H A P T E R T W O Syntax and Semantic. 2 Introduction Who must use language definitions? Other language designers Implementors Programmers (the users.
Overview of Compilation Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 2.
CSC 8505 Compiler Construction
CS416 Compiler Design1. 2 Course Information Instructor : Dr. Ilyas Cicekli –Office: EA504, –Phone: , – Course Web.
CS 2130 Lecture 18 Bottom-Up Parsing or Shift-Reduce Parsing Warning: The precedence table given for the Wff grammar is in error.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Chapter 1 Introduction Samuel College of Computer Science & Technology Harbin Engineering University.
Describing Syntax and Semantics
Advanced Computer Systems
Compiler Design (40-414) Main Text Book:
Introduction Chapter : Introduction.
Describing Syntax and Semantics
Lexical and Syntax Analysis
System Software Unit-1 (Language Processors) A TOY Compiler
A Simple Syntax-Directed Translator
CS510 Compiler Lecture 4.
Introduction.
课程名 编译原理 Compiling Techniques
History of compiler development
Compiler Construction
Course supervisor: Lubna Siddiqui
Lexical and Syntax Analysis
Compiler Construction
High-Level Programming Language
Chapter 10: Compilers and Language Translation
Lec00-outline May 18, 2019 Compiler Design CS416 Compiler Design.
Introduction Chapter : Introduction.
Faculty of Computer Science and Information System
Presentation transcript:

CS 426 Compiler Construction 1. Introduction

Prolog

›This is a course about designing and building programming language analyzers and translators. ›A fascinating topic: –Compilers as translators facilitate programming (increase productivity) by ›Presenting a high-level interface ( high level language ) ›Enabling target system independence for portability ›Detecting errors, defects ›Applying optimizations. When they work, programmers struggle less to tune the program to target machine. Acceptance of a programming language in some cases depends on compiler effectiveness. ›A bridge across areas –Languages –Machines –Theory, decidability, complexity

›Program translation and analysis are among the oldest Computer Science subject and have numerous applications. –Implementation of compilers that translate high level languages onto machine language. A crucial part of computing since the early days. –Implementation of efficient interpreters. –Program analysis for ›Optimization ›Parallelization/vectorization ›Refactoring for readability ›Static error detection. ›Security –Binary translation cross platforms to increase availability of software –Hardware synthesis, to translate from notations like Verilog and VHDL onto RTL (register transfer language) –Database query implementation and optimization

›Performance today –Crucial for some applications: ›Real time systems ›Games ›Computational sciences –Not so much for others ›Highly interactive programs that mainly wait for user some pf the time and I/O the rest of the time. ›Computations for which interactiveness is more important thatn speed (e.g. MATLAB, R, …)

The first commercial compiler

A little bit of compiler history In the beginning there was FORTRAN ›The compiler (ca. 1956) was a momentous accomplishment ›The top compiler algorithm of the century ›Accomplishment by John Backus and his small team even more impressive given how little was known then. IEEE Computing in Science and Engineering, Jan 2000

Fortran I Subscript evaluation ›The address of array element A(I,J,c3*K+6) is base_A+I-1+(J-1)*D1+(c3*K+6-1*DI*DJ) ›There was no strength reduction, induction variable analysis, nor data flow analysis. ›They used a pattern matching so that every time “ K is increased by n (under the control of a DO ), the index quantity is increased by c3 DI DJ n, giving the correct value” (Backus, Western Joint Computer Conference, 1957)

Fortran I Induction variable analysis ›“… it was not practical to track down and identify linear changes in subscripts resulting from assignment statements. Thus, the sole criterion …for efficient handling of array references was to be that the subscripts involved were being controlled by DO statements”

Fortran I Operator precedence in Fortran I ›A big deal. ›“The lack of operator priority (often called precedence …) in the IT language was the most frequent single cause of errors by the users of that compiler” Donald Knuth. ›The Fortran I algorithm: –Replace + and – with ))+(( and ))-(( respectively –Replace * and / with )*( and )/(, respectively –Add (( at the beginning of each expression and after each left parenthesis in the original expression. –Add )) at the end and before each right parenthesis ›“The resulting formula is properly parenthesized, believe it or not” D. Knuth

Fortran I Register allocation ›Extremely complex ›Used to manage the three index registers of the 704 ›“… much of it was carried along into Fortran II and still in use in the 705/9/90. In many programs. In many programs it still contributes to the production of better code than can be achieved on the new Fortran IV compiler.” Saul Rosen

Fortran I A difficult chore ›“… didn’t really work when it was delivered.” ›“ At first people thought it would never be done.” › “Then when it was in field test, with many bugs…, many thought it would never work. “ ›“Fortran is now almost taken for granted, as if it were built into the computer hardware.” Saul Rosen, 1967

Fortran I The challenge then ›“It was our belief that if FORTRAN, during its first months, were to translate any reasonable “scientific” source program into an object program only half as fast as its hand coded counterpart, then acceptance of our system would be in serious danger.” John Backus ›How close they come to this goal? Hard to tell ›But we know they succeeded and this conference is a clear testimony of their success

Language processors

›Languages can be –Translated ›Compiler ›Source-to-source –Interpreted –Processed by a combination of these two approaches ›Translation compiler source programtarget program linker Executable executable input output

Language processors (Cont.) ›Translation (Cont.) source-to- source translator source program (language A) source program (language B) compiler target program linker Executable executable input output

Language processors (Cont.) ›Translation (Cont.) translator source program byte code virtual machine input output Just-in-time compiler executable

The inside of a compiler Lexical analyzer Character stream Syntax analyzer Token stream Semantic analyzer Abstract syntax tree High level optimizer Abstract syntax tree Intermediate code generator Abstract syntax tree Low level optimizer Intermediate representation Code generator Intermediate representation Machine- specific optimizer Target machine code Symbol Table

The inside of a compiler Lexical analyzer Character stream Syntax analyzer Token stream Semantic analyzer Low level optimizer Intermediate representation Code generator Intermediate representation Machine- specific optimizer Target machine code

The inside of a compiler Lexical analyzer Character stream Syntax analyzer Token stream Semantic analyzer Abstract syntax tree High level optimizer Abstract syntax tree Intermediate code generator Abstract syntax tree Low level optimizer Intermediate representation Code generator Intermediate representation Machine- specific optimizer Target machine code Source to source optimizer X High level language

The inside of a compiler Lexical analyzer Character stream Syntax analyzer Token stream Semantic analyzer Abstract syntax tree High level optimizer Abstract syntax tree Intermediate code generator Abstract syntax tree Low level optimizer Intermediate representation Code generator Intermediate representation Machine- specific optimizer Target machine code Translator (for interpreter) X Byte code

The inside of a compiler Lexical analyzer Character stream Syntax analyzer Token stream Semantic analyzer Abstract syntax tree High level optimizer Abstract syntax tree Intermediate code generator Abstract syntax tree Low level optimizer Intermediate representation Code generator Intermediate representation Machine- specific optimizer Target machine code Front end

The inside of a compiler Lexical analyzer Character stream Syntax analyzer Token stream Semantic analyzer Low level optimizer Intermediate representation Code generator Intermediate representation Machine- specific optimizer Target machine code Front end

The front end

›It accepts the input language, including comments, pragmas, and macros ›Translates text into data that is more easily manipulable by the compiler. –Abstract syntax tree, or –Intermediate representation ›Detects and reports syntactic and semantic errors. ›It is built based on a description of the source language –Formal for the syntax. –Informal (typically) for the semantics (although much has been done to formalize the semantics). 25

Backus-Naur Form (BNF) ›Introduced by John Backus to formally describe IAL [ J. W. Backus, The syntax and semantics of the proposed international algebraic language of the Zürich ACM-GRAMM conference. ICIP Paris, June 1959.] ›Adopted to represent ALGOL 60. ›Widely, but not universally, used to describe syntax today (with some extensions). ›A formal description enables automatic (or semi-automatic) generation of lexers and parsers.

BNF of simple syntactic objects

BNF of a part of C

Example 1 of modified BNF (Modula 2)

Example 2 of Modified BNF (Also Modula 2)

Example 3 of modified BNF (Fortran 95)

Example 4 of modified BNF (Java)

Parsing ›Parsing is the process used to –Determine if a string of characters belongs to the language described by the BNF –Create the parse tree (not to be confused with the syntax tree in the textbook which is called abstract syntax tree in these slides) ›The parse tree is seldom explicitly computed and the syntax analyzer typically generates an abstract syntax tree or intermediate code directly.

Example of parse tree A_1 123

Example of parse tree 1 * k + x / 5

Formal notion of a grammar ›The BNF description of syntax involves four concepts –Nonterminals: syntactic categories from which elements of the language can be derived. These are all the symbols on the LHS of the rules. (e.g., ) –Terminals: The actual elements of the language that are not expanded further. They do not appear on the left hand side of any rule (e.g. +, A,…) ›Note: For practical reasons, parsing is typically done in two phases. First some objects like and are recognized by the lexical scanning phase. Then, the rest of the language is parsed assuming the objects recognized by lexical scanning are terminals. –Productions: The rules of the language, relating nonterminals and terminals. –The root: A distinguished not terminal that will be the root of all parse trees for elements of the language.

Formal notion of a grammar

Classes of grammars

Formal notion of a grammar

Multiple grammars, single language ›Different grammars can be equivalent, i.e. they generate the same language. ›Grammars can be modified to remove “undesirable properties” ›For example, it is better for the grammar not to be ambiguous. That is for it not to allow multiple parse trees for a given element of the language.

Example of ambiguous grammar (1/3)

Example of ambiguous grammar (2/3) * * 5 + 6

Example of ambiguous grammar (3/3) ›The examples above also show that we need the right grammar to represent –Associativity (Sec ) –Precedence of operators (Sec )

Left recursive grammar

A very simple compiler

Our first compiler

›Thus, for the assignment A = -A + 5 * B / (B-1) ›The compiler should generate ›LIT A ›LOAD ›NEG ›LIT 5 ›LIT B ›LOAD ›MUL ›LIT B ›LOAD ›LIT 1 ›NEG ›ADD ›DIV ›ADD ›STORE

Grammar

Recursive descent compiler ›There is only one variable: token, which has the value of the next character in the input line. ›The main program is as follows: char token; token= nextchar(); // nextchar() skips spaces assignment();

Recursive descent compiler identifier(){ print(“LIT”); print(token); token=nextchar(); } integer(){ print(“LIT”); print(token); token=nextchar(); }

Recursive descent compiler //process sequence of +/- while (token == “-” | token == “+”){ char t=token; token=nextchar(); term(); if t ==“-” emit(“NEG”) emit(“ADD”) }

Recursive descent compiler