Lecture 2: General Structure of a Compiler

Slides:



Advertisements
Similar presentations
Semantic analysis Parsing only verifies that the program consists of tokens arranged in a syntactically-valid combination, we now move on to semantic analysis,
Advertisements

Yu-Chen Kuo1 Chapter 1 Introduction to Compiling.
Reference Book: Modern Compiler Design by Grune, Bal, Jacobs and Langendoen Wiley 2000.
From Cooper & Torczon1 Implications Must recognize legal (and illegal) programs Must generate correct code Must manage storage of all variables (and code)
Compiler Construction
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.
COP4020 Programming Languages
Chapter 1. Introduction.
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
Parser-Driven Games Tool programming © Allan C. Milne Abertay University v
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
Lexical Analysis - An Introduction Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at.
Compiler course 1. Introduction. Outline Scope of the course Disciplines involved in it Abstract view for a compiler Front-end and back-end tasks Modules.
1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
1.  10% Assignments/ class participation  10% Pop Quizzes  05% Attendance  25% Mid Term  50% Final Term 2.
Compiler design Lecture 1: Compiler Overview Sulaimany University 2 Oct
Chapter 1 Introduction. Chapter 1 - Introduction 2 The Goal of Chapter 1 Introduce different forms of language translators Give a high level overview.
Introduction Lecture 1 Wed, Jan 12, The Stages of Compilation Lexical analysis. Syntactic analysis. Semantic analysis. Intermediate code generation.
Introduction to Compilers. Related Area Programming languages Machine architecture Language theory Algorithms Data structures Operating systems Software.
Topic #1: Introduction EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Overview of Previous Lesson(s) Over View  A program must be translated into a form in which it can be executed by a computer.  The software systems.
1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.
Chapter 1 Introduction Study Goals: Master: the phases of a compiler Understand: what is a compiler Know: interpreter,compiler structure.
Introduction to Compiling
Compiler Design Introduction 1. 2 Course Outline Introduction to Compiling Lexical Analysis Syntax Analysis –Context Free Grammars –Top-Down Parsing –Bottom-Up.
Intermediate Code Representations
Compiler Introduction 1 Kavita Patel. Outlines 2  1.1 What Do Compilers Do?  1.2 The Structure of a Compiler  1.3 Compilation Process  1.4 Phases.
The Model of Compilation Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
What is a compiler? –A program that reads a program written in one language (source language) and translates it into an equivalent program in another language.
1 Compiler Construction Vana Doufexi office CS dept.
Objective of the course Understanding the fundamentals of the compilation technique Assist you in writing you own compiler (or any part of compiler)
CHAPTER 1 INTRODUCTION TO COMPILER SUNG-DONG KIM, DEPT. OF COMPUTER ENGINEERING, HANSUNG UNIVERSITY.
Chapter 1. Introduction.
Lecture 9 Symbol Table and Attributed Grammars
Advanced Computer Systems
Compiler Design (40-414) Main Text Book:
Introduction Chapter : Introduction.
PRINCIPLES OF COMPILER DESIGN
Chapter 1 Introduction.
Introduction to Compiler Construction
Lexical and Syntax Analysis
A Simple Syntax-Directed Translator
Compiler Construction (CS-636)
Introduction.
Chapter 1 Introduction.
课程名 编译原理 Compiling Techniques
Chapter 1: Introduction to Compiling (Cont.)
Compiler Lecture 1 CS510.
CS 536 / Fall 2017 Introduction to programming languages and compilers
Compiler Construction
CS416 Compiler Design lec00-outline September 19, 2018
CSE 3302 Programming Languages
Introduction to Compiler Construction
Course supervisor: Lubna Siddiqui
Introduction CI612 Compiler Design CI612 Compiler Design.
Lecture 5: Lexical Analysis III: The final bits
Lecture 7: Introduction to Parsing (Syntax Analysis)
Compiler Construction
Compiler design.
CS416 Compiler Design lec00-outline February 23, 2019
Introduction to Compiler Construction
Chapter 10: Compilers and Language Translation
Compiler Construction
Lec00-outline May 18, 2019 Compiler Design CS416 Compiler Design.
Introduction Chapter : Introduction.
Introduction to Compiler Construction
Faculty of Computer Science and Information System
Presentation transcript:

Lecture 2: General Structure of a Compiler Source code Object code Errors (from last lecture) The compiler:- must generate correct code. must recognise errors. analyses and synthesises. In today’s lecture: more details about the compiler’s structure. 13-Nov-18 COMP36512 Lecture 2

Conceptual Structure:two major phases Front-End Back-End Source code Intermediate Representation Target code Front-end performs the analysis of the source language: Recognises legal and illegal programs and reports errors. “understands” the input program and collects its semantics in an IR. Produces IR and shapes the code for the back-end. Much can be automated. Back-end does the target language synthesis: Chooses instructions to implement each IR operation. Translates IR into target code. Needs to conform with system interfaces. Automation has been less successful. Typically front-end is O(n), while back-end is NP-complete. What is the implication of this separation (front-end: analysis; back-end:synthesis) in building a compiler for, say, a new language? A problem which we don’t know how to solve in less than exponential time 13-Nov-18 COMP36512 Lecture 2

mn compilers with m+n components! target 1 Fortran Front-end Back-end Smalltalk target 2 Front-end Back-end I.R. C target 3 Front-end Back-end Java target 4 Front-end Back-end All language specific knowledge must be encoded in the front-end All target specific knowledge must be encoded in the back-end But: in practice, this strict separation is not free of charge. 13-Nov-18 COMP36512 Lecture 2

General Structure of a compiler Source Lexical Analysis I.C. Optimisation tokens IR Syntax Analysis Code Generation I.R. Abstract Syntax Tree (AST) symbolic instructions Semantic Analysis Target code Optimisation Annotated AST optimised symbolic instr. Intermediate code generat. Target code Generation Target front-end back-end 13-Nov-18 COMP36512 Lecture 2

Lexical Analysis (Scanning) Reads characters in the source program and groups them into words (basic unit of syntax) Produces words and recognises what sort they are. The output is called token and is a pair of the form <type, lexeme> or <token_class, attribute> E.g.: a=b+c becomes <id,a> <=,> <id,b> <+,> <id,c> Needs to record each id attribute: keep a symbol table. Lexical analysis eliminates white space, etc… Speed is important - use a specialised tool: e.g., flex - a tool for generating scanners: programs which recognise lexical patterns in text; for more info: % man flex 13-Nov-18 COMP36512 Lecture 2

Syntax (or syntactic) Analysis (Parsing) Imposes a hierarchical structure on the token stream. This hierarchical structure is usually expressed by recursive rules. Context-free grammars formalise these recursive rules and guide syntax analysis. Example: expression  expression ‘+’ term | expression ‘-’ term | term term  term ‘*’ factor | term ‘/’ factor | factor factor  identifier | constant | ‘(‘ expression ‘)’ (this grammar defines simple algebraic expressions) 13-Nov-18 COMP36512 Lecture 2

Parsing: parse tree for b*b-4*a*c expression expression - term * term factor term * <id,c> term factor term factor * <id,b> <id,a> factor factor Useful to recognise a valid sentence! Contains a lot of unneeded information! <id,b> <const, 4> 13-Nov-18 COMP36512 Lecture 2

AST for b*b-4*a*c - * * <id,b> <id,b> * <id,c> <const, 4> <id,a> An Abstract Syntax Tree (AST) is a more useful data structure for internal representation. It is a compressed version of the parse tree (summary of grammatical structure without details about its derivation) ASTs are one form of IR 13-Nov-18 COMP36512 Lecture 2

Semantic Analysis (context handling) Collects context (semantic) information, checks for semantic errors, and annotates nodes of the tree with the results. Examples: type checking: report error if an operator is applied to an incompatible operand. check flow-of-controls. uniqueness or name-related checks. 13-Nov-18 COMP36512 Lecture 2

Intermediate code generation Translate language-specific constructs in the AST into more general constructs. A criterion for the level of “generality”: it should be straightforward to generate the target code from the intermediate representation chosen. Example of a form of IR (3-address code): tmp1=4 tmp2=tmp1*a tmp3=tmp2*c tmp4=b*b tmp5=tmp4-tmp3 13-Nov-18 COMP36512 Lecture 2

Code Optimisation The goal is to improve the intermediate code and, thus, the effectiveness of code generation and the performance of the target code. Optimisations can range from trivial (e.g. constant folding) to highly sophisticated (e.g, in-lining). For example: replace the first two statements in the example of the previous slide with: tmp2=4*a Modern compilers perform such a range of optimisations, that one could argue for: Front-End Middle-End (optimiser) Back-End Target code Source code IR IR 13-Nov-18 COMP36512 Lecture 2

Code Generation Phase Map the AST onto a linear list of target machine instructions in a symbolic form: Instruction selection: a pattern matching problem. Register allocation: each value should be in a register when it is used (but there is only a limited number): NP-Complete problem. Instruction scheduling: take advantage of multiple functional units: NP-Complete problem. Target, machine-specific properties may be used to optimise the code. Finally, machine code and associated information required by the Operating System are generated. 13-Nov-18 COMP36512 Lecture 2

Some historical notes... Emphasis of compiler construction research: 1945-1960: code generation need to “prove” that high-level programming can produce efficient code (“automatic programming”). 1960-1975: parsing proliferation of programming languages study of formal languages reveals powerful techniques. 1975-...: code generation and code optimisation Knuth (1962) observed that “in this field there has been an unusual amount of parallel discovery of the same technique by people working independently” 13-Nov-18 COMP36512 Lecture 2

Historical Notes: the Move to Higher-Level Programming Languages Machine Languages (1st generation) Assembly Languages (2nd generation) – early 1950s High-Level Languages (3rd generation) – later 1950s 4th generation higher level languages (SQL, Postscript) 5th generation languages (logic based, eg, Prolog) Other classifications: Imperative (how); declarative (what) Object-oriented languages Scripting languages See exercise 1.3.1 in Aho2 13-Nov-18 COMP36512 Lecture 2

Finally... Summary: the structure of a typical compiler was described. Parts of a compiler can be generated automatically using generators based on formalisms. E.g.: Scanner generators: flex Parser generators: bison Summary: the structure of a typical compiler was described. Next time: Introduction to lexical analysis. Reading: Aho2, Sections 1.2, 1.3; Aho1, pp. 1-24; Hunter, pp. 1-15 (try the exercises); Grune [rest of Chapter 1 up to Section 1.8] (try the exercises); Cooper & Torczon (1st edition), Sections 1.4, 1.5. 13-Nov-18 COMP36512 Lecture 2