1 Chapter 2 A Simple Compiler. 2 Outlines 2.1 The Structure of a Micro Compiler 2.2 A Micro Scanner 2.3 The Syntax of Micro 2.4 Recursive Descent Parsing.

Slides:



Advertisements
Similar presentations
CPSC Compiler Tutorial 9 Review of Compiler.
Advertisements

176 Formal Languages and Applications: We know that Pascal programming language is defined in terms of a CFG. All the other programming languages are context-free.
1 Contents Introduction A Simple Compiler Scanning – Theory and Practice Grammars and Parsing LL(1) Parsing LR Parsing Lex and yacc Semantic Processing.
1 Chapter 5 Compilers Source Code (with macro) Macro Processor Expanded Code Compiler or Assembler obj.
CS 330 Programming Languages 09 / 13 / 2007 Instructor: Michael Eckmann.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Context-Free Grammars Lecture 7
Chapter 3 Program translation1 Chapt. 3 Language Translation Syntax and Semantics Translation phases Formal translation models.
Chapter 2 A Simple Compiler
Yu-Chen Kuo1 Chapter 2 A Simple One-Pass Compiler.
Dr. Muhammed Al-Mulhem 1ICS ICS 535 Design and Implementation of Programming Languages Part 1 Fundamentals (Chapter 4) Compilers and Syntax.
Chapter 2 A Simple Compiler
1 1 Chapter 2 A Simple Compiler Prof Chung. 1 NTHU SSLAB7/2/2015.
1 Chapter 3 Context-Free Grammars and Parsing. 2 Parsing: Syntax Analysis decides which part of the incoming token stream should be grouped together.
UMBC Introduction to Compilers CMSC 431 Shon Vick 01/28/02.
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
1 Scanning Aaron Bloomfield CS 415 Fall Parsing & Scanning In real compilers the recognizer is split into two phases –Scanner: translate input.
Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth.
Compiler Construction 1. Objectives Given a context-free grammar, G, and the grammar- independent functions for a recursive-descent parser, complete the.
1 Syntax and Semantics The Purpose of Syntax Problem of Describing Syntax Formal Methods of Describing Syntax Derivations and Parse Trees Sebesta Chapter.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
ICS611 Introduction to Compilers Set 1. What is a Compiler? A compiler is software (a program) that translates a high-level programming language to machine.
CPSC 388 – Compiler Design and Construction Parsers – Context Free Grammars.
Syntax Analysis (Chapter 4) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.
Syntax and Semantics Structure of programming languages.
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down  LL Bottom-up.
1 Semantic Analysis Aaron Bloomfield CS 415 Fall 2005.
Winter 2007SEG2101 Chapter 71 Chapter 7 Introduction to Languages and Compiler.
Introduction to Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
1 Chapter 3 Describing Syntax and Semantics. 3.1 Introduction Providing a concise yet understandable description of a programming language is difficult.
Context-Free Grammars
CS Describing Syntax CS 3360 Spring 2012 Sec Adapted from Addison Wesley’s lecture notes (Copyright © 2004 Pearson Addison Wesley)
PART I: overview material
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
ISBN Chapter 3 Describing Syntax and Semantics.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
Syntax and Semantics Structure of programming languages.
Chapter 2. Design of a Simple Compiler J. H. Wang Sep. 21, 2015.
1 Chapter 1 Introduction. 2 Outlines 1.1 Overview and History 1.2 What Do Compilers Do? 1.3 The Structure of a Compiler 1.4 The Syntax and Semantics of.
CPS 506 Comparative Programming Languages Syntax Specification.
C Chuen-Liang Chen, NTUCS&IE / 11 A SIMPLE COMPILER Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University.
D Goforth COSC Translating High Level Languages.
D Goforth COSC Translating High Level Languages Note error in assignment 1: #4 - refer to Example grammar 3.4, p. 126.
Syntax The Structure of a Language. Lexical Structure The structure of the tokens of a programming language The scanner takes a sequence of characters.
Muhammad Idrees, Lecturer University of Lahore 1 Top-Down Parsing Top down parsing can be viewed as an attempt to find a leftmost derivation for an input.
Compiler Design Introduction 1. 2 Course Outline Introduction to Compiling Lexical Analysis Syntax Analysis –Context Free Grammars –Top-Down Parsing –Bottom-Up.
Compiler Introduction 1 Kavita Patel. Outlines 2  1.1 What Do Compilers Do?  1.2 The Structure of a Compiler  1.3 Compilation Process  1.4 Phases.
1 Compiler & its Phases Krishan Kumar Asstt. Prof. (CSE) BPRCE, Gohana.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
CSC 4181 Compiler Construction
Overview of Compilation Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 2.
Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages Chapter 3 : Describing Syntax and Semantics Syntax.
CS510 Compiler Lecture 1. Sources Lecture Notes Book 1 : “Compiler construction principles and practice”, Kenneth C. Louden. Book 2 : “Compilers Principles,
CC410: System Programming Dr. Manal Helal – Fall 2014 – Lecture 12–Compilers.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Syntax and Semantics Structure of programming languages.
Introduction to Parsing
Chapter 3 – Describing Syntax
A Simple Syntax-Directed Translator
Programming Languages Translator
Compiler Lecture 1 CS510.
R.Rajkumar Asst.Professor CSE
Context–free Grammar CFG is also called BNF (Backus-Naur Form) grammar
Compiler design.
High-Level Programming Language
Faculty of Computer Science and Information System
Presentation transcript:

1 Chapter 2 A Simple Compiler

2 Outlines 2.1 The Structure of a Micro Compiler 2.2 A Micro Scanner 2.3 The Syntax of Micro 2.4 Recursive Descent Parsing 2.5 Translating Micro

3 Micro Micro: a very simple language –Only integers –No declarations –Variables consist of A..Z, 0..9, and at most 32 characters long. –Comments begin with -- and end with end-of-line. –Three kinds of statements: assignments, e.g., a := b + c read(list of ids), e.g., read(a, b) write(list of exps), e.g., write(a+b) –Begin, end, read, and write are reserved words. –Tokens may not extend to the following line.

4 The Structure of a Micro compiler One-pass type, no explicit intermediate representations used –See P. 9, Fig. 1.3 The interface –Parser is the main routine. –Parser calls scanner to get the next token. –Parser calls semantic routines at appropriate times. –Semantic routines produce output in assembly language. –A simple symbol table is used by the semantic routines.

5 ScannerParser Semantic Routines Source Program (Character Stream) TokensSyntactic Structure Target Machine Code Symbol and Attribute Tables (Used by all Phases of The Compiler)

6 A Micro Scanner The Micro Scanner will be a function of no arguments that returns token values –There are 14 tokens. typedef enum token_types { BEGIN, END, READ, WRITE, ID, INTLITERAL, LPAREN, RPAREN, SEMICOLON, COMMA, ASSIGNOP, PLUSOP, MINUSOP, SCANEOF } token; Extern token scanner(void);

7 A Program Example of Micro Begin A:=BB-314+A; end SCANEOF

8 A Program Example of Micro Begin A:=BB-314+A; end SCANEOF

9 A Micro Scanner (Cont’d) The scanner returns the longest string that constitutes a token, e.g., in abcdef ab, abc, abcdef are all valid tokens. The scanner will return the longest one (i.e., abcdef).

10 continue: skip one iteration getchar(), isspace(), isalpha(), isalnum(), isdigital() ungetc(): push back one character

11

12 A Micro Scanner (Cont’d) How to handle RESERVED words? –Reserved words are similar to identifiers. Two approaches: –Use a separate table of reserved words –Put all reserved words into symbol table initially.

13 A Micro Scanner (Cont’d) Provision for saving the characters of a token as they are scanned –token_buffer, buffer_char(), clear_buffer(), check_reserved() Handle end of file –feof(stdin)

14 Complete Scanner Function for Micro

15

16 The Syntax of Micro Micro's syntax is defined by a context-free grammar (CFG) –CFG is also called BNF (Backus-Naur Form) grammar CFG consists of a set of production rules, A  B C D  Z LHS must be a single nonterminal RHS consists 0 or more terminals or nonterminals LHS RHS

17 The Syntax of Micro (Cont’d) Two kinds of symbols –Nonterminals Delimited by Represent syntactic structures –Terminals Represent tokens E.g.  begin end Start or goal symbol : empty or null string

18 The Syntax of Micro (Cont’d) E.g. 

19 The Syntax of Micro (Cont’d) Extended BNF: some abbreviations 1. optional: [ ] 0 or 1  if then  if then else can be written as  if then [ else ] 2. repetition: { } 0 or more  can be written as  { }

20 The Syntax of Micro (Cont’d) Extended BNF: some abbreviations 3. alternative: | or  can be written as  | Extended BNF == BNF –Either can be transformed to the other. –Extended BNF is more compact and readable

21 The Syntax of Micro (Cont’d)

22 The derivation of begin ID:= ID + (INTLITERAL – ID); end              

23 The Syntax of Micro (Cont’d) A CFG defines a language, which is a set of sequences of tokens Syntax errors & semantic errors A:=‘X’+True;

24 The Syntax of Micro (Cont’d) Associativity A-B-C Operator precedence A+B*C

25 A grammar fragment defines such a precedence relationship

26 With parentheses, the desired grouping can be forced

27 Recursive Descent Parsing There are many parsing techniques. –Recursive descent is one of the simplest parsing techniques Basic idea –Each nonterminal has a parsing procedure –For symbol on the RHS : a sequence of matching To Match a nonterminal A –Call the parsing procedure of A To match a terminal symbol t –Call match(t) »match(t) calls the scanner to get the next token. If is it t, everything is correct. If it is not t, we have found a syntax error.

28 Recursive Descent Parsing If a nonterminal has several productions, choose an appropriate one based on the next input token. Parser is started by invoking system_goal().

29 next_token() : a function that returns the next token. It does not call scanner(void). 處理 { } 不出 現的情形

30 必出現一次

31

32

33

34 Translating Micro Target language: 3-addr code (quadruple) –OP A, B, C Note that we did not worry about registers at this time. –temporaries: Sometimes we need to hold temporary values. E.g. A+B+C ADD A,B,TEMP&1 ADD TEMP&1,C,TEMP&2

35 Translating Micro (Cont’d) Action Symbols –The bulk of a translation is done by semantic routine –Action symbols can be added to a grammar to specify when semantic processing should take place Be placed anywhere in the RHS of a production –translated into procedure call in the parsing procedures #add corresponds to a semantic routine named add() –No impact on the languages recognized by a parser driven by a CFG

36 ScannerParser Semantic Routines Source Program (Character Stream) TokensSyntactic Structure Target Machine Code Symbol and Attribute Tables (Used by all Phases of The Compiler)

37

38 Semantic Information Semantic routines need certain information to do their work. –These information is stored in semantic records. –Each kind of grammar symbol has a semantic record

39 Previous parsing procedure P. 37 New parsing procedure which involved semantic routines  { } #gen_infix

40 Semantic Information (Cont’d) Subroutines for symbol table and temporaries

41 Semantic Information (Cont’d) Semantic routines

42

43

44

45

46