Core Core: Simple prog. language for which you will write an interpreter as your project. First define the Core grammar Next look at the details of how.

Slides:

Advertisements

Similar presentations

1 Programming Languages (CS 550) Mini Language Interpreter Jeremy R. Johnson.

Advertisements

Honors Compilers An Introduction to Grammars Feb 12th 2002.

9/27/2006Prof. Hilfinger, Lecture 141 Syntax-Directed Translation Lecture 14 (adapted from slides by R. Bodik)

Context-Free Grammars Lecture 7

Chapter 2 Chang Chi-Chung Lexical Analyzer The tasks of the lexical analyzer:  Remove white space and comments  Encode constants as tokens.

Environments and Evaluation

Chapter 2 Chang Chi-Chung Lexical Analyzer The tasks of the lexical analyzer:  Remove white space and comments  Encode constants as tokens.

CSC 8310 Programming Languages Meeting 2 September 2/3, 2014.

2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.

LEX and YACC work as a team

Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.

Language Translators - Lee McCluskey LANGUAGE TRANSLATORS: WEEK 21 LECTURE: Using JavaCup to create simple interpreters

CS 280 Data Structures Professor John Peterson. How Does Parsing Work? You need to know where to start (“statement”) This grammar is constructed so that.

Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.

Interpretation Environments and Evaluation. CS 354 Spring Translation Stages Lexical analysis (scanning) Parsing –Recognizing –Building parse tree.

CPS 506 Comparative Programming Languages Syntax Specification.

Chapter 3 Context-Free Grammars and Parsing. The Parsing Process sequence of tokens syntax tree parser Duties of parser: Determine correct syntax Build.

1 Introduction to Parsing. 2 Outline l Regular languages revisited l Parser overview Context-free grammars (CFG ’ s) l Derivations.

LECTURE 7 Lex and Intro to Parsing. LEX Last lecture, we learned a little bit about how we can take our regular expressions (which specify our valid tokens)

CSE 3341/655; Part 3 35 This is all very wrong! What would the SW1/SW2 (RESOLVE) people say if they saw this? Problem: Lack of data abstraction; What would.

Syntax Analysis Or Parsing. A.K.A. Syntax Analysis –Recognize sentences in a language. –Discover the structure of a document/program. –Construct (implicitly.

Comp 411 Principles of Programming Languages Lecture 3 Parsing

CSE 3302 Programming Languages

COMPILER CONSTRUCTION

Chapter 3 – Describing Syntax

Intro to compilers Based on end of Ch. 1 and start of Ch. 2 of textbook, plus a few additional references.

A Simple Syntax-Directed Translator

Introduction to Parsing

Parsing & Context-Free Grammars

Programming Languages Translator

CS510 Compiler Lecture 4.

Introduction to Parsing (adapted from CS 164 at Berkeley)

Textbook:Modern Compiler Design

Interpreters Study Semantics of Programming Languages through interpreters (Executable Specifications) cs7100(Prasad) L8Interp.

CS 153: Concepts of Compiler Design December 5 Class Meeting

Syntax-Directed Translation Part I

Parser and Scanner Generation: An Introduction

CMPE 152: Compiler Design December 5 Class Meeting

Syntax Analysis Sections :.

Top-Down Parsing CS 671 January 29, 2008.

Syntax-Directed Definition

Mini Language Interpreter Programming Languages (CS 550)

Chapter 2: A Simple One Pass Compiler

Programming Languages 2nd edition Tucker and Noonan

CSE 3302 Programming Languages

CMPE 152: Compiler Design September 13 Class Meeting

CMPE 152: Compiler Design October 4 Class Meeting

COP4020 Programming Languages

CSE401 Introduction to Compiler Construction

Recursive Descent (contd)

Chapter 2: A Simple One Pass Compiler

Lecture 7: Introduction to Parsing (Syntax Analysis)

CSC 4181Compiler Construction Context-Free Grammars

R.Rajkumar Asst.Professor CSE

Programming Language Syntax 5

Lecture 4: Lexical Analysis & Chomsky Hierarchy

Syntax-Directed Translation

LL and Recursive-Descent Parsing Hal Perkins Autumn 2011

Designing a Predictive Parser

CS 3304 Comparative Languages

LL and Recursive-Descent Parsing

Chapter 2 :: Programming Language Syntax

CSC 4181 Compiler Construction Context-Free Grammars

Chapter 2 :: Programming Language Syntax

LL and Recursive-Descent Parsing Hal Perkins Autumn 2009

LL and Recursive-Descent Parsing Hal Perkins Winter 2008

COMPILER CONSTRUCTION

COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 2, 09/04/2003 Prof. Roy Levow.

Presentation transcript:

Core Core: Simple prog. language for which you will write an interpreter as your project. First define the Core grammar Next look at the details of how an interpreter for Core may be written. Approach to be used in interpreter: Recursive descent (also “syntax directed”) **The tabs on the next two pages don’t work correctly on the classroom PCs – need to reformat for use on those …** CSE 3341/655; Part 2

BNF for Core <prog> ::= program <decl seq> begin <stmt seq> end (1) <decl seq> ::= <decl> | <decl> <decl seq> (2) <stmt seq> ::= <stmt> | <stmt> <stmt seq> (3) <decl> ::= int <id list>; (4) <id list> ::= <id> | <id>, <id list> (5) <stmt> ::= <assign>|<if>|<loop>|<in>|<out> (6) <assign> ::= <id> = <exp>; (7) <if> ::= if <cond> then <stmt seq> end; (8) |if <cond> then <stmt seq> else <stmt seq> end; <loop> ::= while <cond> loop <stmt seq> end; (9) <in> ::= read <id list>; (10) <out> ::= write <id list>; (11) CSE 3341/655; Part 2

BNF for Core (contd.) <cond> ::= <comp>|!<cond> (12) | [<cond> && <cond>] | [<cond> or <cond>] <comp> ::= (<op> <comp op> <op>) (13) <exp> ::= <fac>|<fac>+<exp>|<fac>-<exp> (14) <fac> ::= <op> | <op> * <fac> (15) <op> ::= <int> | <id> | (<exp>) (16) <comp op> ::= != | == | < | > | <= | >= (17) <id> ::= <let> | <let><id> | <let><int> (18) <let> ::= A | B | C | ... | X | Y | Z (19) <int> ::= <digit> | <digit><int> (20) <digit> ::= 0 | 1 | 2 | 3 | ... | 9 (21) Notes: Problem with <exp>: consider 9-5+4; fix? -5 is not a legal <no>; fix? Productions (18)-(21) have no semantic significance; CSE 3341/655; Part 2

Parse Tree for a simple program program int X; begin X = 25; write X; end <prog> program <decl seq> end begin <stmt seq> <decl> <stmt seq> <stmt> <id list> int ; <assign> <stmt> <id> <let> x <output> <id> ; = <exp> write <id list> ; <let> x <...> <id> <let> x CSE 3341/655; Part 2

Concrete vs. Abstract Parse Trees program int x; begin X = 25; output X; end <prog> program <decl seq> end begin <stmt seq> <decl> <stmt> <id list> int ; <assign> <id> <let> X = <exp> x <...> <output> write ? ? ? CSE 3341/655; Part 2

Abstract Parse Tree <prog> <decl seq> <stmt seq> program int X; begin X = 25; write X; end <prog> <decl seq> <stmt seq> <decl> <stmt> <id list> <id> X <assign> <oper> <fac> <int> 25 <output> 1. What if we had declared Y instead of X? 2. What if we had exchanged the two statements? CSE 3341/655; Part 2

Core Interpreter Tokenizer: Inputs Core program, produces stream of tokens; Parser: Consumes stream of tokens, produces the abstract parse tree (PT); Printer: Given PT, prints the original prog. in a pretty format Executor: Given PT, executes the program; Parser, Printer, Executor: use recursive descent approach. Mention Lex, YACC, Flex, Bison, Antlr, … Slide 16 notes How to do this in pure BNF? Using ε it is easy. Without it, the number of productions increases quite a bit. But using ε can cause problems for compilers. In homeworks, exams, etc. you may use it unless I say otherwise. Relation to book: So far, mostly chapter 1; and 3.1., 3.2, 3.3; rest of chapter 3 not inclded. We will move to chapter 4; that will lead us to the project. A lot of this should be familiar from 321 and 625. But going over it again should make it easier to see how it relates to PLs and lang. implementations. The project also has some relation to 560. CSE 3341/655; Part 2

Tokenizer Tokens: Reserved words: program, begin, end, int, if, then, else, while, loop, read, write Operators/special symbols: ; , = ! [ ] && or ( ) + - * != == < > <= >= Integers (unsigned) Identifiers (start with uc letter, followed by zero or more uc letters followed by zero or more digits) CSE 3341/655; Part 2

Tokenizer methods ... getToken(): returns (info about) current token; Repeated calls to getToken() return same token. skipToken(): skips current token; next token becomes current token; so next call to getToken() will return new token. intVal(): returns the value of the current (integer) token; (what if current token is not an integer? -- error!) idName(): returns the name (string) of the current (id) token. (what if current token is not an id? -- error!) CSE 3341/655; Part 2

Recursive Descent Key idea: Single procedure PN corr. to each non-term. N PN is responsible for every occurrence of N and only occurrences of N Will use this approach for parsing, printing, execution Details: Obtain abstract parse tree Pass root node to PS (S is starting non-term.) Each PN gets most of the work done by procedures correspoding to the children of the nodes it receives as argument CSE 3341/655; Part 2

Recursive Descent (contd.) Example <if> <cond> <stmt seq> <stmt seq> ... ... ... void execIf( ?? ) { bool b = evalCond( ??); if (b) then { execSS(??); return; } else if (?alt?) then {execSS(??); return; } else return; } So, need: 1. Non-term. at current node 2. Alternative at current node 3. Move to children nodes CSE 3341/655; Part 2

A (bad!) representation of PTs An array representation of parse trees: Each node in tree ↔ row in array; Each row has 5 columns: Number corresponding to the non-terminal at the node; Number corresponding to alternative used; The row numbers of children nodes. Representation of the <if> statement in the last page: ... CSE 3341/655; Part 2

Recursive Descent (contd) void execIf( int n ) { // n is row no. of <if> node bool b = evalCond( PT[n,3]); // PT is the parse tree array if (b) then { execSS(PT[n,4]); return; } else if (PT[n,2] == 2) then {execSS(PT[n,5]); return; } else return; } Why do we need PT[n,1]? Why 5 columns in a row? What about <int>? what about <id>? CSE 3341/655; Part 2

Recursive Descent (contd) void printIf( int n ) { // n: row no. of <if> node // check PT[n,1] to see if this is <if> node write(“if”); printCond( PT[n,3]); // don’t we have to evaluate the condition? write(“then”); printSS(PT[n,4]); // what if it was not an <SS>? if (PT[n,2]==2) { write(“else”); printSS(PT[n,5]); } write(“end;”); } CSE 3341/655; Part 2

Recursive Descent (contd) void printAssign( int n ) { // n: row no. of <assign> node // check PT[n,1] to see if this is <assign> node printId( PT[n,3] ); write(“=”); print Exp( PT[n,4]); } // bug in this code! Slide 16 notes How to do this in pure BNF? Using ε it is easy. Without it, the number of productions increases quite a bit. But using ε can cause problems for compilers. In homeworks, exams, etc. you may use it unless I say otherwise. Relation to book: So far, mostly chapter 1; and 3.1., 3.2, 3.3; rest of chapter 3 not inclded. We will move to chapter 4; that will lead us to the project. A lot of this should be familiar from 321 and 625. But going over it again should make it easier to see how it relates to PLs and lang. implementations. The project also has some relation to 560. CSE 3341/655; Part 2

Recursive Descent (contd) void execAssign( int n ) { // n: row no. of <assign> node // check PT[n,1] to see if this is <assign> node int x = evalExp(PT[n,4]); // don’t we have to first take care of PT[n,3]? assignIdVal(PT[n,3], x); // what about PT[n,2]? PT[n,5]? } CSE 3341/655; Part 2

Parser Parsing is harder: No tree to descend! The trick: Build the tree *as* you descend! Approach: Calling procedure will create an "empty" node -by grabbing the next free row from the PT array- and pass it to the appropriate parse procedure CSE 3341/655; Part 2

Recursive Descent Parsing (Note: "t" is the (global) Tokenizer.) void parseIf( int n ) { // node created by *caller* - who? PT[n,1] = 8; // why? string s = t.getToken(); // if s != “if” error! PT[n,3] = nextRow++; // next free row; initialize? parseCond(PT[n,3]); // bug! PT[n,4] = nextRow++; parseSS(PT[n,4]); // bug! s = t.getToken(); if (s!=“else”) {return; // bug! bug!} t.skipToken(); PT[n,5]=nextRow++; parseSS(PT[n,5]); return; // not so fast! } CSE 3341/655; Part 2