Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 3 Lexical and Syntactic Analysis Syntactic.

Slides:



Advertisements
Similar presentations
Parsing 4 Dr William Harrison Fall 2008
Advertisements

Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth.
Translator Architecture Code Generator ParserTokenizer string of characters (source code) string of tokens abstract program string of integers (object.
ICE1341 Programming Languages Spring 2005 Lecture #5 Lecture #5 In-Young Ko iko.AT. icu.ac.kr iko.AT. icu.ac.kr Information and Communications University.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 3 Lexical and Syntactic Analysis Syntactic.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Programming Languages 2nd edition Tucker and Noonan
Slide 1 Chapter 2-b Syntax, Semantics. Slide 2 Syntax, Semantics - Definition The syntax of a programming language is the form of its expressions, statements.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 3 Lexical and Syntactic Analysis Syntactic.
Chapter 2 A Simple Compiler
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 6 Type Systems I was eventually persuaded.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Chapter 2 A Simple Compiler
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 3 Lexical and Syntactic Analysis Syntactic.
CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University
1 Syntax and Semantics The Purpose of Syntax Problem of Describing Syntax Formal Methods of Describing Syntax Derivations and Parse Trees Sebesta Chapter.
Building lexical and syntactic analyzers
CPSC 388 – Compiler Design and Construction Parsers – Context Free Grammars.
Syntax Analysis (Chapter 4) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.
Syntax and Semantics Structure of programming languages.
Parsing Chapter 4 Parsing2 Outline Top-down v.s. Bottom-up Top-down parsing Recursive-descent parsing LL(1) parsing LL(1) parsing algorithm First.
1 Chapter 2 A Simple Compiler. 2 Outlines 2.1 The Structure of a Micro Compiler 2.2 A Micro Scanner 2.3 The Syntax of Micro 2.4 Recursive Descent Parsing.
Syntax – Intro and Overview CS331. Syntax Syntax defines what is grammatically valid in a programming language –Set of grammatical rules –E.g. in English,
1 Top Down Parsing. CS 412/413 Spring 2008Introduction to Compilers2 Outline Top-down parsing SLL(1) grammars Transforming a grammar into SLL(1) form.
CS Describing Syntax CS 3360 Spring 2012 Sec Adapted from Addison Wesley’s lecture notes (Copyright © 2004 Pearson Addison Wesley)
Context-Free Grammars and Parsing
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
PART I: overview material
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
Dr. Philip Cannata 1 Lexical and Syntactic Analysis Chomsky Grammar Hierarchy Lexical Analysis – Tokenizing Syntactic Analysis – Parsing Hmm Concrete Syntax.
Syntax and Semantics Structure of programming languages.
CPS 506 Comparative Programming Languages Syntax Specification.
D Goforth COSC Translating High Level Languages Note error in assignment 1: #4 - refer to Example grammar 3.4, p. 126.
Chapter 4 Top-Down Parsing Recursive-Descent Gang S. Liu College of Computer Science & Technology Harbin Engineering University.
Syntax (2).
Chapter 3 Context-Free Grammars and Parsing. The Parsing Process sequence of tokens syntax tree parser Duties of parser: Determine correct syntax Build.
Syntax Analysis (Chapter 4) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. C H A P T E R T W O Syntax.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
CSE 5317/4305 L3: Parsing #11 Parsing #1 Leonidas Fegaras.
C H A P T E R T W O Linking Syntax And Semantics Programming Languages – Principles and Paradigms by Allen Tucker, Robert Noonan.
CPSC 388 – Compiler Design and Construction Parsers – Syntax Directed Translation.
1 Introduction to Parsing. 2 Outline l Regular languages revisited l Parser overview Context-free grammars (CFG ’ s) l Derivations.
1 Topic #4: Syntactic Analysis (Parsing) CSC 338 – Compiler Design and implementation Dr. Mohamed Ben Othman ( )
Bernd Fischer RW713: Compiler and Software Language Engineering.
MiniJava Compiler A multi-back-end JIT compiler of Java.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 8 Semantic Interpretation To understand a.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Syntax and Semantics Structure of programming languages.
Chapter 3 – Describing Syntax
Parsing #1 Leonidas Fegaras.
Programming Languages 2nd edition Tucker and Noonan
Programming Languages Translator
Chapter 3 Context-Free Grammar and Parsing
Introduction to Parsing (adapted from CS 164 at Berkeley)
Chapter 3 – Describing Syntax
Chapter 4 Top-Down Parsing Part-1 September 8, 2018
Compiler Construction (CS-636)
Programming Languages 2nd edition Tucker and Noonan
Lecture 4: Lexical Analysis & Chomsky Hierarchy
Programming Languages 2nd edition Tucker and Noonan
The Recursive Descent Algorithm
High-Level Programming Language
Programming Languages 2nd edition Tucker and Noonan
COMPILER CONSTRUCTION
Presentation transcript:

Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 3 Lexical and Syntactic Analysis Syntactic sugar causes cancer of the semicolon. A. Perlis

Copyright © 2006 The McGraw-Hill Companies, Inc. Contents 3.1 Chomsky Hierarchy 3.2 Lexical Analysis 3.3 Syntactic Analysis

Copyright © 2006 The McGraw-Hill Companies, Inc. Syntactic Analysis Phase also known as: parser Purpose is to recognize source structure Input: tokens Output: parse tree or abstract syntax tree A recursive descent parser is one in which each nonterminal in the grammar is converted to a function which recognizes input derivable from the nonterminal.

Copyright © 2006 The McGraw-Hill Companies, Inc. Program Structure consists of: Expressions: x + 2 * y Assignment Statement: z = x + 2 * y Loop Statements: while (i < n) a[i++] = 0; Function definitions Declarations: int i;

Copyright © 2006 The McGraw-Hill Companies, Inc. Assignment → Identifier = Expression Expression → Term { AddOp Term } AddOp → + | - Term → Factor { MulOp Factor } MulOp → * | / Factor → [ UnaryOp ] Primary UnaryOp → - | ! Primary → Identifier | Literal | ( Expression )

Copyright © 2006 The McGraw-Hill Companies, Inc. First Set Augmented form: S – start symbol S' → S $ First(X) is set of leftmost terminal symbols derivable from X First(X) = { a  T | X →* a w, w  (N  T)* } Since there are no production rules for terminal symbols: First(a) = a, a  T

Copyright © 2006 The McGraw-Hill Companies, Inc. For w = X1... Xn V... First(w) = First(X1) ...  First(Xn)  First(V) where X1,..., Xn are nullable and V is not nullable A is nullable if it derives the empty string.

Copyright © 2006 The McGraw-Hill Companies, Inc. Nullable Algorithm Set nullable = new Set( ); do { oldSize = nullable.size( ); for (Production p : grammar.productions( )) if (p.length( ) == 0) nullable.add(p.nonterminal( )); else > } while (nullable.size( ) > oldSize);

Copyright © 2006 The McGraw-Hill Companies, Inc. Check Righthand Side { boolean allNull = true; for (Symbol t : p.rule( )) if (! nullable.contains(t)) allNull = false; if (allNull) nullable.add(p.nonterminal( )); }

Copyright © 2006 The McGraw-Hill Companies, Inc. Rewrite Grammar Augmented form Abbreviate symbols Replace meta constructs with nonterminals

Copyright © 2006 The McGraw-Hill Companies, Inc. Transformed Grammar S → A $ A → i = E ; E → T E' E' → | AO T E' AO → + | - T → F T' T' → MO F T' MO → * | / F → F' P F' → | UO UO → - | ! P → i | l | ( E )

Copyright © 2006 The McGraw-Hill Companies, Inc. Compute Nullable Symbols PassNullable 1 E' T' F' 2 E' T' F'

Copyright © 2006 The McGraw-Hill Companies, Inc. Left Dependency Graph For each production: A → V1... Vn X w V1,... Vn nullable draw an arc from A to X. Note: you also get arcs from A to V1,..., A to Vn

Copyright © 2006 The McGraw-Hill Companies, Inc.

NonterminalFirstNonterminalFirst AiUO! - E! - i l (Pi l ( E’+ - AO+ - T! - i l ( T’* / MO* / F! - i l ( F’! -

Copyright © 2006 The McGraw-Hill Companies, Inc. Recursive Descent Parsing Method/function for each nonterminal Recognize longest sequence of tokens derivable from the nonterminal Need an algorithm for converting productions to code Based on EBNF

Copyright © 2006 The McGraw-Hill Companies, Inc. T(EBNF) = Code: A → w 1 If w is nonterminal, call it. 2 If w is terminal, match it against given token. 3 If w is { w' }: while (token in First(w')) T(w') 4 If w is: w1 |... | wn, switch (token) { case First(w1): T(w1); break;... case First(wn): T(wn); break;

Copyright © 2006 The McGraw-Hill Companies, Inc. 5 Switch (cont.): If some wi is empty, use: default: break; Otherwise default: error(token); 6 If w = [ w' ], rewrite as ( | w' ) and use rule 4. 7 If w = X1... Xn, T(w) = T(X1);... T(Xn);

Copyright © 2006 The McGraw-Hill Companies, Inc. Augmentation Production Gets first token Calls method corresponding to original start symbol Checks to see if final token is end token E.g.: end of file token

Copyright © 2006 The McGraw-Hill Companies, Inc. private void match (int t) { if (token.type() == t) token = lexer.next(); else error(t); }

Copyright © 2006 The McGraw-Hill Companies, Inc. private void error(int tok) { System.err.println( “Syntax error: expecting” + tok + “; saw: ” + token); System.exit(1); }

Copyright © 2006 The McGraw-Hill Companies, Inc. private void assignment( ) { // Assignment → Identifier = Expression ; match(Token.Identifier); match(Token.Assign); expression( ); match(Token.Semicolon); }

Copyright © 2006 The McGraw-Hill Companies, Inc. private void expression( ) { // Expression → Term { AddOp Term } term( ); while (isAddOp()) { token = lexer.next( ); term( ); }

Copyright © 2006 The McGraw-Hill Companies, Inc. Linking Syntax and Semantics Output: parse tree is inefficient One nonterminal per precedence level Shape of parse tree that is important

Copyright © 2006 The McGraw-Hill Companies, Inc. Discard from Parse Tree Separator/punctuation terminal symbols All trivial root nonterminals Replace remaining nonterminal with leaf terminal

Copyright © 2006 The McGraw-Hill Companies, Inc. Abstract Syntax Example Assignment = Variable target; Expression source Expression = Variable | Value | Binary | Unary Binary = Operator op; Expression term1, term2 Unary = Operator op; Expression term Variable = String id Value = Integer value Operator = + | - | * | / | !

Copyright © 2006 The McGraw-Hill Companies, Inc. abstract class Expression { } class Binary extends Expression { Operator op; Expression term1, term2; } class Unary extends Expression { Operator op; Expression term; }

Copyright © 2006 The McGraw-Hill Companies, Inc. Modify T(A → w) to Return AST 1.Make A the return type of function, as defined by abstract syntax. 2. If w is a nonterminal, assign returned value. 3. If w is a non-punctuation terminal, capture its value. 4. Add a return statement that constructs the appropriate object.

Copyright © 2006 The McGraw-Hill Companies, Inc. private Assignment assignment( ) { // Assignment → Identifier = Expression ; Variable target = match(Token.Identifier); match(Token.Assign); Expression source = expression( ); match(Token.Semicolon); return new Assignment(target, source); }

Copyright © 2006 The McGraw-Hill Companies, Inc. private String match (int t) { String value = Token.value( ); if (token.type( ) == t) token = lexer.next( ); else error(t); return value; }