Programming Language Concepts (CIS 635) Elsa L Gunter 4303 GITC NJIT, www.cs.njit.edu/~elsa/635 www.cs.njit.edu/~elsa/635.

Slides:



Advertisements
Similar presentations
Lexical and Syntactic Analysis Here, we look at two of the tasks involved in the compilation process –Given source code, we need to first break it into.
Advertisements

5/24/20151 Programming Languages and Compilers (CS 421) Reza Zamani Based in part on slides by Mattox Beckman,
Chapter 4 Lexical and Syntax Analysis Sections
Chapter 4 Lexical and Syntax Analysis Sections 1-4.
ISBN Chapter 4 Lexical and Syntax Analysis.
ISBN Chapter 4 Lexical and Syntax Analysis.
ISBN Chapter 4 Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing.
CS 330 Programming Languages 09 / 23 / 2008 Instructor: Michael Eckmann.
Lexical and Syntax Analysis
Lecture 4 Concepts of Programming Languages Arne Kutzner Hanyang University / Seoul Korea.
ISBN Lecture 04 Lexical and Syntax Analysis.
Chapter 4 Lexical and Syntax Analysis. Chapter 4 Topics Introduction Lexical Analysis The Parsing Problem Recursive-Descent Parsing Bottom-Up Parsing.
Lexical and syntax analysis
CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University
CPSC 388 – Compiler Design and Construction Parsers – Context Free Grammars.
Parsing. Goals of Parsing Check the input for syntactic accuracy Return appropriate error messages Recover if possible Produce, or at least traverse,
1 Week 4 Questions / Concerns Comments about Lab1 What’s due: Lab1 check off this week (see schedule) Homework #3 due Wednesday (Define grammar for your.
Chapter 4 Lexical and Syntax Analysis. 4-2 Chapter 4 Topics 4.1 Introduction 4.2 Lexical Analysis 4.3 The Parsing Problem 4.4 Recursive-Descent Parsing.
10/7/20151 Programming Languages and Compilers (CS 421) Elsa L Gunter 2112 SC, UIUC Based in part on slides by Mattox.
CS 330 Programming Languages 09 / 26 / 2006 Instructor: Michael Eckmann.
10/25/20151 Programming Languages and Compilers (CS 421) Grigore Rosu 2110 SC, UIUC Slides by Elsa Gunter, based.
Lexical and Syntax Analysis
CS 330 Programming Languages 09 / 21 / 2006 Instructor: Michael Eckmann.
Top Down Parsing - Part I Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
ISBN Chapter 4 Lexical and Syntax Analysis.
CPS 506 Comparative Programming Languages Syntax Specification.
ISBN Chapter 4 Lexical and Syntax Analysis.
D Goforth COSC Translating High Level Languages Note error in assignment 1: #4 - refer to Example grammar 3.4, p. 126.
College of Computer Science and Engineering Course: ICS313
CS 330 Programming Languages 09 / 20 / 2007 Instructor: Michael Eckmann.
CS 330 Programming Languages 09 / 25 / 2007 Instructor: Michael Eckmann.
Lesson 4 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Programming Language Concepts (CIS 635) Elsa L Gunter 4303 GITC NJIT,
Copyright © 2004 Pearson Addison-Wesley. All rights reserved.3-1 Language Specification and Translation Lecture 8.
Parser: CFG, BNF Backus-Naur Form is notational variant of Context Free Grammar. Invented to specify syntax of ALGOL in late 1950’s Uses ::= to indicate.
3/13/20161 Programming Languages and Compilers (CS 421) Elsa L Gunter 2112 SC, UIUC Based in part on slides by Mattox.
Copyright © 2004 Pearson Addison-Wesley. All rights reserved.3-1 Language Specification and Translation ICOM 4036 Spring 2004 Lecture 3.
Lecture 4 Concepts of Programming Languages
4.1 Introduction - Language implementation systems must analyze
Lexical and Syntax Analysis
Programming Languages and Compilers (CS 421)
Introduction to Parsing
Chapter 4 - Parsing CSCE 343.
Chapter 4 Lexical and Syntax Analysis.
Lexical and Syntax Analysis
Lexical and Syntax Analysis
Programming Languages and Compilers (CS 421)
Lexical and Syntax Analysis
Top-Down Parsing CS 671 January 29, 2008.
Lexical and Syntactic Analysis
Syntax-Directed Definition
Programming Languages and Compilers (CS 421)
Programming Language Syntax 2
Lexical and Syntax Analysis
Programming Languages and Compilers (CS 421)
Chapter 4: Lexical and Syntax Analysis Sangho Ha
Lexical and Syntax Analysis
Programming Language Specification and Translation
Programming Languages and Compilers (CS 421)
Language Specification and Translation
Programming Language Specification and Translation
Lexical and Syntax Analysis
CSCE 314: Programming Languages Dr. Dylan Shell
Programming Language Specification and Translation
Lexical and Syntax Analysis
Programming Languages and Compilers (CS 421)
Lexical and Syntax Analysis
4.1 Introduction - Language implementation systems must analyze
Presentation transcript:

Programming Language Concepts (CIS 635) Elsa L Gunter 4303 GITC NJIT,

Copyright 2002 Elsa L. Gunter Sample Grammar ::= | + | - ::= | * | / ::= | ( )

Copyright 2002 Elsa L. Gunter Tokens as SML Datatypes + - * / ( ) Becomes an SML datatype datatype token = Id_token of string | Left_parenthesis | Right_parenthesis | Times_token | Divide_token | Plus_token | Minus_token

Copyright 2002 Elsa L. Gunter Parsing Token Streams We will create three mutually recursive parsing functions: expr : (token option * (unit -> token option) -> (bool * (token option * (unit -> token option) term : (token option * (unit -> token option) -> (bool * (token option * (unit -> token option) factor : (token option * (unit -> token option) -> (bool * (token option * (unit -> token option)

Copyright 2002 Elsa L. Gunter ::= [( + | - ) ] fun expr tokens = (case term tokens of ( true, tokens_after_term) => (case tokens_after_term of ( SOME Plus_token, tokens_after_plus) => Parsing an Expression

Copyright 2002 Elsa L. Gunter ::= + fun expr tokens = (case term tokens of ( true, tokens_after_term) => (case tokens_after_term of ( SOME Plus_token, tokens_after_plus) => Parsing a Plus Expression

Copyright 2002 Elsa L. Gunter ::= + (case expr (tokens_after_plus(), tokens_after_plus) of ( true, tokens_after_expr) => ( true, tokens_after_expr) Parsing a Plus Expression

Copyright 2002 Elsa L. Gunter ::= + (case expr (tokens_after_plus(), tokens_after_plus) of ( true, tokens_after_expr) => ( true, tokens_after_expr) Parsing a Plus Expression

Copyright 2002 Elsa L. Gunter ::= + (case expr (tokens_after_plus(), tokens_after_plus) of ( true, tokens_after_expr) => ( true, tokens_after_expr) Parsing a Plus Expression

Copyright 2002 Elsa L. Gunter ::= + | ( false,rem_tokens) => ( false, rem_tokens)) Code for Minus_token is almost identical What If No Expression After Plus

Copyright 2002 Elsa L. Gunter ::= | _ => ( true, tokens_after_term)) What If No Plus or Minus

Copyright 2002 Elsa L. Gunter expr> ::= [( + | - ) ] | ( false, rem_tokens) => ( false, rem_tokens)) Code for term is same as for expr except for replacing addition with multiplication and subtraction with division What if No Term

Copyright 2002 Elsa L. Gunter ::= and factor (SOME (Id_token id_name), tokens) = ( true, (tokens(), tokens)) Parsing Factor as Id

Copyright 2002 Elsa L. Gunter ::= ( ) | factor (SOME Left_parenthesis, tokens) = (case expr (tokens(), tokens) of ( true, tokens_after_expr) => Parsing Factor as Parenthesized Expression

Copyright 2002 Elsa L. Gunter ::= ( ) (case tokens_after_expr of ( SOME Right_parenthesis, tokens_after_rparen ) => ( true, (tokens_after_rparen(), tokens_after_rparen)) Parsing Factor as Parenthesized Expression

Copyright 2002 Elsa L. Gunter What if No Right Parenthesis ::= ( ) | _ => ( false, tokens_after_expr))

Copyright 2002 Elsa L. Gunter ::= ( ) | ( false, rem_tokens) => ( false, rem_tokens)) What If No Expression After Left Parenthesis

Copyright 2002 Elsa L. Gunter What If No Id or Left Parenthesis ::= | ( ) | factor tokens = ( false, tokens)

Copyright 2002 Elsa L. Gunter Parsing - in C Assume global variable currentToken that holds the latest token removed from token stream Assume subroutine lex( ) to analyze the character stream, find the next token at the head of that stream and update currentToken with that token Assume subroutine error( ) to raise an exception

Copyright 2002 Elsa L. Gunter Parsing expr – in C ::= [( + | - ) ] void expr ( ) { term ( ); if (nextToken == PLUS_CODE) { lex ( ); expr ( ); } else if (nextToken == MINUS_CODE) { lex ( ); expr ( );}

Copyright 2002 Elsa L. Gunter SML Code fun expr tokens = (case term tokens of ( true, tokens_after_term) => (case tokens_after_term of (SOME Plus_token,tokens_after_plus) => (case expr (tokens_after_plus(), tokens_after_plus) of ( true, tokens_after_expr) => ( true, tokens_after_expr)

Copyright 2002 Elsa L. Gunter Parsing expr – in C (optimized) ::= [( + | - ) ] void expr ( ) { term( ); while (nextToken == PLUS_CODE || nextToken == MINUS_CODE) { lex ( ); term ( ); }

Copyright 2002 Elsa L. Gunter Parsing factor – in C ::= void factor ( ) { if (nextToken = ID_CODE) lex ( );

Copyright 2002 Elsa L. Gunter ::= and factor (SOME (Id_token id_name), tokens) = ( true, (tokens(), tokens)) Parsing Factor as Id

Copyright 2002 Elsa L. Gunter Parsing factor – in C ::= ( ) else if (nextToken == LEFT_PAREN_CODE) { lex ( ); expr ( ); if (nextToken == RIGHT_PAREN_CODE) lex;

Copyright 2002 Elsa L. Gunter Comparable SML Code | factor (SOME Left_parenthesis, tokens) = (case expr (tokens(), tokens) of ( true, tokens_after_expr) => (case tokens_after_expr of ( SOME Right_parenthesis, tokens_after_rparen ) => ( true, (tokens_after_rparen(), tokens_after_rparen))

Copyright 2002 Elsa L. Gunter Parsing factor – in C else error ( ); /* Right parenthesis missing */ } else error ( ); /* Neither nor ( was found at start */ }

Copyright 2002 Elsa L. Gunter Error cases in SML (* No right parenthesis *) | _ => ( false, tokens_after_expr)) (* No expression found *) | ( false, rem_tokens) => ( false, rem_tokens)) (* Neither nor left parenthesis found *) | factor tokens = ( false, tokens)

Copyright 2002 Elsa L. Gunter Lexers – Simple Parsers Lexers are parsers driven by regular grammars Use character codes and arithmetic comparisons rather than case analysis to determine syntactic category for each character Often some semantic action must be taken –Compute a number or build a string and record it in a symbol table

Copyright 2002 Elsa L. Gunter Example = | = 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 fun digit c = (case Char.ord c of n => if n >= Char.ord #”0” andalso n <= Char.ord #”9” then SOME (n – Char.ord #”0”) else NONE)

Copyright 2002 Elsa L. Gunter Example fun pos [] = (NONE,[]) | pos (chars as ch::rem_chars) = (case digit ch of NONE => (NONE, chars) | SOME n => (case pos rem_chars of (NONE, more_chars) => (SOME (10,n), more_chars) | (SOME (p,m), more_chars) => (SOME (10*p,(p*n)+m), more_chars)))

Copyright 2002 Elsa L. Gunter Problems for Recursive- Descent Parsing Left Recursion: A ::= Aw translates to a subroutine that loops forever Indirect Left Recursion: A ::= Bw B ::= Av causes the same problem

Copyright 2002 Elsa L. Gunter Problems for Recursive- Descent Parsing Parser must always be able to choose the next action based only only the next very next token Pairwise disjointedness Test: Can we always determine which rule (in the non-extended BNF) to choose based on just the first token

Copyright 2002 Elsa L. Gunter Pairwise Disjointedness Test For each rule A ::= y Calculate FIRST (y) = {a | y =>* aw}  {  | if y =>*  } For each pair of rules A ::= y and A ::= z, require FIRST(y)  FIRST(z) = { } Test too strong: Can’t handle ::= [ ( + | - ) ]

Copyright 2002 Elsa L. Gunter Example Grammar: ::= a b ::= b | b ::= a | a FIRST ( b) = {b} Rules for not pairwise disjoint