Presentation is loading. Please wait.

Presentation is loading. Please wait.

CPS 506 Comparative Programming Languages Syntax Specification.

Similar presentations


Presentation on theme: "CPS 506 Comparative Programming Languages Syntax Specification."— Presentation transcript:

1 CPS 506 Comparative Programming Languages Syntax Specification

2 Compiling Process Steps 2 Program  Lexical Analysis – Convert characters into a stream of tokens Lexical Analysis  Syntactic Analysis – Send tokens to develop an abstract representation or parse tree

3 Compiling Process Steps (con’t) 3 Syntactic Analysis  Semantic Analysis – Send parse tree to analyze for semantic consistency and convert for efficient run in the architecture (Optimization) Semantic Analysis  Machine Code – Convert abstract representation to executable machine code using code generation

4 Formal Methods and Language Processing Meta-Language – A language to define other languages BNF (Backus-Naur Form) – A set of rewriting rules ρ – A set of terminal symbols ∑ – A set of non-terminal symbols Ν – A start symbol S є Ν – ρ : Α  ω – Α є Ν and ω є (Ν U Σ) – Right-hand side: a sequence of terminal and non-terminal symbols – Left-hand side: a non-terminal symbol 4

5 BNF (con’t) The words in Ν : grammatical categories – Identifier, Expression, Loop, Program, … – S : principal grammatical category – Symbols in Σ : the basic alphabet – Example 1: binaryDigit  0 binaryDigit  1 or binaryDigit  0 | 1 – Example 2: Integer  Digit | Integer Digit Digit  0|1|2|3|4|5|6|7|8|9 5

6 BNF (con’t) Parse Tree Derivation Integer  Integer Digit  Integer Digit Digit  Digit Digit Digit  2 Digit Digit  28 Digit  281 Integer Digit 1 8 2 6

7 BNF (con’t) Lexeme: The lowest-level syntactic units Tokens : A set of all grammatical categories that define strings of non-blank characters (Lexical Syntax) – Identifier (variable names, function names,…) – Literal (integer and decimal numbers,…) – Operator (+,-,*,/,…) – Separator (;,.,(,),{,},…) – Keyword (int, if, for, where,…) 7

8 BNF (con’t) // comments … void main ( ) { float p; p = 3.14 ; } Comment Keyword Identifier Operator Separator Literal 8

9 BNF (con’t) 9

10 Regular Expressions 10 An alternative for BNF to define a language lexical rules – x : A character – “abc” : A literal string – A | B : A or B – A B : Concatenation of A and B – A* : Zero or more occurrence of A – A+ : One or more occurrence of A – A? : Zero or one occurrence of A – [a-z A-Z] : Any alphabetic character – [0-9] : Any digit –. : Any single character Example Integer :[0-9]+ Identifier :[a-z A-Z][a-z A-Z 0-9]*

11 Syntactic Analysis 11 Primary tool: BNF Input: Tokens from lexical analysis Output: Parse Syntactic categories – Program Declaration Assignment Expression Loop Function definition

12 Syntactic Analysis (con’t) 12 Example Arithmetic Expression  Term | Arithmetic Expression + Term | Arithmetic Expression – Term Term  Factor | Term * Factor | Term / Factor Factor  Identifier | Literal | ( Arithmetic Expression )

13 Syntactic Analysis (con’t) 13 Example 2 * a - 3 Arithmetic Expression Term Factor 3 Identifier Literal Arithmetic Expression Term Factor Literal Integer - * 2 Letter a

14 Syntactic Analysis (con’t) 14 BNF limitations – Declaration of identifiers? – Initial value of identifiers? In statically typed languages – Using Type System for the first problem – Detect in compile time or run time

15 Ambiguous Grammar 15 A string is parsed into two or more various trees Example Exp  Identifier | Literal | Exp – Exp Input: A – B – C Output: 1- A – (B – C) 2- (A – B) – C Another example is “dangling else” – Using BNF rules – Using extra-grammatical rules

16 Operator Precedence 16  + | * | ( ) | A = B + C * A  A = B + (C * A) A = B * C + A  A = B * (C + A) Solution  + |  * |  ( ) | A = B + C * A  A = B + (C * A) A = B * C + A  A = (B * C) + A

17 Associativity of Operators 17 A + B + C A * B * CA / B / C… Left Associativity – Left Recursive: In a grammar rule, LHS also appears at the beginning of its RHS  + | A + B + C  (A + B) + C Right Associativity – Right Recursive: In a grammar rule, LHS also appears at the end of its RHS  ** |  ( ) | A + B ** C  A + (B ** C)

18 Extended BNF (EBNF) 18 Optional part of an RHS  if ( ) [ else ] Repetition, or recursion, part of an RHS  {, } Multiple choice option of an RHS  ( * | / | % ) Optional use of * and +  {, }*  {0 | … | 9}+

19 Extended BNF (EBNF) (con’t) 19 opt subscript Conditional Statement  if ( Expr ) Statement { else Statement } opt Syntax Diagram Factor Term * | /

20 Case Study 20 A BNF or EBNF for one grammar, such as Expression, different Literals, or if Statement in Java, C, C++, or Pascal BNF or EBNF for floating point numbers in Java, C, C++ BNF or EBNF for loop statements in one language

21 Abstract Syntax 21 Consider the following codes: Although syntax are different, they are essentially equivalent Abstract Syntax is a solution to show the essential elements of a language Pascal While i < 10 do begin i := i+ 1; end; C or Java while (i < 10) { i = i + 1; }

22 Abstract Syntax (con’t) 22 General Form Abstract Syntax Class = list of essential components Example Loop = Expression test; Statement body A Java class for abstract syntax of loop class Loop extends Statement { Expression test; Statement body; } Member Element

23 Abstract Syntax (con’t) 23 More examples Assignment = Variable target; Expression source A Java class for abstract syntax of Assignment class Assignment extends Statement { Variable target; Expression source; } Member Element

24 Abstract Syntax Tree 24 A tree to show the abstract syntax tree Example x = 2;x := 2; Assignment = Variable target; Expression source Statement Assignment x VariableExpression 2 Value

25 Recursive Descent Parser 25 A top-down parser to verify the syntax of a stream of text from left to right It contains several recursive methods, each of which implements a rule of the grammar More details and parsing algorithms in Compiler course

26 Exercises 26 1.Modify the following grammar to add a unary minus operator that has higher precedence than either + or *.  =  A | B | C  + |  * |  ( ) |

27 Exercises 27 2.Consider the following grammar:  a b  b | b  a | a Which of the following sentences are in the language generated by this grammar? 1.baab 2.bbbab 3.bbaaaaa 4.bbaab

28 Exercises 28 3.Convert the following EBNF to BNF: S  A { bA } A  a [b]A 4.Using grammar in question 1, add the ++ and – unary operators of Java. 5.Using grammar in question 1, show a parse tree and a leftmost derivation for each of the following statements: a)A = (A+B) * C b)A = B * (C * (A + B))

29 Exercises 29 6.Rrewrite the BNF in question 1 to give + precedence over *, and force + to be right associative. 7.Using BNF write an algorithm for the language consisting of strings {ab} n, where n>0, such as ab, aabb, …. Can you write this using regular expressions?


Download ppt "CPS 506 Comparative Programming Languages Syntax Specification."

Similar presentations


Ads by Google