Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 3 – Describing Syntax

Similar presentations


Presentation on theme: "Chapter 3 – Describing Syntax"— Presentation transcript:

1 Chapter 3 – Describing Syntax
CSCE 343

2 Syntax vs. Semantics Syntax: The (correct) form of the expressions, statements, and program units (blocks, functions, methods). x = x + / 2 % 3; //not the correct form Semantics: the meaning of the expressions, statements, and program units. x = 2 + x % 3; // assignment, precedence, x <0 Syntax + semantics = language definition Used by: Other language designers (evaluators) Implementers Programmers

3 Terminology Sentence: sequence (string) of characters over some alphabet Language: set of legal sentences - all programs Lexeme (Tokens): lowest level syntactic unit (e.g. +, (, {, while, 1.2, etc.) Token (Token Class): a category of lexemes (e.g. identifier, open brace, int literal, etc.)

4 Chomsky’s 1950’s Classes of Grammars
Type-0: Unrestricted Grammars Type-1: Context Sensitive Grammars Type-2: Context Free Grammars Type-3: Regular Grammars Type 2 used to describe syntax for a language Type 3 used to define token classes

5 Formal Languages Definition of a Context Free Grammar Examples
∑ alphabet terminal Γ abstractions nonterminal P rules productions S start symbol (special nonterminal) Examples

6 Rules / Productions Rule has left-hand side (LHS) and right-hand side (RHS) LHS is a single non-terminal RHS consists of terminals and non-terminals A non-terminal can be on the LHS of several rules: Recursion for lists: <ident_list>  identifier | identifier, <ident_list>

7 Recognizers / Generators
Formal Languages diagram of recognizers and generators Compilers have a recognizer component, syntax analyzer (parser) Generators are better at describing the language at a level that is useful to programmers Theorem: If L is a CFG then recognizers are generators are equivalent

8 BNF John Backus uses grammar notation to describe syntax of Algol 58 Backus & Naur use BNF to describe Algol 60 BNF is equivalent to context-free grammars BNF: a metalanguage (used to describe other languages) Backus-Naur Form (BNF) and Context-Free Grammars Most widely known method for describing programming language syntax Extended BNF (shorthand notation) Improves readability and writability of BNF

9 Derivations Derivation: repeated application of rules, beginning with the start symbol (non-terminal) and ending with a sentence (all terminal symbols) Each string in the derivation is called a sentential form Many different derivations result in the same sentence Leftmost derivation Rightmost derivation

10 Example Grammar Grammar: <program>  <stmts>
<stmts>  <stmt> | <stmt> ; <stmts> <stmt>  <var> = <expr> <var>  a | b | c | d <expr>  <var> + <var> | <var> - <var> Derivation: <program> => <stmts> => <stmt> => <var> = <expr> => a = <expr> => a = <var> + <var> => a = b + <var> => a = b + c

11 parse tree -- hierarchical representation of a derivation
<program> <stmts> <stmt> <var> = <expr> a <var> <var> b c

12 Parse Trees A hierarchical representation of a derivation
Internal nodes of the tree are non-terminals Leaf nodes of the tree are terminals Several different derivations may have the same parse tree. If there are two derivations for a sentence that result in distinct parse trees the language is ambiguous

13 Parse Trees and Semantics
Compilers generate code by traversing parse trees. Semantics are derived from “shape” of trees. Example: math expressions Operations lower in tree occur first. The grammar can determine operator precedence 3.2 right most operator has precedence 3.3 ambiguous 3.4 normal precedence order

14 Ambiguous Grammars <expr>  <expr> <op> <expr> | <id> <op>  * | + <id> a | b | c | d

15 Ambiguous Grammars Get rid of multiple recursion to create unambiguous grammar: <expr>  <expr> + <term> | <term> <term>  <term> / <id> | <id> <id>  a | b | c | d

16 Operator Associativity
Precedence order for A / B * C Associativity indicated by recursion: <expr>  <expr> + <term> left recursive  left associative (left-to-right precedence) <expr>  <term> + <expr> right recursion  right associative

17 Extended BNF Optional parts in brackets [ ]
<proc_call> <ident>( [<expr_list>] ) Alternatives are placed in parenthesis <term>  <term> (+|-) <const> Repetitions (0 or more) are in braces { } <ident>  letter { letter | digit }

18 BNF and EBNF BNF EBNF <expr>  <expr> + <term>
<term>  <term> * <factor> | <term> / <factor> | <factor> EBNF <expr>  <term> { ( + | - ) <term> } <term>  <factor> { ( * | / ) <factor> }

19 Limits to BNF BNF has a difficult time with some language rules
type compatibility Some language rules can not be specified with BNF variable must be declared before it is used These are known as static semantic rules, rules that can be checked at compile time Attribute grammars used to check these rules Dynamic semantic rules corresponding to the meaning of expressions, statements, and program units


Download ppt "Chapter 3 – Describing Syntax"

Similar presentations


Ads by Google