Download presentation
Presentation is loading. Please wait.
1
Chapter 3 – Describing Syntax
CSCE 343
2
Syntax vs. Semantics Syntax: The (correct) form of the expressions, statements, and program units (blocks, functions, methods). x = x + / 2 % 3; //not the correct form Semantics: the meaning of the expressions, statements, and program units. x = 2 + x % 3; // assignment, precedence, x <0 Syntax + semantics = language definition Used by: Other language designers (evaluators) Implementers Programmers
3
Terminology Sentence: sequence (string) of characters over some alphabet Language: set of legal sentences - all programs Lexeme (Tokens): lowest level syntactic unit (e.g. +, (, {, while, 1.2, etc.) Token (Token Class): a category of lexemes (e.g. identifier, open brace, int literal, etc.)
4
Chomsky’s 1950’s Classes of Grammars
Type-0: Unrestricted Grammars Type-1: Context Sensitive Grammars Type-2: Context Free Grammars Type-3: Regular Grammars Type 2 used to describe syntax for a language Type 3 used to define token classes
5
Formal Languages Definition of a Context Free Grammar Examples
∑ alphabet terminal Γ abstractions nonterminal P rules productions S start symbol (special nonterminal) Examples
6
Rules / Productions Rule has left-hand side (LHS) and right-hand side (RHS) LHS is a single non-terminal RHS consists of terminals and non-terminals A non-terminal can be on the LHS of several rules: Recursion for lists: <ident_list> identifier | identifier, <ident_list>
7
Recognizers / Generators
Formal Languages diagram of recognizers and generators Compilers have a recognizer component, syntax analyzer (parser) Generators are better at describing the language at a level that is useful to programmers Theorem: If L is a CFG then recognizers are generators are equivalent
8
BNF John Backus uses grammar notation to describe syntax of Algol 58 Backus & Naur use BNF to describe Algol 60 BNF is equivalent to context-free grammars BNF: a metalanguage (used to describe other languages) Backus-Naur Form (BNF) and Context-Free Grammars Most widely known method for describing programming language syntax Extended BNF (shorthand notation) Improves readability and writability of BNF
9
Derivations Derivation: repeated application of rules, beginning with the start symbol (non-terminal) and ending with a sentence (all terminal symbols) Each string in the derivation is called a sentential form Many different derivations result in the same sentence Leftmost derivation Rightmost derivation
10
Example Grammar Grammar: <program> <stmts>
<stmts> <stmt> | <stmt> ; <stmts> <stmt> <var> = <expr> <var> a | b | c | d <expr> <var> + <var> | <var> - <var> Derivation: <program> => <stmts> => <stmt> => <var> = <expr> => a = <expr> => a = <var> + <var> => a = b + <var> => a = b + c
11
parse tree -- hierarchical representation of a derivation
<program> <stmts> <stmt> <var> = <expr> a <var> <var> b c
12
Parse Trees A hierarchical representation of a derivation
Internal nodes of the tree are non-terminals Leaf nodes of the tree are terminals Several different derivations may have the same parse tree. If there are two derivations for a sentence that result in distinct parse trees the language is ambiguous
13
Parse Trees and Semantics
Compilers generate code by traversing parse trees. Semantics are derived from “shape” of trees. Example: math expressions Operations lower in tree occur first. The grammar can determine operator precedence 3.2 right most operator has precedence 3.3 ambiguous 3.4 normal precedence order
14
Ambiguous Grammars <expr> <expr> <op> <expr> | <id> <op> * | + <id> a | b | c | d
15
Ambiguous Grammars Get rid of multiple recursion to create unambiguous grammar: <expr> <expr> + <term> | <term> <term> <term> / <id> | <id> <id> a | b | c | d
16
Operator Associativity
Precedence order for A / B * C Associativity indicated by recursion: <expr> <expr> + <term> left recursive left associative (left-to-right precedence) <expr> <term> + <expr> right recursion right associative
17
Extended BNF Optional parts in brackets [ ]
<proc_call> <ident>( [<expr_list>] ) Alternatives are placed in parenthesis <term> <term> (+|-) <const> Repetitions (0 or more) are in braces { } <ident> letter { letter | digit }
18
BNF and EBNF BNF EBNF <expr> <expr> + <term>
<term> <term> * <factor> | <term> / <factor> | <factor> EBNF <expr> <term> { ( + | - ) <term> } <term> <factor> { ( * | / ) <factor> }
19
Limits to BNF BNF has a difficult time with some language rules
type compatibility Some language rules can not be specified with BNF variable must be declared before it is used These are known as static semantic rules, rules that can be checked at compile time Attribute grammars used to check these rules Dynamic semantic rules corresponding to the meaning of expressions, statements, and program units
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.