Download presentation
Presentation is loading. Please wait.
Published byPercival Small Modified over 9 years ago
1
Chapter 2 Syntax
2
Syntax The syntax of a programming language specifies the structure of the language The lexical structure specifies how words can be constituted from characters The syntactic structure specifies how sentences can be constituted from words
3
Lexical Structure The tokens of a programming language consist of the set of all baisc grammatical categories that are the building blocks of syntax A program is viewed as a stream of tokens
4
Standard Token Categories Keywords, such as if and while Literals or constants, such as 42 (a numeric literal) or "hello" (a string literal) Special symbols, such as “ ; ”, “ <= ”, or “ + ” Identifiers, such as x24, putchar, or monthly_balance
5
White Spaces and Comments White spaces and comments are ignored except they function as delimiters Typical white spaces: newlines, tabs, spaces Comments: /* … */, // … \n (C, C++, Java) -- … \n (Ada, Haskell) (* … *) (Pascal, ML) ; … \n (Scheme)
6
C tokens There are six classes of tokens: identifiers, keywords, constants, string literals, operators, and other separators. Blanks, horizontal and vertical tabs, newlines, formfeeds, and comments as described below (collectively, "white space") are ignored except as they separate tokens. Some white space is required to separate otherwise adjacent identifiers, keywords, and constants. If the input stream has been separated into tokens up to a given character, the next token is the longest string of characters that could constitute a token.
7
An Example /* This program counts from 1 to 10. */ main( ) { int i; for (i = 1; i <= 10; i++) { printf(“%d\n”, i); }
8
Backus-Naur Form (BNF) BNF is a notation widely used in formal definition of syntactic structure A BNF is a set of rewriting rules , a set of terminal symbols , a set of nonterminal symbols N, and a “start symbol” S N Each rule in has the following form A where A N and (N )*
9
Backus-Naur Form The terminals in form the basic alphabet (tokens) from which programs are constructed The nonterminals in N identify grammatical categories like Identifier, Integer, Expression, Statement, Function, Program The start symbol S identifies the principal grammatical category being defined by the grammar
10
Examples 1. binaryDigit 0 binaryDigit 1 binaryDigit 0 | 1 2. Integer Digit | Integer Digit Digit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 metasymbol or metasymbol concatenate
11
Derivation Integer Integer Digit Integer Digit Digit Digit Digit Digit 3 Digit Digit 3 5 Digit 3 5 2 Sentence Sentential form
12
Parse Tree Sentential form
13
Example: Expression Assignment Identifier = Expression Expression Term | Expression + Term | Expression – Term Term Factor | Term * Factor | Term / Factor Factor Identifier | Literal | ( Expression )
14
Example: Expression x + 2 * y
15
Syntax for a Subset of C Program void main ( ) { Declarations Statements } Declarations | Declarations Declaration Declaration Type Identifiers ; Type int | boolean Identifiers Identifier | Identifiers, Identifier Statements | Statements Statement Statement ; | Block | Assignment | IfStatement | WhileStatement Block { Statements } Assignment Identifier = Expression ; IfStatement if ( Expression ) Statement | if ( Expression ) Statement else Statement WhileStatement while ( Expression ) Statement
16
Syntax for a Subset of C Expression Conjuction | Expression || Conjuction Conjuction Relation | Conjuction && Relation Relation Addition | Relation < Addition | Relation <= Addition | Relation > Addition | Relation >= Addition | Relation == Addition | Relation != Addition Addition Term | Addition + Term | Addition – Term Term Negation | Term * Negation | Term / Negation Negation Factor | ! Factor Factor Identifier | Literal | ( Expression )
17
Example: Program void main ( ) { int x; x = 1;}....
18
Ambiguity A grammar is ambiguous if it permits a string to be parsed into two or more different parse trees AmbExp Integer | AmbExp – AmbExp 2 - 3 - 4
19
An Example 2 – (3 – 4) (2 – 3) – 4
20
The Dangling Else Problem if ( x < 0 ) if ( y < 0 ) y = y – 1; else y = 0;
21
The Dangling Else Problem if ( x < 0 ) if ( y < 0 ) y = y – 1; else y = 0;
22
The Dangling Else Problem Solution I: use a special keyword fi to explicitly close every if statement. For example, in Ada IfStatement if ( E ) S fi | if ( E ) S else S fi Solution II: use an explicit rule outside the BNF syntax. For example, in C, every else clause is associated with the closest preceding if in the statement
23
Extended BNF (EBNF) EBNF introduces 3 parentheses: It uses { } to denote repetition to simplify the specification of recursion It uses [ ] to denote the optional part It uses ( ) for grouping
24
An Example Expression Term { ( + | – ) Term } Term Factor { ( * | / ) Factor } Factor [ + | - ] number Expression Term | Expression + Term | Expression – Term Term Factor | Term * Factor | Term / Factor Factor + number | - number | number grouping zero or more occurrences optional
25
Abstract Syntax The abstract syntax of a language identifies the essential syntactic elements in a program without describing how they are concretely constructed while i < n do begin i := i + 1 end while (i < n) { i = i + 1; } PascalC
26
Example: Loop Thinking a loop abstractly, the essential elements are a test expression for continuing a loop and a body which is the statement to be repeated All other elements constitute nonessential “syntactic sugar” The complete syntax is usually called concrete syntax
27
Example: Loop in loop < i + = i1 while i < n do begin i := i + 1 end Pascal while (i < n) { i = i + 1; } C
28
Example: Expression x + 2 * y
29
Example: Expression x + 2 * y x 2 y * +
30
Parser A parser of a language accepts or rejects strings based on whether they are legal strings in the language In a recursive-descent parser, each nonterminal is implemented as a function, and each terminal is implemented as a matching with the current token
31
Example: Calculator command expr ‘\n’ expr term { ‘+ ’ term } term factor { ‘*’ factor } factor number | ‘(’ expr ‘)’ number digit { digit } digit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
32
Example: Calculator #include #include #include int token; int pos = 0; void command(void); void expr(void); void term(void); void factor(void); void number(void); void digit(void);
33
Example: Calculator main() { parse(); return 0; } void parse(void) { getToken(); command(); } void getToken(void) { token = getchar(); pos++; while (token == ' ') { token = getchar(); pos++; } }
34
Example: Calculator void command(void) { expr(); match(‘\n’); } void match(char c) { if (token == c) getToken(); else error(); } command expr ‘\n’
35
Example: Calculator void expr(void) { term(); while (token == '+') { match('+'); term(); } } void term(void) { factor(); while (token == '*') { match('*'); term(); } } expr term { ‘+ ’ term }term factor { ‘*’ factor }
36
Example: Calculator void factor(void) { if (token == '(') { match('('); expr(); match(')'); } else { number(); } } void number(void) { digit(); while (isdigit(token)) digit(); } factor number | ‘(’ expr ‘)’ number digit { digit }
37
Example: Calculator void digit(void) { if (isdigit(token)) match(token); else error(); } void error(void) { printf("parse error: position %d: character %c\n", pos, token); exit(1); }
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.