Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 2 Syntax. Syntax The syntax of a programming language specifies the structure of the language The lexical structure specifies how words can be.

Similar presentations


Presentation on theme: "Chapter 2 Syntax. Syntax The syntax of a programming language specifies the structure of the language The lexical structure specifies how words can be."— Presentation transcript:

1 Chapter 2 Syntax

2 Syntax The syntax of a programming language specifies the structure of the language The lexical structure specifies how words can be constituted from characters The syntactic structure specifies how sentences can be constituted from words

3 Lexical Structure The tokens of a programming language consist of the set of all baisc grammatical categories that are the building blocks of syntax A program is viewed as a stream of tokens

4 Standard Token Categories Keywords, such as if and while Literals or constants, such as 42 (a numeric literal) or "hello" (a string literal) Special symbols, such as “ ; ”, “ <= ”, or “ + ” Identifiers, such as x24, putchar, or monthly_balance

5 White Spaces and Comments White spaces and comments are ignored except they function as delimiters Typical white spaces: newlines, tabs, spaces Comments: /* … */, // … \n (C, C++, Java) -- … \n (Ada, Haskell) (* … *) (Pascal, ML) ; … \n (Scheme)

6 C tokens There are six classes of tokens: identifiers, keywords, constants, string literals, operators, and other separators. Blanks, horizontal and vertical tabs, newlines, formfeeds, and comments as described below (collectively, "white space") are ignored except as they separate tokens. Some white space is required to separate otherwise adjacent identifiers, keywords, and constants. If the input stream has been separated into tokens up to a given character, the next token is the longest string of characters that could constitute a token.

7 An Example /* This program counts from 1 to 10. */ main( ) { int i; for (i = 1; i <= 10; i++) { printf(“%d\n”, i); }

8 Backus-Naur Form (BNF) BNF is a notation widely used in formal definition of syntactic structure A BNF is a set of rewriting rules , a set of terminal symbols , a set of nonterminal symbols N, and a “start symbol” S  N Each rule in  has the following form A   where A  N and   (N   )*

9 Backus-Naur Form The terminals in  form the basic alphabet (tokens) from which programs are constructed The nonterminals in N identify grammatical categories like Identifier, Integer, Expression, Statement, Function, Program The start symbol S identifies the principal grammatical category being defined by the grammar

10 Examples 1. binaryDigit  0 binaryDigit  1 binaryDigit  0 | 1 2. Integer  Digit | Integer Digit Digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 metasymbol or metasymbol concatenate

11 Derivation Integer  Integer Digit  Integer Digit Digit  Digit Digit Digit  3 Digit Digit  3 5 Digit  3 5 2 Sentence Sentential form

12 Parse Tree Sentential form

13 Example: Expression Assignment  Identifier = Expression Expression  Term | Expression + Term | Expression – Term Term  Factor | Term * Factor | Term / Factor Factor  Identifier | Literal | ( Expression )

14 Example: Expression x + 2 * y

15 Syntax for a Subset of C Program  void main ( ) { Declarations Statements } Declarations   | Declarations Declaration Declaration  Type Identifiers ; Type  int | boolean Identifiers  Identifier | Identifiers, Identifier Statements   | Statements Statement Statement  ; | Block | Assignment | IfStatement | WhileStatement Block  { Statements } Assignment  Identifier = Expression ; IfStatement  if ( Expression ) Statement | if ( Expression ) Statement else Statement WhileStatement  while ( Expression ) Statement

16 Syntax for a Subset of C Expression  Conjuction | Expression || Conjuction Conjuction  Relation | Conjuction && Relation Relation  Addition | Relation < Addition | Relation <= Addition | Relation > Addition | Relation >= Addition | Relation == Addition | Relation != Addition Addition  Term | Addition + Term | Addition – Term Term  Negation | Term * Negation | Term / Negation Negation  Factor | ! Factor Factor  Identifier | Literal | ( Expression )

17 Example: Program void main ( ) { int x; x = 1;}....

18 Ambiguity A grammar is ambiguous if it permits a string to be parsed into two or more different parse trees AmbExp  Integer | AmbExp – AmbExp 2 - 3 - 4

19 An Example 2 – (3 – 4) (2 – 3) – 4

20 The Dangling Else Problem if ( x < 0 ) if ( y < 0 ) y = y – 1; else y = 0;

21 The Dangling Else Problem if ( x < 0 ) if ( y < 0 ) y = y – 1; else y = 0;

22 The Dangling Else Problem Solution I: use a special keyword fi to explicitly close every if statement. For example, in Ada IfStatement  if ( E ) S fi | if ( E ) S else S fi Solution II: use an explicit rule outside the BNF syntax. For example, in C, every else clause is associated with the closest preceding if in the statement

23 Extended BNF (EBNF) EBNF introduces 3 parentheses: It uses { } to denote repetition to simplify the specification of recursion It uses [ ] to denote the optional part It uses ( ) for grouping

24 An Example Expression  Term { ( + | – ) Term } Term  Factor { ( * | / ) Factor } Factor  [ + | - ] number Expression  Term | Expression + Term | Expression – Term Term  Factor | Term * Factor | Term / Factor Factor  + number | - number | number grouping zero or more occurrences optional

25 Abstract Syntax The abstract syntax of a language identifies the essential syntactic elements in a program without describing how they are concretely constructed while i < n do begin i := i + 1 end while (i < n) { i = i + 1; } PascalC

26 Example: Loop Thinking a loop abstractly, the essential elements are a test expression for continuing a loop and a body which is the statement to be repeated All other elements constitute nonessential “syntactic sugar” The complete syntax is usually called concrete syntax

27 Example: Loop in loop < i + = i1 while i < n do begin i := i + 1 end Pascal while (i < n) { i = i + 1; } C

28 Example: Expression x + 2 * y

29 Example: Expression x + 2 * y x 2 y * +

30 Parser A parser of a language accepts or rejects strings based on whether they are legal strings in the language In a recursive-descent parser, each nonterminal is implemented as a function, and each terminal is implemented as a matching with the current token

31 Example: Calculator command  expr ‘\n’ expr  term { ‘+ ’ term } term  factor { ‘*’ factor } factor  number | ‘(’ expr ‘)’ number  digit { digit } digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

32 Example: Calculator #include #include #include int token; int pos = 0; void command(void); void expr(void); void term(void); void factor(void); void number(void); void digit(void);

33 Example: Calculator main() { parse(); return 0; } void parse(void) { getToken(); command(); } void getToken(void) { token = getchar(); pos++; while (token == ' ') { token = getchar(); pos++; } }

34 Example: Calculator void command(void) { expr(); match(‘\n’); } void match(char c) { if (token == c) getToken(); else error(); } command  expr ‘\n’

35 Example: Calculator void expr(void) { term(); while (token == '+') { match('+'); term(); } } void term(void) { factor(); while (token == '*') { match('*'); term(); } } expr  term { ‘+ ’ term }term  factor { ‘*’ factor }

36 Example: Calculator void factor(void) { if (token == '(') { match('('); expr(); match(')'); } else { number(); } } void number(void) { digit(); while (isdigit(token)) digit(); } factor  number | ‘(’ expr ‘)’ number  digit { digit }

37 Example: Calculator void digit(void) { if (isdigit(token)) match(token); else error(); } void error(void) { printf("parse error: position %d: character %c\n", pos, token); exit(1); }


Download ppt "Chapter 2 Syntax. Syntax The syntax of a programming language specifies the structure of the language The lexical structure specifies how words can be."

Similar presentations


Ads by Google