Chapter 2 Syntax. Syntax The syntax of a programming language specifies the structure of the language The lexical structure specifies how words can be.

Slides:



Advertisements
Similar presentations
Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth.
Advertisements

Translator Architecture Code Generator ParserTokenizer string of characters (source code) string of tokens abstract program string of integers (object.
ISBN Chapter 3 Describing Syntax and Semantics.
CSE 3302 Programming Languages Chengkai Li, Weimin He Spring 2008 Syntax Lecture 2 - Syntax, Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
CS 330 Programming Languages 09 / 13 / 2007 Instructor: Michael Eckmann.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Fall 2007CS 2251 Miscellaneous Topics Deque Recursion and Grammars.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Slide 1 Chapter 2-b Syntax, Semantics. Slide 2 Syntax, Semantics - Definition The syntax of a programming language is the form of its expressions, statements.
Chapter 3 Program translation1 Chapt. 3 Language Translation Syntax and Semantics Translation phases Formal translation models.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Yu-Chen Kuo1 Chapter 2 A Simple One-Pass Compiler.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Dr. Muhammed Al-Mulhem 1ICS ICS 535 Design and Implementation of Programming Languages Part 1 Fundamentals (Chapter 4) Compilers and Syntax.
Chapter 3: Formal Translation Models
Chapter 4 - Syntax Programming Languages:
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth.
Lee CSCE 314 TAMU 1 CSCE 314 Programming Languages Syntactic Analysis Dr. Hyunyoung Lee.
1 Syntax and Semantics The Purpose of Syntax Problem of Describing Syntax Formal Methods of Describing Syntax Derivations and Parse Trees Sebesta Chapter.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
Chpater 3. Outline The definition of Syntax The Definition of Semantic Most Common Methods of Describing Syntax.
Software II: Principles of Programming Languages
Syntax – Intro and Overview CS331. Syntax Syntax defines what is grammatically valid in a programming language –Set of grammatical rules –E.g. in English,
CS 355 – PROGRAMMING LANGUAGES Dr. X. Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax.
Winter 2007SEG2101 Chapter 71 Chapter 7 Introduction to Languages and Compiler.
Definition A string is a sequence of symbols. Examples “Hi, Mom.” “YAK” “abbababba” Question In what ways do programmers use strings?
CS Describing Syntax CS 3360 Spring 2012 Sec Adapted from Addison Wesley’s lecture notes (Copyright © 2004 Pearson Addison Wesley)
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Grammars CPSC 5135.
LANGUAGE DESCRIPTION: SYNTACTIC STRUCTURE
C H A P T E R TWO Syntax and Semantic.
Topic #2: Infix to Postfix EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
ISBN Chapter 3 Describing Syntax and Semantics.
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
Bernd Fischer RW713: Compiler and Software Language Engineering.
CSE 425: Syntax II Context Free Grammars and BNF In context free grammars (CFGs), structures are independent of the other structures surrounding them Backus-Naur.
CPS 506 Comparative Programming Languages Syntax Specification.
D Goforth COSC Translating High Level Languages.
Syntax and Semantics Structure of programming languages.
D Goforth COSC Translating High Level Languages Note error in assignment 1: #4 - refer to Example grammar 3.4, p. 126.
Syntax The Structure of a Language. Lexical Structure The structure of the tokens of a programming language The scanner takes a sequence of characters.
ISBN Chapter 3 Describing Syntax and Semantics.
Syntax (2).
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
C H A P T E R T W O Syntax and Semantic. 2 Introduction Who must use language definitions? Other language designers Implementors Programmers (the users.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 3.
Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages Chapter 3 : Describing Syntax and Semantics Syntax.
CC410: System Programming Dr. Manal Helal – Fall 2014 – Lecture 12–Compilers.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Syntax(1). 2 Syntax  The syntax of a programming language is a precise description of all its grammatically correct programs.  Levels of syntax Lexical.
CSE 3302 Programming Languages
Chapter 3: Describing Syntax and Semantics
Chapter 3 – Describing Syntax
A Simple Syntax-Directed Translator
CS 326 Programming Languages, Concepts and Implementation
Chapter 3 – Describing Syntax
Syntax (1).
CSE 3302 Programming Languages
Lecture 3: Introduction to Syntax (Cont’)
Programming Languages 2nd edition Tucker and Noonan
Programming Languages 2nd edition Tucker and Noonan
R.Rajkumar Asst.Professor CSE
C H A P T E R T W O Syntax.
CS 3304 Comparative Languages
Chapter 3 Describing Syntax and Semantics.
BNF 9-Apr-19.
High-Level Programming Language
Presentation transcript:

Chapter 2 Syntax

Syntax The syntax of a programming language specifies the structure of the language The lexical structure specifies how words can be constituted from characters The syntactic structure specifies how sentences can be constituted from words

Lexical Structure The tokens of a programming language consist of the set of all baisc grammatical categories that are the building blocks of syntax A program is viewed as a stream of tokens

Standard Token Categories Keywords, such as if and while Literals or constants, such as 42 (a numeric literal) or "hello" (a string literal) Special symbols, such as “ ; ”, “ <= ”, or “ + ” Identifiers, such as x24, putchar, or monthly_balance

White Spaces and Comments White spaces and comments are ignored except they function as delimiters Typical white spaces: newlines, tabs, spaces Comments: /* … */, // … \n (C, C++, Java) -- … \n (Ada, Haskell) (* … *) (Pascal, ML) ; … \n (Scheme)

C tokens There are six classes of tokens: identifiers, keywords, constants, string literals, operators, and other separators. Blanks, horizontal and vertical tabs, newlines, formfeeds, and comments as described below (collectively, "white space") are ignored except as they separate tokens. Some white space is required to separate otherwise adjacent identifiers, keywords, and constants. If the input stream has been separated into tokens up to a given character, the next token is the longest string of characters that could constitute a token.

An Example /* This program counts from 1 to 10. */ main( ) { int i; for (i = 1; i <= 10; i++) { printf(“%d\n”, i); }

Backus-Naur Form (BNF) BNF is a notation widely used in formal definition of syntactic structure A BNF is a set of rewriting rules , a set of terminal symbols , a set of nonterminal symbols N, and a “start symbol” S  N Each rule in  has the following form A   where A  N and   (N   )*

Backus-Naur Form The terminals in  form the basic alphabet (tokens) from which programs are constructed The nonterminals in N identify grammatical categories like Identifier, Integer, Expression, Statement, Function, Program The start symbol S identifies the principal grammatical category being defined by the grammar

Examples 1. binaryDigit  0 binaryDigit  1 binaryDigit  0 | 1 2. Integer  Digit | Integer Digit Digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 metasymbol or metasymbol concatenate

Derivation Integer  Integer Digit  Integer Digit Digit  Digit Digit Digit  3 Digit Digit  3 5 Digit  Sentence Sentential form

Parse Tree Sentential form

Example: Expression Assignment  Identifier = Expression Expression  Term | Expression + Term | Expression – Term Term  Factor | Term * Factor | Term / Factor Factor  Identifier | Literal | ( Expression )

Example: Expression x + 2 * y

Syntax for a Subset of C Program  void main ( ) { Declarations Statements } Declarations   | Declarations Declaration Declaration  Type Identifiers ; Type  int | boolean Identifiers  Identifier | Identifiers, Identifier Statements   | Statements Statement Statement  ; | Block | Assignment | IfStatement | WhileStatement Block  { Statements } Assignment  Identifier = Expression ; IfStatement  if ( Expression ) Statement | if ( Expression ) Statement else Statement WhileStatement  while ( Expression ) Statement

Syntax for a Subset of C Expression  Conjuction | Expression || Conjuction Conjuction  Relation | Conjuction && Relation Relation  Addition | Relation < Addition | Relation <= Addition | Relation > Addition | Relation >= Addition | Relation == Addition | Relation != Addition Addition  Term | Addition + Term | Addition – Term Term  Negation | Term * Negation | Term / Negation Negation  Factor | ! Factor Factor  Identifier | Literal | ( Expression )

Example: Program void main ( ) { int x; x = 1;}....

Ambiguity A grammar is ambiguous if it permits a string to be parsed into two or more different parse trees AmbExp  Integer | AmbExp – AmbExp

An Example 2 – (3 – 4) (2 – 3) – 4

The Dangling Else Problem if ( x < 0 ) if ( y < 0 ) y = y – 1; else y = 0;

The Dangling Else Problem if ( x < 0 ) if ( y < 0 ) y = y – 1; else y = 0;

The Dangling Else Problem Solution I: use a special keyword fi to explicitly close every if statement. For example, in Ada IfStatement  if ( E ) S fi | if ( E ) S else S fi Solution II: use an explicit rule outside the BNF syntax. For example, in C, every else clause is associated with the closest preceding if in the statement

Extended BNF (EBNF) EBNF introduces 3 parentheses: It uses { } to denote repetition to simplify the specification of recursion It uses [ ] to denote the optional part It uses ( ) for grouping

An Example Expression  Term { ( + | – ) Term } Term  Factor { ( * | / ) Factor } Factor  [ + | - ] number Expression  Term | Expression + Term | Expression – Term Term  Factor | Term * Factor | Term / Factor Factor  + number | - number | number grouping zero or more occurrences optional

Abstract Syntax The abstract syntax of a language identifies the essential syntactic elements in a program without describing how they are concretely constructed while i < n do begin i := i + 1 end while (i < n) { i = i + 1; } PascalC

Example: Loop Thinking a loop abstractly, the essential elements are a test expression for continuing a loop and a body which is the statement to be repeated All other elements constitute nonessential “syntactic sugar” The complete syntax is usually called concrete syntax

Example: Loop in loop < i + = i1 while i < n do begin i := i + 1 end Pascal while (i < n) { i = i + 1; } C

Example: Expression x + 2 * y

Example: Expression x + 2 * y x 2 y * +

Parser A parser of a language accepts or rejects strings based on whether they are legal strings in the language In a recursive-descent parser, each nonterminal is implemented as a function, and each terminal is implemented as a matching with the current token

Example: Calculator command  expr ‘\n’ expr  term { ‘+ ’ term } term  factor { ‘*’ factor } factor  number | ‘(’ expr ‘)’ number  digit { digit } digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Example: Calculator #include #include #include int token; int pos = 0; void command(void); void expr(void); void term(void); void factor(void); void number(void); void digit(void);

Example: Calculator main() { parse(); return 0; } void parse(void) { getToken(); command(); } void getToken(void) { token = getchar(); pos++; while (token == ' ') { token = getchar(); pos++; } }

Example: Calculator void command(void) { expr(); match(‘\n’); } void match(char c) { if (token == c) getToken(); else error(); } command  expr ‘\n’

Example: Calculator void expr(void) { term(); while (token == '+') { match('+'); term(); } } void term(void) { factor(); while (token == '*') { match('*'); term(); } } expr  term { ‘+ ’ term }term  factor { ‘*’ factor }

Example: Calculator void factor(void) { if (token == '(') { match('('); expr(); match(')'); } else { number(); } } void number(void) { digit(); while (isdigit(token)) digit(); } factor  number | ‘(’ expr ‘)’ number  digit { digit }

Example: Calculator void digit(void) { if (isdigit(token)) match(token); else error(); } void error(void) { printf("parse error: position %d: character %c\n", pos, token); exit(1); }