Chapter 3 Describing Syntax and Semantics

Slides:



Advertisements
Similar presentations
Defining Program Syntax Chapter TwoModern Programming Languages, 2nd ed.1.
Advertisements

Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
Chapter Chapter Summary Languages and Grammars Finite-State Machines with Output Finite-State Machines with No Output Language Recognition Turing.
ICE1341 Programming Languages Spring 2005 Lecture #5 Lecture #5 In-Young Ko iko.AT. icu.ac.kr iko.AT. icu.ac.kr Information and Communications University.
ICE1341 Programming Languages Spring 2005 Lecture #4 Lecture #4 In-Young Ko iko.AT. icu.ac.kr iko.AT. icu.ac.kr Information and Communications University.
ISBN Chapter 3 Describing Syntax and Semantics.
CS 330 Programming Languages 09 / 13 / 2007 Instructor: Michael Eckmann.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Slide 1 Chapter 2-b Syntax, Semantics. Slide 2 Syntax, Semantics - Definition The syntax of a programming language is the form of its expressions, statements.
1 CSE305 Programming Languages Syntax What is it? How is it specified? Who uses it? Why is it needed?
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Chapter 2-a Defining Program Syntax
Dr. Muhammed Al-Mulhem 1ICS ICS 535 Design and Implementation of Programming Languages Part 1 Fundamentals (Chapter 4) Compilers and Syntax.
Chapter 3: Formal Translation Models
COP4020 Programming Languages
ISBN Chapter 3 Describing Syntax and Semantics.
S YNTAX. Outline Programming Language Specification Lexical Structure of PLs Syntactic Structure of PLs Context-Free Grammar / BNF Parse Trees Abstract.
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth.
Describing Syntax and Semantics
1 Syntax and Semantics The Purpose of Syntax Problem of Describing Syntax Formal Methods of Describing Syntax Derivations and Parse Trees Sebesta Chapter.
The College of Saint Rose CIS 433 – Programming Languages David Goldschmidt, Ph.D. from Concepts of Programming Languages, 9th edition by Robert W. Sebesta,
Chapter TwoModern Programming Languages1 Defining Program Syntax.
Chpater 3. Outline The definition of Syntax The Definition of Semantic Most Common Methods of Describing Syntax.
CPSC 388 – Compiler Design and Construction Parsers – Context Free Grammars.
CS 355 – PROGRAMMING LANGUAGES Dr. X. Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax.
Winter 2007SEG2101 Chapter 71 Chapter 7 Introduction to Languages and Compiler.
1 Chapter 3 Describing Syntax and Semantics. 3.1 Introduction Providing a concise yet understandable description of a programming language is difficult.
Syntax: 10/18/2015IT 3271 Semantics: Describe the structures of programs Describe the meaning of programs Programming Languages (formal languages) -- How.
A sentence (S) is composed of a noun phrase (NP) and a verb phrase (VP). A noun phrase may be composed of a determiner (D/DET) and a noun (N). A noun phrase.
CS Describing Syntax CS 3360 Spring 2012 Sec Adapted from Addison Wesley’s lecture notes (Copyright © 2004 Pearson Addison Wesley)
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Grammars CPSC 5135.
PART I: overview material
Chapter 3 Part I Describing Syntax and Semantics.
3-1 Chapter 3: Describing Syntax and Semantics Introduction Terminology Formal Methods of Describing Syntax Attribute Grammars – Static Semantics Describing.
C H A P T E R TWO Syntax and Semantic.
ISBN Chapter 3 Describing Syntax and Semantics.
TextBook Concepts of Programming Languages, Robert W. Sebesta, (10th edition), Addison-Wesley Publishing Company CSCI18 - Concepts of Programming languages.
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
CMSC 330: Organization of Programming Languages Context-Free Grammars.
The College of Saint Rose CIS 433 – Programming Languages David Goldschmidt, Ph.D. from Concepts of Programming Languages, 9th edition by Robert W. Sebesta,
CPS 506 Comparative Programming Languages Syntax Specification.
D Goforth COSC Translating High Level Languages.
Chapter 3 Describing Syntax and Semantics. Chapter 3: Describing Syntax and Semantics - Introduction - The General Problem of Describing Syntax - Formal.
C HAPTER 3 Describing Syntax and Semantics. T OPICS Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Attribute.
Copyright © 2006 Addison-Wesley. All rights reserved. Ambiguity in Grammars A grammar is ambiguous if and only if it generates a sentential form that has.
Syntax The Structure of a Language. Lexical Structure The structure of the tokens of a programming language The scanner takes a sequence of characters.
ISBN Chapter 3 Describing Syntax and Semantics.
LESSON 04.
Syntax and Grammars.
Syntax and Semantics Form and Meaning of Programming Languages Copyright © by Curt Hill.
Chapter 3 Context-Free Grammars Dr. Frank Lee. 3.1 CFG Definition The next phase of compilation after lexical analysis is syntax analysis. This phase.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
C H A P T E R T W O Syntax and Semantic. 2 Introduction Who must use language definitions? Other language designers Implementors Programmers (the users.
1 CS Programming Languages Class 04 September 5, 2000.
Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages Chapter 3 : Describing Syntax and Semantics Syntax.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Describing Syntax and Semantics Chapter 3: Describing Syntax and Semantics Lectures # 6.
Chapter 3: Describing Syntax and Semantics
Chapter 3 – Describing Syntax
Describing Syntax and Semantics
PROGRAMMING LANGUAGES
Chapter 3 – Describing Syntax
What does it mean? Notes from Robert Sebesta Programming Languages
Syntax versus Semantics
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Chapter 3 Describing Syntax and Semantics.
COMPILER CONSTRUCTION
Presentation transcript:

Chapter 3 Describing Syntax and Semantics The syntax of a programming language is the part of the language definition that says how programs look: their form and structure, such as its expressions, statements, and so on. The semantics of a programming language is the part of the language definition that says what programs do: the behavior and meaning of those expressions, statements. For example: The syntax of a C if statement is if (<expr>) <statement> The semantics of this statement form means if the current value of the expression <expr> is true, the embedded statement <statement> is selected for execution.

2. A Grammar Example for English 2.1 <A> for an article (a, the) and express our definition: <A>  a | the 2.2 <N> for a noun (dog, cat, or rat): <N>  dog | cat | rat 2.3 <NP> for a noun phrase (an article followed by a noun): <NP>  <A> <N> 2.4 <V> for a verb (loves, hates, or eats) <V>  loves | hates | eats 2.5 <S> for a sentence (a noun phrase, followed by a verb, followed by another noun phrase: <S>  <NP> <V> <NP>

2.6 Grammar defining a small subset of unpunctuated English <S>  <NP> <V> <NP> <NP>  <A> <N> <V>  loves | hates | eats <N>  dog | cat | rat <A>  a | the How does such a grammar define a language? Think of the grammar as a set of rules that say how to build a tree.

<S> <NP> <V> <A> <N> the dog loves cat <S> <NP> <V> <A> <N> a cat eats the rat

3 The Basic Concepts of Describing Syntax 3.1 Sentences (statements) – The strings of a language. The syntax rules of the language specify which strings of characters in the language. 3.2 lexemes – The small units (identifier, literals, operators, and special words). 3.3 Token – A category of the language lexemes. 3.4 For example: Statement index = 2 * count + 17; Lexemes Tokens index identifier = equal_sign 2 int_literal * mult_op count identifier + plus_op 17 int_literal ; semicolon

Recursive Grammar 4 Formal Methods of Describing Syntax 4.1 A Grammar Example for a Programming Language <exp>  <exp>+<exp> | <exp>*<exp> | ( <exp> ) | a | b | c expression: a expression: a + b expression: a + b * c expression: ( ( a + b ) * c ) 4.2 Parse trees: Hierarchical syntactic structures <exp> ( <exp> ) Recursive Grammar <exp> <exp> * c ( <exp> ) <exp> + <exp> a b

4.3 Backus-Naur Form and Context-Free Grammars John Backus and Noam Chomsky (The middle to late 1950’s) proposed a method (grammar) for describing syntax. Start symbol <S>  <NP> <V> <NP> production <NP>  <A> <N> <V>  loves | hates | eats Non-terminal <N>  dog | cat | rat symbols <A>  a | the tokens (Lexemes): terminal symbols

The grammar has four important parts: The Non-terminal symbols are strings enclosed in angle brackets, such as <NP>. The Non-terminal symbols of a grammar often correspond to different kinds of language constructs The grammar designates one of the non-terminal symbols as the root of the parse tree: the start symbol. A production (rule) consists of a left-hand side (LHS) : a single non-terminal symbol a arrow :  a right-hand side (RHS) : a sequence of one or more things, each of which can be either a token or a non-terminal symbol. The special symbol | is used to separate the right-hand sides. Tokens and Lexemes : Terminal symbols (the smallest units of syntax).

The example: The grammar for a simple language of expressions with three variables. <exp>  <exp>+<exp> | <exp>*<exp> | ( <exp> ) | a | b | c This grammar can be written in a different way, without the | notation: <exp>  <exp>+<exp> <exp>  <exp>*<exp> <exp>  ( <exp> ) <exp>  a <exp>  b <exp>  c The special non-terminal symbol <empty> : Generate an empty string – a string of no tokens if – then statement with an optional else part might be defined like this: <if-stmt>  if <expr> them <stmt> <else-part> <else-part>  else <stmt> | <empty>

Context-free Grammars: The grammars are context-free grammars, the children of a node in the parse tree depend only on that node’s non-terminal symbol; they do not depend on the context of neighboring nodes in the tree. Metalanguage: a language is used to describe another language. BNF is a metalanguage.

4.4 An example for writing Grammars: float a; boolean (bool)b, c, d; int e=1, f, g = 1+2; Step 1: Divide the problem into smaller pieces. The major components: the type name the list of variables the final semicolon Non-terminal symbols <var-dec> <type-name> <declarator-list> <var-dec>  <type-name> <declarator-list>;

Step 2: List the primitive types <type-name>  boolean (bool) | byte | short | int | | long | char | float | double Step 3: Define the declarator list A declarator list is a list of one or more declarators, followed by a comma, followed by a (smaller) declarator list. <declarator-list>  <declartor> | <declarator> , <declarator-list> Step 4: Define a declarator It is a variable name followed by, optionally, by a equal sign and an expression. <declarator>  <variable-name> | <variable-name> = <expr>

4.5 Examples: A Grammar for a Small Language <program>  begin <stmt_list> end <stmt_list>  <stmt> | <stmt> ; <stmt_list> <stmt>  <var> = <expression> <var>  A | B | C <expression>  <var> + <var> | <var> - <var> | <var> Remarks: Only one kind of statement: assignment. A program consists of begin and end A list of assignment statements separated by semicolons An expression: a variable, add two variables, or subtract one variable from another one. Only three variable names: A, B, C.

Leftmost derivations: <program> => begin <stmt_list> end => begin <stmt> ; <stmt_list> end => begin <var> = <expression>; <stmt_list> end => begin A = <expression>; <stmt_list> end => begin A = <var> + <var>; <stmt_list> end => begin A = B + <var>; <stmt_list> end => begin A = B + C; <stmt_list> end => begin A = B + C; <stmt> end => begin A = B + C; <var> = <expression> end => begin A = B + C; B = <expression> end => begin A = B + C; B = C end In this derivation, the replaced non-terminal is always left-most non-terminal. (right-most non-terminal or neither left-most nor right-most)

A Grammar for Simple Assignment Statements <assign>  <id> = <expr> <id>  A | B | C <expr>  <id> + <expr> | <id> * <expr> | ( <expr> ) | <id> A = B * ( A + C ) is generated by the leftmost derivation: <assign> => <id> = <expr> => A = <id> * <expr> => A = B * <expr> => A = B * ( <expr> ) => A = B * ( <id> + <expr> ) => A = B * ( A + <expr> ) => A = B * ( A + <id> ) => A = B * ( A + C )

A parse tree for the simple statement A = B * (A + C)

The statement: A = B + C * A has two distinct parse trees. Ambiguity: A grammar that generates a sentence for which there are two or more distinct parse trees is said to be ambiguous. An ambiguous Grammar for Simple Assignment Statements <assign>  <id> = <expr> <id>  A | B | C <expr>  <expr> + <expr> | <expr> * <expr> | ( <expr> ) | <id> The statement: A = B + C * A has two distinct parse trees. By using Operator Precedence and Associativity of Operator, we solve the ambiguous problem.

Two distinct parse trees for the same sentence, A = B + C * A

5.1 Operator Precedence: An operator in an expression is generated lower in the parse tree (and therefore must be evaluated first) has precedence over an operator produced higher up in the tree. The left parse tree indicates The multiplication operator has precedence over the addition operator. The right parse tree indicates The addition operator has precedence over the multiplication operator. A grammar can be written to define different operators (the addition and multiplication) in a higher to lower ordering.

Derive the statement A = B + C * A by using the grammar An unambiguous Grammar for Simple Assignment Statements <assign>  <id> = <expr> <id>  A | B | C <expr>  <expr> + <term> | <term> <term>  <term> * <factor> | <factor> <factor>  ( <expr> ) | <id> Derive the statement A = B + C * A by using the grammar

The leftmost derivation: <assign> => <id> = <expr> => A = <expr> => A = <expr> + <term> => A = <term> + <term> => A = <factor> + <term> => A = <id> + <term> => A = B + <term> => A = B + <term> * <factor> => A = B + <factor> * <factor> => A = B + <id> * <factor> => A = B + C * <factor> => A = B + C * <id> => A = B + C * A

The rightmost derivation: <assign> => <id> = <expr> => <id> = <expr> + <term> => <id> = <expr> + <term> * <factor> => <id> = <expr> + <term> * <id> => <id> = <expr> + <term> * A => <id> = <expr> + <factor> * A => <id> = <expr> + <id> * A => <id> = <expr> + C * A => <id> = <term> + C * A => <id> = <factor> + C * A => <id> = <id> + C * A => <id> = B + C * A => A = B + C * A

A unique parse tree for A = B + C * A using an unambiguous grammar

5.2 Associativity of Operators: The parse trees for expressions with two or more adjacent occurrences of operators with equal precedence have those occurrences in proper hierarchical order. Example: Derive this statement A = B + C + A by using the grammar (the Leftmost) and generate its parse tree. This parse tree shows the left addition is lower than the right addition (the left associative) A = B + C + A

The leftmost derivation: <assign> => <id> = <expr> => A = <expr> => A = <expr> + <term> => A = <expr> + <term> + <term> => A = <term> + <term> + <term> => A = <factor> + <term> + <term> => A = <id> + <term> + <term> => A = B + <term> + <term> => A = B + <factor> + <term> => A = B + <id> + <term> => A = B + C + <term> => A = B + C + <factor> => A = B + C + <id> => A = B + C + A

A parse tree for A = (B + C) + A illustrating the left associativity of addition The leftmost derivation:

The rightmost derivation: <assign> => <id> = <expr> => <id> = <expr> + <term> => <id> = <expr> + <factor> => <id> = <expr> + <id> => <id> = <expr> + A => <id> = <expr> + <term> + A => <id> = <expr> + <factor> + A => <id> = <expr> + <id> + A => <id> = <expr> + C + A => <id> = <term> + C + A => <id> = <factor> + C + A => <id> = <id> + C + A => <id> = B + C + A => A = B + C + A

A parse tree for A = (B + C) + A illustrating the left associativity of addition The rightmost derivation:

Left recursive: when a BNF rule has its LHS also appearing at the beginning of its RHS. An unambiguous Grammar for Simple Assignment Statements <assign>  <id> = <expr> <id>  A | B | C <expr>  <expr> + <term> | <term> (Left associative) <term>  <term> * <factor> | <factor> (Left associative) <factor>  ( <expr> ) | <id>

Right recursive: when a BNF rule has its LHS also appearing at the end of its RHS. Rules for exponentiation as a right-associative operator <factor>  <exp> ** <factor> | <exp> (Right associative) <exp>  ( <exp> ) | <id>

5.3 An unambiguous grammar for if-then-else Rules for if-then-else <if_stmt>  if <logic_expr> then <stmt> | if <logic_expr> then <stmt> else <stmt> Deriving if <logic_expr> then if <logic_expr> then <stmt> else <stmt> by the rules above

Two distinct parse trees for the same sentential form

The unambiguous grammar for if-then-else Rules for if-then-else <stmt>  <matched> | <unmatched> <matched>  if <logic_expr> then <matched> else <matched> | any non-if statement <unmatched>  if <logic_expr> then <stmt> | if <logic_expr> then <matched> else <unmached> Deriving if <logic_expr> then if <logic_expr> then <stmt> else <stmt> by the rules above