Software II: Principles of Programming Languages

Slides:



Advertisements
Similar presentations
ICE1341 Programming Languages Spring 2005 Lecture #5 Lecture #5 In-Young Ko iko.AT. icu.ac.kr iko.AT. icu.ac.kr Information and Communications University.
Advertisements

Chapter 2 Syntax. Syntax The syntax of a programming language specifies the structure of the language The lexical structure specifies how words can be.
ISBN Chapter 3 Describing Syntax and Semantics.
CSE 3302 Programming Languages Chengkai Li, Weimin He Spring 2008 Syntax Lecture 2 - Syntax, Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai.
CS 330 Programming Languages 09 / 13 / 2007 Instructor: Michael Eckmann.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
PZ02A - Language translation
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Slide 1 Chapter 2-b Syntax, Semantics. Slide 2 Syntax, Semantics - Definition The syntax of a programming language is the form of its expressions, statements.
Chapter 3 Program translation1 Chapt. 3 Language Translation Syntax and Semantics Translation phases Formal translation models.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Dr. Muhammed Al-Mulhem 1ICS ICS 535 Design and Implementation of Programming Languages Part 1 Fundamentals (Chapter 4) Compilers and Syntax.
Chapter 3: Formal Translation Models
COP4020 Programming Languages
ISBN Chapter 3 Describing Syntax and Semantics.
S YNTAX. Outline Programming Language Specification Lexical Structure of PLs Syntactic Structure of PLs Context-Free Grammar / BNF Parse Trees Abstract.
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth.
Describing Syntax and Semantics
1 Syntax and Semantics The Purpose of Syntax Problem of Describing Syntax Formal Methods of Describing Syntax Derivations and Parse Trees Sebesta Chapter.
Syntax & Semantic Introduction Organization of Language Description Abstract Syntax Formal Syntax The Way of Writing Grammars Formal Semantic.
CS 355 – PROGRAMMING LANGUAGES Dr. X. Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax.
Winter 2007SEG2101 Chapter 71 Chapter 7 Introduction to Languages and Compiler.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 2.
Syntax and Backus Naur Form
1 Chapter 3 Describing Syntax and Semantics. 3.1 Introduction Providing a concise yet understandable description of a programming language is difficult.
Syntax: 10/18/2015IT 3271 Semantics: Describe the structures of programs Describe the meaning of programs Programming Languages (formal languages) -- How.
CS Describing Syntax CS 3360 Spring 2012 Sec Adapted from Addison Wesley’s lecture notes (Copyright © 2004 Pearson Addison Wesley)
Grammars CPSC 5135.
PART I: overview material
3-1 Chapter 3: Describing Syntax and Semantics Introduction Terminology Formal Methods of Describing Syntax Attribute Grammars – Static Semantics Describing.
ProgrammingLanguages Programming Languages Language Syntax This lecture introduces the the lexical structure of programming languages; the context-free.
C H A P T E R TWO Syntax and Semantic.
ISBN Chapter 3 Describing Semantics -Attribute Grammars -Dynamic Semantics.
ISBN Chapter 3 Describing Syntax and Semantics.
Course: ICS313 Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali Lecturer, College of Computer Science and Engineering.
TextBook Concepts of Programming Languages, Robert W. Sebesta, (10th edition), Addison-Wesley Publishing Company CSCI18 - Concepts of Programming languages.
March 5, ICE 1341 – Programming Languages (Lecture #4) In-Young Ko Programming Languages (ICE 1341) Lecture #4 Programming Languages (ICE 1341)
CPS 506 Comparative Programming Languages Syntax Specification.
D Goforth COSC Translating High Level Languages.
Chapter 3 Part II Describing Syntax and Semantics.
Chapter 3 Describing Syntax and Semantics. Chapter 3: Describing Syntax and Semantics - Introduction - The General Problem of Describing Syntax - Formal.
C HAPTER 3 Describing Syntax and Semantics. T OPICS Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Attribute.
Copyright © 2006 Addison-Wesley. All rights reserved. Ambiguity in Grammars A grammar is ambiguous if and only if it generates a sentential form that has.
D Goforth COSC Translating High Level Languages Note error in assignment 1: #4 - refer to Example grammar 3.4, p. 126.
Chapter 3 Describing Syntax and Semantics
Syntax The Structure of a Language. Lexical Structure The structure of the tokens of a programming language The scanner takes a sequence of characters.
1 Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
ISBN Chapter 3 Describing Syntax and Semantics.
1 / 48 Formal a Language Theory and Describing Semantics Principles of Programming Languages 4.
Syntax and Grammars.
Syntax and Semantics Form and Meaning of Programming Languages Copyright © by Curt Hill.
C H A P T E R T W O Syntax and Semantic. 2 Introduction Who must use language definitions? Other language designers Implementors Programmers (the users.
Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages Chapter 3 : Describing Syntax and Semantics Syntax.
BNF A CFL Metalanguage Some Variations Particular View to SLK Copyright © 2015 – Curt Hill.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Chapter 3 – Describing Syntax
Describing Syntax and Semantics
PROGRAMMING LANGUAGES
Describing Syntax and Semantics
Chapter 3 – Describing Syntax
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Chapter 3 Describing Syntax and Semantics.
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
COMPILER CONSTRUCTION
Presentation transcript:

Software II: Principles of Programming Languages Lecture 3 – Formal Descriptions of a Programming Language 1

Lexics vs. Syntax Vs. Semantics Lexics refers to issues regarding the assembly of words that comprise a statement. Syntax refers to issues regarding the grammar of a statement. Semantics refers to issues regarding the meaning of a statement. 2

Lexical Structure of Programming Languages It was believed in the early days of programming language development that it was sufficient to be able specify the syntax of a programming language. We now know that this is not enough. This led to the development of context-free grammars and Backus-Naur Form. 3

Programming Language Syntax Syntax is defined as “the arrangement of words as elements in a sentence to show their relationship.” Syntax provides a great deal of information that we need to understand a program and to guide its translation. Example 2 + 3 ´ 4 = 14 (not 20 – multiplication takes precedence) 4

Programming Language Semantics Semantics is defined as “the meaning of a symbol or set of symbols.” This is equally important in translating a programming correctly and may be more difficult to express unambiguously. 5

Tokens The lexical structure of program consists of sequence of characters that are assembled into character strings called lexemes which have directly related to tokens, the element of a languages grammar to which they correspond. Tokens fall into several distinct categories: reserved words literals or constants special symbols such as < = + identifiers, such as x24, average, balance 6

Reserved Words and Standard Identifiers Reserved words serve a special purpose within the syntax of a language; for this reason, they are generally not allowed to be used as user-defined identifiers. Reserved words are sometimes confused with standard identifiers, which are identifiers defined by the language, but serve no special syntactic purpose. The standard data types are standard identifiers in Pascal and Ada. 7

Free- and Fixed-Field Formats Fixed-field format is a holdover from the day of punch cards A fixed field syntax uses the positioning within a line to convey information. E g., FORTRAN, COBOL and RPG use fixed-field formats. SNOBOL4 uses the first character on the line to distinguish between statement labels, continuations and comments Free-field formats allow program statements to be written anywhere on a line without regard to position on the line or to line breaks. 17 8

Delimiting Lexemes Most languages work with lexemes of differing length; this could create problems. If the input is doif is the lexeme doif or are there two lexemes do and if? The easiest way to handle this is to use the principle of longest substring, i.e., the longest possible string is the lexeme. As a result, we typically use white space as a delimiter separating lexemes in a source file. 9

Scanning FORTRAN FORTRAN breaks many of the rules of lexical analysis FORTRAN ignores white space, which leads to: DO 99 I = 1, 10 vs DO 99 I = 1.10 FORTRAN allows keywords to be defined as variables: IF = 2 IF (IF.LT.0) IF = IF + 1 ELSE IF = IF + 2 10

Regular Expressions The lexemes of a programming languages are described formally by the use of regular expressions, where there are 3 operations, concatentation, repetition and selection: a|b denotes a or b. ab denotes a followed by b (ab)* denotes a followed by b zero or more times (a|b)c denotes a or b followed by c 11

Extending Regular Expressions There are other operators that we can add to regular expression notations that make them easier to write: [a-z] any character from a through z r+ one or more occurrences of r ? An optional term . Any one character Examples [0-9]+ describes an integer [0-9]+(\.[0-9]+)? describes an unsigned real 12

What Is A Grammar? The grammar of a language is expressed formally as G = (T, N, S, P) where T is a set of terminals (the basic, atomic symbols of a language). N is a set of nonterminals (symbols which denote particular arrangements of terminals). S is the start symbol (a special nonterminal which denotes the program as a whole). P is the set of productions (rules showing how terminals and nonterminal can be arranged to form other nonterminals. 18

An Example Of A Grammar? We can describe the manner in which sentences in English are composed: 1. sentence ® noun-phrase verb-phrase . 2. noun-phrase ® article noun 3. article ® a | the 4. noun ® girl | dog 5. verb-phrase ® verb noun-phrase 6. verb ® sees | pets Start symbol Non- terminals Terminals 19

Parsing A Sentence Let’s examine the sentence “the girl sees a dog. sentence Þ noun-phrase verb-phrase . (Rule 1) sentence Þ article noun verb-phrase . (Rule 2) sentence Þ the noun verb-phrase . (Rule 3) sentence Þ the girl verb-phrase . (Rule 4) sentence Þ the girl verb noun-phrase . (Rule 5) sentence Þ the girl sees noun-phrase . (Rule 6) sentence Þ the girl sees article noun . (Rule 2) sentence Þ the girl sees a noun . (Rule 3) sentence Þ the girl sees a dog . (Rule 3) 20

Context –Free Grammars and BNFs Context-Free grammars are grammars where non-terminals (collections of tokens in a language) always are deconstructed the same way, regardless of the context in which they are used. BNF (Backus-Naur form) is the standard notation or metalanguage used to specify the grammar of the language. 4 21

Backus-Naur Form BNF (Backus-Naur Form) is a metalanguage for describing a context-free grammar. The symbol ::= (or → ) is used for may derive. The symbol | separates alternative strings on the right-hand side. Example E ::= E + T | T T ::= T * F | F F ::= id | constant | (E) where E is Expression, T is Term, and F is Factor 22

Syntax We can use BNF to specify the syntax of a programming language, and determine if we have a syntactically correct program. Syntactic correctness does not mean that a program is semantically correct. We could write: The home/ ran/ girl and recognize that this is nonsensical even if the grammar is correct. A language is a set of finite-length strings with characters chosen from the language’s alphabet. This includes the set of all programs written in <fill in you favorite programming language> . 47 23

Grammar For Simple Assignment Statements <assignment statement> ::= <variable> = <arithmetic expression> <arithmetic expression> ::=<term> | <arithmetic expression> + <term> | <arithmetic expression> - <term> <term> ::= <primary> | <term> ´ <primary> | <term> / <primary> <primary> ::= <variable> | <number> | (<arithmetic expression>) <variable> ::= <identifier> | <identifier> [<subscript list>] <subscript list> ::= <arithmetic expression> | <subscript list>, <arithmetic expression> 48 24

Generating Strings S ® SS | ( S ) | ( ) To generate strings that belong to our language, we use a single-replacement rule: the generation of strings in our language all begin with a single symbol which we replace with the right-hand side of a production: S ® SS | ( S ) | ( ) We can generate the string: ( ( ) ( ) ) S Þ( S ) Þ ( S S ) Þ ( ( ) S ) Þ ( ( ) ( ) ) sentential forms sentence 50 25

Parse Tree For (()()) S ( S ) S S ( ) ( ) 26

Parse Tree for “The girl sees a dog” sentence noun-phrase verb-phrase . article noun verb noun-phrase article noun the girl sees a dog 27

Parse Tree for an Assignment Statement nonterminals <arithmetic expression> <variable> = <term> <primary> ´ <term> <identifier> ( ) <arithmetic expression> <primary> <arithmetic expression> + <term> W <variable> <term> <primary> <primary> <variable> <identifier> <variable> <identifier> <identifier> Y U V terminals 51 28

Using BNF To Specify Regular Expressions We can also use BNF to specify how we assemble the words that comprise our language: <digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 <unsigned integer> ::= <digit> | <unsigned integer> <digit> These strings are much simple that the ones that comprise programs and are called regular expressions. 49 29

Example - Another Expression Grammar Let’s take a look at another simple expression grammar: E ® E + T | T T ® T * F | F F ® (E) | number Let’s parse the expressions 3 + 4 + 5 and 3 + (4 + 5) 30

Expression Parse Trees + + T T E + T E E + + T T F T F T T F number F E ( ) number number number number E + T T number number 31

Abstract Syntax Trees for Expression + + + + number number number number number number 32

Parsing 3 + 4 * 5 E E + T T T * F F F number number number 33

Parsing (3 + 4) * 5 E T T * F F ( E ) E + T F T F number number number 34

Parse Tree For If-Then-Else if-statement if ( expression ) statement else statement if-statement ® if ( expression ) statement else statement 35

Abstract Syntax Tree For If-Then-Else if-statement expression statement statement if-statement ® if ( expression ) statement else statement 36

Extended Backus-Naur Form EBNF (Extended Backus-Naur Form) adds a few additional metasymbols whose main advantage is replacing recursion with iteration. {a} means that a is occur zero or more times. [a] means that a appears once or not at all. Example Our expression grammar can become: E ::= T { + T } T ::= F { * F } F ::= id | constant | (E) 37

Left and right derivations Remember our grammar: S ::= A B c A ::= a A | b B ::= A b | a How do we parse the string abbbc? S S A B c A B c a A A b Left derivation Right derivation 38

Ambiguity Any grammar that accurately describes the language is equally valid. Sometimes, there may be more than one way to parse a program correctly. If this is the case, the grammar is said to be ambiguous. They /are flying / planes. They are/ flying planes. Ambiguity (which is NOT desirable) is usually a property of the grammar and not of the language itself. 53 39

Ambiguous grammars While there may be an infinite number of grammars that describe a given language, their parse trees may be very different. A grammar capable of producing two different parse trees for the same sentence is called ambiguous. Ambiguous grammars are highly undesireable. 40

Is it IF-THEN or IF-THEN-ELSE? The IF-THEN=ELSE ambiguity is a classical example of an ambiguous grammar. Statement ::= if Expression then Statement else Statement | if Expression then Statement How would you parse the following string? IF x > 0 THEN IF y > 0 THEN z := x + y ELSE z := x; 41

Is it IF-THEN or IF-THEN-ELSE? (continued) There are two possible parse trees: Statement if Expression then Statement if Expression then Statement else Statement Statement if Expression then Statement else Statement if Expression then Statement 42

Is it IF-THEN or IF-THEN-ELSE? (continued) Statement ::= if Expression then Statement ElseClause ElseClause ::= else Statement | e Statement if Expression then Statement ElseClause e if Expression then Statement ElseClause else Statement 43

Ambiguous Languages If every grammar in a language is ambiguous, we say that the language is inherently ambiguous. If we have two grammars: G1: S ® SS | 0 | 1 G2: T ® 0T | 1T | 0 | 1 G1 is ambiguous; G2 is not; therefore the language is NOT inherently ambiguous 54 44

Ambiguity in Grammars T S S T S S S S S S T S S 1 1 1 1 1 1 G1 : Ambiguous Grammar G2: Unambiguous Grammar 55 45

Syntax Charts for Simple Assignment Statements variable = arithmetic expression arithmetic expression term + - 59 46

Syntax Charts for Simple Assignment Statements (continued) variable [ ] identifier subscript list subscript list arithmetic expression , 58 47

Syntax Charts for Simple Assignment Statements (continued) term primary ´ / 60 48

Syntax Charts for Simple Assignment Statements (continued) primary variable number ( arithmetic expression ) 61 49

What is an Attribute Grammar? An attribute grammar is an extension to a context-free grammar that is used to describe features of a programming language that cannot be described in BNF or can only be described in BNF with great difficulty. Examples Describing the rule that float variables can be assigned integer values but the reverse is not true is difficult to describe complete in BNF. The rule requiring that all variable must be declared before being used is impossible to describe in BNF. 50

Static vs. Dynamic Semantics The static semantics of a language is indirectly related to the meaning of programs during execution. Its names comes from the fact that these specifications can be checked at compile time. Dynamic semantics refers to meaning of expressions, statements and other program units. Unlike static semantics, these cannot be checked at compile time and can only be checked at runtime. 51

What is an Attribute? An attribute is a property whose value is assigned to a grammar symbol. Attribute computation functions (or semantic functions) are associated with the productions of a grammar and are used to compute the values of an attribute. Predicate functions state some of the syntax and static semantics rules of the grammar. 52

Definition of an Attribute Grammar An attribute grammar is a grammar with the following added features: Each symbol X has a set of attributes A(X). A(X) has two disjoint subsets: S(X), synthesized attributes, which are passed up the parse tree I(X), inherited attributes which are passed down the parse tree Each production of the grammar has a set of semantic functions and a set of predicate functions (which may be an empty set). Intrinsic attributes are synthesized attributes who properties are found outside the grammar (e.g., symbol table) 53

An Attribute Grammar for Assignments Syntax rule: <assign> ® <var> = <expr> Semantic rule: <expr>.expected_type ¬ <var>.actual_type Syntax rule: <expr> ®<var>[2]+ <var>[3] Semantic rule: <expr>.actual_type ¬ if (<var>[2].actual_type = int) and if (<var>[3].actual_type = int) then int else real end if Predicate: <expr>.actual_type = <expr>.expected_type Syntax rule: <expr> ®<var> Semantic rule: <expr>.actual_type ¬ <var>.actual_type 4. Syntax rule: <var> ® A | B | C Semantic rule: <var>actual_type ¬ look-up(<var>.string) 54

Parse Tree for A = A + B <assign> <var> = <expr> 55

Derivation of Attributes <assign> actual_type <var> = <expr> expected_type actual_type <var>[2] + <var>[3] actual_type actual_type A A B 56

Fully Attributed Parse Tree <assign> expected_type = real actual_type = real <var> = <expr> expected_type actual_type = int <var>[2] + <var>[3] expected_type = real actual_type = real expected_type = real actual_type = real A A B 57