Programming Languages An Introduction to Grammars Oct 18th 2002.

Slides:



Advertisements
Similar presentations
Lecture # 8 Chapter # 4: Syntax Analysis. Practice Context Free Grammars a) CFG generating alternating sequence of 0’s and 1’s b) CFG in which no consecutive.
Advertisements

Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
ISBN Chapter 3 Describing Syntax and Semantics.
Honors Compilers An Introduction to Grammars Feb 12th 2002.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Context-Free Grammars Lecture 7
A basis for computer theory and A means of specifying languages
Slide 1 Chapter 2-b Syntax, Semantics. Slide 2 Syntax, Semantics - Definition The syntax of a programming language is the form of its expressions, statements.
Dr. Muhammed Al-Mulhem 1ICS ICS 535 Design and Implementation of Programming Languages Part 1 Fundamentals (Chapter 4) Compilers and Syntax.
COP4020 Programming Languages
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
EECS 6083 Intro to Parsing Context Free Grammars
Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth.
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
Syntax & Semantic Introduction Organization of Language Description Abstract Syntax Formal Syntax The Way of Writing Grammars Formal Semantic.
Syntax Specification and BNF © Allan C. Milne Abertay University v
1 Chapter 3 Describing Syntax and Semantics. 3.1 Introduction Providing a concise yet understandable description of a programming language is difficult.
Syntax: 10/18/2015IT 3271 Semantics: Describe the structures of programs Describe the meaning of programs Programming Languages (formal languages) -- How.
CS 331, Principles of Programming Languages Chapter 2.
CS Describing Syntax CS 3360 Spring 2012 Sec Adapted from Addison Wesley’s lecture notes (Copyright © 2004 Pearson Addison Wesley)
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
3-1 Chapter 3: Describing Syntax and Semantics Introduction Terminology Formal Methods of Describing Syntax Attribute Grammars – Static Semantics Describing.
C H A P T E R TWO Syntax and Semantic.
Copyright © by Curt Hill Grammar Types The Chomsky Hierarchy BNF and Derivation Trees.
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
CPS 506 Comparative Programming Languages Syntax Specification.
D Goforth COSC Translating High Level Languages Note error in assignment 1: #4 - refer to Example grammar 3.4, p. 126.
Chapter 3 Describing Syntax and Semantics
A Programming Languages Syntax Analysis (1)
LESSON 04.
CS 331, Principles of Programming Languages Chapter 2.
Syntax and Grammars.
Overview of Previous Lesson(s) Over View  In our compiler model, the parser obtains a string of tokens from the lexical analyzer & verifies that the.
Syntax and Semantics Form and Meaning of Programming Languages Copyright © by Curt Hill.
Chapter 3 Context-Free Grammars Dr. Frank Lee. 3.1 CFG Definition The next phase of compilation after lexical analysis is syntax analysis. This phase.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Overview of Previous Lesson(s) Over View 3 Model of a Compiler Front End.
C H A P T E R T W O Syntax and Semantic. 2 Introduction Who must use language definitions? Other language designers Implementors Programmers (the users.
Parser: CFG, BNF Backus-Naur Form is notational variant of Context Free Grammar. Invented to specify syntax of ALGOL in late 1950’s Uses ::= to indicate.
Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages Chapter 3 : Describing Syntax and Semantics Syntax.
Syntax Analysis Or Parsing. A.K.A. Syntax Analysis –Recognize sentences in a language. –Discover the structure of a document/program. –Construct (implicitly.
Context Free Grammars & Parsing CPSC 388 Fall 2001 Ellen Walker Hiram College.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Syntax(1). 2 Syntax  The syntax of a programming language is a precise description of all its grammatically correct programs.  Levels of syntax Lexical.
Describing Syntax and Semantics Chapter 3: Describing Syntax and Semantics Lectures # 6.
Chapter 3: Describing Syntax and Semantics
Chapter 3 – Describing Syntax
Describing Syntax and Semantics
Describing Syntax and Semantics
Introduction to Parsing
CS510 Compiler Lecture 4.
Fall Compiler Principles Context-free Grammars Refresher
Chapter 3 – Describing Syntax
Syntax (1).
Syntax Specification and Analysis
Automata and Languages What do these have in common?
CS314 – Section 5 Recitation 3
Compiler Design 4. Language Grammars
Programming Language Syntax 2
COP4020 Programming Languages
Lecture 7: Introduction to Parsing (Syntax Analysis)
R.Rajkumar Asst.Professor CSE
Lecture 4: Lexical Analysis & Chomsky Hierarchy
Fall Compiler Principles Context-free Grammars Refresher
Chapter 3 Describing Syntax and Semantics.
BNF 9-Apr-19.
Discrete Maths 13. Grammars Objectives
Programming Languages 2nd edition Tucker and Noonan
COMPILER CONSTRUCTION
Presentation transcript:

Programming Languages An Introduction to Grammars Oct 18th 2002

Context Free Grammars One or more non terminal symbols One or more non terminal symbols Lexically distinguished, e.g. upper case Lexically distinguished, e.g. upper case Terminal symbols are actual characters in the language Terminal symbols are actual characters in the language Or they can be tokens in practice Or they can be tokens in practice One non-terminal is the distinguished start symbol. One non-terminal is the distinguished start symbol.

Grammar Rules Non-terminal ::= sequence Non-terminal ::= sequence Where sequence can be non-terminals or terminals Where sequence can be non-terminals or terminals At least some rules must have ONLY terminals on the right side At least some rules must have ONLY terminals on the right side

Example of Grammar S ::= (S) S ::= (S) S ::= S ::= S ::= empty S ::= empty This is the language D2, the language of two kinds of balanced parens This is the language D2, the language of two kinds of balanced parens E.g. (( >)) E.g. (( >)) Well not quite D2, since that should allow things like (())<> Well not quite D2, since that should allow things like (())<>

Example, continued So add the rule So add the rule S ::= SS S ::= SS And that is indeed D2 And that is indeed D2 But this is ambiguous But this is ambiguous ()<>() can be parsed two ways ()<>() can be parsed two ways ()<> is an S and () is an S ()<> is an S and () is an S () is an S and <>() is an S () is an S and <>() is an S Nothing wrong with ambiguous grammars Nothing wrong with ambiguous grammars

BNF (Backus Naur/Normal Form) Properly attributed to Sanskrit scholars Properly attributed to Sanskrit scholars An extension of CFG with An extension of CFG with Optional constructs in [] Optional constructs in [] Sequences {} = 0 or more Sequences {} = 0 or more Alternation | Alternation | All these are just short hands All these are just short hands

BNF Shorthands IF ::= if EXPR then STM [else STM] fi IF ::= if EXPR then STM [else STM] fi IF ::= if EXPR then STM fi IF ::= if EXPR then STM fi IF ::= if EXPR then STM else STM fi IF ::= if EXPR then STM else STM fi STM ::= IF | WHILE STM ::= IF | WHILE STM ::= IF STM ::= IF STM ::= WHILE STM ::= WHILE STMSEQ ::= STM {;STM} STMSEQ ::= STM {;STM} STMSEQ ::= STM STMSEQ ::= STM STMSEQ ::= STM ; STMSEQ STMSEQ ::= STM ; STMSEQ

Programming Language Syntax Expressed as a CFG where the grammar is closely related to the semantics Expressed as a CFG where the grammar is closely related to the semantics For example For example EXPR ::= PRIMARY {OP | PRIMARY} EXPR ::= PRIMARY {OP | PRIMARY} OP ::= + | * OP ::= + | * Not good, better is Not good, better is EXPR ::= TERM | EXPR + TERM EXPR ::= TERM | EXPR + TERM TERM ::= PRIMARY | TERM * PRIMARY TERM ::= PRIMARY | TERM * PRIMARY This implies associativity and precedence This implies associativity and precedence

PL Syntax Continued No point in using BNF for tokens, since no semantics involved No point in using BNF for tokens, since no semantics involved ID ::= LETTER | LETTER ID ID ::= LETTER | LETTER ID Is actively confusing since the BC of ABC is not an identifier, and anyway there is no tree structure here Is actively confusing since the BC of ABC is not an identifier, and anyway there is no tree structure here Better to regard ID as a terminal symbol. In other words grammar is a grammar of tokens, not characters Better to regard ID as a terminal symbol. In other words grammar is a grammar of tokens, not characters

Grammars and Trees A Grammar with a starting symbol naturally indicates a tree representation of the program A Grammar with a starting symbol naturally indicates a tree representation of the program Non terminal on left is root of tree node Non terminal on left is root of tree node Right hand side are descendents Right hand side are descendents Leaves read left to right are the terminals that give the tokens of the program Leaves read left to right are the terminals that give the tokens of the program

The Parsing Problem Given a grammar of tokens Given a grammar of tokens And a sequence of tokens And a sequence of tokens Construct the corresponding parse tree Construct the corresponding parse tree Giving good error messages Giving good error messages

General Parsing Not known to be easier than matrix multiplication Not known to be easier than matrix multiplication Cubic, or more properly n** (whatever that unlikely constant is) Cubic, or more properly n** (whatever that unlikely constant is) In practice almost always linear In practice almost always linear In any case not a significant amount of time In any case not a significant amount of time Hardest part by far is to give good messages Hardest part by far is to give good messages

Regular Grammars Also called type 3 Also called type 3 Only one non-terminal on right side Only one non-terminal on right side Right side is either Right side is either Terminal Terminal Terminal Non-terminal Terminal Non-terminal Correspond to regular expressions Correspond to regular expressions Used to express grammar of tokens Used to express grammar of tokens Real_Literal ::= {digit}*. {digit}* [E[+|-]integer_literal] Real_Literal ::= {digit}*. {digit}* [E[+|-]integer_literal]

Context Sensitive Grammars Also called type 1 Also called type 1 Left side can contain multiple symbols Left side can contain multiple symbols JUNK digit => digit JUNK JUNK digit => digit JUNK But rhs cannot be shorter than left side But rhs cannot be shorter than left side Can express powerful rules Can express powerful rules But hard to parse But hard to parse

Type 0 Grammars No restrictions on left/right sides No restrictions on left/right sides Arbitrary mixture of terminals/non- terminals on either side Arbitrary mixture of terminals/non- terminals on either side Very powerful (Turing powerful) Very powerful (Turing powerful) Impossible to parse Impossible to parse

In Practice … Context free grammars used to express syntax. Context free grammars used to express syntax. Other rules (formal english, VDL) used to express static semantics Other rules (formal english, VDL) used to express static semantics Possible to do more in grammars Possible to do more in grammars E.g. Algol-68 W-grammars E.g. Algol-68 W-grammars But too difficult to deal with But too difficult to deal with