Syntax The Structure of a Language. Lexical Structure The structure of the tokens of a programming language The scanner takes a sequence of characters.

Slides:



Advertisements
Similar presentations
Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
Advertisements

Chapter 2 Syntax. Syntax The syntax of a programming language specifies the structure of the language The lexical structure specifies how words can be.
Session 14 (DM62) / 15 (DM63) Recursive Descendent Parsing.
ISBN Chapter 3 Describing Syntax and Semantics.
CSE 3302 Programming Languages Chengkai Li, Weimin He Spring 2008 Syntax Lecture 2 - Syntax, Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai.
CS 330 Programming Languages 09 / 13 / 2007 Instructor: Michael Eckmann.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
A basis for computer theory and A means of specifying languages
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Slide 1 Chapter 2-b Syntax, Semantics. Slide 2 Syntax, Semantics - Definition The syntax of a programming language is the form of its expressions, statements.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Dr. Muhammed Al-Mulhem 1ICS ICS 535 Design and Implementation of Programming Languages Part 1 Fundamentals (Chapter 4) Compilers and Syntax.
Chapter 3: Formal Translation Models
Chapter 4 - Syntax Programming Languages:
S YNTAX. Outline Programming Language Specification Lexical Structure of PLs Syntactic Structure of PLs Context-Free Grammar / BNF Parse Trees Abstract.
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth.
Describing Syntax and Semantics
1 Syntax and Semantics The Purpose of Syntax Problem of Describing Syntax Formal Methods of Describing Syntax Derivations and Parse Trees Sebesta Chapter.
Software II: Principles of Programming Languages
Programming Languages Third Edition
CSI 3120, Grammars, page 1 Language description methods Major topics in this part of the course: –Syntax and semantics –Grammars –Axiomatic semantics (next.
CS 355 – PROGRAMMING LANGUAGES Dr. X. Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax.
Winter 2007SEG2101 Chapter 71 Chapter 7 Introduction to Languages and Compiler.
Definition A string is a sequence of symbols. Examples “Hi, Mom.” “YAK” “abbababba” Question In what ways do programmers use strings?
Syntax and Backus Naur Form
Syntax: 10/18/2015IT 3271 Semantics: Describe the structures of programs Describe the meaning of programs Programming Languages (formal languages) -- How.
Context-Free Grammars
CS Describing Syntax CS 3360 Spring 2012 Sec Adapted from Addison Wesley’s lecture notes (Copyright © 2004 Pearson Addison Wesley)
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Grammars CPSC 5135.
PART I: overview material
Programming Languages Third Edition Chapter 6 Syntax.
3-1 Chapter 3: Describing Syntax and Semantics Introduction Terminology Formal Methods of Describing Syntax Attribute Grammars – Static Semantics Describing.
ProgrammingLanguages Programming Languages Language Syntax This lecture introduces the the lexical structure of programming languages; the context-free.
COMP313A Programming Languages Lexical Analysis. Lecture Outline Lexical Analysis The language of Lexical Analysis Regular Expressions.
C H A P T E R TWO Syntax and Semantic.
ISBN Chapter 3 Describing Syntax and Semantics.
TextBook Concepts of Programming Languages, Robert W. Sebesta, (10th edition), Addison-Wesley Publishing Company CSCI18 - Concepts of Programming languages.
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
COP4020 Programming Languages Syntax Prof. Robert van Engelen (modified by Prof. Em. Chris Lacher)
CFG1 CSC 4181Compiler Construction Context-Free Grammars Using grammars in parsers.
CSE 425: Syntax II Context Free Grammars and BNF In context free grammars (CFGs), structures are independent of the other structures surrounding them Backus-Naur.
CPS 506 Comparative Programming Languages Syntax Specification.
D Goforth COSC Translating High Level Languages.
D Goforth COSC Translating High Level Languages Note error in assignment 1: #4 - refer to Example grammar 3.4, p. 126.
Chapter 3 Describing Syntax and Semantics
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 3: Introduction to Syntactic Analysis.
ISBN Chapter 3 Describing Syntax and Semantics.
Syntax and Semantics Form and Meaning of Programming Languages Copyright © by Curt Hill.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Parsing and Code Generation Set 24. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program,
Programming Languages and Design Lecture 2 Syntax Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
GRAMMARS & PARSING. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program, referred to as a.
C H A P T E R T W O Syntax and Semantic. 2 Introduction Who must use language definitions? Other language designers Implementors Programmers (the users.
Copyright © 2004 Pearson Addison-Wesley. All rights reserved.3-1 Language Specification and Translation Lecture 8.
©SoftMoore ConsultingSlide 1 Context-Free Grammars.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 3.
Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages Chapter 3 : Describing Syntax and Semantics Syntax.
Spring 16 CSCI 4430, A Milanova 1 Announcements HW1 will be out this evening Due Monday, 2/8 Submit in HW Server AND at start of class on 2/8 A review.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Syntax(1). 2 Syntax  The syntax of a programming language is a precise description of all its grammatically correct programs.  Levels of syntax Lexical.
Describing Syntax and Semantics Chapter 3: Describing Syntax and Semantics Lectures # 6.
Chapter 3 – Describing Syntax
PROGRAMMING LANGUAGES
CSE 3302 Programming Languages
Chapter 3 – Describing Syntax
Syntax (1).
CS 363 Comparative Programming Languages
CSE 3302 Programming Languages
Presentation transcript:

Syntax The Structure of a Language

Lexical Structure The structure of the tokens of a programming language The scanner takes a sequence of characters and collects them into tokens

Tokens Reserved words (keywords) –if while Literals or constants –3.14 “Fred” Special symbols –+ = Identifiers

Principle of Longest Substring At each point, the longest possible string is collected into a single token Natural token separators –Token separators ; + = –White space Spaces and tabs Newlines Comments

FORTRAN violates these rules DO 99 I = 1.10 –Assigns 1.10 to the variable DO99I DO 99 I = 1,10 –Sets up a loop with loop counter I going from 1 to 10 FORTRAN has no reserved words at all

C token conventions Six classes of tokens –Identifiers –Keywords –Constants –String literals –Operators –Other operators White space characters are ignored except as they separate tokens Adheres to the principle of longest substring

Regular Expressions Regular expressions were invented by Stephen Kleene and appeared in a Rand Corporation report in about 1950 Regular expressions represent a form of language definition Each regular expression E denotes a language L(E) defined over the alphabet of the language

Rules defining REs Empty –  is a RE Atom –Any symbol from the alphabet is a RE Alternation –If a and b are REs then so is a|b –All strings identified by a and all those identified by b Concatenation –If a and b are REs then so is ab –All strings formed by concatenating a string identified by b to the end of one identified by a

More rules for REs Kleene Closure –If a is an RE then so is a* –All strings formed by concatenating zero or more strings identified by a Positive Closure –If a is an RE then so is a+ –All strings formed by concatenating one or more strings identified by a

Examples of Res (a|b)c –Recognizes ac and bc but no others (a|b)*c –Recognizes c ac bc aac abc abac (a|b)+c –Does not recognize c but all the others above

Extensions [] – any one of a set of characters –[A-Z] – any capitol letter – [ ] – any digit ? – an optional item (0 or 1 of these) –[A-Z][0-9]? – a single capitol letter or a single capitol letter followed by a single digit. (period) – any character

More Examples [0-9]+ –Simple integer constants [0-9]+(\.[0-9])? –Simple floating-point constants

Context-Free Grammars (CFGs) Context-free grammars were developed by Noam Chomsky as a way to specify language Rules are generally specified in Backus-Naur Form (BNF) or ain Extended BNF (EBNF)

What makes up a CFG? A set N of non-terminal symbols A set T of terminal symbols A set P of production rules A special non-terminal symbol S called the start symbol (or goal symbol)

Sample CFG sentence  noun-phrase verb-phrase. noun-phrase  article noun article  a | the noun  girl | dog verb-phrase  verb noun-phrase verb  sees | pets

Parts of the grammar Non-terminal symbols: {sentence, noun-phrase, article, noun, verb- phrase, verb} Terminal Sumbols {.,a, the, girl, dog, sees, pets} Production rules The previous slide provides these Start Symbol sentence

Notes on CFG Non-terminal symbols are those that appear on the left-hand side (lhs) of the production rules Terminal symbols are those that appear only on the right-hand side (rhs) of the production rules  and | are meta-symbols

(Left-Most) Derivation sentence  noun-phrase verb-phrase.  article noun verb-phrase.  the noun verb-phrase.  the girl verb-phrase.  the girl verb noun-phrase.  the girl sees noun-phrase.  the girl sees article noun.  the girl sees a noun.  the girl sees a dog.

Corresponding Parse Tree sentence noun-phraseverb-phrase. articlenoun verb noun-phrase articlenoun the girlsees adog

Ambiguous Grammars A grammar is ambiguous of a sentence has two distinct derivations or two distinct parse trees

Grammar for expressions expr  expr + expr | expr * expr | (expr) | number number  number digit | digit digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Parse trees for * 7 expr + * + * number digit 3 number digit 5 number digit 7 number digit 3 number digit 5 number digit 7

Handling Ambiguity The grammar rules for expressions can be modified to eliminate the ambiguity that precedence should take care of Introduce a new non-terminal that forces the higher-precedence operator lower in the parse tree

Precedence handled expr  expr + expr | term term  term * term | ( expr ) | number number  number digit | digit digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Associativity This grammar is still ambiguous There are two parse trees for This may be ok for addition & multiplication, but not for subtraction & addition which are left-associative

Revised Grammar (not ambiguous) expr  expr + term | term term  term * factor | factor factor  ( expr ) | number number  number digit | digit digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

EBNFs Extended BNF adds more metasymbols { } – a repeated item (0 or more times) [ ] – an optional item (0 or 1 time)

Expression Grammar in EBNF expr  term { + term } term  factor { * factor } factor  ( expr ) | number number  digit { digit } digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

EBNF for if-statement if-statement  if (expression) statement [ else statement ]

Syntax Diagrams Syntax diagrams are an alternative to EBNF Study the diagrams on pp and observe the direct relationship of each to the EBNF grammar rules for expressions

Parsers This simplest parser is a recognizer Accepts or rejects strings on whether they are legal strings in the language More general parsers Build parse trees (or abstract syntax trees) May calculate values of expressions, etc.

Bottom-up Parsers Attempts to match the input with the RHSs of the grammar rules When a match occurs, the RHS is replaced by the non-teminal on the LHS of the rule (called a reduce) Sometimes called shift-reduce parsing

Top-down Parsers Non-terminals are expanded to match incoming tokens and the parser directly constructs a derivation

Recursive-Descent Parsing A program made up of a collection of mutually recursive procedures, one for each non-terminal.