Syntax Analysis Sections 4.1-4.4:.

Slides:



Advertisements
Similar presentations
Error Handling A compiler should: detect errors locate errors recover from errors Errors may occur in any of the three main analysis phases of compiling:
Advertisements

Lecture # 7 Chapter 4: Syntax Analysis. What is the job of Syntax Analysis? Syntax Analysis is also called Parsing or Hierarchical Analysis. A Parser.
ISBN Chapter 4 Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing.
Compiler Constreuction 1 Chapter 4 Syntax Analysis Topics to cover: Context-Free Grammars: Concepts and Notation Writing and rewriting a grammar Syntax.
CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University
ERROR HANDLING Lecture on 27/08/2013 PPT: 11CS10037 SAHIL ARORA.
1 October 2, October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Syntax Analyzer Syntax Analyzer creates the syntactic structure of the given source program. This.
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
SYNTAX ANALYSIS & ERROR RECOVERY By: Sarthak Swaroop.
Lexical and Syntax Analysis
Lesson 5 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Joey Paquet, 2000, Lecture 5 Error Recovery Techniques in Top-Down Predictive Syntactic Analysis.
1 Problems with Top Down Parsing  Left Recursion in CFG May Cause Parser to Loop Forever.  Indeed:  In the production A  A  we write the program procedure.
Recursive Descent Parsers Lecture 6 Mon, Feb 2, 2004.
Overview of Previous Lesson(s) Over View  In our compiler model, the parser obtains a string of tokens from the lexical analyzer & verifies that the.
Syntax Analyzer (Parser)
Parser: CFG, BNF Backus-Naur Form is notational variant of Context Free Grammar. Invented to specify syntax of ALGOL in late 1950’s Uses ::= to indicate.
1 Topic #4: Syntactic Analysis (Parsing) CSC 338 – Compiler Design and implementation Dr. Mohamed Ben Othman ( )
Syntax Analysis Or Parsing. A.K.A. Syntax Analysis –Recognize sentences in a language. –Discover the structure of a document/program. –Construct (implicitly.
Introduction to Parsing
Chapter 3 – Describing Syntax
Problems with Top Down Parsing
lec02-parserCFG May 8, 2018 Syntax Analyzer
LESSON 16.
Parsing — Part II (Top-down parsing, left-recursion removal)
Parsing & Context-Free Grammars
Chapter 4 - Parsing CSCE 343.
Programming Languages Translator
Lexical and Syntax Analysis
Chapter 2 :: Programming Language Syntax
Chapter 4 Syntax Analysis
Introduction to Parsing (adapted from CS 164 at Berkeley)
Textbook:Modern Compiler Design
Parsing IV Bottom-up Parsing
Table-driven parsing Parsing performed by a finite state machine.
Parsing — Part II (Top-down parsing, left-recursion removal)
Syntax Analysis Part I Chapter 4
lec02-parserCFG July 30, 2018 Syntax Analyzer
Chapter 4 Syntax Analysis.
Syntax Analysis Chapter 4.
PROGRAMMING LANGUAGES
Compiler Construction
4 (c) parsing.
Syntax Analysis Sections :.
Parsing Techniques.
CS 3304 Comparative Languages
Compiler Design 4. Language Grammars
Lexical and Syntax Analysis
Top-Down Parsing CS 671 January 29, 2008.
Chapter 2: A Simple One Pass Compiler
Chapter 2: A Simple One Pass Compiler
Lecture 7: Introduction to Parsing (Syntax Analysis)
R.Rajkumar Asst.Professor CSE
Compiler design Error recovery in top-down predictive parsing
LL and Recursive-Descent Parsing Hal Perkins Autumn 2011
Introduction to Parsing
Introduction to Parsing
Parsing — Part II (Top-down parsing, left-recursion removal)
Chapter 4: Lexical and Syntax Analysis Sangho Ha
Chapter 2 :: Programming Language Syntax
Chapter 2 :: Programming Language Syntax
LL and Recursive-Descent Parsing Hal Perkins Autumn 2009
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Predictive Parsing Program
LL and Recursive-Descent Parsing Hal Perkins Winter 2008
Lec00-outline May 18, 2019 Compiler Design CS416 Compiler Design.
lec02-parserCFG May 27, 2019 Syntax Analyzer
Review for the Midterm. Overview (Chapter 1):
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

Syntax Analysis Sections 4.1-4.4:

Syntax Analysis - Parsing An overview of parsing : Functions & Responsibilities Context Free Grammars Concepts & Terminology Writing and Designing Grammars Resolving Grammar Problems / Difficulties Top-Down Parsing Recursive Descent & Predictive LL Bottom-Up Parsing LR & LALR Concluding Remarks/Looking Ahead

How do grammars relate to parsing process ? An Overview of Parsing Why are Grammars to formally describe Languages Important ? Precise, easy-to-understand representations Compiler-writing tools can take grammar and generate a compiler allow language to be evolved (new statements, changes to statements, etc.) Languages are not static, but are constantly upgraded to add new features or fix “old” ones ADA  ADA9x, C++ Adds: Templates, exceptions, How do grammars relate to parsing process ?

Parsing During Compilation regular expressions errors lexical analyzer parser rest of front end symbol table source program parse tree get next token token interm repres also technically part or parsing includes augmenting info on tokens in source, type checking, semantic analysis uses a grammar to check structure of tokens produces a parse tree syntactic errors and recovery recognize correct syntax report errors

Parsing Responsibilities Syntax Error Identification / Handling Recall typical error types: Lexical : Misspellings Syntactic : Omission, wrong order of tokens Semantic : Incompatible types Logical : Infinite loop / recursive call Majority of error processing occurs during syntax analysis NOTE: Not all errors are identifiable !! Which ones?

Key Issues – Error Processing Detecting errors Finding position at which they occur Clear / accurate presentation Recover (pass over) to continue and find later errors Don’t impact compilation of “correct” programs

What are some Typical Errors ? #include<stdio.h> int f1(int v) { int i,j=0; for (i=1;i<5;i++) { j=v+f2(i) } return j; } int f2(int u) { int j; j=u+f1(u*u) int main() for (i=1;i<10;i++) { j=j+i*I; printf(“%d\n”,i); printf("%d\n",f1(j)); return 0; } As reported by MS VC++ 'f2' undefined; assuming extern returning int syntax error : missing ';' before ‘}‘ syntax error : missing ';' before ‘return‘ fatal error : unexpected end of file found Which are “easy” to recover from? Which are “hard” ?

Error Recovery Strategies Panic Mode– Discard tokens until a “synchro” token is found ( end, “;”, “}”, etc. ) -- Decision of designer -- Problems: skip input miss declaration – causing more errors miss errors in skipped material -- Advantages: simple suited to 1 error per statement Phrase Level – Local correction on input -- “,” ”;” – Delete “,” – insert “;” -- Also decision of designer -- Not suited to all situations -- Used in conjunction with panic mode to allow less input to be skipped

Error Recovery Strategies – (2) Error Productions: -- Augment grammar with rules -- Augment grammar used for parser construction / generation -- example: add a rule for := in C assignment statements Report error but continue compile -- Self correction + diagnostic messages Global Correction: -- Adding / deleting / replacing symbols is chancy – may do many changes ! -- Algorithms available to minimize changes costly - key issues

Motivating Grammars Regular Expressions  Basis of lexical analysis  Represent regular languages Context Free Grammars  Basis of parsing  Represent language constructs  Characterize context free languages Reg. Lang. CFLs EXAMPLE: anbn , n  1 : Is it regular ?

Context Free Grammars : Concepts & Terminology Definition: A Context Free Grammar, CFG, is described by T, NT, S, PR, where: T: Terminals / tokens of the language NT: Non-terminals to denote sets of strings generated by the grammar & in the language S: Start symbol, SNT, which defines all strings of the language PR: Production rules to indicate how T and NT are combined to generate valid strings of the language. PR: NT  (T | NT)* Like a Regular Expression / DFA / NFA, a Context Free Grammar is a mathematical model

Context Free Grammars : A First Look assign_stmt  id := expr ; expr  expr operator term expr  term term  id term  real term  integer operator  + operator  - What do “blue” symbols represent? Derivation: A sequence of grammar rule applications and substitutions that transform a starting non-term into a sequence of terminals / tokens. Simply stated: Grammars / production rules allow us to “rewrite” and “identify” correct syntax.

Derivation Let’s derive: id := id + real – integer ; using production: assign_stmt assign_stmt  id := expr ;  id := expr ; expr  expr operator term id := expr operator term; expr  expr operator term id := expr operator term operator term; expr  term id := term operator term operator term; term  id id := id operator term operator term; operator  + id := id + term operator term; term  real id := id + real operator term; operator  - id := id + real - term; term  integer id := id + real - integer;

Example Grammar expr  expr op expr expr  ( expr ) expr  - expr expr  id op  + op  - op  * op  / op   Black : NT Blue : T expr : S 9 Production rules To simplify / standardize notation, we offer a synopsis of terminology.

Example Grammar - Terminology Terminals: a,b,c,+,-,punc,0,1,…,9, blue strings Non Terminals: A,B,C,S, black strings T or NT: X,Y,Z Strings of Terminals: u,v,…,z in T* Strings of T / NT:  ,  in ( T  NT)* Alternatives of production rules: A 1; A 2; …; A k;  A  1 | 2 | … | 1 First NT on LHS of 1st production rule is designated as start symbol ! E  E A E | ( E ) | -E | id A  + | - | * | / | 

Grammar Concepts A step in a derivation is zero or one action that replaces a NT with the RHS of a production rule. EXAMPLE: E  -E (the  means “derives” in one step) using the production rule: E  -E EXAMPLE: E  E A E  E * E  E * ( E ) DEFINITION:  derives in one step  derives in  one step  derives in  zero steps + * EXAMPLES:  A      if A  is a production rule 1  2 …  n  1  n ;    for all  If    and    then    * * * *

How does this relate to Languages? + Let G be a CFG with start symbol S. Then S W (where W has no non-terminals) represents the language generated by G, denoted L(G). So WL(G)  S W. + W : is a sentence of G When S  (and  may have NTs) it is called a sentential form of G. EXAMPLE: id * id is a sentence Here’s the derivation: E  E A E E * E  id * E  id * id E  id * id Sentential forms *