Download presentation
Presentation is loading. Please wait.
Published byDarrell Knight Modified over 9 years ago
1
Parser: CFG, BNF Backus-Naur Form is notational variant of Context Free Grammar. Invented to specify syntax of ALGOL in late 1950’s Uses ::= to indicate rewrite. CFG/BNF benefits precise and comprehensible specifications of syntactic structure parser construction can be automated automation has grammar-debugging benefits facilitate translation into code languages can be extended over time http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction1
2
Left-to-right parsers A [compiler’s] parser identifies how the lexemes of a sentence [correct program] conform to grammar [the syntactic rules of the language]. This may result in the creation of an explicit syntax tree. The construction of an actual tree is not necessary however, the tree may instead be implicit. Efficient parsers may operate either top-down (predictively) or bottom-up, but they always operate left-to-right. Certain subclasses of CFG, esp. ‘LL’ and ‘LR’ grammars, are expressive enough for most constructs in programming languages can form the basis of very efficient parsers (low computational complexity) http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction2
3
Simple expression grammar This grammar for expressions, terms & factors can be extended to have extra operators and levels of precedence is interesting compared to say ‘While …’ or ‘Real …’ because the first lexeme of an expression is not a total giveaway of its syntactic structure is LR and so suitable for bottom-up not so for top-down because it has left- recursive rules: needs to be transformed unambiguously describes same language as http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction3 E ::= E + T | T T ::= T * F | F F ::= ( E ) | id E ::= T E’ E’ ::= + T E’ | ε T ::= F T’ T’ ::= * F T’ | ε F ::= ( E ) | id E ::= E + E | E * E | ( E ) | id
4
Error handling Parsers are also given incorrect programs, so error handling is necessary Language specifications usually say nothing about how to handle errors Compiler designers choose panic mode: advance to next “synchronising” token with clear role – ‘;’ etc phrase-level recovery – edits of token strings, c.f. lexer and character strings error productions – provide specialised diagnostics for common errors global correction – smallest set of edits: very costly & rather pointless approach Error types: lexical, syntactic, semantic, logical Some parsing methods (e.g. LL & LR methods) have viable-prefix property i.e. they detect error as soon as they see prefix that cannot legally be completed http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction4
5
Notational conventions in Dragon Book grammars Terminals: early lowercase; operators except ‘|’; punctuation; digits; boldface –a b c = ;,. 0 1 9 id begin while Nonterminals: early uppercase; italic lowercase; S is usually the ‘start symbol’ –A B C expr paramlist S Single symbols either terminal or nonterminal: late uppercase –U V W X Y Z Strings of nonterminals: late lowercase –u v w x y z Strings of symbols, possibly empty, either terminal or nonterminal: greek lowercase –α β γ δ θ ρ σ (but not ε which denotes the empty string) http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction5
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.