Download presentation
Presentation is loading. Please wait.
Published byWilliam Cole Modified over 9 years ago
1
Winter 2007SEG2101 Chapter 71 Chapter 7 Introduction to Languages and Compiler
2
Winter 2007SEG2101 Chapter 72 Contents Computer architecture Compiler Grammars Formal languages Parse trees Ambiguity Regular expressions
3
Winter 2007SEG2101 Chapter 73 Von Neumann Architecture
4
Winter 2007SEG2101 Chapter 74 Compiler A compiler is a program that reads a program written in one language – the source language – and translates it into an equivalent program in another language – the target language.
5
Winter 2007SEG2101 Chapter 75 The Compilation process
6
Winter 2007SEG2101 Chapter 76 Grammars A grammar is defined as a 4-tuple: the alphabet , the nonterminals N, the production P, and a goal symbol S. ( , N, P, S) , N, P are set, S is a particular element of set N.
7
Winter 2007SEG2101 Chapter 77 Alphabets and Strings is the alphabet, or set of terminals. It is a finite set consisting of all the input characters or symbols that can be arranged to form sentences in the language. English: A to Z, in our definition, punctuation and space symbols Programming language: usually some well- defined computer set such as ASCII
8
Winter 2007SEG2101 Chapter 78 Alphabets and Strings (II) A compiler is usually defined with 2 grammars. The alphabet for the scanner grammar is ASCII or some subset of it. The alphabet for the parse grammar is the set of tokens generated by the scanner, not ASCII at all.
9
Winter 2007SEG2101 Chapter 79 An Example of Strings ={a,b,c,d} Possible strings of terminals from include aaa, aabbccdd, d, cba, abab, ccccccccccacccc, and so on.
10
Winter 2007SEG2101 Chapter 710 Formal Languages :alphabet, it is a finite set consisting of all input characters or symbols. * : closure of the alphabet, the set of all possible strings in , including the empty string . A (formal) language is some specified subset of *.
11
Winter 2007SEG2101 Chapter 711 Nonterminals Nonterninal set N is a finite set of symbols not in the alphabet. A particular nonterminal, the goal symbol S, represents exactly all the strings in the language. The goal symbol is also often called the start symbol because we start with it. The set of terminal and set of nonterminals, taken together, is called vocabulary of the grammar.
12
Winter 2007SEG2101 Chapter 712 Productions The productions P of a grammar is a set of rewriting rules, each written as two strings of symbols separated by an arrow. The symbols on each side of the arrow may be drawn from both terminals and nonterminals, subject to certain restrictions in the form of the grammars.
13
Winter 2007SEG2101 Chapter 713 An Example Grammar G1=({a,b,c}, {A,B}, {A aB, A bB, A cB, B a, B b, B c}, A) The grammar generates 9 two-letter strings.
14
Winter 2007SEG2101 Chapter 714 Syntax and Semantics Syntax: a syntax of a programming language is the form of its expression, statements, and program units. Semantics: the meaning of those expression, statements, and program units. If ( )
15
Winter 2007SEG2101 Chapter 715 Sentences, Lexeme, Token Sentences: the strings of a language are called sentences or statements. Lexeme: the lexemes of a programming language include its identifier, literals, operators, and special words. Token: a token of a language is a category of its lexemes.
16
Winter 2007SEG2101 Chapter 716 Lexeme and Token LexemesTokens IndexIdentifier =equal_sign 2int_literal *multi_op Countidentifier +plus_op 17int_literal ;semicolon Index = 2 * count +17;
17
Winter 2007SEG2101 Chapter 717 The Role of Grammars The grammar of a language defines the correct form for sentences in that language. Grammar is the formal language generation mechanism that are commonly used to describe the syntax of programming languages.
18
Winter 2007SEG2101 Chapter 718 BNF: Backus-Naur Form Backus presented a new formal notation for specifying programming language syntax. Naur modified the notation slightly. Known as Backus-Naur Form, or BNF. BNF is a very natural notation for describing syntax. BNF and context-free grammar (grammar) are used interchangeably.
19
Winter 2007SEG2101 Chapter 719 BNF Metalanguage: A language used to describe another language. BNF is a metalanguage for programming language. Abstraction: the symbol on the left-hand of the arrow Definition: the text to the right of the arrow Rule (production): altogether the description is called rule.
20
Winter 2007SEG2101 Chapter 720 BNF Description (A simple C assignment statement)
21
Winter 2007SEG2101 Chapter 721 Nonterminal and Terminal Nonterminal symbol: the abstraction in a BNF description or grammar Terminal symbol: the lexemes and tokens of the rules A BNF description or grammar is simply a collection of rules. Nonterminals can have two or more distinct definitions. Multiple definitions can be written as a single rule, with the different definitions separated by |, meaning logical OR. if then |if then else
22
Winter 2007SEG2101 Chapter 722 List of Syntactic Elements BNF does not include ellipsis (…) BNF uses recursion A rule is recursive if its LHS appears in its RHS. e.g., identifier | identifier,
23
Winter 2007SEG2101 Chapter 723 A Grammar
24
Winter 2007SEG2101 Chapter 724 A Derivation of a Program
25
Winter 2007SEG2101 Chapter 725 Another Grammar
26
Winter 2007SEG2101 Chapter 726 A Derivation of a Statement
27
Winter 2007SEG2101 Chapter 727 Parse Tree Grammars naturally describe the hierarchical syntactic structure of the sentences of the languages they define. These hierarchical structures are called parse trees.
28
Winter 2007SEG2101 Chapter 728 Ambiguous Grammar A grammar that generates a sentence for which there are two or more distinct parse trees is said to be ambiguous.
29
Winter 2007SEG2101 Chapter 729 Ambiguity
30
Winter 2007SEG2101 Chapter 730 Regular Expressions Regular expression is a method of describing string.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.