LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Slides:



Advertisements
Similar presentations
Lesson 8 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Advertisements

Natural Language Processing - Formal Language - (formal) Language (formal) Grammar.
Translator Architecture Code Generator ParserTokenizer string of characters (source code) string of tokens abstract program string of integers (object.
COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou.
Closure Properties of CFL's
Grammars, Languages and Parse Trees. Language Let V be an alphabet or vocabulary V* is set of all strings over V A language L is a subset of V*, i.e.,
Chapter Chapter Summary Languages and Grammars Finite-State Machines with Output Finite-State Machines with No Output Language Recognition Turing.
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
ISBN Chapter 4 Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing.
79 Regular Expression Regular expressions over an alphabet  are defined recursively as follows. (1) Ø, which denotes the empty set, is a regular expression.
Normal forms for Context-Free Grammars
COP4020 Programming Languages
1 Bottom-up parsing Goal of parser : build a derivation –top-down parser : build a derivation by working from the start symbol towards the input. builds.
Finite State Machines Data Structures and Algorithms for Information Processing 1.
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 7 Mälardalen University 2010.
Parsing IV Bottom-up Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Languages & Strings String Operations Language Definitions.
Week 14 - Friday.  What did we talk about last time?  Exam 3 post mortem  Finite state automata  Equivalence with regular expressions.
Parsing. Goals of Parsing Check the input for syntactic accuracy Return appropriate error messages Recover if possible Produce, or at least traverse,
Chapter 9 Syntax Analysis Winter 2007 SEG2101 Chapter 9.
CS/IT 138 THEORY OF COMPUTATION Chapter 1 Introduction to the Theory of Computation.
Winter 2007SEG2101 Chapter 71 Chapter 7 Introduction to Languages and Compiler.
Testing Grammars For Top Down Parsers By Asma M Paracha, Frantisek F. Franek Dept. of Computing & Software McMaster University Hamilton, Ont.
Languages, Grammars, and Regular Expressions Chuck Cusack Based partly on Chapter 11 of “Discrete Mathematics and its Applications,” 5 th edition, by Kenneth.
Context Free Grammars CIS 361. Introduction Finite Automata accept all regular languages and only regular languages Many simple languages are non regular:
Grammars CPSC 5135.
1 Computability Five lectures. Slides available from my web page There is some formality, but it is gentle,
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Syntax and Semantics Structure of programming languages.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 11 Midterm Exam 2 -Context-Free Languages Mälardalen University 2005.
11 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 7 School of Innovation, Design and Engineering Mälardalen University 2012.
Introduction Finite Automata accept all regular languages and only regular languages Even very simple languages are non regular (  = {a,b}): - {a n b.
Top-Down Parsing.
Parsing and Code Generation Set 24. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program,
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Programming Languages and Design Lecture 2 Syntax Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
Formal Languages and Grammars
GRAMMARS & PARSING. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program, referred to as a.
CS 330 Programming Languages 09 / 25 / 2007 Instructor: Michael Eckmann.
Chapter 4: Syntax analysis Syntax analysis is done by the parser. –Detects whether the program is written following the grammar rules and reports syntax.
1 Topic #4: Syntactic Analysis (Parsing) CSC 338 – Compiler Design and implementation Dr. Mohamed Ben Othman ( )
Donghyun (David) Kim Department of Mathematics and Physics North Carolina Central University 1 Chapter 2 Context-Free Languages Some slides are in courtesy.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen Department of Computer Science University of Texas-Pan American.
UMBC  CSEE   1 Chapter 4 Chapter 4 (b) parsing.
Grammar Set of variables Set of terminal symbols Start variable Set of Production rules.
Bernd Fischer RW713: Compiler and Software Language Engineering.
COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Mid-Terms Exam Scope and Introduction. Format Grades: 100 points -> 20% in the final grade Multiple Choice Questions –8 questions, 7 points each Short.
Lecture 6: Context-Free Languages
Formal grammars A formal grammar is a system for defining the syntax of a language by specifying sequences of symbols or sentences that are considered.
Pushdown Automata Chapter 12. Recognizing Context-Free Languages Two notions of recognition: (1) Say yes or no, just like with FSMs (2) Say yes or no,
Week 14 - Friday.  What did we talk about last time?  Simplifying FSAs  Quotient automata.
Syntax Analysis By Noor Dhia Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers.
CS 154 Formal Languages and Computability March 22 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron Mak.
Last Chapter Review Source code characters combination lexemes tokens pattern Non-Formalization Description Formalization Description Regular Expression.
Chapter 1 INTRODUCTION TO THE THEORY OF COMPUTATION.
Context-Free Grammars: an overview
Programming Languages Translator
Lexical and Syntax Analysis
Natural Language Processing - Formal Language -
4 (c) parsing.
Lexical and Syntax Analysis
5. Context-Free Grammars and Languages
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Compiler Design 7. Top-Down Table-Driven Parsing
CHAPTER 2 Context-Free Languages
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Kanat Bolazar February 16, 2010
Presentation transcript:

LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Contents Languages and Grammars Formal languages Formal grammars Generative grammars Analytic grammars Context-free grammars LL parsers LR parsers Rewrite systems L-systems © University of LiverpoolCOMP319slide 2

Software Engineering Foundation Software engineering may be summarised by saying that it concerns the construction of programs to solve problems and that there are three parts:  Construction/engineering, and methods  Problems, and problem solving, and  Programs © University of LiverpoolCOMP319slide 3

Languages and grammar Languages are spoken and written (linguistics) To be effective they must be based on a shared set of rules – a grammar Grammars are introspective they are based on and couched in language Natural language grammars are constantly shifting and locally negotiated A grammar is a formal language in which the rules of discourse are discussed and are the aim © University of LiverpoolCOMP319slide 4

Formal language concepts The concept emerges because of the need to define rules (for language) Formally, they are collections of words composed of smaller, atomic units Issues of concern are  the number and nature of the atomic units,  the precision level required,  the completeness of the formalism © University of LiverpoolCOMP319slide 5

Examples of formal languages The set of all words over {a, b} The set {a n : n is a prime number} The set of syntactically correct programs in a given computer programming language The set of inputs upon which a certain Turing machine halts © University of LiverpoolCOMP319slide 6

Formal language specification There are many ways in which a formal language can be specified e.g. strings produced in a formal grammar strings produced by regular expressions the strings accepted by automata logic and other formalisms © University of LiverpoolCOMP319slide 7

Language Production Operations Concatenation of strings drawn from the two languages Intersection or union of common strings in both languages Complement of one language Right quotient of one by the other Kleene star operation on one language Reverse of a language Shuffle combination of languages © University of LiverpoolCOMP319slide 8

Formal Grammars Noam Chomsky  Linguist, philosopher at MIT  1956, papers on information and grammar Types of formal grammar  Generative grammar  Analytical grammar © University of LiverpoolCOMP319slide 9

Generative formal grammars Generative grammars: A set of rules by which all possible strings in a language to be described can be generated by successively rewriting strings starting from a designated start symbol. In effect it formalises an algorithm that generates strings in the language. © University of LiverpoolCOMP319slide 10

Analytic formal grammars Analytic grammars: A set of rules that assumes an arbitrary string as input, and which successively reduces or analyses that string to yield a final boolean “yes/no” that indicates whether that string is a member of the language described by the grammar In effect a parser or recogniser for a language © University of LiverpoolCOMP319slide 11

Generative grammar components Chomsky’s definition – essentially for linguistics but perfect for formal computing grammars; consists of the following components:  A finite set N of nonterminal symbols  A finite set  of terminal symbols disjoint from N  A finite set P of production rules where a rule is of the form: string in (   N)* → string in (   N)*  A symbol S in N that is identified as the start symbol © University of LiverpoolCOMP319slide 12

Generative grammar definition A language of a formal grammar: G = (N, ,P, S) Is denoted by L(G) And is defined as all those strings over  such that can be generated by starting from the symbol S and then applying P until no more nonterminal symbols are present © University of LiverpoolCOMP319slide 13

A generative formal grammar Given the terminals {a, b}, nonterminals {S, A, B} where S is the special start symbol and Productions: S → ABS S →  (the empty string) BA → AB BS → b Bb → bb Ab → ab Aa → aa Defines all the words of the from a n b n, (i.e. n copies of a followed by n copies of b) © University of LiverpoolCOMP319slide 14

Context Free Grammars Theoretical basis of most programming languages. Easy to generate a parser using a compiler compiler. Two main approaches exist: top-down parsing e.g. LL parsers, and bottom-up parsing e.g. LR parsers. © University of LiverpoolCOMP319slide 15

LL parser Table based, top down parser for a subset of the context-free grammars (LL grammars). Parsing is Left to right, and constructs a Leftmost derivation of the sentence. LL(k) parsers use k tokens of look-ahead to parse the LL(k) grammar sentence. LL(1) grammars are popular and fast because only the next token is considered in parsing decisions. © University of LiverpoolCOMP319slide 16

Table based LL parsing © University of LiverpoolCOMP319slide 17 Input buffer: | Stack | | S Output $ | | ^ | | Parsing | | table | Architecture Consider the grammar 1.S → F 2.S → ( S + F) 3.F → 1 This has the parsing table e.g. 1 and S implies rule 1 i.e. Stack S is replaced with F and 1 is output Stack and Input same = delete Stack and Input different = error Example input ( ) $ ()1+$ S2-1-- F--3--

Table based LL parsing © University of LiverpoolCOMP319slide 18 Consider the grammar 1.S → F 2.S → ( S + F) 3.F → 1 This has the parsing table e.g. 1 and S implies rule 1 i.e. Stack S is replaced with F and 1 is output Stack and Input same = delete Stack and Input different = error Example input ( ) $ ()1+$ S2-1-- F--3-- inputstackactionoutput (S$parse ( S : 22 ((S + F)$( ( delete2 1S + F)$parse 1 S : 121 1F + F)$parse 1 F : F)$1 1 delete F)$+ + delete213 1F)$parse 1 F : )$1 1 delete2133 ))$) ) delete2133 $$stop2133

Parse Tree

Left Right Parser Bottom up parser for context-free grammars used by many program language compilers Parsing is Left to right, and produces a Rightmost derivation. LR(k) parsers uses k tokens of look-ahead. LR(1) is the most common type of parser used by many programming languages. Usually always generated using a parser generator which constructs the parsing table; e.g. Simple LR parser (SLR), Look Ahead LR (LALR) e.g. Yacc, Canonical LR. © University of LiverpoolCOMP319slide 20

Left Right parser example.. Rules... 1) E → E * B (2) E → E + B (3) E → B (4) B → 0 (5) B → 1 © University of LiverpoolCOMP319slide 21

Left Right parser example © University of LiverpoolCOMP319slide 22

Re-writing Rewriting is a general process involving strings and alphabets. Classified according to what is rewritten e.g. strings, terms, graphs, etc. A rewrite system is a set of equations that characterises a system of computation that provides one method of automating theorem proving and is based on use of rewrite rules. Examples of practical systems that use this approach includes the software Mathematica. © University of LiverpoolCOMP319slide 23

Re-writing logic example ! ! A = A// eliminate double negative !(A AND B) = !A OR !B // de-morgan © University of LiverpoolCOMP319slide 24

Re-writing in Mathematica (Wolfram) © University of LiverpoolCOMP319slide 25

L-systems Named after Aristid Lindenmeyer ( ) a Swedish theoretical biologist and botanist who worked at the University of Utrecht (Netherlands) Are a formal grammar used to model the growth and morphology of plants and animals In plant and animal modelling a special form, the parametric L-system is used – based on rewriting. Because of their recursive, parallel, and unlimited nature they lead to concepts of self- similarity and fractional dimension and fractal- like forms. © University of LiverpoolCOMP319slide 26

L-system structure The basic system is identical to formal grammars: G = {V, S, Ω, P} where G is the grammar defined V (the alphabet) a set of symbols that can be replaced by (variables) S is a set of symbols that remain fixed (constants) Ω(start, axiom or initiator) a string from V, the initial state P is a set of rules or productions defining the ways variables can be replaced by constants and other variables. Each rule, consists of a LHS (predecessor) and RHS (successor) © University of LiverpoolCOMP319slide 27

© University of LiverpoolCOMP319slide 28 Slide 28 Example 1: Fibonacci numbers V: A B C: none Ω : A P: p1: A → B p2: B → AB N=0 A N=1 → B N=2 → AB N=3 → BAB N=4 → ABBAB N=5 → BABABBAB N=6 → ABBABBABABBAB N=7 → BABABBAB... Counting lengths we get : 1,1,2,3,5,8,13,21,... The Fibonacci numbers

© University of Liverpool COMP319slide 29 Slide 29 Example 2: Algal growth V: A B C: none Ω : A P: p1: A → AB p2: B → A N=0 A → AB N=1 → ABA N=2 → ABAAB N=3 → ABAABABA

© University of LiverpoolCOMP319slide 30 COMP319 Software Engineering II Example 3: Koch snowflake V: F C: none Ω : F P: p1: F → F+F-F- F+F N=0 F N=1 → F+F-F-F+F N=2 → F+F-F-F+F+F... N=3 etc

Example 4: 3D Hilbert curve © University of LiverpoolCOMP319slide 31

Example 5: Branching © University of LiverpoolCOMP319slide 32