Cse321, Programming Languages and Compilers 1 6/30/2015 Lecture #11, Feb. 19, 2007 ml-Yacc Actions when reducing Making ml-yacc work with ml-lex Boiler.

Slides:



Advertisements
Similar presentations
Structure of a YACC File Has the same three-part structure as Lex Each part is separated by a % symbol The three parts are even identical: – definition.
Advertisements

COS 320 Compilers David Walker. Outline Last Week –Introduction to ML Today: –Lexical Analysis –Reading: Chapter 2 of Appel.
Compiler construction in4020 – lecture 2 Koen Langendoen Delft University of Technology The Netherlands.
Chapter 5: Languages and Grammar 1 Compiler Designs and Constructions ( Page ) Chapter 5: Languages and Grammar Objectives: Definition of Languages.
Tim Sheard Oregon Graduate Institute Lecture 8: Operational Semantics of MetaML CSE 510 Section FSC Winter 2005 Winter 2005.
COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou.
The Symbol Table Lecture 13 Wed, Feb 23, The Symbol Table When identifiers are found, they will be entered into a symbol table, which will hold.
176 Formal Languages and Applications: We know that Pascal programming language is defined in terms of a CFG. All the other programming languages are context-free.
9/27/2006Prof. Hilfinger, Lecture 141 Syntax-Directed Translation Lecture 14 (adapted from slides by R. Bodik)
Cse321, Programming Languages and Compilers 1 6/12/2015 Lecture #9, Feb. 12, 2007 A data structure for grammars Computing Nullable and First in SML.
Cse321, Programming Languages and Compilers 1 6/12/2015 Lecture #10, Feb. 14, 2007 Modified sets of item construction Rules for building LR parse tables.
CS 330 Programming Languages 09 / 13 / 2007 Instructor: Michael Eckmann.
Abstract Syntax Trees Compiler Baojian Hua
ML-YACC David Walker COS 320. Outline Last Week –Introduction to Lexing, CFGs, and Parsing Today: –More parsing: automatic parser generation via ML-Yacc.
Cse321, Programming Languages and Compilers 1 6/18/2015 Lecture #7, Feb. 5, 2007 Grammars Top down parsing Transition Diagrams Ambiguity Left recursion.
Cse321, Programming Languages and Compilers 1 6/19/2015 Lecture #18, March 14, 2007 Syntax directed translations, Meanings of programs, Rules for writing.
Context-Free Grammars Lecture 7
Cse321, Programming Languages and Compilers 1 6/23/2015 Lecture #15, March. 5, 2007 Judgments for mini-Java Multiple type environments Class Hierarchy.
Lecture #8, Feb. 7, 2007 Shift-reduce parsing,
Parsing V Introduction to LR(1) Parsers. from Cooper & Torczon2 LR(1) Parsers LR(1) parsers are table-driven, shift-reduce parsers that use a limited.
1 Foundations of Software Design Lecture 23: Finite Automata and Context-Free Grammars Marti Hearst Fall 2002.
COS 320 Compilers David Walker. Outline Last Week –Introduction to ML Today: –Lexical Analysis –Reading: Chapter 2 of Appel.
COS 320 Compilers David Walker. last time context free grammars (Appel 3.1) –terminals, non-terminals, rules –derivations & parse trees –ambiguous grammars.
Chapter 2 A Simple Compiler
COP4020 Programming Languages
Lecture 14 Syntax-Directed Translation Harry Potter has arrived in China, riding the biggest initial print run for a work of fiction here since the Communist.
Automata and Regular Expression Discrete Mathematics and Its Applications Baojian Hua
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
Parser construction tools: YACC
Syntax Analysis – Part II Quick Look at Using Bison Top-Down Parsers EECS 483 – Lecture 5 University of Michigan Wednesday, September 20, 2006.
Abstract Syntax Trees Lecture 14 Wed, Mar 3, 2004.
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
Syntax and Semantics Structure of programming languages.
LR Parsing Compiler Baojian Hua
Semantic Analysis (Generating An AST) CS 471 September 26, 2007.
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
Lesson 10 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Lab 3: Using ML-Yacc Zhong Zhuang
10/25/20151 Programming Languages and Compilers (CS 421) Grigore Rosu 2110 SC, UIUC Slides by Elsa Gunter, based.
Interpretation Environments and Evaluation. CS 354 Spring Translation Stages Lexical analysis (scanning) Parsing –Recognizing –Building parse tree.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Introduction to Parsing
–Writing a parser with YACC (Yet Another Compiler Compiler). Automatically generate a parser for a context free grammar (LALR parser) –Allows syntax direct.
Introduction to Yacc Ying-Hung Jiang
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
1 Using Yacc. 2 Introduction Grammar –CFG –Recursive Rules Shift/Reduce Parsing –See Figure 3-2. –LALR(1) –What Yacc Cannot Parse It cannot deal with.
Compiler Principle and Technology Prof. Dongming LU Mar. 26th, 2014.
YACC. Introduction What is YACC ? a tool for automatically generating a parser given a grammar written in a yacc specification (.y file) YACC (Yet Another.
1 LEX & YACC Tutorial February 28, 2008 Tom St. John.
Yacc. Yacc 2 Yacc takes a description of a grammar as its input and generates the table and code for a LALR parser. Input specification file is in 3 parts.
LECTURE 7 Lex and Intro to Parsing. LEX Last lecture, we learned a little bit about how we can take our regular expressions (which specify our valid tokens)
Chapter 4: Syntax analysis Syntax analysis is done by the parser. –Detects whether the program is written following the grammar rules and reports syntax.
LECTURE 11 Semantic Analysis and Yacc. REVIEW OF LAST LECTURE In the last lecture, we introduced the basic idea behind semantic analysis. Instead of merely.
More yacc. What is yacc – Tool to produce a parser given a grammar – YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
Chapter 3 Lexical Analysis.
Compiler Baojian Hua LR Parsing Compiler Baojian Hua
RegExps & DFAs CS 536.
Chapter 4 Syntax Analysis.
Abstract Syntax Trees Lecture 14 Mon, Feb 28, 2005.
Lexical and Syntax Analysis
Syntax-Directed Translation
Bison Marcin Zubrowski.
ENERGY 211 / CME 211 Lecture 15 October 22, 2008.
Lecture 4: Lexical Analysis & Chomsky Hierarchy
Predictive Parsing Lecture 9 Wed, Feb 9, 2005.
Erik (Happi) Johansson Room:
Yacc Yacc.
Appendix B.2 Yacc Appendix B.2 -- Yacc.
Presentation transcript:

Cse321, Programming Languages and Compilers 1 6/30/2015 Lecture #11, Feb. 19, 2007 ml-Yacc Actions when reducing Making ml-yacc work with ml-lex Boiler plate

Cse321, Programming Languages and Compilers 2 6/30/2015 Assignments Reading –Chapter 4, Sections »4.1 Context Sensitive Analysis »4.2 Intro to Type Systems –Pages –Quiz on Wednesday? Homework #9 is due Wednesday. Project 2 is assigned today. It is posted on the web site.

Cse321, Programming Languages and Compilers 3 6/30/2015 Sml-yacc parser generator Sml-yacc specifications contain 3 parts separated by % % %

Cse321, Programming Languages and Compilers 4 6/30/2015 Declarations about the grammar All begin with a single % followed by a key word. Some declarations are required! –You MUST name the specification »%name XXX –you MUST describe the nonterminals and terminals of the grammar. »%term.... »%nonterm... The description of terminals and non-terminals requires you give the type of any attribute that they may have. You Must have a %pos declaration. This declares the type of "positions". (more about this later).

Cse321, Programming Languages and Compilers 5 6/30/2015 %term and %nonterm These things look like algebraic datatype declarations. We will build an example parser for Regular Expressions. (* user declarations *) datatype Re = empty of int | simple of string * int | concat of Re * Re | closure of Re | union of Re * Re; val count = ref 0; fun next() =(count := (!count)+1; !count); % (* declarations about the grammar *) %name XXX %term EOF | STAR | BAR | LP | RP | HASH | SINGLE of string %nonterm exp of Re %pos int % (* grammar rules *)...

Cse321, Programming Languages and Compilers 6 6/30/2015 Description of Example Symbols are represented by EOF, BAR,... None of them have any attributes except SINGLE which has a string attribute which represents the single character that we want to recognize. There is only one non-terminal, exp, and it has one attribute which is of type Re. Note that the Re type is defined in the user declarations section. The %pos declaration say that a position is an integer. This is for error reporting.

Cse321, Programming Languages and Compilers 7 6/30/2015 Recall how a Bottom up Parse Works E ::= E + T 1 | T 2 T ::= T * F 3 | F 4 F ::= ( E ) 5 | id 6 stack Input Action x + y shift x + y reduce 6 F + y reduce 4 T + y reduce 2 E + y shift E + y reduce 6 E + F reduce 4 E + T reduce 1 E accept

Cse321, Programming Languages and Compilers 8 6/30/2015 Grammar Rules Section Grammar rules which describe the grammar that is to be recognized They also tell what to do whenever a "reduce" action is encountered. A grammar rule has the form: – : ( action ) –For example: exp: SINGLE ( simple(SINGLE,next() ) ) | HASH ( empty (next()) ) The “action” is a value that is associated with the lhs of the production when it is pushed on the stack. Its can “depend” upon the values of the symbols in the rhs (which are already on the stack).

Cse321, Programming Languages and Compilers 9 6/30/2015 Example Showing Grammar Rules (*user declarations (Re) omitted here*) % (* declarations about the grammar *) %name XXX %term EOF | STAR | BAR | LP | RP | HASH | SINGLE of string %nonterm exp of Re %pos int % exp: SINGLE ( simple(SINGLE,next()) ) | HASH ( empty(next()) ) | LP exp RP ( exp ) | exp STAR ( closure exp ) | exp exp ( concat(exp1,exp2) ) | exp BAR exp ( union(exp1,exp2) )

Cse321, Programming Languages and Compilers 10 6/30/2015 Complete Example datatype Re = empty of int | simple of string * int | concat of Re * Re | closure of Re | union of Re * Re; % %name XXX %term EOF | STAR | DUMMY | BAR | LP | RP | HASH | SINGLE of string %nonterm go of Re | exp of Re %pos int %start go %eop EOF %verbose %left LP SINGLE HASH %left BAR %left DUMMY %right STAR

Cse321, Programming Languages and Compilers 11 6/30/2015 Complete Example continued % go: exp EOF ( exp ) exp: SINGLE ( simple(SINGLE,next()) ) | HASH ( empty( next() ) ) | LP exp RP ( exp ) | exp STAR ( closure exp ) | exp exp %prec DUMMY ( concat(exp1,exp2) ) | exp BAR exp ( union(exp1,exp2) )

Cse321, Programming Languages and Compilers 12 6/30/2015 Boiler Plate To get this all to work we need a lexical analyzer that can produce terminal symbols with the correct attributes for the %term directive. We can use sml-lex to do this, but instead of defining our own token type we will use the one which is automatically defined by the %term declaration in sml-yacc. In order to do this we need the following BOILER-PLATE in the user declarations part of the sml-lex source file. type pos = int type svalue = Tokens.svalue type ('a,'b) token = ('a,'b) Tokens.token type lexresult = (svalue,pos) token open Tokens val lineno = ref 0 val reset_lineno = fn () => lineno := 1 val eof = fn () => EOF(!lineno,!lineno) fun error (e,l : int,_) =... Boiler plate in Sml-Lex source file

Cse321, Programming Languages and Compilers 13 6/30/2015 More Boiler Plate We must also place the following as the FIRST line in the ML- lex definitions section. %header (functor XXXLexFun (structure Tokens: XXX_TOKENS)); It is very important that the "type pos = int" be the same type as the %pos declaration in the sml-yacc source file, and that the "XXX" in the %header declaration in the sml-lex source file BE THE SAME as the %name declaration in the sml- yacc source file. type pos = int %header (functor XXXLexFun (structure Tokens: XXX_TOKENS)) % %pos int %name XXX % lexfile yacc file

Cse321, Programming Languages and Compilers 14 6/30/2015 Tying it all together The file "XXX.cm" ties all the pieces together. This file has many occurrences of the string XXX, they must all be changed to the same string as in the %name directive of the sml-yacc source file. To build a parser we do the following: –Start up sml and then use the compile-manager as follows

Cse321, Programming Languages and Compilers 15 6/30/2015 New Boiler plate for Parser structure CommonTypes = struct (* Put type declarations here that you *) (* want to appear in both the parser *) (* and lexer. You can open this structure *) (* else where inside your application as well *) end; group is CommonTypes.sml XXX.lex XXX.grm driver.sml (* Other user defined sml files go here *) $/basis.cm (* system library files *) $/smlnj-lib.cm $/ml-yacc-lib.cm XXX.cm CommonTypes.sml

Cse321, Programming Languages and Compilers 16 6/30/2015 The Driver file (* ************** Driver file **************** *) structure Driver = struct (* ******* Tie all the libraries together ******** *) structure regexpLrVals = regexpLrValsFun(structure Token = LrParser.Token); structure regexpLex = regexpLexFun(structure Tokens = regexpLrVals.Tokens); structure regexpParser = Join(structure ParserData = regexpLrVals.ParserData structure Lex = regexpLex structure LrParser = LrParser); (* ******** Build a lexer and Parser *************** *) val verboselex = ref false; Fun parse s fromfile =... end (* struct Driver *) Driver.sml

Cse321, Programming Languages and Compilers 17 6/30/2015 The.lex files open CommonTypes; type pos = int type svalue = Tokens.svalue (* the type token is from the %term in XXX.grm *) type ('a,'b) token = ('a,'b) Tokens.token type lexresult = (svalue,pos) token (* Defines constructor functions for "token" *) open Tokens val lineno = ref 0 val reset_lineno = fn () => lineno := 1... (* YOUR USER DECLARATIONS (if any) GO HERE *) % %header (functor XXXLexFun(structure Tokens:XXX_TOKENS)); (* YOUR Lex-Definitions (if any) GO HERE *) % (* YOUR RULES GO HERE *) XXX.lex

Cse321, Programming Languages and Compilers 18 6/30/2015 The.grm file open CommonTypes; (* YOUR USER DECLARATIONS (if any) GO HERE *) % (* declarations about the grammar *) %name XXX %term EOF |... %nonterm go of ? |... %pos int %start go %eop EOF %verbose (* YOUR GRAMMAR DECLARATIONS LIKE %left ETC. (if any) GO HERE *) % go:... EOF (... ) (* YOUR ADDITINAL GRAMMAR RULES GO HERE *) XXX.grm

Cse321, Programming Languages and Compilers 19 6/30/2015 Putting it all together Start sml in the directory where all the files are Then type: CM.make “XXX.cm” The Open the driver Library –This imports the function –parse :: string -> bool -> answer_type

Cse321, Programming Languages and Compilers 20 6/30/2015 Standard ML of New Jersey v [built: Mon Nov 21 21:46: ] - CM.make "regexp.cm"; [scanning regexp.cm] [D:\programs\SML110.57\bin\ml-lex regexp.lex] Number of states = 12 Number of distinct rows = 2 Approx. memory size of trans. table = 258 bytes [parsing (regexp.cm):regexp.lex.sml] [library $/ml-yacc-lib.cm is stable] [library $SMLNJ-ML-YACC-LIB/ml-yacc-lib.cm is stable] [loading (regexp.cm):regexp.grm.sig] [loading (regexp.cm):CommonTypes.sml] [loading (regexp.cm):regexp.grm.sml] [compiling (regexp.cm):regexp.lex.sml] [code: 9617, data: 705, env: 1871 bytes] [loading (regexp.cm):driver.sml] [New bindings added.] val it = true : bool - open Driver; opening Driver val parse : string -> bool -> Driver.regexpParser.result val verboselex : bool ref end -

Cse321, Programming Languages and Compilers 21 6/30/2015 Boiler Plate Files The Final BOILER PLATE files, that you can fill in, replacing XXX with the name of your parser, and filling in the...'s with some code or rules can be found in the directory: SML-version/boilerplate You will find the 5 files –"XXX.lex" –"XXX.grm" –"XXX.cm“ –CommonTypes.sml –Driver.sml The outline of these files is included here for your convenience The complete example is in the file