J AMOOS An Object-Oriented Language for Grammars Yuri Tsoglin Supervised by: Dr. Yossi Gil MODULE A A  X Y Z; X  “JAM”; Y  {“O” … }++ Z  “S”; END MODULE.

Slides:



Advertisements
Similar presentations
Programming Language Concepts
Advertisements

1 Programming Languages (CS 550) Mini Language Interpreter Jeremy R. Johnson.
Semantics Static semantics Dynamic semantics attribute grammars
Intermediate Code Generation
Programming Languages and Paradigms
Chapter 8 Intermediate Code Generation. Intermediate languages: Syntax trees, three-address code, quadruples. Types of Three – Address Statements: x :=
1 Compiler Construction Intermediate Code Generation.
Exercise: Balanced Parentheses
CS252: Systems Programming Ninghui Li Topic 4: Regular Expressions and Lexical Analysis.
Compiler Construction
Compiler Principle and Technology Prof. Dongming LU Mar. 28th, 2014.
CPSC Compiler Tutorial 9 Review of Compiler.
176 Formal Languages and Applications: We know that Pascal programming language is defined in terms of a CFG. All the other programming languages are context-free.
1 Contents Introduction A Simple Compiler Scanning – Theory and Practice Grammars and Parsing LL(1) Parsing LR Parsing Lex and yacc Semantic Processing.
Context-Free Grammars Lecture 7
Environments and Evaluation
Semantic analysis Enforce context-dependent language rules that are not reflected in the BNF, e.g.a function must have a return statement. Decorate AST.
Chapter 3 Program translation1 Chapt. 3 Language Translation Syntax and Semantics Translation phases Formal translation models.
1 Foundations of Software Design Lecture 23: Finite Automata and Context-Free Grammars Marti Hearst Fall 2002.
1 Problem 2 A Scanner / Parser for Simple C. 2 Outline l Language syntax for SC l Requirements for the scanner l Requirement for the parser l companion.
Yu-Chen Kuo1 Chapter 2 A Simple One-Pass Compiler.
Guide To UNIX Using Linux Third Edition
CSC 8310 Programming Languages Meeting 2 September 2/3, 2014.
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
Syntax & Semantic Introduction Organization of Language Description Abstract Syntax Formal Syntax The Way of Writing Grammars Formal Semantic.
CSc 453 Semantic Analysis Saumya Debray The University of Arizona Tucson.
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down  LL Bottom-up.
CSI 3120, Grammars, page 1 Language description methods Major topics in this part of the course: –Syntax and semantics –Grammars –Axiomatic semantics (next.
CSC 338: Compiler design and implementation
1 Semantic Analysis Aaron Bloomfield CS 415 Fall 2005.
CS Describing Syntax CS 3360 Spring 2012 Sec Adapted from Addison Wesley’s lecture notes (Copyright © 2004 Pearson Addison Wesley)
Grammars CPSC 5135.
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
3-1 Chapter 3: Describing Syntax and Semantics Introduction Terminology Formal Methods of Describing Syntax Attribute Grammars – Static Semantics Describing.
Topic #2: Infix to Postfix EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 8: Semantic Analysis and Symbol Tables.
Interpretation Environments and Evaluation. CS 354 Spring Translation Stages Lexical analysis (scanning) Parsing –Recognizing –Building parse tree.
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 3, 09/11/2003 Prof. Roy Levow.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
Introduction to Parsing
CPS 506 Comparative Programming Languages Syntax Specification.
D Goforth COSC Translating High Level Languages Note error in assignment 1: #4 - refer to Example grammar 3.4, p. 126.
Syntax The Structure of a Language. Lexical Structure The structure of the tokens of a programming language The scanner takes a sequence of characters.
Syntax (2).
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
CPSC 388 – Compiler Design and Construction Parsers – Syntax Directed Translation.
1 Introduction to Parsing. 2 Outline l Regular languages revisited l Parser overview Context-free grammars (CFG ’ s) l Derivations.
1 Programming Languages (CS 550) Lecture 2 Summary Mini Language Interpreter Jeremy R. Johnson.
Compiler Construction CPCS302 Dr. Manal Abdulaziz.
©SoftMoore ConsultingSlide 1 Context-Free Grammars.
More yacc. What is yacc – Tool to produce a parser given a grammar – YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar.
Overview of Compilation Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 2.
CC410: System Programming Dr. Manal Helal – Fall 2014 – Lecture 12–Compilers.
BNF A CFL Metalanguage Some Variations Particular View to SLK Copyright © 2015 – Curt Hill.
Definition of the Programming Language CPRL
Chapter 3 – Describing Syntax
Constructing Precedence Table
Programming Languages Translator
CS510 Compiler Lecture 4.
Introduction to Parsing (adapted from CS 164 at Berkeley)
Mini Language Interpreter Programming Languages (CS 550)
CSE 3302 Programming Languages
Chapter 6 Intermediate-Code Generation
Lecture 15 (Notes by P. N. Hilfinger and R. Bodik)
CS 432: Compiler Construction Lecture 11
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
COMPILER CONSTRUCTION
Faculty of Computer Science and Information System
Presentation transcript:

J AMOOS An Object-Oriented Language for Grammars Yuri Tsoglin Supervised by: Dr. Yossi Gil MODULE A A  X Y Z; X  “JAM”; Y  {“O” … }++ Z  “S”; END MODULE A A  X Y Z; X  “JAM”; Y  {“O” … }++ Z  “S”; END

A simple example How do we write a grammar for Pascal program in YACC ? Note: even this is not sufficient… Must define in a lexical file that semicolon means “;”. Program:program Name semicolon Decls Body; Decls:Decls Decl | /* empty */ ; …………………………………………………… Conditional: if Exp then Statement OptElse; OptElse:else Statement | /* empty */ ; ……………………………………………………

Problems with Yacc List and optional elements are defined in an unnatural way. Tokens must be defined in a separate file. All productions must have semantic features of the same type, which is defined separately. Error handling using special error token. No support for language library. No internal symbol table handling.

What do we have in J AMOOS Equivalence between programs and grammars. Lingual features: –Class definitions in EBNF form –Default fields –Automatic field naming –Type OK –Error handling and error types –Tree computation metaphor –Modular definitions and generic modules –Dictionaries

Grammatical features: –Extended BNF grammars –Three predefined kinds of tokens –Generic (parametrized) grammars –Language embedding –Improved parse error handling –Internal symbol tables handling

Class definitions Each class definition defines also a grammar production Extended BNF: structures like lists, optional components, choices are represented as such. No need for a separate “lexical” file. All the tokens are written as they are within the grammar definition. The above Yacc example can be redefined this way: Program program Name “;” {Decls …} Body; Program  program Name “;” {Decls …} Body;…………………………………………………………... Conditional if Exp then Statement [else Statement]; Conditional  if Exp then Statement [else Statement];

Class  Rule  Procedure Every definition in Jamoos can be read and understood as all of the following: –Rule (in a BNF) –Class (as in OO) –Procedure (as in imperative programming) with local variables, input, output, and input-output arguments. Example: A  X Y; –Rule: Symbol A can derive an X and a Y. –Class: Class A has two components, X and Y. –Procedure: Procedure A calls procedures X and Y. A definition has fields …

Classification of Fields Every field represents a value to be computed, a syntactical or semantical element, or a component of a class. Properties of a field: _: Name: Type := Initializer Type: Almost always exists –Could be a primitive type, a class, or a compound type. Name: Optional (automatic naming can be used) Initializer: Optional Perishability prefix: Optional

Kinds of Fields FieldConstructo r Life TimeHasProcedural kindArgument?BeginEndinitializer?Equivalent Compone nt YES CTOR invocation With object NO IN-OUT argument Perishabl e YES CTOR invocation CTOR return NO IN argument AttributeNO Initialization by CTOR With object YES OUT argument Temporar y NO Initialization by CTOR CTOR return YES Local variable

Detailed Example Addition  Expression “+” Expression FEATURES value:INTEGER := [[ return $Expression#1.value + $Expression#2.value ; ]] END This can be understood in the following three ways: Addition is a production of two Expressions with a “+” between them, having an integer semantic feature value whose value is computed using the given C++ code. Addition is a class consisting of three fields: two unnamed of type Expression and one of type INTEGER named value. The constructor of this type gets two parameters of type Expression, assigns their values to the first two fields and assigns the result of the C++ computation to the third field. Addition is a procedure which gets two IN-OUT parameters of type Expression and assigns a value to a third OUT parameter computed using the C++ code.

The Special Return Field Let us slightly change the above example: Addition  Expression “+” Expression FEATURES value:INTEGER := [[ return $1.value + $2.value ; ]] Why not write just $1+$2 like in Yacc? A field named return is a default field. To refer it, its name can be omitted. Addition  Expression “+” Expression FEATURES return:INTEGER := [[ return $1 + $2 ; ]]-- assuming Expression also has a return field

Program  program _:Name “;” decls:{ Decl … } Body FEATURES num_vars: INTEGER := decls.num_vars; END VarsDeclaration  var _:vars:{ (var_list:{ Name “,” … }+ “:” Type “;”) … }+ var _:vars:{ (var_list:{ Name “,” … }+ “:” Type “;”) … }+FEATURES variables: { (Variable Type) … }+ := [[ for (int i=0; i++) for (int j=0; j++) ADD (vars [i].var_list [j] vars [i].Type);]]END Notice how J AMOOS and C++ are mutually embedded.

Internal Classes We have no methods!!! There are no methods as such. Internal classes can be used as methods. A constructor call for an internal class is like a method call. If method needs local variables, these are fields of the internal class. The return field may be used to “return” only the necessary value.

Tree Computation An execution of J AMOOS program is nothing but –A nested chain of constructor calls, or, –An execution of a bottom-up or top-down parser, –A nested execution of procedures and functions. Each constructor call builds an object which becomes a node in the abstract syntax tree. The constructor computes all the attributes by executing their initializer, in the order of their appearance in the definition. When parsing, constructor calls are made implicitly by parser. At the start, constructor of class Main is called.

Summary of the 3 Aspects of Definitions Grammatical aspectOO aspectProcedural aspect Grammar productionClassProcedure Right-hand side components FieldsProcedure arguments ParsingConstructor callsProcedure calls Semantic actions / featuresAttributesOUT arguments Syntax / semantic errorsError typesException throwing Embedded languagesModular definitions Generic grammarsGeneric classesGeneric procedures Default user actionType OKImperative code Tokens (non-terminals)Primitive types Variables (terminals)Class namesProcedure names Selection in right-hand sideAbstract class--- Semantic value of a symbol Default fieldReturn value (of a function)

Four kinds of compound types: List (similar to arrays) Optional (similar to pointers) Choice (as in C’s union or Pascal’s variant records) Sequence (as in C’s struct) More examples: CompoundStatement begin { Statement “;” … } end; CompoundStatement  begin { Statement “;” … } end; ForLoop for Var “:=“ lower:Exp ForLoop  for Var “:=“ lower:Exp up OF to | down OF downto upper:Exp do Statement;

There are three types of tokens: Keyword - any sequence of letters and digits (beginning with a letter). ifbeginabc345 String - any quoted sequence of characters. “(”“…” “A^” Regular expression. Tokens

Primitive Types Tokens define objects of primitive types, by default - STRING J AMOOS primitive types are: INTEGER REAL BOOLEAN CHARACTER STRING OK

Unit Type Unit type is called OK. Used primarily to designate imperative code fragments (usually in C++). An expression of this type may appear at any place within a constructor argument list. Program  program _:Name “;” decls:{ Decl … } Body FEATURES print_num_vars: OK := [[ cout << $decls.num_vars; ]] END

Error Types Both syntax and semantic errors are handled using error types. An object of an error type can “legally” be in an illegal state.  Header? Body; -- Header can be illegal Procedure  Header? Body; -- Header can be illegal VariableName  Id FEATURES type:Type? := … -- Type can be illegal END A special case is type OK? which can be used to define assertions.A special case is type OK? which can be used to define assertions. Errors are generated by special ERROR command.Errors are generated by special ERROR command. Any object can be tested for being in an illegal state.Any object can be tested for being in an illegal state.

What about inheritance? Abstract class: Abstract class: the right hand side defines all the subclasses. Statement  Assignment | Loop | Conditional | Compound | ProcCall; Loop  ForLoop | WhileLoop | RepeatLoop; ForLoop | WhileLoop | RepeatLoop; Grammatically, this is just a selection element of EBNF.

Fields of an abstract class can be inherited or overridden in a subclass. When overridden, field can be made either component or attribute. Field Inheritance  Loop  StepLoop | CondLoop;  CondLoop  WhileLoop | RepeatLoop FEATURES cond: Expression; END  WhileLoop  do Statement;  RepeatLoop  repeat { Statement “;” … }

Dictionaries Each field can depend on the fields of the descendants (so called “generated features”). There are also “inherited features”! Symbol tables can help in most practical cases. In J AMOOS they are called dictionaries.

Dictionaries (cont.) A dictionary is a mapping from strings to some type. So, to define a dictionary, we define the type of its elements. There is a stack of dictionaries for each dictionary type. Three operations on a dictionary: –INSERT (a_string, an_element) –SEARCH (a_string) - only current dictionary –FERRET (a_string) - search through stack A class can be assigned a dictionary; the dictionary will be pushed on stack each time an object of that class is accessed.

Example: DICTIONARY Identifiers; ………………………………….. ProcDecl  procedure Name “(“ {Param “,”} “)”; ProcDecl  procedure Name Identifiers “(“ {Param “,”} “)”; The place of Identifiers within the definition defines when the dictionary must be constructed. procedureName In this case, the dictionary is constructed after procedure and Name are matched. Dictionaries (cont.)

Definitions can be modularized. Each module can have type parameters. MODULE Expression (Op) Expression  Unary | Binary | Parenthesized; Unary  Op Expression; Binary  Expression Op Expression; Parenthesized  “(“ Expression “)”; END Op Similar to templates! Each class in the module is a template class parametrized by Op. Modularity and Genericity

Now, any other module can use this module. PascalExp  Expression (Op=PascalOp); PascalOp  Arith | Bool; Arith  plus OF “+” | minus OF “-” | mult OF “*” | div OF “/”;

Calls between grammars Sometimes, we need one parser to call another parser. For example: A version of Pascal allowing embedded code in Assembler. PARSE command is used to call another parser. EmbeddedAssemler  PARSE(“\””,Assembly,“\””); EmbeddedCPP  PARSE(“[[“,CPP,”]]”);

THE END