The TXL Programming Language Filippo Ricca ITC-Irst Istituto per la ricerca Scientifica e Tecnologica

Slides:



Advertisements
Similar presentations
1 Mariano Ceccato FBK Fondazione Bruno Kessler The TXL Programming Language (2)
Advertisements

Semantics Static semantics Dynamic semantics attribute grammars
Chapter 2 Syntax. Syntax The syntax of a programming language specifies the structure of the language The lexical structure specifies how words can be.
176 Formal Languages and Applications: We know that Pascal programming language is defined in terms of a CFG. All the other programming languages are context-free.
CS 330 Programming Languages 09 / 13 / 2007 Instructor: Michael Eckmann.
Context-Free Grammars Lecture 7
Slide 1 Chapter 2-b Syntax, Semantics. Slide 2 Syntax, Semantics - Definition The syntax of a programming language is the form of its expressions, statements.
Chapter 3 Program translation1 Chapt. 3 Language Translation Syntax and Semantics Translation phases Formal translation models.
Slide 1 Chapter 3 Attribute Grammars. Slide 2 Attribute Grammars Certain language structures cannot be described using EBNF. Attribute grammars are extensions.
Yu-Chen Kuo1 Chapter 2 A Simple One-Pass Compiler.
The TXL Programming Language Mariano Ceccato ITC-Irst Istituto per la ricerca Scientifica e Tecnologica
Chapter 2 A Simple Compiler
COP4020 Programming Languages
UMBC Introduction to Compilers CMSC 431 Shon Vick 01/28/02.
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
CSC 8310 Programming Languages Meeting 2 September 2/3, 2014.
Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth.
1 Syntax and Semantics The Purpose of Syntax Problem of Describing Syntax Formal Methods of Describing Syntax Derivations and Parse Trees Sebesta Chapter.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
Syntax & Semantic Introduction Organization of Language Description Abstract Syntax Formal Syntax The Way of Writing Grammars Formal Semantic.
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
CSI 3120, Grammars, page 1 Language description methods Major topics in this part of the course: –Syntax and semantics –Grammars –Axiomatic semantics (next.
Winter 2007SEG2101 Chapter 71 Chapter 7 Introduction to Languages and Compiler.
Syntax Specification and BNF © Allan C. Milne Abertay University v
CS Describing Syntax CS 3360 Spring 2012 Sec Adapted from Addison Wesley’s lecture notes (Copyright © 2004 Pearson Addison Wesley)
Context-Free Grammars and Parsing
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
C H A P T E R TWO Syntax and Semantic.
ISBN Chapter 3 Describing Syntax and Semantics.
TextBook Concepts of Programming Languages, Robert W. Sebesta, (10th edition), Addison-Wesley Publishing Company CSCI18 - Concepts of Programming languages.
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
Bernd Fischer RW713: Compiler and Software Language Engineering.
Chapter 2. Design of a Simple Compiler J. H. Wang Sep. 21, 2015.
CPS 506 Comparative Programming Languages Syntax Specification.
D Goforth COSC Translating High Level Languages.
Topic #1: Introduction EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Chapter 3 Part II Describing Syntax and Semantics.
Copyright © 2006 Addison-Wesley. All rights reserved. Ambiguity in Grammars A grammar is ambiguous if and only if it generates a sentential form that has.
D Goforth COSC Translating High Level Languages Note error in assignment 1: #4 - refer to Example grammar 3.4, p. 126.
Chapter 3 Describing Syntax and Semantics
Syntax The Structure of a Language. Lexical Structure The structure of the tokens of a programming language The scanner takes a sequence of characters.
Compiler Design Introduction 1. 2 Course Outline Introduction to Compiling Lexical Analysis Syntax Analysis –Context Free Grammars –Top-Down Parsing –Bottom-Up.
ISBN Chapter 3 Describing Syntax and Semantics.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
Syntax and Semantics Form and Meaning of Programming Languages Copyright © by Curt Hill.
CS412/413 Introduction to Compilers Radu Rugina Lecture 13 : Static Semantics 18 Feb 02.
Programming Languages and Design Lecture 2 Syntax Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
06/03/2007 The TXL Programming Language (4) 1 Mariano Ceccato ITC-Irst Istituto per la ricerca Scientifica e Tecnologica
LECTURE 4 Syntax. SPECIFYING SYNTAX Programming languages must be very well defined – there’s no room for ambiguity. Language designers must use formal.
C H A P T E R T W O Syntax and Semantic. 2 Introduction Who must use language definitions? Other language designers Implementors Programmers (the users.
©SoftMoore ConsultingSlide 1 Context-Free Grammars.
Overview of Compilation Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 2.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
The TXL Programming Language Filippo Ricca ITC-Irst Istituto per la ricerca Scientifica e Tecnologica
The TXL Programming Language Filippo Ricca & Mariano Ceccato ITC-Irst Istituto per la ricerca Scientifica e Tecnologica
BNF A CFL Metalanguage Some Variations Particular View to SLK Copyright © 2015 – Curt Hill.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Chapter 3 – Describing Syntax
Describing Syntax and Semantics
A Simple Syntax-Directed Translator
CS510 Compiler Lecture 4.
Introduction to Parsing (adapted from CS 164 at Berkeley)
Chapter 3 – Describing Syntax
Compiler Construction
Lecture 15 (Notes by P. N. Hilfinger and R. Bodik)
R.Rajkumar Asst.Professor CSE
Chapter 3 Describing Syntax and Semantics.
BNF 9-Apr-19.
COMPILER CONSTRUCTION
Faculty of Computer Science and Information System
Presentation transcript:

The TXL Programming Language Filippo Ricca ITC-Irst Istituto per la ricerca Scientifica e Tecnologica

What is TXL?  TXL is a programming language specifically designed to support software analysis and program transformation.  The TXL programming language is a functional rule-based language. example: functional rule-based program functions: F(x) = x-1; G(x) = x+1; rules: If x<5 then x:=G(x); Flow execution is not sequential! If x>5 then x:=F(x); If x=5 then stop;

Program Transformations  Program transformation is the act of changing one program into another. source language Ltarget language L’ PP’ L is different to L’ ----> translation L is equal to L’ -----> rephrasing transformation

What is TXL good for (1)?  Translation Aspect Language High-level Language 1 High-level Language 2 Low-level Language analysis migration reverse engineering synthesis

Translations  Program synthesis: compilation, code generation, …  Program migration: porting a Pascal program to C, translation between dialects (Fortran77 to Fortran90), …  Reverse engineering: architecture extraction, design recovery, …  Program analysis: measurement, clone detection, type inference, call graph, control flow graph, …

Reverse engineering Java code Class A; Class B; Class C, Example: design recovery Class Diagram A C B * Reverse engineering

Program analysis  Example: clone analysis … 20: FOR I=1 TO 10 30: V[I] = V[I] +1; 40: PRINT V[I] 50: ENDFOR 60: PRINT X; 70: CALL F; … 100: FOR J=1 TO : W[J] = W[J] +1; 120: PRINT W[J] 130: ENDFOR … Lines and ; … Clones:

What is TXL good for (2) Rephrasing: Normalization: reduces a program to a program in a sub- language to decrease its syntactic complexity (ex. Pascal to “core Pascal”). Optimization: improves the run-time and/or space performance of a program (ex. Code motion optimization). Refactoring or restructuring: changes the structure of the program to make it easier to understand. Renovation: Error repair (ex. Year 2000) and changed requirements (ex. “lira” to “euro”). Does not preserve semantics.

Optimization Example: Code motion optimization: moves all loop- independent assignment statements outside of loops. Loop x := a + b; y := x; a := y + 3; x := b + c; y := x – 2; z := x + a * y; End loop x4 := b + c; y2 := x4 – 2; Loop x := a + b; y := x; a := y + 3; z := x4 + a * y2; End loop Code motion optimization

Restructuring  Example: goto elimination f := 0; A_0: if x >n goto B_3; x := x –1; f := f * x; goto A_0; B_3: print f f := 0; while x <=n do x := x –1; f := f * x; end while print f goto elimination

TXL Components  A description of the structure to be transformed specified as an EBNF grammar, in context-free ambiguos form.  A set of Transformation Rules specified by example, using pattern/replacement pairs. Each TXL program has two components:

Syntax definition  A grammar G describes the syntax of a language.  L(G) is the language defined by the grammar G.  Usually a grammar G is given in (E)BNF notation. Example: E --> E + E | E * E | 0 | 1| 2 0+1*2 is in L(E) 3+0 is not in L(E) non-terminalterminal

BNF vs. EBNF List --> List Element ; List --> Element ; Element --> number Element --> word Element --> word sign word List --> ((word [sign word] | number) ; )* BNF EBNF

Parse Tree vs. AST E --> E + E | E * E | 0 | 1| 2 E EE + 2 0*1 Parse tree + * 01 2 Abstract syntax tree

Ambiguity  A grammar that produces more than one parse tree for some term is said to be ambiguos. Example: E --> E + E | E * E | 0 | 1| 2 is ambiguos. 0+1*2 has two parse tree.

Transformation rules - A transformation rule is a rule of then form: Lhs ----> Rhs if cond where Lhs and Rhs are term patterns the condition is optional - The application of a rule to a term succeds if the term matches (pattern matching) with the Lhs pattern and the condition is true. - The result is the instantiation of the Rhs pattern. For example, if we have the term: and applying the rule: x > x to it the result is 3 (the pattern variable x match the number 3)

The three phases of TXL txl “input file” “txl file” Parse Transform Unparse Input textParse tree Transformed parse tree Output text “blue fish” [words] [word] [words] blue [word][empty] fish [words] [word] marlin [empty] “marlin”

First example: ‘expr’ grammar % BNF: Expr --> Num | Expr+Expr | Expr*Expr | (Expr) % Part I. Syntax specification define program [Expr] end define define Expr [number] | [Expr] + [Expr] | [Expr] * [Expr] | ([Expr]) end define

First example: rules N > N (N) > N rule removePlusZero replace [Expr] N [number] + ‘0 by N end rule rule resolveBracketedExpressions replace [Expr] ( N [number] ) by N end rule

First example: main rule % Part 2: main rule rule main replace [Expr] E [Expr] construct NewE [Expr] E [removePlusZero] [resolveBracketedExpr] where not NewE [= E] by NewE end rule *: Expr +: Expr 1: number 9: number 0: number E *: Expr 9: number0: number NewE

First example:parsing txl –Dparse es.txt Expr.Grm 9+0Program Expr + number Input Parse Tree Output Parse Tree

First example: transforming Txl –Dapply es.txt Expr.txl Transforming … ==> 9 [removePlusZero] (9) ==> 9 [resolveBracketedExpressions] (9 + 0) * 1 ==> 9 * 1 [main] 9 * 1 ==> 9 [removeMultiplicationByOne] 9 * 1 ==> 9 [main] 9 Input: (9+0) * 1

First example: unparsing  [NL] force a new line of output.  [IN] indent all following output lines by four spaces.  [EX] exdent all following output lines by four spaces. Example: define Procedure [id] [Parameters] [NL] [IN] ‘begin [NL] [IN] [body] [NL] [EX] ‘ end [NL] [EX] end define

Anatomy of a TXL program  Base grammar  Grammar overrides  Transformation rules The base grammar defines the lexical forms (tokens or terminals) and the syntactic forms (non-terminals). The optional grammar overrides non-terminal of the base grammar. The ruleset defines the set of transformation rules and functions

Anatomy of a TXL program  Base Grammar  Grammar overrides  Transformation rules Example: Expr grammar include “Expr.Grammar” redefine expr … | exp([number], [number])) include “Expr-exp.Grammar” rule main rule one rule two

Specifying Lexical Forms  Lexical forms specify how the input is partitionated into tokens.  Predefined defaults include identifiers [id] (e.g. ABC, rt789), integer and float [number] (e.g. 123, , 3e22), string [string] (e.g. “hi there”).  The tokens statement gives regular expressions for each class of token in the input language. tokens hexnumber “0[xX][\dABCDEFabcdef]+” end tokens Example:

Specifying lexical Forms (cont’d)  Any single char (not [, ]) not preceded by a \ or # simply represents itself.  Single char patterns: ex. \d (digits), \a (alphabetic char).  Regular expression operators: [PQR] (any one of), (PQR) (sequence of), P*, P+, P?. tokens name “regular expression” end tokens Regular expression:

Specifying lexical Forms (cont’d)  The keys specifies that certain identifiers are to be treated as unique special symbols.  The compounds specifies char seuqences to be treated as a single terminal.  The comments specifies the commenting conventions of the input language. By default comments are ignored by TXL. keys procedure repeat ‘program end keys compounds := >= <= end compounds comments /* */ end comments

Specifying Syntactic Forms  The general form of a non-terminal is: define name alternative1 | alternative2 … | alternativeN end define  Where each alternative is any sequence of terminal and non terminal (N.B: enclosed in square brackets).  The special type [program] describes the structure of the entire input.

Specifying Syntactic Forms (cont’d)  Extended BNF-like sequence notation: [repeat x] sequence of zero or more (X*) [list X] comma-separated list [opt X] optional (zero or one) define statements [repeat statement+] end define define statements [statement] | [statement] [statements] end define … are equivalent

Specifying Syntactic Forms (cont’d) define formalParameters ‘([list formalParameter+]’) | [empty] end define define formalParameter [id] ‘: [type] end define define type ‘int | ‘bool end define key procedure begin ‘end int bool end key define proc ‘procedure [id] [forrmalParameters] ‘begin [body] ‘end end define

Ambiguity  TXL resolves ambiguities by choosing the first alternative of each non-terminal that can match the input. define T [number] | ([T]) | + [T] | + + [T] end define ++2 T T T T T ++ 2 Example: T-language

Transformation rules  TXL has two kinds of transformation rules, rules and functions, which are distinguished by whether they should transform only one (for functions) or many (for rules) occurrences of their pattern.  Rules search their scope for the first istance of their target type matching their pattern, transform it, and then reapply to the entire scope until no more matches are found.  Functions do not search, but attempt to match only their entire scope to their pattern, transforming it if it matches.

Rules and functions function 2To42 replace [number] 2 by 42 end function rule 2To42 replace [number] 2 by 42 end rule > > > Rules search the pattern!

Searching functions function 2To42 replace * [number] 2 by 42 end function Note: change only * > >

Syntax of rules and functions Simplified and given in TXL. ‘rule [ruleid] [repeat formalArgument] [repeat construct_deconstruct_where] ‘replace [type] [pattern] [repeat construct_deconstruct_where] ‘by [replacement] ‘end rule The same for functions! N.B. If the ‘where-condition’ is false the rule can not be applied and the result is the input-AST.

Built-in functions rule resolveAdd replace [expr] N1 [number] + N2 [number] by N1 [add N2] end rule function add … end function rule resolveAdd replace [expr] N1 [number] + N2 [number] by N1 [+ N2] end rule … are equivalent!

Built-in functions (cont’d) rule sort replace [repeat number] N1 [number] N2 [number] Rest [repeat number] where N1 [> N2] by N2 N1 Rest end rule > … >

Recursive functions function fact replace [number] n [number] construct nMinusOne [number] n [- 1] where n [> 1] construct factMinusOne [number] nMinusOne [fact] by n [* factMinusOne] end function

Using rule parameters rule resolveConstants replace [repeat statement] ‘const C [id] = V [expr] RestOfscope [repeat statement] by RestOfScope [replaceByValue C V] end rule rule replaceByValue ConstName [id] Value [expr] replace [primary] ConstName by (Value) end rule Example: Const Pi = 3.14; Area := r*r*Pi; Area := r*r*3.14;

Deconstruct and searching functions rule vectorizeScalarAssignments replace [repeat statement] C1 [statement] C2 [statement] rest [repeat statement] deconstruct C1 V1 [var] := E1 [expr]; deconstruct C2 V2 [var] := E2 [expr]; where not E2 [reference V1] where not E1 [reference V2] construct Passign [statement] := ; by Passign rest end rule function reference V [variable] match * [variable] V end function Example: x:=x+1; y:=t+4; := ; x:=x+1; y:=x+4; No!

Working with Global Variables  Global variables are a rich and powerful feature that can be used for many distinct purposes, including: - global tables. - multiple results from a rule. - “deep” parameters. - “message-passing” communication between rules in a rule set (e.g, avoiding interference).

Setting Global Table  Global tables can be set up using an export clause before the replace clause in the main rule of a program. Example: function main export Table [repeat table_entry] “Veggie” -> “Cabbage” “Fruit” -> “Apple” “Fruit” -> “Orange” replace [program] P [program] by P [R1] end function define table_entry [stringlit] -> [stringlit] end define

Adding Table Entry  Global tables can be modified by exporting a new binding for the table based on the imported original binding. function addTableEntry import Table [repeat table_entry] … construct newEntry [table_entry] … export Table Table [. NewEntry] … end function Example:

Searching in a Table  Global tables can be easily queried using searching deconstructors. Example: deconstruct * [table_entry] Table Kind [stringlit] -> “Orange”  The binding for “Kind” will be the [stringlit] “Fruit”. If no match were to be found, then the deconstructor would fail.

Avoiding interference between rules function shiftByOne export Flag [id] ‘not_found replace [number] N [number] by N [replaceOneByTwo] [replaceTwoByThree] end function function replaceOneByTwo replace [number] 1 export Flag ‘found by 2 end function function replaceTwoByThree import Flag [id] deconstruct Flag ‘not_found replace [number] 2 by 3 end function 1 ---> 2 ---> > 3 We want:

Counting items in TXL  TXL can be used for counting items (i.e. LOCs, number of cycles, etc.). For example: given a tag-language counting the number of tags and end-tags. uno due tags: 2 end Tags: 1

% Tags grammar define program [repeat element] end define define element [Tag] | [endTag] | [id] end define define Tag end define define endTag end define % Count number of tag function main replace [program] P [program] construct ListTags [repeat Tag] _ [^ P] construct NumberTags [number] _ [countTag each ListTags] [printf] by end function R1 [^ X1] Replace R1 of type [repeat T] with a sequence consisting of every subtree of type [T] contained in X1.

function countTag A [Tag] replace [number] N [number] by N [+ 1] end function function printf match [number] N [number] construct PrintObj [number] N [print] end function print is a built-in function!

Using attributes TXL allows a grammar definition to have attributes associated with it: 1.Attributes act like optional non-terminals, but normally do not appear in the output (txl –attr force the print of all attributes). 2.Attributes may be added to the parse tree during transformations. 3.Attributes are denotated in the grammar by the nonterminal modifier attr.

define type ‘int | ‘string end define define typed_id [id] [attr type] end define function InferType expr [expression] replace [typed_id] Id [id] deconstruct expr f [number] op [operator] s [number] by Id ‘int end function The attribute ‘type’ is Optional.

Remark 1.Several functions (or rules) may be applied to a scope in succession. For example: X [f][g][h] (the meaning is: h(g(f(X))) )