Download presentation
Presentation is loading. Please wait.
1
The TXL Programming Language Filippo Ricca ITC-Irst Istituto per la ricerca Scientifica e Tecnologica ricca@itc.it
2
What is TXL? TXL is a programming language specifically designed to support software analysis and program transformation. The TXL programming language is a functional rule-based language. example: functional rule-based program functions: F(x) = x-1; G(x) = x+1; rules: If x<5 then x:=G(x); Flow execution is not sequential! If x>5 then x:=F(x); If x=5 then stop;
3
Program Transformations Program transformation is the act of changing one program into another. source language Ltarget language L’ PP’ L is different to L’ ----> translation L is equal to L’ -----> rephrasing transformation
4
What is TXL good for (1)? Translation Aspect Language High-level Language 1 High-level Language 2 Low-level Language analysis migration reverse engineering synthesis
5
Translations Program synthesis: compilation, code generation, … Program migration: porting a Pascal program to C, translation between dialects (Fortran77 to Fortran90), … Reverse engineering: architecture extraction, design recovery, … Program analysis: measurement, clone detection, type inference, call graph, control flow graph, …
6
Reverse engineering Java code Class A; Class B; Class C, Example: design recovery Class Diagram A C B 1 1 1 * Reverse engineering
7
Program analysis Example: clone analysis … 20: FOR I=1 TO 10 30: V[I] = V[I] +1; 40: PRINT V[I] 50: ENDFOR 60: PRINT X; 70: CALL F; … 100: FOR J=1 TO 10 110: W[J] = W[J] +1; 120: PRINT W[J] 130: ENDFOR … Lines 20-50 and 100-130; … Clones:
8
What is TXL good for (2) Rephrasing: Normalization: reduces a program to a program in a sub- language to decrease its syntactic complexity (ex. Pascal to “core Pascal”). Optimization: improves the run-time and/or space performance of a program (ex. Code motion optimization). Refactoring or restructuring: changes the structure of the program to make it easier to understand. Renovation: Error repair (ex. Year 2000) and changed requirements (ex. “lira” to “euro”). Does not preserve semantics.
9
Optimization Example: Code motion optimization: moves all loop- independent assignment statements outside of loops. Loop x := a + b; y := x; a := y + 3; x := b + c; y := x – 2; z := x + a * y; End loop x4 := b + c; y2 := x4 – 2; Loop x := a + b; y := x; a := y + 3; z := x4 + a * y2; End loop Code motion optimization
10
Restructuring Example: goto elimination f := 0; A_0: if x >n goto B_3; x := x –1; f := f * x; goto A_0; B_3: print f f := 0; while x <=n do x := x –1; f := f * x; end while print f goto elimination
11
TXL Components A description of the structure to be transformed specified as an EBNF grammar, in context-free ambiguos form. A set of Transformation Rules specified by example, using pattern/replacement pairs. Each TXL program has two components:
12
Syntax definition A grammar G describes the syntax of a language. L(G) is the language defined by the grammar G. Usually a grammar G is given in (E)BNF notation. Example: E --> E + E | E * E | 0 | 1| 2 0+1*2 is in L(E) 3+0 is not in L(E) non-terminalterminal
13
BNF vs. EBNF List --> List Element ; List --> Element ; Element --> number Element --> word Element --> word sign word List --> ((word [sign word] | number) ; )* BNF EBNF
14
Parse Tree vs. AST E --> E + E | E * E | 0 | 1| 2 E EE + 2 0*1 Parse tree + * 01 2 Abstract syntax tree
15
Ambiguity A grammar that produces more than one parse tree for some term is said to be ambiguos. Example: E --> E + E | E * E | 0 | 1| 2 is ambiguos. 0+1*2 has two parse tree.
16
Transformation rules - A transformation rule is a rule of then form: Lhs ----> Rhs if cond where Lhs and Rhs are term patterns the condition is optional - The application of a rule to a term succeds if the term matches (pattern matching) with the Lhs pattern and the condition is true. - The result is the instantiation of the Rhs pattern. For example, if we have the term: 3 + 0 and applying the rule: x + 0 ---> x to it the result is 3 (the pattern variable x match the number 3)
17
The three phases of TXL txl “input file” “txl file” Parse Transform Unparse Input textParse tree Transformed parse tree Output text “blue fish” [words] [word] [words] blue [word][empty] fish [words] [word] marlin [empty] “marlin”
18
First example: ‘expr’ grammar % BNF: Expr --> Num | Expr+Expr | Expr*Expr | (Expr) % Part I. Syntax specification define program [Expr] end define define Expr [number] | [Expr] + [Expr] | [Expr] * [Expr] | ([Expr]) end define
19
First example: rules N + 0 -------> N (N) --------> N rule removePlusZero replace [Expr] N [number] + ‘0 by N end rule rule resolveBracketedExpressions replace [Expr] ( N [number] ) by N end rule
20
First example: main rule % Part 2: main rule rule main replace [Expr] E [Expr] construct NewE [Expr] E [removePlusZero] [resolveBracketedExpr] where not NewE [= E] by NewE end rule *: Expr +: Expr 1: number 9: number 0: number E *: Expr 9: number0: number NewE
21
First example:parsing txl –Dparse es.txt Expr.Grm 9+0Program Expr + number 9 0 ------ Input Parse Tree ------- ------- Output Parse Tree ------
22
First example: transforming Txl –Dapply es.txt Expr.txl Transforming … 9 + 0 ==> 9 [removePlusZero] (9) ==> 9 [resolveBracketedExpressions] (9 + 0) * 1 ==> 9 * 1 [main] 9 * 1 ==> 9 [removeMultiplicationByOne] 9 * 1 ==> 9 [main] 9 Input: (9+0) * 1
23
First example: unparsing [NL] force a new line of output. [IN] indent all following output lines by four spaces. [EX] exdent all following output lines by four spaces. Example: define Procedure [id] [Parameters] [NL] [IN] ‘begin [NL] [IN] [body] [NL] [EX] ‘ end [NL] [EX] end define
24
Anatomy of a TXL program Base grammar Grammar overrides Transformation rules The base grammar defines the lexical forms (tokens or terminals) and the syntactic forms (non-terminals). The optional grammar overrides non-terminal of the base grammar. The ruleset defines the set of transformation rules and functions
25
Anatomy of a TXL program Base Grammar Grammar overrides Transformation rules Example: Expr grammar include “Expr.Grammar” redefine expr … | exp([number], [number])) include “Expr-exp.Grammar” rule main rule one rule two
26
Specifying Lexical Forms Lexical forms specify how the input is partitionated into tokens. Predefined defaults include identifiers [id] (e.g. ABC, rt789), integer and float [number] (e.g. 123, 123.23, 3e22), string [string] (e.g. “hi there”). The tokens statement gives regular expressions for each class of token in the input language. tokens hexnumber “0[xX][\dABCDEFabcdef]+” end tokens Example:
27
Specifying lexical Forms (cont’d) Any single char (not [, ]) not preceded by a \ or # simply represents itself. Single char patterns: ex. \d (digits), \a (alphabetic char). Regular expression operators: [PQR] (any one of), (PQR) (sequence of), P*, P+, P?. tokens name “regular expression” end tokens Regular expression:
28
Specifying lexical Forms (cont’d) The keys specifies that certain identifiers are to be treated as unique special symbols. The compounds specifies char seuqences to be treated as a single terminal. The comments specifies the commenting conventions of the input language. By default comments are ignored by TXL. keys procedure repeat ‘program end keys compounds := >= <= end compounds comments /* */ end comments
29
Specifying Syntactic Forms The general form of a non-terminal is: define name alternative1 | alternative2 … | alternativeN end define Where each alternative is any sequence of terminal and non terminal (N.B: enclosed in square brackets). The special type [program] describes the structure of the entire input.
30
Specifying Syntactic Forms (cont’d) Extended BNF-like sequence notation: [repeat x] sequence of zero or more (X*) [list X] comma-separated list [opt X] optional (zero or one) define statements [repeat statement+] end define define statements [statement] | [statement] [statements] end define … are equivalent
31
Specifying Syntactic Forms (cont’d) define formalParameters ‘([list formalParameter+]’) | [empty] end define define formalParameter [id] ‘: [type] end define define type ‘int | ‘bool end define key procedure begin ‘end int bool end key define proc ‘procedure [id] [forrmalParameters] ‘begin [body] ‘end end define
32
Ambiguity TXL resolves ambiguities by choosing the first alternative of each non-terminal that can match the input. define T [number] | ([T]) | + [T] | + + [T] end define ++2 T T T + + 2 T T ++ 2 Example: T-language
33
Transformation rules TXL has two kinds of transformation rules, rules and functions, which are distinguished by whether they should transform only one (for functions) or many (for rules) occurrences of their pattern. Rules search their scope for the first istance of their target type matching their pattern, transform it, and then reapply to the entire scope until no more matches are found. Functions do not search, but attempt to match only their entire scope to their pattern, transforming it if it matches.
34
Rules and functions function 2To42 replace [number] 2 by 42 end function rule 2To42 replace [number] 2 by 42 end rule 2 ----> 42 3 2 6 2 78 4 2 2 ----> 42 3 2 6 2 78 4 2 ----> 42 6 42 78 4 42 Rules search the pattern!
35
Searching functions function 2To42 replace * [number] 2 by 42 end function Note: change only * 2 ----> 42 3 2 6 2 78 4 2 ----> 42 6 2 78 4 2
36
Syntax of rules and functions Simplified and given in TXL. ‘rule [ruleid] [repeat formalArgument] [repeat construct_deconstruct_where] ‘replace [type] [pattern] [repeat construct_deconstruct_where] ‘by [replacement] ‘end rule The same for functions! N.B. If the ‘where-condition’ is false the rule can not be applied and the result is the input-AST.
37
Built-in functions rule resolveAdd replace [expr] N1 [number] + N2 [number] by N1 [add N2] end rule function add … end function rule resolveAdd replace [expr] N1 [number] + N2 [number] by N1 [+ N2] end rule … are equivalent!
38
Built-in functions (cont’d) rule sort replace [repeat number] N1 [number] N2 [number] Rest [repeat number] where N1 [> N2] by N2 N1 Rest end rule 22 4 2 15 1 ------> …. ------> 1 2 4 15 22
39
Recursive functions function fact replace [number] n [number] construct nMinusOne [number] n [- 1] where n [> 1] construct factMinusOne [number] nMinusOne [fact] by n [* factMinusOne] end function
40
Using rule parameters rule resolveConstants replace [repeat statement] ‘const C [id] = V [expr] RestOfscope [repeat statement] by RestOfScope [replaceByValue C V] end rule rule replaceByValue ConstName [id] Value [expr] replace [primary] ConstName by (Value) end rule Example: Const Pi = 3.14; Area := r*r*Pi; Area := r*r*3.14;
41
Deconstruct and searching functions rule vectorizeScalarAssignments replace [repeat statement] C1 [statement] C2 [statement] rest [repeat statement] deconstruct C1 V1 [var] := E1 [expr]; deconstruct C2 V2 [var] := E2 [expr]; where not E2 [reference V1] where not E1 [reference V2] construct Passign [statement] := ; by Passign rest end rule function reference V [variable] match * [variable] V end function Example: x:=x+1; y:=t+4; := ; x:=x+1; y:=x+4; No!
42
Working with Global Variables Global variables are a rich and powerful feature that can be used for many distinct purposes, including: - global tables. - multiple results from a rule. - “deep” parameters. - “message-passing” communication between rules in a rule set (e.g, avoiding interference).
43
Setting Global Table Global tables can be set up using an export clause before the replace clause in the main rule of a program. Example: function main export Table [repeat table_entry] “Veggie” -> “Cabbage” “Fruit” -> “Apple” “Fruit” -> “Orange” replace [program] P [program] by P [R1] end function define table_entry [stringlit] -> [stringlit] end define
44
Adding Table Entry Global tables can be modified by exporting a new binding for the table based on the imported original binding. function addTableEntry import Table [repeat table_entry] … construct newEntry [table_entry] … export Table Table [. NewEntry] … end function Example:
45
Searching in a Table Global tables can be easily queried using searching deconstructors. Example: deconstruct * [table_entry] Table Kind [stringlit] -> “Orange” The binding for “Kind” will be the [stringlit] “Fruit”. If no match were to be found, then the deconstructor would fail.
46
Avoiding interference between rules function shiftByOne export Flag [id] ‘not_found replace [number] N [number] by N [replaceOneByTwo] [replaceTwoByThree] end function function replaceOneByTwo replace [number] 1 export Flag ‘found by 2 end function function replaceTwoByThree import Flag [id] deconstruct Flag ‘not_found replace [number] 2 by 3 end function 1 ---> 2 ---> 3 2 ---> 3 We want:
47
Counting items in TXL TXL can be used for counting items (i.e. LOCs, number of cycles, etc.). For example: given a tag-language counting the number of tags and end-tags. uno due tags: 2 end Tags: 1
48
% Tags grammar define program [repeat element] end define define element [Tag] | [endTag] | [id] end define define Tag end define define endTag end define % Count number of tag function main replace [program] P [program] construct ListTags [repeat Tag] _ [^ P] construct NumberTags [number] _ [countTag each ListTags] [printf] by end function R1 [^ X1] Replace R1 of type [repeat T] with a sequence consisting of every subtree of type [T] contained in X1.
49
function countTag A [Tag] replace [number] N [number] by N [+ 1] end function function printf match [number] N [number] construct PrintObj [number] N [print] end function print is a built-in function!
50
Using attributes TXL allows a grammar definition to have attributes associated with it: 1.Attributes act like optional non-terminals, but normally do not appear in the output (txl –attr force the print of all attributes). 2.Attributes may be added to the parse tree during transformations. 3.Attributes are denotated in the grammar by the nonterminal modifier attr.
51
define type ‘int | ‘string end define define typed_id [id] [attr type] end define function InferType expr [expression] replace [typed_id] Id [id] deconstruct expr f [number] op [operator] s [number] by Id ‘int end function The attribute ‘type’ is Optional.
52
Remark 1.Several functions (or rules) may be applied to a scope in succession. For example: X [f][g][h] (the meaning is: h(g(f(X))) )
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.