Compilation 2007 Domain-Specific Languages Syntax Extensions Michael I. Schwartzbach BRICS, University of Aarhus
2 Domain-Specific Languages GPL Problem Solving The General Purpose Language (GPL) approach: analyze the problem domain express the conceptual model as an OO design program a framework Pros: predictable and familiar result (relatively) low cost of implementation Cons: difficult to fully exploit domain-specific knowledge only available to general programmers
3 Domain-Specific Languages DSL Problem Solving The DSL approach: analyze the problem domain express the conceptual model as a language design implement a compiler or interpreter Pros: possible to exploit all domain-specific knowledge also available to domain experts Cons: (relatively) high cost of implementation risk of Babylonian confusion
4 Domain-Specific Languages Variations of DSLs A stand-alone DSL: a novel language with unique syntax and features example: LaTeX An embedded DSL: an existing GPL extended with DSL features example: JSP An external DSL: a stand-alone DSL invoked from a GPL example: SQL invoked from Java (JDBC)
5 Domain-Specific Languages From DSL to GPL A stand-alone DSL may evolve into a GPL: Fortran Formula Translation Algol Algorithmic Language Cobol Common Business Oriented Language Lisp List Processing Language Simula = Simulation Language A (successful) DSL design should plan for growth
6 Domain-Specific Languages Using Domain-Specific Knowledge Domain-specific syntax: directly denote high-level concepts Domain-specific analysis: consider global properties of the application domain-specific syntax clarifies the behavior Domain-specific optimization: exploit domain-specific analysis results GPL frameworks cannot provide these benefits
7 Domain-Specific Languages The Joos Peephole Language A stand-alone DSL: no general-purpose computing is required Domain concepts: bytecodes patterns templates Implemented using: a parser a static checker an interpreter
8 Domain-Specific Languages DSL Syntax for Peepholes pattern dup_istore_pop x: x ~ dup istore (i0) pop -> 3 istore (i0)
9 Domain-Specific Languages GPL Syntax Alternative boolean dup_istore_pop(InstructionList x) { int i0; if (is_dup(x) && is_istore(x.next) && is_pop(x.next.next)) { i0 = (int)x.next.getArg(); x = replace(x,3,new Arraylist().add(new Iistore(i0))); return true; } return false; } Much harder to write correctly Fixed implementation strategy
10 Domain-Specific Languages DSL Analysis for Peepholes Formal type and scope rules: This is checked by a phase in the DSL interpreter |- E: bytecodes[ → '] |- P[ '→ ''] |- E ~ P: boolean[ → ''] |- E: boolean[ → '] |- ! E: boolean[ → ] |- E 1 : boolean[ → '] |- E 2 : boolean[ '→ ''] |- E 1 && E 2 : boolean[ → '']
11 Domain-Specific Languages GPL Analysis Alternative Lots of yellow PostIt notes: These cannot be checked by the Java compiler don't assign to the same argument variable twice remember to return a boolean telling whether the pattern clicked always use the right kinds of arguments to the instructions
12 Domain-Specific Languages The JWIG Language An embedded DSL (in Java): lots of general-purpose computing is required Domain concepts: XML templates Web services sessions Implemented using: a syntax extension a static analysis a framework
13 Domain-Specific Languages DSL Syntax for JWIG public class test extends Service { String userid; public class Login extends Session { XML wrap = [[ ]]; public void main() { XML login = [[ Userid: ]]; show wrap<[contents = login]; userid = receive userid; show wrap<[contents = "Welcome "+userid]; }
14 Domain-Specific Languages GPL Syntax Alternative XML login = XML.make(" \nUserid: \n \ "); show(wrap.plug("contents",login)); userid = receive("userid"); The DSL syntax maps directly to methods calls in an underlying Java framework Avoiding escapes makes the syntax more legible But this is just a thin layer of syntactic sugar
15 Domain-Specific Languages DSL Analysis for JWIG A static analysis that at compile time guarantees: only well-formed and valid XML is every generated only existing form fields are every received only exisiting gaps are ever plugged This is a DSL analysis that is performed on the resulting compiled class files
16 Domain-Specific Languages JWIG syntax JWIG Implementation Model JWIG framework Java syntax.class files analysis results jwigcjavac jwiga
17 Domain-Specific Languages Syntax Extensions Programmers may want to extend the syntax of their programming language: introduce domain-specific syntax abbreviate common idioms define language extensions ensure consistency Such extensions are introduced through macros
18 Domain-Specific Languages Macros Macros are as old as programming Is used as an orthogonal abstraction mechanism Two different flavors: lexical macros syntactic macros Main Entry: 2 macro Pronunciation: 'ma-(")krO Function: noun Inflected Form(s): plural macros Etymology: short for macroinstruction Date: 1959 “a single computer instruction that stands for a sequence of operations” Main Entry: 2 macro Pronunciation: 'ma-(")krO Function: noun Inflected Form(s): plural macros Etymology: short for macroinstruction Date: 1959 “a single computer instruction that stands for a sequence of operations”
19 Domain-Specific Languages Lexical Macros Operate on sequences of tokens Are handled by a preprocessor Are independent of the host language syntax Examples: CPP TeX
20 Domain-Specific Languages CPP - The C Preprocessor Integrated into C compilers Also works as a stand-alone expander Intercepts directives such as: #define #undef #ifdef #if #include
21 Domain-Specific Languages Lexical Macro Example CPP macro to square a number: #define square(X) X * X square(z + 1) z + 1 * z + 1
22 Domain-Specific Languages Lexical Macro Example CPP macro to square a number: #define square(X) X * X square(z + 1) z + (1 * z) + 1 Adding parentheses as a hack: #define square(X) (X) * (X) square(z + 1) (z + 1)*(z + 1)
23 Domain-Specific Languages Parsing Problem #define swap(X,Y) { int t=X; X=Y; Y=t; } if (a > b) swap(a,b); else b=0; *** test.c:3: parse error before 'else'
24 Domain-Specific Languages #define swap(X,Y) { int t=X; X=Y; Y=t; } if (a > b) swap(a,b); else b=0; #define swap(X,Y) do { int t=X; X=Y; Y=t; } while (0) if (a > b) swap(a,b); else b=0; Parsing Problem Hack *** test.c:3: parse error before 'else'
25 Domain-Specific Languages Expansion Time #define A 87 #define B A #undef A #define A 42 B ??? Eager expansion (definition time): B 87 Lazy expansion (invocation time): B A 42 CPP is lazy
26 Domain-Specific Languages Expansion Order #define id(X) X #define one(X) id(X) #define two a,b one(two) ??? Inner (call-by-value): one(two) one(a,b) *** arity error 'one' Outer (call-by-name): one(two) id(two) two a,b
27 Domain-Specific Languages Expansion Order in CPP CPP uses a pragmatic "argument prescan": one(two) id(a,b) *** arity error 'id' Useful for composing macros: #define succ(X) ((X)+1) #define call7(X) X(7) call7(succ) succ(7) ((7)+1)
28 Domain-Specific Languages Recursive Expansion #define x 1+x x ??? Definition time: *** recursive definition Invocation time: x 1+x 1+1+x x...
29 Domain-Specific Languages Recursive Expansion in CPP CPP uses a pragmatic "intercept-and-ignore": int x = 2; #define x = 1+x x 1+x Maintain a stack of macro invocations Ignore invocations of macros already on the stack At runtime the value of x is 3
30 Domain-Specific Languages TeX Macros \def \vector #1[#2..#3] { $({#1}_{#2},\ldots,{#1}_{#3})$ } \vector \phi[0..n-1] $({\phi}_{0},\ldots,{\phi}_{n-1})$ Flexible invocation syntax Parsing ambiguities (chooses shortest invocation) Expansion is lazy and outer Recursion is permitted (conditions allowed)
31 Domain-Specific Languages Syntactic Macros Operate on sequences of ASTs Are handled by the parser Are integrated with the host language syntax Examples: C++ templates Jakarta Tool Suite
32 Domain-Specific Languages C++ Templates Integrated into C++ compilers Is intended as a genericity mechanism But is often used as a macro language Macros accept ASTs for: identifers constants types The result is always an AST for a declaration
33 Domain-Specific Languages Syntactic Macro Example template T GetMax(T x, T y) { return (x>y?x,y); } int i,j; max = GetMax (i,j); Template bodies are parsed at definition time (unlike CPP macros) Templates are syntactically expanded Heavy use of templates yields bloated code (unlike Java generics that are not macros)
34 Domain-Specific Languages Metaprogramming C++ templates: perform compile time constant folding of arguments allow multiple template definitions and pattern matching This combination enables metaprogramming: Turing-complete computations during compilation Template libraries exist for: booleans control structures functions variables data structures
35 Domain-Specific Languages Metaprogramming Example template struct pow { static const int n = 1; }; template struct pow { static const int n=X*pow ::n; }; const int z = pow ::n; The value 125 is assigned to z at compile time
36 Domain-Specific Languages Metaprogramming for Specialization template inline float dot(float *a, float *b) { return dot (a,b) + a[I]*b[I]; } template <> inline float dot (float *a, float *b) { return a[0]*b[0]; } float x[3], y[3]; float z = dot (x,y); float z = x[0]*y[0] + x[1]*y[1] + x[2]*y[2]; The overhead of control structures are removed
37 Domain-Specific Languages Jakarta Tool Suite JTS extends Java with simple syntactic macros Macros accept ASTs for: AST_QualifiedName AST_Exp AST_Stm AST_FieldDecl AST_Class AST_TypeName The result is an AST specified as: exp{... }exp stm{... }stm mth{... }mth cls{... }cls
38 Domain-Specific Languages Hygienic Macros macro swap(AST_QualifiedName x, AST_QualifiedName y) local temp stm{ int temp = x; x = y; y = temp; }stm int temp = 42; int tump = 87; #swap(temp,tump); Potential name clash problem: int temp = temp; temp = tump; tump = temp; But local names are renamed uniquely: int temp143 = temp; temp = tump; tump = temp143;
39 Domain-Specific Languages The MetaFront System Macros are special cases of transformations: Inductive transformations allow: arbitrary nonterminals arbitrary invocation syntax metafront input program program.a output program program.b A input language B output language transformation x: A => B
40 Domain-Specific Languages MetaFront Example language Maybe extends Java stm[maybe] -> maybe ; transformation Maybe2Java: Maybe => Java { transformer Xstm: stm => stm; Xstm[maybe](S) S.xstm => xS ==> >> }