Agile and Efficient Domain-Specific Languages using Multi-stage Programming in Java Mint Edwin Westbrook, Mathias Ricken, and Walid Taha.

Slides:



Advertisements
Similar presentations
CPSC 388 – Compiler Design and Construction
Advertisements

Objects and Classes David Walker CS 320. Advanced Languages advanced programming features –ML data types, exceptions, modules, objects, concurrency,...
Type Checking, Inference, & Elaboration CS153: Compilers Greg Morrisett.
1 Mooly Sagiv and Greta Yorsh School of Computer Science Tel-Aviv University Modern Compiler Design.
The Interface Definition Language for Fail-Safe C Kohei Suenaga, Yutaka Oiwa, Eijiro Sumii, Akinori Yonezawa University of Tokyko.
Tim Sheard Oregon Graduate Institute Lecture 8: Operational Semantics of MetaML CSE 510 Section FSC Winter 2005 Winter 2005.
Functional Programming. Pure Functional Programming Computation is largely performed by applying functions to values. The value of an expression depends.
INF 212 ANALYSIS OF PROG. LANGS Type Systems Instructors: Crista Lopes Copyright © Instructors.
Mint: A Multi-stage Extension of Java COMP 600 Mathias Ricken Rice University February 8, 2010.
Generic programming in Java
Tim Sheard Oregon Graduate Institute Lecture 4: Staging Interpreters CSE 510 Section FSC Winter 2004 Winter 2004.
Compiler Construction
Java Generics.
Control Flow Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Cse321, Programming Languages and Compilers 1 6/19/2015 Lecture #18, March 14, 2007 Syntax directed translations, Meanings of programs, Rules for writing.
1 Control Flow Analysis Mooly Sagiv Tel Aviv University Textbook Chapter 3
Environments and Evaluation
CSE S. Tanimoto Syntax and Types 1 Representation, Syntax, Paradigms, Types Representation Formal Syntax Paradigms Data Types Type Inference.
Register Allocation and Spilling via Graph Coloring G. J. Chaitin IBM Research, 1982.
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
Query Processing Presented by Aung S. Win.
Abstract Interpretation (Cousot, Cousot 1977) also known as Data-Flow Analysis.
CSE 341, S. Tanimoto Concepts 1- 1 Programming Language Concepts Formal Syntax Paradigms Data Types Polymorphism.
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
P Object type and wrapper classes p Object methods p Generic classes p Interfaces and iterators Generic Programming Data Structures and Other Objects Using.
Adapted from Prof. Necula UCB CS 1641 Overview of COOL ICOM 4029 Lecture 2 ICOM 4029 Fall 2008.
Agile and Efficient Domain-Specific Languages using Multi-stage Programming in Java Mint Edwin Westbrook, Mathias Ricken, and Walid Taha.
Basics of Java IMPORTANT: Read Chap 1-6 of How to think like a… Lecture 3.
Hello.java Program Output 1 public class Hello { 2 public static void main( String [] args ) 3 { 4 System.out.println( “Hello!" ); 5 } // end method main.
Object-Oriented Program Development Using Java: A Class-Centered Approach, Enhanced Edition.
Interpretation Environments and Evaluation. CS 354 Spring Translation Stages Lexical analysis (scanning) Parsing –Recognizing –Building parse tree.
Formal Semantics Chapter Twenty-ThreeModern Programming Languages, 2nd ed.1.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
Mint: A Multi-stage Extension of Java Purdue University - Computer Science Colloquia Mathias Ricken Rice University March 15, 2010.
Functional Programming With examples in F#. Pure Functional Programming Functional programming involves evaluating expressions rather than executing commands.
CMP-MX21: Lecture 4 Selections Steve Hordley. Overview 1. The if-else selection in JAVA 2. More useful JAVA operators 4. Other selection constructs in.
Java Basics.  To checkout, use: svn co scb07f12/UTORid  Before starting coding always use: svn update.
Introduction to Generics
Chapter 1 Introduction Study Goals: Master: the phases of a compiler Understand: what is a compiler Know: interpreter,compiler structure.
CS536 Semantic Analysis Introduction with Emphasis on Name Analysis 1.
Edwin Westbrook, Mathias Ricken, Jun Inoue, Yilong Yao, Tamer Abdlatif, Walid Taha.
12/18/2015© Hal Perkins & UW CSEG-1 CSE P 501 – Compilers Intermediate Representations Hal Perkins Winter 2008.
VISUAL C++ PROGRAMMING: CONCEPTS AND PROJECTS Chapter 2A Reading, Processing and Displaying Data (Concepts)
Duke CPS From C++ to Java l Java history: Oak, toaster-ovens, internet language, panacea l What it is ä O-O language, not a hybrid (cf. C++)
Agile and Efficient Domain-Specific Languages using Multi-stage Programming in Java Mint Edwin Westbrook, Mathias Ricken, and Walid Taha.
Java Annotations for Types and Expressions Mathias Ricken October 24, 2008 COMP 617 Seminar.
Advanced Functional Programming Tim Sheard 1 Lecture 17 Advanced Functional Programming Tim Sheard Oregon Graduate Institute of Science & Technology Lecture:
Chapter 1 Java Programming Review. Introduction Java is platform-independent, meaning that you can write a program once and run it anywhere. Java programs.
Announcements Assignment 2 Out Today Quiz today - so I need to shut up at 4:25 1.
Strings Robin Burke IT 130. Outline Objects Strings methods properties Basic methods Form validation.
ECE 750 Topic 8 Meta-programming languages, systems, and applications Automatic Program Specialization for J ava – U. P. Schultz, J. L. Lawall, C. Consel.
IST 210: PHP Basics IST 210: Organization of Data IST2101.
JAVA GENERICS Lecture 16 CS2110 – Spring 2016 Photo credit: Andrew Kennedy.
Type Checking and Type Inference
Representation, Syntax, Paradigms, Types
Generics, Lambdas, Reflections
Data Types.
Lecture 23 Pages : Separating Syntactic Analysis from Execution. We omit many details so you have to read the section in the book. The halting.
Arrays We often want to organize objects or primitive data in a way that makes them easy to access and change. An array is simple but powerful way to.
Representation, Syntax, Paradigms, Types
Coding Concepts (Basics)
Representation, Syntax, Paradigms, Types
Adapted from slides by Nicholas Shahan and Dan Grossman
6.001 SICP Variations on a Scheme
Introduction to Data Structure
Representation, Syntax, Paradigms, Types
Recursive Procedures and Scopes
Subtype Substitution Principle
ICOM 4029 Fall 2003 Lecture 2 (Adapted from Prof. Necula UCB CS 164)
Rehearsal: Lazy Evaluation Infinite Streams in our lazy evaluator
Presentation transcript:

Agile and Efficient Domain-Specific Languages using Multi-stage Programming in Java Mint Edwin Westbrook, Mathias Ricken, and Walid Taha

Domain-Specific Languages Increase Productivity public int f () {... } 2 Time + Effort

DSLs Must Be Agile and Efficient 3 Result

How to Implement DSLs? InterpretersCompilers + Easy to write/modify- Complex - Slow + Efficient code Why can’t we have both? 4

Multi-Stage Programming 5 Interpreter Compiler MSP

Compiler 6 DSL Parser High-Level Optimizations Low-Level Optimizations Code Gen binary

Staged Interpreter in Mint DSL Parser High-Level Optimizations Java Compiler Java Compiler binary.java Guaranteed to be well-formed 7 Re-usable techniques

Outline Background: a simple interpreter What is MSP? Writing a staged interpreter Dynamic type inference / unboxing in MSP Staged interpreter for an image processing DSL Automatic loop parallelization in MSP 8

A Simple Dynamically-Typed PL (Defun fact (x) (If (Equals (Var x) (Val (IntValue 0))) (Val (IntValue 1)) (Mul (Var x) (App fact (Sub (Var x) (Val (IntValue 1)))))) 9

The Value Type 10 abstract class Value { int intValue(); boolean booleanValue(); } abstract class Value { int intValue(); boolean booleanValue(); } class IntValue { int i; int intValue() { return i; } boolean booleanValue() { throw Error; } class IntValue { int i; int intValue() { return i; } boolean booleanValue() { throw Error; } class BooleanValue { boolean b; // … similar … } class BooleanValue { boolean b; // … similar … }

Expressions and eval 11 interface Exp { Value eval (Env e, FEnv f); } interface Exp { Value eval (Env e, FEnv f); } class Val class Add class If class App …

Example Expression Classes 12 class Var implements Exp { String x; Value eval (Env e, FEnv f) { return e.get (x); } class Add implements Exp { Exp left, right; Value eval (Env e, FEnv f) { return new IntValue ( left.eval(e,f).intValue() + right.eval(e,f).intValue() ); }

MSP in Mint Code has type Code Code built with brackets Code spliced with escapes `e Code compiled and run with run() method Code x = ; Code y = ; Integer z = y.run(); // z = 9 13 (1 + 2)

Staging the Interpreter 14 Input Program OutputInterpreter Input OutputRuntime Program Staged Interpreter Code Add and `.run()

Staging the Interpreter 15 interface Exp { separable Code eval (Env e, FEnv f); } interface Exp { separable Code eval (Env e, FEnv f); } class Val class Add class If class App …

Staging the Expression Classes 16 class Var implements Exp { String x; separable Code eval (Env e, FEnv f) { return e.get (x); } class Add implements Exp { Exp left, right; separable Code eval (Env e, FEnv f) { return <| new IntValue (`(left.eval (e,f)).intValue () + `(right.eval (e,f)).intValue ()) ) |>; }

Side Note: Weak Separability Escapes must be weakly separable – Prevents scope extrusion [Westbrook et al. ‘10] Limits the allowed assignments: – Variables with code must be block-local Can only call separable methods 17

Preventing Scope Extrusion 18 int i; Code d; separable Code foo (Code c) { i += 2;// OK Code c2 = ;// OK d = ;// BAD return c; } <| new A() { int bar (int y) { `(foo ( )) } } |>; =

Side Note: Cross-Stage Persistance 19 class Val implements Exp { Value v; separable Code eval(Env e, FEnv f) { return ; } v “persists” in generated code

Side Note: Cross-Stage Persistance CSP only allowed for code-free classes – Primitive types OR implementers of CodeFree Mint prohibits use of Code in CodeFree classes Prevents scope extrusion; see paper for details 20 abstract class Value implements CodeFree { int intValue(); boolean booleanValue(); } abstract class Value implements CodeFree { int intValue(); boolean booleanValue(); }

DSL Implementation Toolchain Abstract Compiler and Runner classes – Template method design pattern – ~25 lines each, easily updated for new DSLs – Compiles code to a.jar file using Mint’s save() method – Command-line: mint Compiler prog.dsl output.jar mint Runner output.jar 21

Results Generated code (factorial): 3x speedup relative to unstaged (x = 10) – (2.67 GHz Intel i7, RHEL 5, 4 GB RAM) 22 public Value apply(final Value x) { return (new BoolValue(x.valueEq(new IntValue(0))).boolValue()) ? new IntValue(1) : new IntValue (x.intValue() * apply(new IntValue(x.intValue() - new IntValue(1).intValue())).intValue()); }

Overhead of Unnecessary Boxing Problem: eval() always returns Code : class Add { separable Code eval(Env e, FEnv f) { return <| new IntValue(`(left.eval(e,f)).intValue() + `(right.eval(e,f)).intValue()) |>; } IntValue always built at runtime! 23

Solution: Dynamic Type Inference Statically determine the type of an expression Elide coercions to/from boxed Value type Can be expressed as a dataflow problem: 24 Unknown Any IntegerBoolean

Dynamic Type Inference in MSP 25 Exp Code parse Code.java.eval().run() DSL

Dynamic Type Inference in MSP 26 Exp T = Code IR parse Code.java.analyze().eval().run() IR associated with Java type T DSL

Typed Intermediate Representation 27 abstract class IR { DSLType tp; separable T eval (Env e, FEnv f); } abstract class IR { DSLType tp; separable T eval (Env e, FEnv f); } class ValIR class AddIR class IfIR class AppIR …

Example: AddIR 28 class AddIR extends IR > { IR > left, right; separable Code eval (Env e, FEnv f) { return ; } No unnecessary boxing here!

Building IR Nodes: Analysis 29 abstract class Exp { IR analyze (TEnv e, TFEnv f); } abstract class Exp { IR analyze (TEnv e, TFEnv f); } Types of variables and functions IR that produces some Java type

Example: Add 30 class Add extends Exp { Exp left, right; separable IR > eval (TEnv e, TFEnv f) { return new AddIR (coerceToInt (left.analyze (e,f)), coerceToInt (right.analyze (e,f))); } Coerce IR to IR >

DSL Types  class DSLType 31 Unknown Any IntegerBoolean class DSLTypeTop extends DSLType class DSLTypeTop extends DSLType class DSLTypeInteger extends DSLType > class DSLTypeInteger extends DSLType > class DSLTypeBoolean extends DSLType > class DSLTypeBoolean extends DSLType > class DSLTypeBottom extends DSLType > class DSLTypeBottom extends DSLType >

interface Visitor { R visitTop (IR ir); R visitInteger (IR > ir); R visitBoolean (IR > ir); R visitBottom (IR > ir); } abstract class DSLType { R accept (IR ir, Visitor v); } Visitor Pattern = Typecase on IR Return an R for any IR type 32 Pass ir to the correct method class DSLTypeInteger { R accept (IR > ir, Visitor v) { v.visitInteger (ir); }}

IR > coerceToInt (IR ir) { return ir.tp.accept (ir, new Visitor >>() { IR > visitTop (IR ir) { return new BoxAnyInt (ir); } IR > visitInteger (IR > ir) { return ir; } IR > visitBoolean (IR > ir) { return new ErrorIR (ir) } IR > visitBottom (IR > ir) { return new UnboxIntIR (ir); } }); } Example: Coercing to Integers 33 Cast AnyCode to Integer Just return Integer Throw errors for Boolean Conditionally unbox Values

Staging Join Points 34 If e 1 e 2 e 3 IfIR > IR >.analyze()

Staging Join Points 35 If e 1 e 2 e 3 IfIR > IR >.analyze()

Staging Join Points 36 If e 1 e 2 e 3 IfIR IR IR >.analyze() Written with visitor pattern

Per-Datatype Functions 37 (Defun fact (x) … ) new Fun() { applyInt (int x) { … } applyBool (boolean b) { … } applyValue (Value x) { /* applyInt or applyBool */ }

(Defun fact ( x) (If (Equals (Var x) (Val (IntValue 0))) (Val (IntValue 1)) (Mul (Var x) (App fact (Sub (Var x) (Val (IntValue 1)))))) T T I I I I I I Iterated Dataflow for Back Edges 38 I I I I I I I I I I B B T T I I I I

(Defun fact ( x) (If (Equals (Var x) (Val (IntValue 0))) (Val (IntValue 1)) (Mul (Var x) (App fact (Sub (Var x) (Val (IntValue 1)))))) I I I I I I I I Iterated Dataflow for Back Edges 39 I I I I I I I I I I B B I I I I I I

Results (Dynamic Type Inference): Generated code (factorial): 12x speedup relative to unstaged (x = 10) (2.67 GHz Intel i7, RHEL 5, 4 GB RAM) 40 Integer applyInt (final int x) { final /* type omitted */ tthis = this; return (x == 0) ? 1 : (x * tthis.applyInt (x - 1)); } Boolean applyBool (final boolean x) { final /* type omitted */ tthis = this; return (let l = x; let r = 0; false) ? 1 : /* ERROR */ }

An Image Processing Language Double-precision floating point arithmetic Arrays – Allocate, get, set For loop Loading/displaying raster images 41

Adding New Types 42 Unknown Any Integer Boolean Double Array

Automatic Loop Parallelization for i = 0 to n-1 do B[i] = f(A[i]) for i = 0 to (n-1)/2 do B[i] = f(A[i]) for i = (n-1)/2 +1 to n-1 do B[i] = f(A[i]) 43

Not That Easy… for i = 0 to n-1 do B[i] = f(B[i-1]) for i = 0 to (n-1)/2 do B[i] = f(B[i-1]) for i = (n-1)/2+1 to n-1 do B[i] = f(B[i-1]) Could set B[(n-1)/2 + 1] before B[(n-1)/2]! Could set B[(n-1)/2 + 1] before B[(n-1)/2]! 44

Embarrassingly Parallel Loops for i = 0 to n-1 do … = R[i] W[i] = … { Reads R } ∩ { Writes W } = ∅ 45

Adding R/W Sets to the Analysis 46 class IR { DSLType tp; RWSet rwSet; separable T eval (Env e, FEnv f); } class IR { DSLType tp; RWSet rwSet; separable T eval (Env e, FEnv f); }

Staged Parallel For 47 public class ForIR extends IR > { //… public Code eval (Env e, FEnv f) { return <| LoopRunner r = /* loop code */; `(rwOverlaps (e, _body.rwSet())) ? runner.runSerial(…) : runner.runParallel(…) |>); } Inserts runtime check for whether { Reads R } ∩ { Writes W } = ∅ Inserts runtime check for whether { Reads R } ∩ { Writes W } = ∅

Test Program: Parallel Array Set (Let a (AllocArray (Var x) (Val (IntValue 1))) (Let _ (For i (Var x) (ASet (Var a) (Var i) (Rand))) (Var a))) 3.3x speedup relative to unstaged (x = 10 5 ) 2.4x speedup relative to staged serial – 2.67 GHz Intel i7, RHEL 5, 4 GB RAM 48

Test Results Mandelbrot (200x200, max 256 iterations) 58x speedup relative to unstaged 2x speedup relative to staged serial Gaussian Blur (2000x2000) 33x speedup relative to unstaged 1.5x speedup relative to staged serial (2.67 GHz Intel i7, RHEL 5, 4 GB RAM) 49

Conclusion MSP for DSL implementation – Agile: as simple as writing an interpreters – Efficient: performance of compiled code Helpful toolchain for implementing DSLs Dataflow-based compiler optimizations Available for download: 50

Future Work: Iteration Spaces 51 for i from 1 to N do for j from 1 to i do i j A[i,j] = f (A[i-1,j], A[i,j-1])

Iteration Spaces in MSP 52 abstract class Exp { IR analyze (TEnv e, TFEnv f, IterSpace i); } abstract class Exp { IR analyze (TEnv e, TFEnv f, IterSpace i); } i  [1, N) j  [1, i) { (i-1,j), (i,j-1) } abstract class IR { DSLType tp; Deps deps; } abstract class IR { DSLType tp; Deps deps; }

53 Thank You!

Results (Loop Parallelization): Generated code (array set): 54 public Value apply(final int size) { return let final Value arr = new ArrayValue( allocArray(size, new IntValue(1))); let final int notUsed = let LoopRunner r = new LoopRunner() { /* next slide */ }; let int bound = size; let boolean runSerial = false; runSerial ? r.runSerial(bound) : r.runParallel(bound); arr; }

Results (Loop Parallelization): Generated code (array set): let LoopRunner r = new LoopRunner() { public int run(int lower, int upper) { for (int i = lower; i < upper; ++i) { Value unused = new IntValue( setArray(arr.arrayValue(), i, new DoubleValue(getRandom()))); } return 0; }... }

Test Program: Mandelbrot 56 x and y loops can run in parallel k loop cannot run in parallel (Let … (For y (Var n) (For x (Var n) … (For k (Val (IntValue 256)) … (ASet (Var acc) … (Add (Val (IntValue 1)) (AGet (Var acc) …))) … (ASet (Var out) …)

interface AnyCode { separable Code cast (Class c); } The AnyCode Class 57 Returns Code for any X

MSP Applications Sparse matrix multiplication Array Views – Removal of index math Killer example: Staged interpreters 58