Programming by Sketching Armando Solar-Lezama, Liviu Tancau, Gilad Arnold, Rastislav Bodik, Sanjit Seshia UC Berkeley, Rodric Rabbah MIT, Kemal Ebcioglu,

Slides:



Advertisements
Similar presentations
_Synthesis__________________ __Of_______________________ ___First-Order_____Dynamic___ _____________Programming___ _______________Algorithms___ Yewen (Evan)
Advertisements

Models of Computation Prepared by John Reif, Ph.D. Distinguished Professor of Computer Science Duke University Analysis of Algorithms Week 1, Lecture 2.
NP-Completeness.
TIE Extensions for Cryptographic Acceleration Charles-Henri Gros Alan Keefer Ankur Singla.
Counting the bits Analysis of Algorithms Will it run on a larger problem? When will it fail?
Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH CSED233: Data Structures (2014F)
Recursion CSC 220: Data Structure Winter Introduction A programming technique in which a function calls itself. One of the most effective techniques.
The Future of Correct Software George Necula. 2 Software Correctness is Important ► Where there is software, there are bugs ► It is estimated that software.
StreamBit: Sketching high-performance implementations of bitstream programs Armando Solar-Lezama, Rastislav Bodik UC Berkeley.
Intermediate code generation. Code Generation Create linear representation of program Result can be machine code, assembly code, code for an abstract.
Sketching high-performance implementations of bitstream programs. Armando Solar-Lezama, Rastislav Bodik UC Berkeley.
 Data copy forms part of an auto-tuning compiler framework.  Auto-tuning compiler, while using the library, can empirically evaluate the different implementations.
CSE 830: Design and Theory of Algorithms
VS 3 : Verification and Synthesis using SMT Solvers SMT Solvers for Program Verification Saurabh Srivastava * Sumit Gulwani ** Jeffrey S. Foster * * University.
Concept of Basic Time Complexity Problem size (Input size) Time complexity analysis.
1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.
Generative Programming. Generic vs Generative Generic Programming focuses on representing families of domain concepts Generic Programming focuses on representing.
Chapter Two Algorithm Analysis
– 1 – Basic Machine Independent Performance Optimizations Topics Load balancing (review, already discussed) In the context of OpenMP notation Performance.
Relational Verification to SIMD Loop Synthesis Mark Marron – IMDEA & Microsoft Research Sumit Gulwani – Microsoft Research Gilles Barthe, Juan M. Crespo,
IT253: Computer Organization Lecture 4: Instruction Set Architecture Tonga Institute of Higher Education.
CS Data Structures Chapter 1 Basic Concepts.
Generative Programming Meets Constraint Based Synthesis Armando Solar-Lezama.
Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004.
Analysis of Algorithms
Introduction Algorithms and Conventions The design and analysis of algorithms is the core subject matter of Computer Science. Given a problem, we want.
Chapter 3 Sec 3.3 With Question/Answer Animations 1.
Enabling Refinement with Synthesis Armando Solar-Lezama with work by Zhilei Xu and many others*
Examples using Arrays. Summing Squares Problem: To compute the sum of the squares of N numbers N is given N values are also given These should be read.
Analyzing algorithms & Asymptotic Notation BIO/CS 471 – Algorithms for Bioinformatics.
Programming Model and Synthesis for Low-power Spatial Architectures Phitchaya Mangpo Phothilimthana Nishant Totla University of California, Berkeley.
Generative Programming. Automated Assembly Lines.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
Synthesis with the Sketch System D AY 1 Armando Solar-Lezama.
Data Structure Introduction.
Carnegie Mellon Lecture 15 Loop Transformations Chapter Dror E. MaydanCS243: Loop Optimization and Array Analysis1.
Algorithm Analysis CS 400/600 – Data Structures. Algorithm Analysis2 Abstract Data Types Abstract Data Type (ADT): a definition for a data type solely.
October 11, 2007 © 2007 IBM Corporation Multidimensional Blocking in UPC Christopher Barton, Călin Caşcaval, George Almási, Rahul Garg, José Nelson Amaral,
1/6/20161 CS 3343: Analysis of Algorithms Lecture 2: Asymptotic Notations.
Algorithm Analysis. What is an algorithm ? A clearly specifiable set of instructions –to solve a problem Given a problem –decide that the algorithm is.
Algorithms A well-defined computational procedure that takes some value as input and produces some value as output. (Also, a sequence of computational.
Data Structures and Algorithms (AT70.02) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Prof. Sumanta Guha Slide Sources: CLRS “Intro.
Operational Semantics Mooly Sagiv Tel Aviv University Textbook: Semantics with Applications Chapter.
Chap. 7, Syntax-Directed Compilation J. H. Wang Nov. 24, 2015.
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
Big O David Kauchak cs302 Spring Administrative Assignment 1: how’d it go? Assignment 2: out soon… Lab code.
Recursion Unrolling for Divide and Conquer Programs Radu Rugina and Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
BITS Pilani Pilani Campus Data Structure and Algorithms Design Dr. Maheswari Karthikeyan Lecture1.
Programming by Sketching Ras Bodik. 2 The Problem Problem: k-line algorithm translates to k lines of code. 30-year-old idea: Can we synthesize the.
Chapter 6 (Lafore’s Book) Recursion Hwajung Lee.  Definition: An algorithmic technique. To solve a problem on an instance of size n, the instance is:
FURQAN MAJEED ALGORITHMS. A computer algorithm is a detailed step-by-step method for solving a problem by using a computer. An algorithm is a sequence.
Advanced Algorithms Analysis and Design
Database Management System
Introduction to Sketching
New applications of program synthesis
Introduction to Sketching
Algorithms Furqan Majeed.
CS 213: Data Structures and Algorithms
CS 3343: Analysis of Algorithms
Lecture 7 Constraint-based Search
Programming by Sketching for Bit-Streaming Programs
Objective of This Course
Unit-2 Divide and Conquer
Over-Approximating Boolean Programs with Unbounded Thread Creation
Programming and Data Structure
Resolution Proofs for Combinational Equivalence
Programming by Sketching
Armando Solar-Lezama, Rastislav Bodik UC Berkeley
David Kauchak cs161 Summer 2009
Algorithms and Data Structures
Presentation transcript:

Programming by Sketching Armando Solar-Lezama, Liviu Tancau, Gilad Arnold, Rastislav Bodik, Sanjit Seshia UC Berkeley, Rodric Rabbah MIT, Kemal Ebcioglu, Vijay Saraswat, Vivek Sarkar IBM

2 int[] mergeSort (int[] input, int n) { return merge(mergeSort (input[0::n/2]), mergeSort (input[n/2+1::n]), n); } int[] merge (int[] a, int b[], int n) { int j=0, k=0; for (int i = 0; i < n; i++) if ( a[j] < b[k] ) { result[i] = a[j++]; } else { result[i] = b[k++]; } return result; } Merge sort looks simple to code, but there is a bug

3 Merge sort int[] mergeSort (int[] input, int n) { return merge(mergeSort (input[0::n/2]), mergeSort (input[n/2+1::n]), n); } int[] merge (int[] a, int b[], int n) { int j, k; for (int i = 0; i < n; i++) if ( j<n && ( !(k<n) || a[j] < b[k]) ) { result[i] = a[j++]; } else { result[i] = b[k++]; } return result; }

4 The sketching experience sketch implementation (completed sketch) spec specification +

5 The spec: bubble sort int[] sort (int[] input, int n) { for (int i=0; i<n; ++i) for (int j=i+1; j<n; ++j) if (input[j] < input[i]) swap(input, j, i); }

6 int[] mergeSort (int[] input, int n) { return merge(mergeSort (input[0::n/2]), mergeSort (input[n/2+1::n]), n); } int[] merge (int[] a, int b[], int n) { int j, k; for (int i = 0; i < n; i++) if ( expression( ||, &&, <, !, [] ) ) { result[i] = a[j++]; } else { result[i] = b[k++]; } return result; } Merge sort: sketched hole

7 Merge sort: synthesized int[] mergeSort (int[] input, int n) { return merge(mergeSort (input[0::n/2]), mergeSort (input[n/2::n]) ); } int[] merge (int[] a, int b[], int n) { int j, k; for (int i = 0; i < n; i++) if ( j<n && ( !(k<n) || a[j] < b[k]) ) { result[i] = a[j++]; } else { result[i] = b[k++]; } return result; }

8 Sketching: spec vs. sketch Specification –executable: easy to debug, serves as a prototype –a reference implementation: simple and sequential –written by domain experts: crypto, bio, MPEG committee Sketched implementation –program with holes: filled in by synthesizer –programmer sketches strategy: machine provides details –written by performance experts: vector wizard; cache guru

9 How sketching fits into autotuning Autotuning: two methods for obtaining code variants 1.optimizing compiler: transform a “spec” in various ways 2.custom generator: for a specific algorithm We seek to simplify the second approach Scenario 1: library of variants stores resolved sketches –as if written by hand Scenario 2: library has unresolved, flexible sketches –sketch works for a variety of specifications: e.g., a class of stencils

10 S KETCH A language with support for sketching-based synthesis –like C without pointers –two simple synthesis constructs restricted to finite programs: –input size known at compile time, terminates on all inputs most high-performance kernels are finite: –matrix multiply: yes –binary search tree: no we’re already working on relaxing the fineteness restriction –later in this talk

11 Ex1: Isolate rightmost 0-bit  bit[W] isolate0 (bit[W] x) { // W: word size bit[W] ret = 0; for (int i = 0; i < W; i++) if (!x[i]) { ret[i] = 1; break; } return ret; } bit[W] isolate0Fast (bit[W] x) implements isolate0 { return ~x & (x+1); } bit[W] isolate0Sketched (bit[W] x) implements isolate0 { return ~(x + ??) & (x + ??); }

12 Programmer’s view of sketches the ?? operator replaced with a suitable constant as directed by the implements clause. the ?? operator introduces non-determinism the implements clause constrains it.

13 Meaning of sketches programs with ?? have many meanings ~(x + ??) & (x + ??); means: ~(x + 0) & (x + 1); ~(x - 1) & (x + 0); … “counted loops” are unrolled: x = ??; loop (x) { y = y + ??; } means: x = 2; y = y + 4; y = y + 0; x = 3; y = y + 2; y = y + 4; y = y + 17; … f implements g: –synthesizer “selects” the meaning of f that is functionally equivalent to g

14 Beyond synthesis of literals Synthesizing values of ?? already very useful –parallelization machinery: bitmasks, tables in crypto codes –array indices: A[i+??,j+??] We can synthesize more than constants –semi-permutations: functions that select and shuffle bits –polynomials: over one or more variables –actually, arbitrary expressions, programs

15 Example 3: IP from DES. 32 bits bit[64] IPsketched (bit[64] x) implements IP { bit[64] result; bit[32] table[8][16] = ??; x = (x>>??) {|} (x<<??) {|} x; for (int i=0; i<8; ++i) { result[0:31] |= table[i][x[i*4::4]]; result[32:63]|= table[i][x[32+i*4::4]]; } return result; } table[i][permutation(x)];

16 Template for an arbitrary permutation bit[N] permutation(bit[N] x) { bit[N] result; int i=0; loop (??) { result ^= x>>i & ??; result ^= x<<i & ??; ++i; } return result; }

17 Synthesizing polynomials int spec (int x) { return 2*x*x*x*x + 3*x*x*x + 7*x*x + 10; } int p (int x) implements spec { return (x+1)*(x+2)*poly(3,x); } int poly(int n, int x) { if (n==0) return ??; else return x * poly(n-1, x) + ??; }

18 Karatsuba’s multiplication x = x1*b + x0 y = y1*b + y0b=2 k x*y = b 2 *x1*y1 + b*(x1*y0 + x0*y1) + x0*y0 x*y =poly(??,b) * x 1 *y 1 + +poly(??,b) * poly(1,x 1,x 0,y 1,y 0 )*poly(1,x 1, x 0, y 1, y 0 ) +poly(??,b) * x 0 *y 0 x*y = (b 2 +b) * x 1 *y 1 + b * (x 1 - x 0 )*(y 1 - y 0 ) + (b+1) * x 0 *y 0

19 Sketch of Karatsuba bit[N*2] k (bit[N] x, bit[N] y) implements mult { if (N<=1) return x*y; bit[N/2] x1 = x[0:N/2-1]; bit[N/2+1] x2 = x[N/2:N-1]; bit[N/2] y1 = y[0:N/2-1]; bit[N/2+1] y2 = y[N/2:N-1]; bit[2*N] t11 = x1 * y1; bit[2*N] t12 = poly(1, x1, x2, y1, y2) * poly(1, x1, x2, y1, y2); bit[2*N] t22 = x2 * y2; return multPolySparse (2, N/2, t11) // log b = N/2 + multPolySparse (2, N/2, t12) + multPolySparse (2, N/2, t22); } bit[2*N] poly (int n, bit[N] x0, x1, x2, x3) { if (n<=0) return ??; else return (??*x0 + ??*x1 + ??*x2 + ??*x3) * poly (n-1, x0, x1, x2, x3); } bit[2*N] multPolySparse (int n, int x, bit[N] y) { if (n<=0) return 0; else return y (n-1, x, y); }

20 Semantic view of sketches a sketch represents a set of functions: –the ?? operator modeled as reading from an oracle int f (int y) { int f (int y, bit[][K] oracle) { x = ??;x = oracle[0]; loop (x) {loop (x) { y = y + ??; y = y + oracle[1];} return y;return y; } Synthesizer must find oracle satisfying f implements g

21 Synthesis algorithm: overview 1.translation: represent spec and sketch as circuits 2.synthesis: find suitable oracle 3.code generation: specialize sketch wrt oracle

22 Ex : Population count  3 int pop (bit[W] x) { int count = 0; for (int i = 0; i < W; i++) { if (x[i]) count++; } return count; } xcount 0000 one mux count + mux + count + mux F(x) =

23 Synthesis as generalized SAT The sketch synthesis problem is an instance of 2QBF:  o.  x. P(x) = S(x,o) Counter-example driven solver: I = {} x = random() do I = I U {x} c = synthesizeForSomeInputs(I) if c = nil then exit(“buggy sketch'') x = verifyForAllInputs(c) // x: counter-example while x != nil return c S(x 1, c)=P(x 1 )  …  S(x k, c)=P(x k ) I ={ x 1, x 2, …, x k } S(x, c)  P(x)

24 Case study Implemented AES –the modern block-cipher standard –14 rounds: each has table lookup, permutation, GF- multiply –a good implementation collapses each round into table lookups Our results –we synthesized 32Kbit oracle! –synthesis time: about 1 hour –counterexample-driven synthesizer iterated 655 times –performance of synthesized code within 10% of hand- tuned

25 Finite programs In theory, SKETCH is complete for all finite programs: –specification can specify any finite program –sketch can describe any implementation over given instructions –synthesizer can resolve any sketch In practice, SKETCH scales for small finite programs –small finite programs: block ciphers, small kernels –large finite: big-integer multiplication, matrix multiplication Solution: –synthesize for a small input size –prove (or examine) that result of synthesis works for bigger inputs

26 Stencil computations: an example grid sten(grid in, real a, real b) for (i,j) in [(1,1), (n-2,n-2)] out i,j = a*in i-1,j + b*in i,j-1 + b*in i+1,j + a*in i,j+1 return out;

27 An implementation idea Expression b*in i+1,j + a*in i,j+1 can be reused grid sten(grid in, real a, real b) for i in [??,n-??] for j in [??,n-??] t = a*in i+??,j+?? + b*in i+??, j+?? ; out i+??, j+?? += t ; out i+??, j+?? = t ; return out grid sten(grid in, real a, real b) for i in [??,n-??] for j in [??,n-??] t = a*in i+??,j+?? + b*in i+??, j+?? ; if (expression(i,j,n)) out i+??, j+?? += t ; if (expression(i,j,n)) out i+??, j+?? = t ; return out

28 Lossless abstraction Problem –does result of synthesis for a small matrix work for all matrices? Approach –spec, sketch have unbounded-input/output –abstract them into finite functions, with the same abstraction –synthesize –obtained oracle works for original sketch Stencil kernels –concrete: matrix A[N]  matrix B[N] –abstract:A[e(i)], i, N  B[i]

29 Example: divide and conquer parallelization Parallel algorithm: –Data rearrangement + parallel computation spec: –sequential version of the program sketch: –parallel computation automatically synthesized: –Rearranging the data (dividing the data structure)