Dimensions in Synthesis Sumit Gulwani Microsoft Research, Redmond May 2012.

Slides:



Advertisements
Similar presentations
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Advertisements

Models of Computation Prepared by John Reif, Ph.D. Distinguished Professor of Computer Science Duke University Analysis of Algorithms Week 1, Lecture 2.
From Verification to Synthesis Sumit Gulwani Microsoft Research, Redmond August 2013 Marktoberdorf Summer School Lectures: Part 1.
Semantics Static semantics Dynamic semantics attribute grammars
Planning Module THREE: Planning, Production Systems,Expert Systems, Uncertainty Dr M M Awais.
Introducing Formal Methods, Module 1, Version 1.1, Oct., Formal Specification and Analytical Verification L 5.
Synthesizing Geometry Constructions Sumit Gulwani MSR, Redmond Vijay Korthikanti UIUC Ashish Tiwari SRI.
Sumit Gulwani Microsoft Research, Redmond Dimensions in Program Synthesis ACM Symposium on Principles and Practice of Declarative.
Dimensions in Synthesis Part 2: Applications (Intelligent Tutoring Systems) Sumit Gulwani Microsoft Research, Redmond May 2012.
ISBN Chapter 3 Describing Syntax and Semantics.
CS 355 – Programming Languages
1 Formal Methods in SE Qaisar Javaid Assistant Professor Lecture 05.
CPSC 322, Lecture 23Slide 1 Logic: TD as search, Datalog (variables) Computer Science cpsc322, Lecture 23 (Textbook Chpt 5.2 & some basic concepts from.
Program Verification as Probabilistic Inference Sumit Gulwani Nebojsa Jojic Microsoft Research, Redmond.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
CPSC 322, Lecture 23Slide 1 Logic: TD as search, Datalog (variables) Computer Science cpsc322, Lecture 23 (Textbook Chpt 5.2 & some basic concepts from.
CPSC 322, Lecture 12Slide 1 CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12 (Textbook Chpt ) January, 29, 2010.
TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A Sumit Gulwani (MSR Redmond) Component-based Synthesis Susmit Jha.
CSE 830: Design and Theory of Algorithms
UMass Lowell Computer Science Advanced Algorithms Computational Geometry Prof. Karen Daniels Spring, 2004 Project.
Usable Synthesis Sumit Gulwani Microsoft Research, Redmond Usable Verification Workshop November 2010 MSR Redmond.
Program Synthesis for Automating Education Sumit Gulwani Microsoft Research, Redmond.
Describing Syntax and Semantics
Synthesis of Loop-free Programs Sumit Gulwani (MSR), Susmit Jha (UC Berkeley), Ashish Tiwari (SRI) and Ramarathnam Venkatesan(MSR) Susmit Jha 1.
Facilitator: David Brown
James Matte Nicole Calbi SUNY Fredonia AMTNYS October 28 th, 2011.
Cultivating Research Taste (illustrated via a journey in Program Synthesis research) Programming Languages Mentoring Workshop 2015 Sumit Gulwani Microsoft.
Introduction to Programming Lecture Number:. What is Programming Programming is to instruct the computer on what it has to do in a language that the computer.
From Program Verification to Program Synthesis Saurabh Srivastava * Sumit Gulwani ♯ Jeffrey S. Foster * * University of Maryland, College Park ♯ Microsoft.
CMPS 3223 Theory of Computation Automata, Computability, & Complexity by Elaine Rich ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Slides provided.
Computer Science CPSC 322 Lecture 3 AI Applications 1.
1 Program Correctness CIS 375 Bruce R. Maxim UM-Dearborn.
Dimensions in Synthesis Part 3: Ambiguity (Synthesis from Examples & Keywords) Sumit Gulwani Microsoft Research, Redmond May 2012.
Chapter 3 Sec 3.3 With Question/Answer Animations 1.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 2.
Constraint Satisfaction Problems (CSPs) CPSC 322 – CSP 1 Poole & Mackworth textbook: Sections § Lecturer: Alan Mackworth September 28, 2012.
ENM 503 Lesson 1 – Methods and Models The why’s, how’s, and what’s of mathematical modeling A model is a representation in mathematical terms of some real.
Logic Circuits Chapter 2. Overview  Many important functions computed with straight-line programs No loops nor branches Conveniently described with circuits.
TECH Computer Science NP-Complete Problems Problems  Abstract Problems  Decision Problem, Optimal value, Optimal solution  Encodings  //Data Structure.
ISBN Chapter 3 Describing Semantics -Attribute Grammars -Dynamic Semantics.
CS 363 Comparative Programming Languages Semantics.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
C++ Programming Language Lecture 2 Problem Analysis and Solution Representation By Ghada Al-Mashaqbeh The Hashemite University Computer Engineering Department.
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
CSCI 3160 Design and Analysis of Algorithms Tutorial 10 Chengyu Lin.
Introduction to Problem Solving. Steps in Programming A Very Simplified Picture –Problem Definition & Analysis – High Level Strategy for a solution –Arriving.
Chapter 3 Part II Describing Syntax and Semantics.
Programming Languages and Design Lecture 3 Semantic Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
Data Structures and Algorithms Dr. Tehseen Zia Assistant Professor Dept. Computer Science and IT University of Sargodha Lecture 1.
1 / 48 Formal a Language Theory and Describing Semantics Principles of Programming Languages 4.
Automating String Processing in Spreadsheets using Input-Output Examples Sumit Gulwani Microsoft Research, Redmond.
Random Interpretation Sumit Gulwani UC-Berkeley. 1 Program Analysis Applications in all aspects of software development, e.g. Program correctness Compiler.
From Verification to Synthesis Sumit Gulwani Microsoft Research, Redmond August 2013 Marktoberdorf Summer School Lectures: Part 1.
CES 592 Theory of Software Systems B. Ravikumar (Ravi) Office: 124 Darwin Hall.
Chapter 1: Introduction to Visual Basic.NET: Background and Perspective Visual Basic.NET Programming: From Problem Analysis to Program Design.
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
C HAPTER 3 Describing Syntax and Semantics. D YNAMIC S EMANTICS Describing syntax is relatively simple There is no single widely acceptable notation or.
Lecture. Today Problem set 9 out (due next Thursday) Topics: –Complexity Theory –Optimization versus Decision Problems –P and NP –Efficient Verification.
NY State Learning Standard 3- Mathematics at the Commencement Level By Andrew M. Corbett NY State HS Math Teacher Click to continue >>>
Progression in KS3/4 Algorithms MONDAY 30 TH NOVEMBER SUE SENTANCE.
Introductory Lecture. What is Discrete Mathematics? Discrete mathematics is the part of mathematics devoted to the study of discrete (as opposed to continuous)
PROBABILITY AND COMPUTING RANDOMIZED ALGORITHMS AND PROBABILISTIC ANALYSIS CHAPTER 1 IWAMA and ITO Lab. M1 Sakaidani Hikaru 1.
Advanced Algorithms Analysis and Design
From Classical Proof Theory to P vs. NP
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Objective of This Course
Chapter Nine: Advanced Topics in Regular Languages
Instructor: Aaron Roth
Presentation transcript:

Dimensions in Synthesis Sumit Gulwani Microsoft Research, Redmond May 2012

Synthesize a program in some underlying language from user intent using some search technique. 1 Program Synthesis Why today? –Variety of (cheap) computational devices and platforms Billions of non-experts have access to these devices! –Enabling technology is now available Better search algorithms Faster machines (good application for multi-cores)

Synthesize a program in some underlying language from user intent using some search technique. 2 Program Synthesis Why today? –Variety of (cheap) computational devices and platforms Billions of non-experts have access to these devices! –Enabling technology is now available Better search algorithms Faster machines (good application for multi-cores)

Concept Language –Programs Straight-line programs –Automata –Queries –Sequences User Intent –Logic, Natural Language –Examples, Demonstrations/Traces Search Technique –SAT/SMT solvers (Formal Methods) –A*-style goal-directed search (AI) –Version space algebras (Machine Learning) 3 Dimensions in Synthesis PPDP 2010: “Dimensions in Program Synthesis”, Gulwani. (Application) (Ambiguity) (Algorithm)

4 Compilers vs. Synthesizers DimensionCompilersSynthesizers Concept Language Executable ProgramVariety of concepts: Program, Automata, Query, Sequence User IntentStructured languageVariety/mixed form of constraints: logic, examples, traces Search Technique Syntax-directed translation (No new algorithmic insights) Uses some kind of search (Discovers new algorithmic insights)

Students and Teachers End-Users Algorithm Designers Software Developers Most Transformational Target Potential Users of Synthesis Technology 5 Most Useful Target Vision for End-users: Enable people to have (automated) personal assistants. Vision for Education: Enable every student to have access to free & high-quality education.

Lecture 1: Algorithms Synthesis of Straight-line Programs from Logic –Bit-vector Algorithms –Geometry Constructions Lecture 2: Applications Intelligent Tutoring Systems Lecture 3: Ambiguity Synthesis from Examples & Keywords 6 Organization

Intelligent Tutoring Systems Technical Goals: Identify a useful task that can be formalized as a synthesis problem. Propose an appropriate user interaction model. Propose an appropriate search technique. 7 Lab

Synthesizing Bitvector Algorithms PLDI 2011: Gulwani, Jha, Tiwari, Venkatesan

Concept Language –Programs Straight-line programs –Automata –Queries –Sequences User Intent –Logic, Natural Language –Examples, Demonstrations/Traces Search Technique –SAT/SMT solvers (Formal Methods) –A*-style goal-directed search (AI) –Version space algebras (Machine Learning) 9 Dimensions in Synthesis PPDP 2010: “Dimensions in Program Synthesis”, Gulwani.

Straight-line programs that use –Arithmetic Operators: +,-,*,/ –Logical Operators: Bitwise and/or/not, Shift left/right 10 Bitvector Algorithms

Turn-off rightmost 1-bit 11 Examples of Bitvector Algorithms Z Z & (Z-1) Z Z & Z & (Z-1)

12 Examples of Bitvector Algorithms Turn-off rightmost contiguous sequence of 1-bits Z Z & (1 + (Z | (Z-1))) Ceil of average of two integers without overflowing (Y|Z) – ((Y © Z) >> 1)

13 Examples of Bitvector Algorithms Higher order half of product of x and y o1 := and(x,0xFFFF); o2 := shr(x,16); o3 := and(y,0xFFFF); o4 := shr(y,16); o5 := mul(o1,o3); o6 := mul(o2,o3); o7 := mul(o1,o4); o8 := mul(o2,o4); o9 := shr(o5,16); o10 := add(o6,o9); o11 := and(o10,0xFFFF); o12 := shr(o10,16); o13 := add(o7,o11); o14 := shr(o13,16); o15 := add(o14,o12); res := add(o15,o8); Round up to next highest power of 2 o1 := sub(x,1); o2 := shr(o1,1); o3 := or(o1,o2); o4 := shr(o3,2); o5 := or(o3,o4); o6 := shr(o5,4); o7 := or(o5,o6); o8 := shr(o7,8); o9 := or(o7,o8); o10 := shr(o9,16); o11 := or(o9,o10); res := add(o10,1);

Given: Specification of desired functionality Specification of library components Synthesize a straight-line program 14 Problem Definition where Each variable in is either or some where k<j is a permutation of 1...n that meets the desired specification. Verification Constraint

Specification of desired functionality Specification of library components 15 Problem Definition: Turn-off rightmost 1 bit

16 Synthesis Constraint Verification Constraint Synthesis Constraint

represents which component goes on which location (line #) and from which location does it gets its input arguments. We encode this by location variables L. 17 Idea # 1: Reduce Second-order Quantification in Synthesis Constraint to First Order

18 Example: Possible programs that use 2 components and their Representation using Location Variables

Consistency Constraint: Every line in the program should have at most one component. 19 Encoding Well-formedness of Programs Acyclicity Constraint: A variable should be initialized before being used. The following constraint ensures that L assignments correspond to well-formed programs.

20 Encoding data-flow The following constraint describes connections between inputs and outputs of various components.

21 Idea # 1: Reduce Second-order Quantification in Synthesis Constraint to First Order

Synthesis constraint is of the form: 9 L 8 Y F(L,Y) Finite Synthesis Step 9 L F(L,y 1 ) Æ … Æ F(L,y n ) Verification Step Does 8 Y F(S,Y) hold? Or, equivalently 9 Y : F(S,Y) Solution Y = y n+1 return S 22 Choose some values y1,..,yn for y Solution L = S Failure No Solution Idea # 2: Using CEGIS style procedure to solve the Synthesis Constraint

Experiments: Comparison with Brute-force Search 23 ProgramBrahmaAHA time Namelinesiterstime P P P P P P P73212 P83211 P93267 P P P ProgramBrahmaAHA time Namelinesiterstime P13446X P144460X P X P164562X P P186546X P196535X P X P218528X P X P X P X P X

Synthesizing Geometry Constructions PLDI 2011: Gulwani, Korthikanti, Tiwari.

Concept Language –Programs Straight-line programs –Automata –Queries –Sequences User Intent –Logic, Natural Language –Examples, Demonstrations/Traces Search Technique –SAT/SMT solvers (Formal Methods) –A*-style goal-directed search (AI) –Version space algebras (Machine Learning) 25 Dimensions in Synthesis PPDP 2010: “Dimensions in Program Synthesis”, Gulwani.

Given a triangle XYZ, construct circle C such that C passes through X, Y, and Z. 26 Ruler/Compass based Geometry Constructions X Z Y L1 L2 N C

Draw a regular hexagon given a side. Given 3 parallel lines, draw an equilateral triangle whose vertices lie on the parallel lines. Given 4 points, draw a square whose sides contain those points. 27 Other Examples of Geometry Constructions

Good platform for teaching logical reasoning. –Visual Nature: Makes it more accessible. Exercises both logical/visual abilities of left/right brain. –Fun Aspect: Ruler/compass restrictions make it fun, as in sports. Application in dynamic geometry or animations. –“Constructive” geometry macros (unlike numerical methods) enable fast re-computation of derived objects from free (moving) objects. 28 Significance

Types: Point, Line, Circle Methods: Ruler(Point, Point) -> Line Compass(Point, Point) -> Circle Intersect(Circle, Circle) -> Pair of Points Intersect(Line, Circle) -> Pair of Points Intersect(Line, Line) -> Point Geometry Program: A straight-line composition of the above methods. 29 Programming Language for Geometry Constructions

Given a triangle XYZ, construct circle C such that C passes through X, Y, and Z. 30 Example Problem: Program 1.C1 = Compass(X,Y); 2.C2 = Compass(Y,X); 3. = Intersect(C1,C2); 4.L1 = Ruler(P1,P2); 5.D1 = Compass(Z,X); 6.D2 = Compass(X,Z); 7. = Intersect(D1,D2); 8.L2 = Ruler(R1,R2); 9.N = Intersect(L1,L2); 10.C = Compass(N,X); X Z Y C1 C2 P1 P2 L1 D2 D1 R1 R2 L2 N C

Conjunction of predicates over arithmetic expressions Predicates p := e 1 = e 2 | e 1  e 2 | e 1 · e 2 Arithmetic Expressions e := Distance(Point, Point) | Slope(Point, Point) | e 1 § e 2 | c 31 Specification Language for Geometry Programs

Given a triangle XYZ, construct circle C such that C passes through X, Y, and Z. Precondition: Slope(X,Y)  Slope(X,Z) Æ Slope(X,Y)  Slope(Z,X) Postcondition: LiesOn(X,C) Æ LiesOn(Y,C) Æ LiesOn(Z,C) Where LiesOn(X,C) ´ Distance(X,Center(C)) = Radius(C) Example Problem: Precondition/Postcondition 32

Let P be a geometry program that computes outputs O from inputs I. Verification Problem: Check the validity of the following Hoare triple. Assume Pre(I); P Assert Post(I,O); Synthesis Problem: Given Pre(I), Post(I,O), find P such that the above Hoare triple is valid. 33 Verification/Synthesis Problem for Geometry Programs

Pre(I), P, Post(I,O) a)Symbolic decision procedures are complex. 34 Approaches to Verification Problem

Problem: Given two polynomials P1 and P2, determine whether they are equivalent. The naïve deterministic algorithm of expanding polynomials to compare them term-wise is exponential. A simple randomized test is probabilistically sufficient: –Choose random values r for polynomial variables x –If P1(r) ≠ P2(r), then P1 is not equivalent to P2. –Otherwise P1 is equivalent to P2 with high probability, 35 Randomized Polynomial Identity Testing

Pre(I), P, Post(I,O) a)Symbolic decision procedures are complex. b)New efficient approach: Random Testing! 1.Choose I’ randomly from the set { I | Pre(I) }. 2.Compute O’ := P(I’). 3.If O’ satisfies Post(I’,O’) output “Verified”. Correctness Proof of (b): Objects constructed by P can be described using polynomial ops (+,-,*), square-root & division operator. The randomized polynomial identity testing algorithm lifts to square-root and division operators as well ! 36 Approaches to Verification Problem

Synthesis Algorithm: // First obtain a random input-output example. 1.Choose I’ randomly from the set { I | Pre(I) }. 2.Compute O’ s.t. Post(I’,O’) using numerical methods. // Now obtain a construction that can generate O’ from I’ (using exhaustive search). 3.S := I’; 4.While (S does not contain O’) 5. S := S [ { M(O 1,O 2 ) | O i 2 S, M 2 Methods } 6.Output construction steps for O’. 37 Idea 1 (from Theory): Symbolic Reasoning -> Concrete

Given a triangle XYZ, construct circle C such that C passes through X, Y, and Z. 38 Error Probability of the algorithm is extremely low. … L1 = Ruler(P1,P2); … L2 = Ruler(R1,R2); N = Intersect(L1,L2); C = Compass(N,X); 38 For an equilateral 4 XYZ, incenter coincides with circumcenter N. But what are the chances of choosing a random 4 XYZ to be an equilateral one? X Z Y L1 L2 N C

Synthesis algorithm times out because programs are large. Identify a library of commonly used patterns (pattern = “sequence of geometry methods”) –E.g., perpendicular/angular bisector, midpoint, tangent, etc. S := S [ { M(O 1,O 2 ) | O i 2 S, M 2 Methods } S := S [ { M(O 1,O 2 ) | O i 2 S, M 2 LibMethods } Two key advantages: –Search space: large depth -> small depth –Easier to explain solutions to students. 39 Idea 2 (from PL): High-level Abstractions

Given a triangle XYZ, construct circle C such that C passes through X, Y, and Z. 40 Use of high-level abstractions reduces program size 1.C1 = Compass(X,Y); 2.C2 = Compass(Y,X); 3. = Intersect(C1,C2); 4.L1 = Ruler(P1,P2); 5.D1 = Compass(Z,X); 6.D2 = Compass(X,Z); 7. = Intersect(D1,D2); 8.L2 = Ruler(R1,R2); 9.N = Intersect(L1,L2); 10.C = Compass(N,X); 1.L1 = PBisector(X,Y); 2.L2 = PBisector(X,Z); 3.N = Intersect(L1,L2); 4.C = Compass(N,X);

Synthesis algorithm is inefficient because the search space is too wide and hence still huge. Prune forward search by using A* style heuristics. S := S [ { M(O 1,O 2 ) | O i 2 S, M 2 LibMethods } S := S [ {M(O 1,O 2 ) | O i 2 S, M 2 LibMethods, IsGood(M(O 1,O 2 )) } Example: If a method constructs a line L that passes through a desired output point, then L is “good” (i.e., worth constructing). 41 Idea 3 (from AI): Goal Directed Search

Given a triangle XYZ, construct circle C such that C passes through X, Y, and Z. 42 Effectiveness of Goal-directed search 42 L1 and L2 are immediately constructed since they pass through output point N. On the other hand, other lines like angular bisectors are not eagerly constructed. X Z Y L1 L2 N C

25 benchmark problems. such as: Construct a square whose extended sides pass through 4 given points. 18 problems less than 1 second. 4 problems between 1-3 seconds. 3 problems seconds. Idea 2 (high-level abstractions) reduces programs of size 3-45 to Idea 3 (goal-directedness) improves performance by factor of times on most problems. 43 Experimental Results

44 Search space Exploration: With/without goal-directness

Concept Language –Programs Straight-line programs –Automata –Queries –Sequences User Intent –Logic, Natural Language –Examples, Demonstrations/Traces Search Technique –SAT/SMT solvers (Formal Methods) –A*-style goal-directed search (AI) –Version space algebras (Machine Learning) 45 Dimensions in Synthesis PPDP 2010: “Dimensions in Program Synthesis”, Gulwani.

Lecture 2 –Section 4 in WAMBSE 2012 keynote paper “Synthesis from Examples”, Gulwani. Lab –Section 4 in WAMBSE 2012 keynote paper. –NCERT Online Book Website. Lecture 3 –Sections 1-3 in WAMBSE 2012 keynote paper 46 Optional Advance Preparation

Motivation –Online learning sites: Khan academy, Edx, Udacity, Coursera Increasing class sizes with even less personal attention –New technologies: Tablets/Smartphones, NUI, Cloud Various Aspects –Solution Generation –Problem Generation –Automated Grading –Content Entry Various Domains –K-12: Mathematics, Physics, Chemistry –Undergraduate: Introductory Programming, Automata Theory –Language Learning 47 Intelligent Tutoring Systems