Synthesis via Numerical optimization

Slides:



Advertisements
Similar presentations
An Active contour Model without Edges
Advertisements

Delta Debugging and Model Checkers for fault localization
Linear Regression.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
Artificial Intelligence 13. Multi-Layer ANNs Course V231 Department of Computing Imperial College © Simon Colton.
SOFTWARE TESTING. INTRODUCTION  Software Testing is the process of executing a program or system with the intent of finding errors.  It involves any.
Dependency Analysis We want to “parallelize” program to make it run faster. For that, we need dependence analysis to ensure correctness.
Dynamic Bayesian Networks (DBNs)
CHAPTER 8 A NNEALING- T YPE A LGORITHMS Organization of chapter in ISSO –Introduction to simulated annealing –Simulated annealing algorithm Basic algorithm.
Smooth Interpretation Swarat Chaudhuri and Armando Solar-Lezama.
Machine Learning Week 2 Lecture 1.
Computer Vision Optical Flow
Artificial Neural Networks
Motion Analysis (contd.) Slides are from RPI Registration Class.
CSci 6971: Image Registration Lecture 4: First Examples January 23, 2004 Prof. Chuck Stewart, RPI Dr. Luis Ibanez, Kitware Prof. Chuck Stewart, RPI Dr.
Chapter 5: Path Planning Hadi Moradi. Motivation Need to choose a path for the end effector that avoids collisions and singularities Collisions are easy.
Motion Estimation Today’s Readings Trucco & Verri, 8.3 – 8.4 (skip 8.3.3, read only top half of p. 199) Numerical Recipes (Newton-Raphson), 9.4 (first.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Radial Basis Function Networks
כמה מהתעשייה? מבנה הקורס השתנה Computer vision.
RL for Large State Spaces: Policy Gradient
Dr. Hala Moushir Ebied Faculty of Computers & Information Sciences
1 Hybrid methods for solving large-scale parameter estimation problems Carlos A. Quintero 1 Miguel Argáez 1 Hector Klie 2 Leticia Velázquez 1 Mary Wheeler.
Artificial Neural Networks
The Basics and Pseudo Code
Section 8.1 Estimating  When  is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that.
Testing and Debugging Version 1.0. All kinds of things can go wrong when you are developing a program. The compiler discovers syntax errors in your code.
Brian Macpherson Ph.D, Professor of Statistics, University of Manitoba Tom Bingham Statistician, The Boeing Company.
December 9, 2014Computer Vision Lecture 23: Motion Analysis 1 Now we will talk about… Motion Analysis.
Automated Patch Generation Adapted from Tevfik Bultan’s Lecture.
M Machine Learning F# and Accord.net.
Insight: Steal from Existing Supervised Learning Methods! Training = {X,Y} Error = target output – actual output.
Bundle Adjustment A Modern Synthesis Bill Triggs, Philip McLauchlan, Richard Hartley and Andrew Fitzgibbon Presentation by Marios Xanthidis 5 th of No.
Motion Estimation Today’s Readings Trucco & Verri, 8.3 – 8.4 (skip 8.3.3, read only top half of p. 199) Newton's method Wikpedia page
Random Interpretation Sumit Gulwani UC-Berkeley. 1 Program Analysis Applications in all aspects of software development, e.g. Program correctness Compiler.
SOFTWARE TESTING LECTURE 9. OBSERVATIONS ABOUT TESTING “ Testing is the process of executing a program with the intention of finding errors. ” – Myers.
Shape2Pose: Human Centric Shape Analysis CMPT888 Vladimir G. Kim Siddhartha Chaudhuri Leonidas Guibas Thomas Funkhouser Stanford University Princeton University.
Data Mining – Algorithms: K Means Clustering
Big data classification using neural network
Numerical Methods Some example applications in C++
Fall 2004 Backpropagation CS478 - Machine Learning.
Deep Feedforward Networks
Software Testing.
Goal We present a hybrid optimization approach for solving global optimization problems, in particular automated parameter estimation models. The hybrid.
10701 / Machine Learning.
Machine Learning Basics
Chapter 10 Verification and Validation of Simulation Models
UNIT-4 BLACKBOX AND WHITEBOX TESTING
Click to edit Master title style
CSC 578 Neural Networks and Deep Learning
Graph Paper Programming
Motion Estimation Today’s Readings
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
We’re moving on to more recap from other programming languages
Coding Concepts (Basics)
Snakes, Shapes, and Gradient Vector Flow
Adaptive Perturbation Theory: QM and Field Theory
Gaussian Mixture Models And their training with the EM algorithm
Announcements Questions on the project? New turn-in info online
For loops Taken from notes by Dr. Neil Moore
Artificial Intelligence 10. Neural Networks
Backpropagation David Kauchak CS159 – Fall 2019.
More on HW 2 (due Jan 26) Again, it must be in Python 2.7.
More on HW 2 (due Jan 26) Again, it must be in Python 2.7.
Calibration and homographies
Nonlinear Conjugate Gradient Method for Supervised Training of MLP
UNIT-4 BLACKBOX AND WHITEBOX TESTING
Stochastic Methods.
ONNX Training Discussion
Presentation transcript:

Synthesis via Numerical optimization Armando Solar-Lezama and Swarat Chaudhuri

Numerical Optimization Goal: Find a (hopefully global) minima of a function Sample the function on a small set of values Based on the result sample in a new set of locations Gradient descent Newton iteration …

Numerical Optimization for Synthesis Many synthesis problems are best framed as optimization problems Especially true for problems involving hybrid systems

The Parameter Synthesis Problem tOff := ??; tOn := ??; h := ?? forever{ temp := readTemp(); if (isOn() && temp > tOff) switchHeaterOff(); elseif ( !isOn() && temp < tOn) switchHeaterOn(h); } Cooling: Warming: To motivate the problem, consider the following simple exercise. You want to develop a thermostat to control the temperature of a house. The house is a physical system that behaves according to a set of equations that determine how the temperature evolves when the heater is on or off. The thermostat, in turn, is a simple piece of code that keeps two thresholds; one to determine when to turn the heater on, and one to determine when to turn it off. The code is very simple, but determining the correct constants is far from trivial; for example, if tOn is too close to the target temperature, then by the time the next reading comes in, you will have overshot the target by too much. So the challenge is to find the most effective parameters to keep the house comfortable.

The Parameter Synthesis Problem temp time 75· A simple approach to this problem is to write a program that takes tOn, tOff and h as parameters and simulates the behavior of the house and the thermostat. The output of this program is a series of temperature readings. Our goal is to keep the house at a pleasant 75deg, we can write an error function that gives us a measure of how close our temperature readings are to the desired temperature. This error function is also our program, and our goal, is to find values that minimize its output. At this point, the problem is no longer a parameter synthesis problem; it’s a problem of how to find values that minimize the output of a complicated program. Can we find values of (tOff, tOn, h) to minimize Err?

The challenge of optimizing programs Plateaus and discontinuities thwart search

Branches introduce discontinuities For the thermostat example Function minimization has long been a problem of interest for numerical analysis, so one intriguing idea is to use a standard numerical minimization technique to minimize the Error function. For example, could we use simple gradient descent algorithm to find a minimum for our function. The problem is that our Error function is not a particularly well behaved function; specifically, the function is full of plateaus and discountinuities as a result of the discrete control flow inside the function. This is a problem for gradient descent, because for all practical purposes, there is no gradient. In fact, it is a problem for many of the most effective minimization algorithms because they all rely on some kind of continuity and even differentiability assumption.

Minimization by numerical Search Discontinuities affect even the best methods So this approach seems like a dead end, but it raises the question: is it possible to approximate the program with a continuous function, one that works well with these numerical methods? Because if we can do that, we could use these optimization methods for a whole range of program analysis problems, not just parameter synthesis, but potentially even bug finding. This is really the big idea of this paper.

Key Idea: Smooth the Function Approximate the problem by a smooth function

Example: Gaussian Smoothing Execute on a distribution instead of a value Program Output value Input x

Example: Gaussian Smoothing Execute on a distribution instead of a value Expected value smooths discontinuities Output distribution Input x Program Distribution centered at x Expected value

Parameterized Smoothing Parameterized by a value 𝛽 Corresponds to std dev when smoothing with Gaussian noise

Parameterized Smoothing 𝐹 𝑠𝑚𝑜𝑜𝑡ℎ 𝑝, 𝛽 𝑖 parameterized by a value 𝛽 Corresponds to std dev when smoothing with Gaussian noise Minima of smooth function is different from minima of the original

Smoothed Numerical Search 𝑝 0 =𝑟𝑎𝑛𝑑𝑜𝑚 𝑠𝑡𝑎𝑟𝑡𝑖𝑛𝑔 𝑝𝑜𝑖𝑛𝑡 𝛽 0 =𝑖𝑛𝑖𝑡𝑖𝑎𝑙 𝑏𝑒𝑡𝑎 𝑊ℎ𝑖𝑙𝑒 𝛽 𝑖 >𝜖 𝑝 𝑖+1 =𝑜𝑝𝑡𝑖𝑚𝑖𝑧𝑒 𝐹 𝑠𝑚𝑜𝑜𝑡ℎ 𝑝, 𝛽 𝑖 𝑠𝑡𝑎𝑟𝑡𝑖𝑛𝑔 𝑎𝑡 𝑝 𝑖 𝛽 𝑖+1 = 𝛽 𝑖 ∗𝑞 𝑟𝑒𝑡𝑢𝑟𝑛 𝑝 𝑖 0≤𝑞<1

Smoothed Numerical Search Very general procedure Steps: Frame your problem as an optimization problem Define a parameterized smoothing procedure Run the algorithm Smoothing doesn’t have to be perfect But it should approximate the desired function in the limit as 𝛽→0

Euler: SNS for programs

Use Symbolic Gaussian Smoothing SNS in Euler Use Symbolic Gaussian Smoothing 𝛽 is simply the standard deviation Quantifier alternation is handled through fixed test harnesses not ideal, but works well in practice

Smoothed Numerical Search in Euler Euler Search Procedure Sketch (.sk file) Executable (.out file) Optimization backend (GSL) main () { … } Euler compiler Nelder-mead simplex search program() { … ?? ... } smoothParallel (β, x) { … x .. } 𝒙 𝑵

Smooth Interpretation How do we execute a program on a distribution of inputs? Not by sampling we would miss essential features of the search landscape. Symbolic execution: propagate symbolic representations of distributions. Not numerically, as it would require us to discretize. This would miss important features. Input x Program Approximation of expected output

Smooth Interpretation: Branch if( x > 0 ) true false

Smooth Interpretation: Join if( x > 0 ) true false

Smooth Interpretation: Join if( x > 0 ) true false

Smooth Interpretation: Assignment x = x - y; x = -y;

Approximations Trick Question: Which of the previous approximations is problematic?

Approximations Trick Question: Which of the previous approximations is problematic? Merging can introduce discontinuities!

Excessive merging can cause problems

Examples

Parameter synthesis t1 := ??; t2 := ??; A1 := ??; A2 := ??; } repeat (every ΔT) { if (stage = BACK && time > t1) stage = INTURN; if (stage = INTURN) { if (time > t2) then stage = OUTTURN; angle = angle – A1; } if (stage = OUTTURN) angle = angle + A2; moveCar(); } We were motivated by a class of optimal control problems. Parameter synthesis

Parameter synthesis t1 := ??; t2 := ??; A1 := ??; A2 := ??; } repeat (every ΔT) { if (stage = BACK && time > t1) stage = INTURN; if (stage = INTURN) { if (time > t2) then stage = OUTTURN; angle = angle – A1; } if (stage = OUTTURN) angle = angle + A2; moveCar(); } We were motivated by a class of optimal control problems. Parameter synthesis

Parameter synthesis t1 := ??; t2 := ??; A1 := ??; A2 := ??; } repeat (every ΔT) { if (stage = BACK && time > t1) stage = INTURN; if (stage = INTURN) { if (time > t2) then stage = OUTTURN; angle = angle – A1; } if (stage = OUTTURN) angle = angle + A2; moveCar(); } We were motivated by a class of optimal control problems. Parameter synthesis

Application : Parameter synthesis t1 := ??; t2 := ??; A1 := ??; A2 := ??; repeat (every ΔT) { if (stage = BACK && time > t1) stage = INTURN; if (stage = INTURN) { if (time > t2) then stage = OUTTURN; angle = angle – A1; } if (stage = OUTTURN) angle = angle + A2; moveCar(); } Error = distance between ideal and actual final position Goal: minimize Error. We were motivated by a class of optimal control problems. Parameter synthesis

Slice of Error function WRT T2

Slice of Error function WRT T2

Slice of Error function WRT T2

Slice of Error function WRT T2

Numerical Search with Smoothing

Errors from different starting points Both are quite good Both are quite bad Result of smoothing is much better

What the difference means How an error of 40 look like How an error of 10 looks like

Ongoing work

Smooth Verified Synthesis Target: Quantitative synthesis with hard constraints Ex. Temperature can never drop below some 𝑇 𝑙𝑜𝑤 Parking car can never crash with surrounding cars Also for Probabilistic systems: Error stays within 𝜖 with probability 𝑝 under some probabilistic model of the environment

Use Euler to create a candidate program Straw-man Approach Use Euler to create a candidate program Use Abstract Interpretation to verify program If it doesn’t work, repeat Problems: Abstract Interpretation is brittle Euler works on concrete test harnesses No quantifier alternation

Smooth Verified Synthesis Use smooth abstract interpretation 𝐹(𝑝, 𝛽)is smooth approximation of AI Continuous with respect to 𝑝 and 𝛽 lim 𝛽→0 𝐹(𝑝, 𝛽) is the true abstract interpretation This is all we need to do Smoothed Numerical Search

Thank you!