Predicting a Correct Program in PBE Rishabh Singh, Microsoft Research Sumit Gulwani, Microsoft Research
Programming By Examples Intuitive Natural Accessible Ambiguity!
Excel Forums 300_w1_aniSh_c1_b w1 =MID(“300_w1_aniSh_c1_b”,5,2)
300_w30_aniSh_c1_b w30 =MID($B:$B,FIND(“_”,$B:$B)+1, FIND(“_”,REPLACE($B:$B,1,FIND(“_”,$B:$B),””))-1) Excel Forums
FlashFill [Gulwani POPL2011][Gulwani,Harris,Singh CACM 2012] DSL VSA Program Heuristics Benchmarks
DSL VSA Program Ranking Benchmarks
Handling Ambiguity InputOutput Rick RashidMr. Rick Satya Nadella
Prefer non-constants InputOutput Rick RashidMr. Rick Satya NadellaMs. Satya Prefer smaller substrings as constants
Prefer smaller constants InputOutput Satya NadellaS. Nadella Bill Gates 2 nd word, last word, 2 nd capital followed by 2 nd lowercase string….
Machine Learning for Ranking “With great power comes great responsibility.”
Labelled Training Data Machine Learning Algorithm Efficient Ranking Algorithm Three Challenges
Training Data Generation InputOutput Rick RashidMr. Rashid Satya NadellaMr. Nadella Peter LeeMr. Lee
Structuring Hypothesis Space with Sharing in Version-space Associative Expressions Fixed-arity Expressions f(e 1, f(e 2, f(e 3, e 4 ))) f(e 1, e 2, e 3, e 4 ) DAG-based sharing Set-based sharing
Ranking Function f(p) Assume Linear Function f(p) = w 1 * f 1 + w 2 *f 2 + … + w k *f k
Learning To Rank Logistic Regression Listwise Approach Didn’t work well Too strong a constraint All relevant pages over irrelevant
Training Phase InputOutput Rick RashidMr. Rick Satya NadellaMr. Satya Peter LeeMr. Lee Lower 1 st uppercase letter Constant “r” Lower 2 nd upper case letter …. Goal: Find ranking function f(p) over program features that ranks positive programs higher than negative programs
Learn DAGs Rick Rashid Mr. Rashid Satya Nadella Mr. Satya
Intersect DAGs Rick Rashid Mr. Rick Satya Nadella Mr. Satya
Assign Positive Labels Rick Rashid Mr. Rick Satya Nadella Mr. Satya
Assign Negative Labels Rick Rashid Mr. Rick Satya Nadella Mr. Satya
Rick Rashid Mr. Rick Satya Nadella Mr. Satya Learn ranking function f(p) that ranks programs higher than programs.
Training Phase Positive ProgramsNegative Programs Rank any positive program over all negative programs
Hierarchical Ranking Atomic Expression Substring Expression Concat Expression Frequency of tokens, context, neighborhood,… Length of substring, input, output, constant,… Number of Arguments, sum, max, min, prod
Evaluation 175 benchmarks train-test partition Baseline (Occam’s razor): Smallest & Simplest programs
Ranking Evaluation LearnRank learns from 1 example for 79% benchmarks
Efficiency of Ranking
Ranking for PBE Machine Learning + Synthesis VSA Sharing Formalization Efficient Features & Algorithms General Loss Function for PBE Thanks!