TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A Sumit Gulwani (MSR Redmond) Component-based Synthesis Susmit Jha and Sanjit Seshia (UC-Berkeley) Joint work with: Ashish Tiwari (SRI) Ramarathnam Venkatesan (MSR Bangalore/Redmond)
Problem Definition Given: A library of components where each component comes with its functional specification Functional Specification of desired behavior Obtain: Appropriate composition of components to obtain desired behavior. Applications Bit-vector Algorithm Synthesis, Deobfuscation 1 Component based Synthesis
Straight-line programs that use –Arithmetic Operators: +,-,*,/ –Logical Operators: Bitwise and/or/not, Shift left/right Challenge: Combination of arithmetic + logical operators leads to unintuitive algorithms Application: Provides most-efficient way to accomplish a given task on a given architecture 2 Application 1: Bit-vector Algorithms
Turn-off rightmost 1-bit 3 Examples of Bitvector Algorithms Y Y & (Y-1) Y Y & Y & (Y-1)
4 Examples of Bitvector Algorithms Turn-off rightmost contiguous sequence of 1-bits Y Y & (1 + (Y | (Y-1))) Ceil of average of two integers without overflowing (X|Y) – ((X © Y) >> 1)
5 Examples of Bitvector Algorithms P25: Higher order half of product of x and y o1 := and(x,0xFFFF); o2 := shr(x,16); o3 := and(y,0xFFFF); o4 := shr(y,16); o5 := mul(o1,o3); o6 := mul(o2,o3); o7 := mul(o1,o4); o8 := mul(o2,o4); o9 := shr(o5,16); o10 := add(o6,o9); o11 := and(o10,0xFFFF); o12 := shr(o10,16); o13 := add(o7,o11); o14 := shr(o13,16); o15 := add(o14,o12); res := add(o15,o8); P24: Round up to next highest power of 2 o1 := sub(x,1); o2 := shr(o1,1); o3 := or(o1,o2); o4 := shr(o3,2); o5 := or(o3,o4); o6 := shr(o5,4); o7 := or(o5,o6); o8 := shr(o7,8); o9 := or(o7,o8); o10 := shr(o9,16); o11 := or(o9,o10); res := add(o10,1);
Transform given code into simpler representation (using components from the given code). Important for identifying malware/viruses 6 Application 2: Deobfuscation
7 Deobfuscation Example: Multiply by 45 Int multiply45Obs(int y) a=1; b=0; z=1; c=0; while(1) if (a==0) if (b==0) y=z+y; a= : a; b= : b; c= : c; if : c break; else z=z+y; a= : a; b= : b; c= : c; if : c break; else if (b==0) z=y<<2; a= : a; else z=y <<3; a= : a; b= : b; return y; Int multiply45(int y) z=y<<2; y=z+y; z=y<<3; y=z+y; return y;
8 Deobfuscation Example: Interchange src/dest InterchangeObs(Ipaddr *s, *d) *s = *s © *d; if (*s == *s © *d) *s = *s © *d; if (*s == *s © *d) *d = *s © *d; if (*d == *s © *d) *s = *d © *s; else *s = *s © *d; *d = *s © *d; return; else *s = *s © *d; *d = *s © *d; *s = *s © *d; Interchange(Ipaddr *s, *d) *d = *s © *d; *s = *s © *d; *d = *s © *d;
Functional Specification –Pre/Post-conditions, Input-output examples, Inefficient/Related programs –Interaction in face of over/under specification Search Space –Imperative/Functional Programs Operators Control-flow –Restricted Models of Computation Search Technique –Constraint Generation Invariant-based, Path-based, Input-based Precise/Abstract/Approximate Operator Encoding –Constraint Solving 9 Dimensions in Program Synthesis
Functional Specification –Pre/Post-conditions, Input-output examples, Inefficient/Related programs –Interaction in face of over/under specification Search Space –Imperative/Functional Programs Operators (Arithmetic/Logical) Control-flow (Straight-Line) –Restricted Models of Computation Search Technique –Constraint Generation Invariant-based, Path-based, Input-based Precise/Abstract/Approximate Operator Encoding –Constraint Solving 10 Dimensions in Program Synthesis
Functional Specification –Pre/Post-conditions, Input-output examples, Inefficient/Related programs –Interaction in face of over/under specification Search Space –Imperative/Functional Programs Operators (Arithmetic/Logical) Control-flow (Straight-Line) –Restricted Models of Computation Search Technique –Constraint Generation Invariant-based, Path-based, Input-based Precise/Abstract/Approximate Operator Encoding –Constraint Solving 11 Dimensions in Program Synthesis
Functional Specification –Pre/Post-conditions, Input-output examples, Inefficient/Related programs –Interaction in face of over/under specification Search Space –Imperative/Functional Programs Operators (Arithmetic/Logical) Control-flow (Straight-Line) –Restricted Models of Computation Search Technique –Constraint Generation Invariant-based, Path-based, Input-based Precise/Abstract/Approximate Operator Encoding –Constraint Solving 12 Dimensions in Program Synthesis
Choice 1: Logical relation between inputs and outputs Choice 2: Input-Output Examples 13 Functional Specification
Functional Spec of components Subtract, Bitwise-And Subtract(I1,I2,J) := J = (I1-I2) Bitwise-And(I1,I2,J) := J = (I1 & I2) 14 Functional Specification: Logical Relations Æ [ ( I[p]=1 Æ (I[j]=0) ) ) ( J[p]=0 Æ (J[j] = I[j]) ) ] p=1 b j=p+1 b jpjp Problem: Turn off rightmost 1-bit Functional Specification of desired behavior
Experiments: Comparison with Exhaustive Search 15 ProgramBrahmaAHA time Namelinesiterstime P P P P P P P73212 P83211 P93267 P P P ProgramBrahmaAHA time Namelinesiterstime P13446X P144460X P X P164562X P P186546X P196535X P X P218528X P X P X P X P X
Problem: Turn off rightmost contiguous string of 1-bits Logical Relations –A bit complicated Input-Output Relations –Key challenge is to resolve ambiguity –Our solution: Interaction with user 16 Functional Specification
Problem: Turn-off rightmost contiguous string of 1’s User: I want a design that maps > Oracle: I can think of two designs Design 1: (x+1) & (x-1) Design 2: (x+1) & x which differ on (Distinguishing Input) What should be mapped to? User: > 17 Dialog: Interactive Synthesis
Problem: Turn-off rightmost contiguous string of 1’s User: > Oracle: ? User: Oracle: ? User: Oracle: ? User: Oracle: ? User: Oracle: ? User: Oracle: Your design is X & (1 + ((x-1)|x)) 18 Dialog: Interactive Synthesis
Distinguishing Input construction is a bit expensive. We tried two optimizations –Interleave with random inputs. Overall end-to-end performance even worse. –Interleave with biased random inputs. Performs best. 19 Synthesizing Inputs for Dialog with User
Theorem: If a circuit uses only add/subtract/and/or/not operators, then i th bit of an output depends only on i th bit of inputs and bits on right side of it. Biased Random Strategy: –Choose a random input whose rightmost bits are different from the ones that have already been queried for. –For example, if 3 inputs of following form have been queried r1 0 0 r2 0 1 r3 1 0 Then, choose the 4 th input to be of the form r Biased Random Input Selection
Experiments: Random vs Biased-Random 21 Prog.Random Biased-Random TimeItersTimeIters P11513 P P32814 P P54826 P P71515 P P P P P Prog.Random Biased-Random TimeItersTimeIters P P P P P P P P P P22XX1869 P P P25XX19
Problem Definition Given: A library of components where each component comes with its functional specification Functional Specification of desired behavior Obtain: Appropriate composition of components to obtain desired behavior. Inspiration Standard process of knowledge discovery Modular development Can it help with modular synthesis? 22 Conclusion: Component based Synthesis