Download presentation
Presentation is loading. Please wait.
Published byDamian Turner Modified over 9 years ago
1
Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015
2
IA-32 Primer 2 Registers 1000 ESP Stack push eax 20 EAX 20 mov ebx, eax 20 EBX mov [esp], 60 60 lea esp, [esp+4] 9961000 add eax, ebx 40 Registers EAX, EBX, ECX, EDX, ESP, EBP, EIP, ESI, EDI Registers EAX, EBX, ECX, EDX, ESP, EBP, EIP, ESI, EDI Flags CF, OF, ZF, SF, PF, … Flags CF, OF, ZF, SF, PF, …
3
Superoptimization Input: An instruction sequence I – Instructions belong to an ISA – No loops Output: An instruction-sequence I’ – Instructions belong to same ISA – I’ is equivalent to I On all inputs, I and I’ produce the same results – I’ is the optimal implementation of I + timeout t an improved shorter faster lesser energy consumption
4
Superoptimization sub eax, ecx neg eax sub eax, 1 sub eax, ecx neg eax sub eax, 1 IA-32 Superoptimizer IA-32 Superoptimizer not eax add eax, ecx not eax add eax, ecx EAX ← ECX – EAX – 1
5
Superoptimization add ebp. 4 mov [ebp], eax add ebp, 4 mov [ebp], ebx add ebp. 4 mov [ebp], eax add ebp, 4 mov [ebp], ebx IA-32 Superoptimizer IA-32 Superoptimizer mov [ebp+4], eax mov [ebp+8], ebx add ebp, 8 mov [ebp+4], eax mov [ebp+8], ebx add ebp, 8
6
Why do we need a superoptimizer?
7
© Alvin Cheung, CSE 501
8
Peephole Optimization Done at low-level IR – Closer to assembly code Sliding window – “peephole” Peephole rules – Rewrite rules of form “LHS → RHS ” If current peephole matches LHS of some rule, replace with RHS
9
Peephole Rules Rule categoryLHSRHS Eliminating redundant loads and stores mov reg, [addr] // Load contents of [addr] in reg mov [addr], reg // Store contents of reg in [addr] (Both instructions must be in the same basic block) mov reg, [addr] Control-flow optimizations goto L1 … L1: goto L2 goto L2 … L1: goto L2 Algebraic rewrites add oprnd, 0(none) imul oprnd, 2shl oprnd, 1 Machine idioms add oprnd, 1inc oprnd
10
Problem with Peephole Rules Manually written – Cumbersome – Error-prone
11
Peephole Superoptimizer Superoptimizer. mov eax, [esp] mov [esp], eax. mov eax, [esp] add eax, 1 mov [esp], eax.... mov eax, [esp] mov [esp], eax... mov eax, [esp] add eax, 1 mov [esp], eax.... mov eax, [esp]. mov eax, [esp] add eax, 1 mov [esp], eax.... mov eax, [esp]... add eax, 1 mov [esp], eax...
12
Peephole Superoptimizer Superoptimizer. mov eax, [esp] mov [esp], eax. mov eax, [esp] add eax, 1 mov [esp], eax. mov eax, [esp] mov [esp], eax. mov eax, [esp] add eax, 1 mov [esp], eax. mov eax, [esp]. add [esp], 1 mov eax, [esp].... mov eax, [esp].... add [esp], 1 mov eax, [esp]...
13
Peephole Superoptimizer Superoptimization is a sloooooooooowwwww process
14
Optimization Database Training programs Instruction sequences Superoptimizer Optimization database
15
Peephole Superoptimizer. mov eax, [esp] mov [esp], eax. mov eax, [esp] add eax, 1 mov [esp], eax.... mov eax, [esp] mov [esp], eax... mov eax, [esp] add eax, 1 mov [esp], eax.... mov eax, [esp]. mov eax, [esp] add eax, 1 mov [esp], eax.... mov eax, [esp]... add eax, 1 mov [esp], eax... Optimization database
16
Peephole Superoptimizer. mov eax, [esp] mov [esp], eax. mov eax, [esp] add eax, 1 mov [esp], eax. mov eax, [esp] mov [esp], eax. mov eax, [esp] add eax, 1 mov [esp], eax. mov eax, [esp]. add [esp], 1 mov eax, [esp].... mov eax, [esp].... add [esp], 1 mov eax, [esp]... Optimization database
17
Outline Problem statement Motivation – Peephole optimization – Speeding up critical inner loops Equivalence of two instruction-sequences A naïve superoptimizer Massalin/Bansal-Aiken superoptimizer – Canonicalization – Test cases – Counterexamples as tests – Pruning away sub-optimal candidates Stochastic superoptimization
18
Checking equivalence of two instruction-sequences Key primitive in superoptimizers and related tools “Do two loop-free instruction sequences I and I’ produce the same outputs for all possible inputs?”
19
Checking equivalence of two instruction-sequences 1.Encode I as a logical formula ϕ 2.Encode I’ as a logical formula ψ 3.Use an SMT solver to check if ϕ ⟺ ψ is logically valid How to encode an instruction sequence as a logical formula? How to use an SMT solver to check if two formulas are equivalent?
20
How to encode an instruction sequence I as a logical formula ϕ? 1.Symbolically evaluate I on Id state to obtain post-state σ 2.Convert σ into a logical formula
21
Concrete Evaluation IA-32 state IA-32 interpreter – Concrete operational semantics of IA-32 instructions
22
IA-32 State ⟨RegMap, FlagMap, MemMap⟩ RegMap : Register ↦ 32-bit value FlagMap : Flag ↦ Boolean value MemMap : 32-bit address ↦ 8-bit value ⟨ [EBX ↦ 8, ESP ↦ 1000 ], [ ], [1000 ↦ 2]⟩ ⟨ [EAX ↦ 0, EBX ↦ 8, ECX ↦ 0, …, ESP ↦ 1000, EBP ↦ 0, …, EDI ↦ 0 ], [SF ↦ false, CF ↦ false, … OF ↦ false ], [0 ↦ 0, 4 ↦ 0, 8 ↦ 0, …, 1000 ↦ 2, 1004 ↦ 0, …, 4294967292 ↦ 0]⟩ ⟨ [EAX ↦ 0, EBX ↦ 8, ECX ↦ 0, …, ESP ↦ 1000, EBP ↦ 0, …, EDI ↦ 0 ], [SF ↦ false, CF ↦ false, … OF ↦ false ], [0 ↦ 0, 4 ↦ 0, 8 ↦ 0, …, 1000 ↦ 2, 1004 ↦ 0, …, 4294967292 ↦ 0]⟩ 32-bit address ↦ 32-bit value
23
Concrete Evaluation mov eax, [esp] add eax, 4 sub ebx, 4 mov [esp], ebx ⟨ [EBX ↦ 8, ESP ↦ 1000 ], [ ], [1000 ↦ 2]⟩ ⟨ [EBX ↦ 8, ESP ↦ 1000 ], [ ], [1000 ↦ 2]⟩
24
Concrete Evaluation mov eax, [esp] add eax, 4 sub ebx, 4 mov [esp], ebx ⟨ [EBX ↦ 8, ESP ↦ 1000 ], [ ], [1000 ↦ 2]⟩ ⟨ [EBX ↦ 8, ESP ↦ 1000 ], [ ], [1000 ↦ 2]⟩
25
Concrete Evaluation mov eax, [esp] add eax, 4 sub ebx, 4 mov [esp], ebx ⟨ [EAX ↦ 2, EBX ↦ 8, ESP ↦ 1000 ], [ ], [1000 ↦ 2]⟩ ⟨ [EAX ↦ 2, EBX ↦ 8, ESP ↦ 1000 ], [ ], [1000 ↦ 2]⟩
26
Concrete Evaluation mov eax, [esp] add eax, 4 sub ebx, 4 mov [esp], ebx ⟨ [EAX ↦ 6, EBX ↦ 8, ESP ↦ 1000 ], [PF ↦ true ], [1000 ↦ 2]⟩ ⟨ [EAX ↦ 6, EBX ↦ 8, ESP ↦ 1000 ], [PF ↦ true ], [1000 ↦ 2]⟩
27
Concrete Evaluation mov eax, [esp] add eax, 4 sub ebx, 4 mov [esp], ebx ⟨ [EAX ↦ 6, EBX ↦ 4, ESP ↦ 1000 ], [PF ↦ true ], [1000 ↦ 2]⟩ ⟨ [EAX ↦ 6, EBX ↦ 4, ESP ↦ 1000 ], [PF ↦ true ], [1000 ↦ 2]⟩
28
Concrete Evaluation mov eax, [esp] add eax, 4 sub ebx, 4 mov [esp], ebx ⟨ [EAX ↦ 6, EBX ↦ 4, ESP ↦ 1000 ], [PF ↦ true ], [1000 ↦ 4]⟩ ⟨ [EAX ↦ 6, EBX ↦ 4, ESP ↦ 1000 ], [PF ↦ true ], [1000 ↦ 4]⟩
29
Symbolic Evaluation Symbolic IA-32 state Symbolic IA-32 interpreter – Symbolic transformers for instructions – Obtained from concrete operational semantics of IA-32 instructions
30
Symbolic IA-32 State ⟨RegMap, FlagMap, MemMap⟩ RegMap : Symbolic register-constant ↦ term FlagMap : Symbolic flag-constant ↦ formula MemMap : Function symbol ↦ function-update expression ID state ⟨ [EAX’ ↦ EAX, EBX’ ↦ EBX, …, ESP’ ↦ ESP, EBP’ ↦ EBP, …, EDI’ ↦ EDI ], [SF’ ↦ SF, CF’ ↦ CF, … ZF’ ↦ ZF ], [Mem’ ↦ Mem]⟩ ID state ⟨ [EAX’ ↦ EAX, EBX’ ↦ EBX, …, ESP’ ↦ ESP, EBP’ ↦ EBP, …, EDI’ ↦ EDI ], [SF’ ↦ SF, CF’ ↦ CF, … ZF’ ↦ ZF ], [Mem’ ↦ Mem]⟩ QFBV+Arrays
31
Symbolic Evaluation mov eax, [esp] add eax, 4 sub ebx, 4 mov [esp], ebx ⟨ [EAX’ ↦ EAX, EBX’ ↦ EBX, …, ESP’ ↦ ESP, … ], [SF’ ↦ SF, …, ZF’ ↦ ZF ], [Mem’ ↦ Mem]⟩ ⟨ [EAX’ ↦ EAX, EBX’ ↦ EBX, …, ESP’ ↦ ESP, … ], [SF’ ↦ SF, …, ZF’ ↦ ZF ], [Mem’ ↦ Mem]⟩
32
Symbolic Evaluation mov eax, [esp] add eax, 4 sub ebx, 4 mov [esp], ebx ⟨ [EAX’ ↦ EAX, EBX’ ↦ EBX, …, ESP’ ↦ ESP, … ], [SF’ ↦ SF, …, ZF’ ↦ ZF ], [Mem’ ↦ Mem]⟩ ⟨ [EAX’ ↦ EAX, EBX’ ↦ EBX, …, ESP’ ↦ ESP, … ], [SF’ ↦ SF, …, ZF’ ↦ ZF ], [Mem’ ↦ Mem]⟩
33
Symbolic Evaluation mov eax, [esp] add eax, 4 sub ebx, 4 mov [esp], ebx ⟨ [EAX’ ↦ Mem(ESP), EBX’ ↦ EBX, …, ESP’ ↦ ESP, … ], [SF’ ↦ SF, …, ZF’ ↦ ZF ], [Mem’ ↦ Mem]⟩ ⟨ [EAX’ ↦ Mem(ESP), EBX’ ↦ EBX, …, ESP’ ↦ ESP, … ], [SF’ ↦ SF, …, ZF’ ↦ ZF ], [Mem’ ↦ Mem]⟩
34
Symbolic Evaluation mov eax, [esp] add eax, 4 sub ebx, 4 mov [esp], ebx ⟨ [EAX’ ↦ Mem(ESP) + 4, EBX’ ↦ EBX, …, ESP’ ↦ ESP, … ], [SF’ ↦ ((Mem(ESP) + 4) < 0), …, ZF’ ↦ ((Mem(ESP) + 4) = 0) ], [Mem’ ↦ Mem]⟩ ⟨ [EAX’ ↦ Mem(ESP) + 4, EBX’ ↦ EBX, …, ESP’ ↦ ESP, … ], [SF’ ↦ ((Mem(ESP) + 4) < 0), …, ZF’ ↦ ((Mem(ESP) + 4) = 0) ], [Mem’ ↦ Mem]⟩
35
Symbolic Evaluation mov eax, [esp] add eax, 4 sub ebx, 4 mov [esp], ebx ⟨ [EAX’ ↦ Mem(ESP) + 4, EBX’ ↦ EBX – 4, …, ESP’ ↦ ESP, … ], [SF’ ↦ ((EBX – 4) < 0), …, ZF’ ↦ ((EBX – 4) = 0) ], [Mem’ ↦ Mem]⟩ ⟨ [EAX’ ↦ Mem(ESP) + 4, EBX’ ↦ EBX – 4, …, ESP’ ↦ ESP, … ], [SF’ ↦ ((EBX – 4) < 0), …, ZF’ ↦ ((EBX – 4) = 0) ], [Mem’ ↦ Mem]⟩
36
Symbolic Evaluation mov eax, [esp] add eax, 4 sub ebx, 4 mov [esp], ebx ⟨ [EAX’ ↦ Mem(ESP) + 4, EBX’ ↦ EBX – 4, …, ESP’ ↦ ESP, … ], [SF’ ↦ ((EBX – 4) < 0), …, ZF’ ↦ ((EBX – 4) = 0) ], [Mem’ ↦ Mem[ESP ↦ EBX – 4]]⟩ ⟨ [EAX’ ↦ Mem(ESP) + 4, EBX’ ↦ EBX – 4, …, ESP’ ↦ ESP, … ], [SF’ ↦ ((EBX – 4) < 0), …, ZF’ ↦ ((EBX – 4) = 0) ], [Mem’ ↦ Mem[ESP ↦ EBX – 4]]⟩
37
Convert a symbolic state into a logical formula ⟨ [EAX’ ↦ Mem(ESP) + 4, EBX’ ↦ EBX – 4, …, ESP’ ↦ ESP, … ], [SF’ ↦ ((EBX – 4) < 0), …, ZF’ ↦ ((EBX – 4) = 0) ], [Mem’ ↦ Mem[ESP ↦ EBX – 4]]⟩ ⟨ [EAX’ ↦ Mem(ESP) + 4, EBX’ ↦ EBX – 4, …, ESP’ ↦ ESP, … ], [SF’ ↦ ((EBX – 4) < 0), …, ZF’ ↦ ((EBX – 4) = 0) ], [Mem’ ↦ Mem[ESP ↦ EBX – 4]]⟩ EAX’ = Mem(ESP) + 4 ∧ EBX’ = EBX – 4 ∧ … ∧ ESP’ = ESP ∧ … ∧ SF’ = ((EBX – 4) < 0) ∧ … ∧ ZF’ = ((EBX – 4) = 0) ∧ Mem’ = Mem[ESP ↦ EBX – 4] EAX’ = Mem(ESP) + 4 ∧ EBX’ = EBX – 4 ∧ … ∧ ESP’ = ESP ∧ … ∧ SF’ = ((EBX – 4) < 0) ∧ … ∧ ZF’ = ((EBX – 4) = 0) ∧ Mem’ = Mem[ESP ↦ EBX – 4]
38
How to encode an instruction sequence I as a logical formula ϕ? 1.Symbolically evaluate I on Id state to obtain post-state σ 2.Convert σ into a logical formula push eax EAX’ = EAX ∧ EBX’ = EBX ∧ … ∧ ESP’ = ESP – 4 ∧ … ∧ SF’ = SF ∧ … ∧ ZF’ = ZF ∧ Mem’ = Mem[ESP – 4 ↦ EAX] EAX’ = EAX ∧ EBX’ = EBX ∧ … ∧ ESP’ = ESP – 4 ∧ … ∧ SF’ = SF ∧ … ∧ ZF’ = ZF ∧ Mem’ = Mem[ESP – 4 ↦ EAX] ESP’ = ESP – 4 ∧ Mem’ = Mem[ESP – 4 ↦ EAX]
39
Exercise Convert the following instruction sequence into a QFBV formula lea esp, [esp - 4] mov [esp], 1 push ebp mov ebp, esp
40
Checking Equivalence of Two Instruction-Sequences 1.Encode I as a logical formula ϕ 2.Encode I’ as a logical formula ψ 3.Use an SMT solver to check if ϕ ⟺ ψ is logically valid How to encode an instruction sequence as a logical formula? How to use an SMT solver to check if two formulas are equivalent?
41
How to use SMT solver to check equivalence? Conventionally used for “satisfiability” – Symbolic execution for test-case generation EAX > 0 ∧ EBX = EAX + 4 ∧ EBX > 0 SMT solver SAT [EAX ↦ 1, EBX ↦ 5 ] SAT [EAX ↦ 1, EBX ↦ 5 ] EAX > 0 ∧ EBX = EAX + 4 ∧ EBX ≤ 0 SMT solver UNSAT
42
How to use SMT solver to check equivalence? SMT solver UNSAT (Two formulas are equivalent) UNSAT (Two formulas are equivalent) SMT solver SAT [EAX ↦ 1, EBX ↦ 1 ] (Counterexample) SAT [EAX ↦ 1, EBX ↦ 1 ] (Counterexample)
43
Checking equivalence of two instruction sequences sub eax, ecx neg eax sub eax, 1 sub eax, ecx neg eax sub eax, 1 EAX’ = -1 * (EAX + (-1 * ECX)) - 1 ∧ SF’ = (-1 * (EAX + (-1 * ECX)) - 1) < 0 EAX’ = -1 * (EAX + (-1 * ECX)) - 1 ∧ SF’ = (-1 * (EAX + (-1 * ECX)) - 1) < 0 not eax add eax, ecx not eax add eax, ecx EAX’ = -1 * EAX + ECX - 1 ∧ SF’ = (-1 * EAX + ECX - 1) < 0 EAX’ = -1 * EAX + ECX - 1 ∧ SF’ = (-1 * EAX + ECX - 1) < 0 SMT solver UNSAT
44
Checking equivalence of two instruction sequences mov eax, [esp] add eax, ebx mov eax, [esp] add eax, ebx EAX’ = Mem(ESP) + EBX ∧ SF’ = (Mem(ESP) + EBX) < 0 EAX’ = Mem(ESP) + EBX ∧ SF’ = (Mem(ESP) + EBX) < 0 mov eax, [esp] sub eax, ebx mov eax, [esp] sub eax, ebx SMT solver SAT [EBX ↦ 1, ESP ↦ 1000, Mem[1000 ↦ 1 ]] SAT [EBX ↦ 1, ESP ↦ 1000, Mem[1000 ↦ 1 ]] EAX’ = Mem(ESP) + EBX ∧ SF’ = (Mem(ESP) - EBX) < 0 EAX’ = Mem(ESP) + EBX ∧ SF’ = (Mem(ESP) - EBX) < 0
45
Outline Problem statement Motivation – Peephole optimization Equivalence of two instruction-sequences A naïve superoptimizer Massalin/Bansal-Aiken superoptimizer – Canonicalization – Test cases – Counterexamples as tests – Pruning away sub-optimal candidates Stochastic superoptimization
46
Thousands of unique IA-32 instruction schemas Exponential cost of enumeration += Instruction- Sequence Enumerator Equivalence Check Equivalence Check I A Naïve Superoptimizer Optimality check No Timeout expired I’
47
Outline Problem statement Motivation – Peephole optimization Equivalence of two instruction-sequences A naïve superoptimizer Massalin/Bansal-Aiken superoptimizer – Canonicalization – Test cases – Counterexamples as tests – Pruning away sub-optimal candidates Stochastic superoptimization
48
Canonicalization mov ebp, esp mov esp, ebp mov ebp, esp mov esp, ebp IA-32 Superoptimizer IA-32 Superoptimizer mov ebp, esp mov ebp, eax mov eax, ebp mov ebp, eax mov eax, ebp IA-32 Superoptimizer IA-32 Superoptimizer mov ebp, eax mov esi, esp mov esp, esi mov esi, esp mov esp, esi IA-32 Superoptimizer IA-32 Superoptimizer mov esi, esp
49
Canonicalization mov ebp, esp mov esp, ebp mov ebp, esp mov esp, ebp IA-32 Superoptimizer IA-32 Superoptimizer mov ebp, esp mov ebp, eax mov eax, ebp mov ebp, eax mov eax, ebp IA-32 Superoptimizer IA-32 Superoptimizer mov ebp, eax mov esi, esp mov esp, esi mov esi, esp mov esp, esi IA-32 Superoptimizer IA-32 Superoptimizer mov esi, esp mov reg1, reg2 mov reg2, reg1 mov reg1, reg2 mov reg2, reg1 IA-32 Superoptimizer IA-32 Superoptimizer mov reg1, reg2
50
Canonicalization add ebp, 20 add ebp, 30 add ebp, 20 add ebp, 30 IA-32 Superoptimizer IA-32 Superoptimizer add ebp, 50 add reg1, c1 add reg1, c2 add reg1, c1 add reg1, c2 IA-32 Superoptimizer IA-32 Superoptimizer add reg1, c1+c2
51
Canonicalization add ebp, 20 add ebp, 30 add ebp, 20 add ebp, 30 Canonicalizer add ebp, 50 add reg1, c1 add reg1, c2 add reg1, c1 add reg1, c2 Uncanonicalizer add reg1, c1+c2 IA-32 Superoptimizer IA-32 Superoptimizer reg1 ↦ ebp c1 ↦ 20 c2 ↦ 30 reg1 ↦ ebp c1 ↦ 20 c2 ↦ 30
52
Instruction- Sequence Enumerator Equivalence Check Equivalence Check I A Naïve Superoptimizer Optimality check No Timeout expired I’
53
Instruction- Sequence Enumerator Equivalence Check Equivalence Check I A Naïve Superoptimizer + Canonicalization Optimality check No Timeout expired I’ Canonicalizer I’’ Uncanonicalizer
54
Outline Problem statement Motivation – Peephole optimization Equivalence of two instruction-sequences A naïve superoptimizer Massalin/Bansal-Aiken superoptimizer – Canonicalization – Test cases – Counterexamples as tests – Pruning away sub-optimal candidates Stochastic superoptimization
55
Instruction- Sequence Enumerator Equivalence Check Equivalence Check I A Naïve Superoptimizer + Canonicalization Optimality check No Timeout expired I’ Canonicalizer I’’ Uncanonicalizer
56
Test Cases mov reg1, [reg2] add reg1, reg3 mov reg1, [reg2] add reg1, reg3 mov reg1, [reg2] sub reg1, reg3 mov reg1, [reg2] sub reg1, reg3 [REG3 ↦ 0, REG2 ↦ 1000, Mem[1000 ↦ 1 ]] [REG3 ↦ 0, REG2 ↦ 1000, Mem[1000 ↦ 1 ]] [REG1 ↦ 1, REG3 ↦ 0, REG2 ↦ 1000, Mem[1000 ↦ 1 ]] [REG1 ↦ 1, REG3 ↦ 0, REG2 ↦ 1000, Mem[1000 ↦ 1 ]] [REG1 ↦ 1, REG3 ↦ 0, REG2 ↦ 1000, Mem[1000 ↦ 1 ]] [REG1 ↦ 1, REG3 ↦ 0, REG2 ↦ 1000, Mem[1000 ↦ 1 ]] [REG3 ↦ 1, REG2 ↦ 1000, Mem[1000 ↦ 1 ]] [REG3 ↦ 1, REG2 ↦ 1000, Mem[1000 ↦ 1 ]] [REG1 ↦ 2, REG3 ↦ 1, REG2 ↦ 1000, Mem[1000 ↦ 1 ]] [REG1 ↦ 2, REG3 ↦ 1, REG2 ↦ 1000, Mem[1000 ↦ 1 ]] [REG1 ↦ 0, REG3 ↦ 1, REG2 ↦ 1000, Mem[1000 ↦ 1 ]] [REG1 ↦ 0, REG3 ↦ 1, REG2 ↦ 1000, Mem[1000 ↦ 1 ]]
57
Instruction- Sequence Enumerator Equivalence check Equivalence check I Test Cases Optimality check Optimality check No Timeout expired I’ Canonicalizer Test cases Test cases Fail I’’ Uncanonicalizer
58
Checking equivalence of two instruction sequences mov reg1, [reg2] add reg1, reg3 mov reg1, [reg2] add reg1, reg3 REG1’ = Mem(REG2) + REG3 ∧ SF’ = (Mem(REG2) + REG3) < 0 REG1’ = Mem(REG2) + REG3 ∧ SF’ = (Mem(REG2) + REG3) < 0 mov reg1, [reg2] sub reg1, reg3 mov reg1, [reg2] sub reg1, reg3 SMT solver SAT [REG3 ↦ 1, REG2 ↦ 1000, Mem[1000 ↦ 1 ]] SAT [REG3 ↦ 1, REG2 ↦ 1000, Mem[1000 ↦ 1 ]] REG1’ = Mem(REG2) + REG3 ∧ SF’ = (Mem(REG2) – REG3) < 0 REG1’ = Mem(REG2) + REG3 ∧ SF’ = (Mem(REG2) – REG3) < 0 Counterexample
59
Instruction- Sequence Enumerator Equivalence check Equivalence check I Test Cases Optimality check Optimality check No Timeout expired I’ Canonicalizer Test cases Test cases Fail Counter-example Counterexample- guided inductive synthesis (CEGIS) I’’ Uncanonicalizer
60
Outline Problem statement Motivation – Peephole optimization Equivalence of two instruction-sequences A naïve superoptimizer Massalin/Bansal-Aiken superoptimizer – Canonicalization – Test cases – Counterexamples as tests – Pruning away sub-optimal candidates Stochastic superoptimization
61
Pruning sub-optimal candidates. add ebp, c1+c2. One-instruction sequencesTwo-instruction sequencesThree-instruction sequences. add ebp, c1 add ebp, c2...........................
62
Instruction- Sequence Enumerator + Pruning Equivalence check Equivalence check I Superoptimizer Optimality check Optimality check No Timeout expired I’ Canonicalizer Test cases Test cases Fail Counter-example
63
Results – Search-space pruning LengthOriginal search space After canonicalization After pruningReduction factor 154539976448.5 229 million2.49 million1.2 million24.7 3162.1 billion8.6 billion3.11 billion52.1 Original search space After canonicalization After pruning After test-cases
64
But still …. Superoptimization is still a sloooooooooowwwww process After canonicalization, pruning, and testing, checking all candidates of length up to 3 instructions takes several hours
65
Outline Problem statement Motivation – Peephole optimization Equivalence of two instruction-sequences A naïve superoptimizer Massalin/Bansal-Aiken superoptimizer – Canonicalization – Test cases – Counterexamples as tests – Pruning away sub-optimal candidates Stochastic superoptimization
66
© Eric Schkufza, ASPLOS 2013
69
Montgomery multiplication kernel in OpenSSL library (116 instructions): STOKE’s result is 16 instructions shorter and 1.6 times faster than gcc –O3’s result
70
Outline Problem statement Motivation – Peephole optimization Equivalence of two instruction-sequences A naïve superoptimizer Massalin/Bansal-Aiken superoptimizer – Canonicalization – Test cases – Counterexamples as tests – Pruning away sub-optimal candidates Stochastic superoptimization
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.