Download presentation
Presentation is loading. Please wait.
1
Validating High-Level Synthesis Sudipta Kundu, Sorin Lerner, Rajesh Gupta Department of Computer Science and Engineering, University of California, San Diego
2
Validating High-Level Synthesis 2 Power Perfor mance Area High-Level Synthesis Algorithmic Design Register Transfer Level Design (RTL) Functionally Equivalent Scheduled Design Functionally Equivalent Functionally Equivalent Controller S0 S1 S2S3 S4 f !f Data path …. x = a * b; c = a < b; if (c) then a = b – x; else a = b + x; a = a + x; b = b * x; …. C/C++, SystemC Verilog, VHDL High-Level Synthesis (HLS)
3
University of California, San Diego Validating High-Level Synthesis 3 Equivalence Checker For each translation Does not guarantee => HLS tool is bug free Checker Once and for all HLS tool always produce correct results The Problem Transformations Input Program (Specification) Transformed Program (Implementation) Translation Validation (TV) Optimizing Compiler [Pnueli et al. 98] [Necula 00] [Zuck et al. 05] Does guarantee => any errors in translation will be caught when tool runs Is the Specification “functionally equivalent” to the Implementation?
4
University of California, San Diego Validating High-Level Synthesis 4 Contributions Developed TV techniques for a new setting: HLS Algorithm uses a bisimulation relation approach. Implemented TV for a realistic HLS tool: SPARK Widely used: 4,000 downloads, over 100 active users. Large Software: around 125, 000 LoC. Ran it on 12 benchmarks Modular: works on one procedure at a time. Practical: took on average 6 secs to run per procedure. Easy to Implement: only a fraction of development cost of SPARK. Useful: found 2 previously unknown bugs.
5
University of California, San Diego Validating High-Level Synthesis 5 Outline 1. Motivation and Problem definition 2. Our Approach using an Example Definition of Equivalence Translation Validation Algorithm Generate Constraints Solve Constraints 3. Experiments and Results SPARK: Parallelizing HLS Framework 4. Conclusion
6
University of California, San Diego Validating High-Level Synthesis 6 Original Program: An Example of HLS Specification: i 2 : k = p i 1 : sum = 0 i 3 : (k < 10) i 6 : ¬ (k < 10) i 4 : k = k + 1 i 5 : sum = sum + k a2a2 a3a3 a4a4 a5a5 a1a1 a6a6 a0a0 i 7 : return sum int SumTo10 (int p) int sum = 0, k = p; while (k < 10) sum += ++k; return sum; int SumTo10 (int p) int sum = 0, k = p; while (k < 10) sum += ++k; return sum; ++< Resource Allocation: i 5 : sum = sum + k i 4 : k = k + 1 i 2 : k = p i 1 : sum = 0 a1a1 a0a0 i 3 : (k < 10) i 6 : ¬ (k < 10) a2a2 a3a3 a4a4 a5a5 a6a6 i 7 : return sum 10 sum = ∑ k p+1
7
University of California, San Diego Validating High-Level Synthesis 7 i 2 : k = p i 1 : sum = 0 a1a1 a0a0 Loop Pipelining: t = k + 1 i 5 : sum = sum + k An Example of HLS Specification: i 2 : k = p i 1 : sum = 0 i 3 : (k < 10) i 6 : ¬ (k < 10) i 4 : k = k + 1 i 5 : sum = sum + k a2a2 a3a3 a4a4 a5a5 a1a1 a6a6 a0a0 i 7 : return sum int SumTo10 (int p) int sum = 0, k = p; while (k < 10) sum += ++k; return sum; ++< Resource Allocation: i 4 : k = k + 1 i 3 : (k < 10) i 6 : ¬ (k < 10) a2a2 a3a3 a4a4 a5a5 a6a6 i 7 : return sum i 2 : k = p i 1 : sum = 0 a1a1 a0a0 t = k + 1 i 4 : k = t
8
University of California, San Diego Validating High-Level Synthesis 8 Copy Propagation: t = k + 1 i 5 : sum = sum + k i 5 : sum = sum + t t = p + 1 An Example of HLS Specification: i 2 : k = p i 1 : sum = 0 i 3 : (k < 10) i 6 : ¬ (k < 10) i 4 : k = k + 1 i 5 : sum = sum + k a2a2 a3a3 a4a4 a5a5 a1a1 a6a6 a0a0 i 7 : return sum int SumTo10 (int p) int sum = 0, k = p; while (k < 10) sum += ++k; return sum; ++< Resource Allocation: i 3 : (k < 10) i 6 : ¬ (k < 10) a2a2 a3a3 a4a4 a5a5 a6a6 i 7 : return sum i 2 : k = p i 1 : sum = 0 a1a1 a0a0 t = k + 1 i 4 : k = t t = t + 1
9
University of California, San Diego Validating High-Level Synthesis 9 a0a0 i 1 : sum = 0 i 2 : k = p i 41 : t = p + 1 Scheduling: An Example of HLS Specification: i 2 : k = p i 1 : sum = 0 i 3 : (k < 10) i 6 : ¬ (k < 10) i 4 : k = k + 1 i 5 : sum = sum + k a2a2 a3a3 a4a4 a5a5 a1a1 a6a6 a0a0 i 7 : return sum int SumTo10 (int p) int sum = 0, k = p; while (k < 10) sum += ++k; return sum; ++< Resource Allocation: t = p + 1 i 2 : k = p i 1 : sum = 0 a1a1 a0a0 a2a2 i 3 : (k < 10) i 6 : ¬ (k < 10) a5a5 a6a6 i 7 : return sum a3a3 i 4 : k = t i 5 : sum = sum + t a4a4 t = t + 1 i 4 : k = t i 5 : sum = sum + t i 42 : t = t + 1 Read After Write dependency
10
University of California, San Diego Validating High-Level Synthesis 10 An Example of HLS Specification: i 2 : k = p i 1 : sum = 0 i 3 : (k < 10) i 6 : ¬ (k < 10) i 4 : k = k + 1 i 5 : sum = sum + k a2a2 a3a3 a4a4 a5a5 a1a1 a6a6 a0a0 i 7 : return sum int SumTo10 (int p) int sum = 0, k = p; while (k < 10) sum += ++k; return sum; b0b0 j 1 : sum = 0 j 2 : k = p j 41 : t = p + 1 b1b1 j 3 : (k < 10) j 6 : ¬ (k < 10) b3b3 b4b4 j 7 : return sum b2b2 j 4 : k = t j 5 : sum = sum + t j 42 : t = t + 1 Implementation: ≡ Re-labeled
11
University of California, San Diego Validating High-Level Synthesis 11 Definition of Equivalence Specification ≡ Implementation => They have the same set of execution sequences of visible instructions. Visible instructions are: Function call and return statements. Two function calls are equivalent if the state of globals and the arguments are the same. Two returns are equivalent if the state of the globals and the returned values are the same.
12
University of California, San Diego Validating High-Level Synthesis 12 Technique for Proving Equivalence Bisimulation Relation: relates a given program state in the implementation with the corresponding state in the specification and vice versa. It satisfies the following properties: 1.The start states are related. 2. s2s2 i s’ 2 i s1s1 s’ 1 SpecificationImplementation Theorem: If there exists a bisimulation relation between the specification and the implementation, then they are equivalent. Visible Instruction
13
University of California, San Diego Validating High-Level Synthesis 13 Our Approach Split program state space in two parts: control flow state, which is finite. => explored by traversing the CFGs. dataflow state, which may be infinite. => explored using Automated Theorem Prover (ATP). Bisimulation relation: set of entries of the form (p 1, p 2, Ф). p 1 – program point in Specification. p 2 – program point in Implementation. Ф – formula that relates the data.
14
University of California, San Diego Validating High-Level Synthesis 14 Bisimulation Relation p s = p i k s = k i Λ sum s = sum i Λ (k s + 1) = t i sum s = sum i i 2 : k = p i 1 : sum = 0 i 3 : (k < 10) i 6 : ¬ (k < 10) i 4 : k = k + 1 i 5 : sum = sum + k a2a2 a3a3 a4a4 a5a5 a1a1 a6a6 a0a0 i 7 : return sum b1b1 b2b2 b3b3 b4b4 b0b0 j 1 : sum = 0 j 2 : k = p j 41 : t = p + 1 j 4 : k = t j 5 : sum = sum + t j 42 : t = t + 1 j 7 : return sum j 6 : ¬ (k < 10) j 3 : (k < 10) Specification Implementation Bisimulation relation: set of entries of the form (p 1, p 2, Ф). p 1 – program point in Specification. p 2 – program point in Implementation. Ф – formula that relates the data.
15
University of California, San Diego Validating High-Level Synthesis 15 Translation Validation Algorithm Two step approach. Generate Constraints: traverses the CFGs simultaneously and generates the constraints required for the visible instructions to be matched. Solve Constraints: solves the constraints using a fixpoint algorithm. For loops: iterate to a fixed point. May not terminate in general. But in practice can find the bisimulation relation.
16
University of California, San Diego Validating High-Level Synthesis 16 Generate Constraints i 2 : k = p i 1 : sum = 0 i 3 : (k < 10) i 6 : ¬ (k < 10) i 4 : k = k + 1 i 5 : sum = sum + k a2a2 a3a3 a4a4 a5a5 a1a1 a6a6 a0a0 i 7 : return sum b1b1 b2b2 b3b3 b4b4 b0b0 j 1 : sum = 0 j 2 : k = p j 41 : t = p + 1 j 4 : k = t j 5 : sum = sum + t j 42 : t = t + 1 j 7 : return sum j 6 : ¬ (k < 10) j 3 : (k < 10) Constraint Variable Constraints: Specification Implementation Ψ (a 2, b 1 ) b1b1 a2a2 Ψ (a 0, b 0 ) b0b0 a0a0 i1i1 i2i2 j 1 j 2 j 41 Ψ (a 2, b 1 ) b1b1 b1b1 a2a2 a2a2 i3i3 i4i4 i5i5 j 4 j 5 j 42 j3j3 Ψ (a 5, b 3 ) j6j6 b1b1 b3b3 a2a2 a5a5 i6i6 Ψ (a 2, b 1 ) Ψ (a 5, b 3 ) ⇒ sum s = sum i Ψ (a 2, b 1 ) ⇒ k s = k i Ψ (a 0, b 0 ) Ψ (a 2, b 1 ) Ψ (a 5, b 3 ) Branch Correlation - Strongest Post Condition - Structure of the code
17
University of California, San Diego Validating High-Level Synthesis 17 Ψ (a 2, b 1 ) Ψ (a 0, b 0 ) b0b0 b1b1 a0a0 a2a2 i1i1 i2i2 j 1 j 2 j 41 Ψ (a 2, b 1 ) b1b1 b1b1 a2a2 a2a2 i3i3 i4i4 i5i5 j 4 j 5 j 42 j3j3 Ψ (a 5, b 3 ) j6j6 b1b1 b3b3 a2a2 a5a5 i6i6 Ψ (a 2, b 1 ) Solve Constraints Constraints: i 2 : k = p i 1 : sum = 0 i 3 : (k < 10) i 6 : ¬ (k < 10) i 4 : k = k + 1 i 5 : sum = sum + k a2a2 a3a3 a4a4 a5a5 a1a1 a6a6 a0a0 i 7 : return sum b1b1 b2b2 b3b3 b4b4 b0b0 j 1 : sum = 0 j 2 : k = p j 41 : t = p + 1 j 4 : k = t j 5 : sum = sum + t j 42 : t = t + 1 j 7 : return sum j 6 : ¬ (k < 10) j 3 : (k < 10) Specification Implementation Ψ (a 0, b 0 ) : p s = p i Ψ (a 2, b 1 ) : True Ψ (a 2, b 1 ) : k s = k i Ψ (a 5, b 3 ) : True Ψ (a 5, b 3 ) : sum s = sum i XX X Ψ (a 2, b 1 ) : k s = k i Λ (k s + 1) = t i Ψ (a 2, b 1 ) : k s = k i Λ (k s + 1) = t i Λ sum s = sum i Ψ (a 5, b 3 ) ⇒ sum s = sum i Ψ (a 2, b 1 ) ⇒ k s = k i X X Ψ (a 0, b 0 ) Ψ (a 2, b 1 ) Ψ (a 5, b 3 ) b1b1 b1b1 a2a2 a2a2 Ψ (a 2, b 1 ) : k s = k i j 3 : (k i < 10) j 4 : k i = t i j 5 : sum i = sum i + t i j 42 : t i = t i + 1 i 3 : (k s < 10) i 4 : k s = k s + 1 i 5 : sum s = sum s + k s WP( Ψ (a 2, b 1 ) ) : (k i + 1) ⇒ (k s + 1) ⇒ (k s + 1) = t i ATP[ Ψ (a 2, b 1 ) ⇒ WP( Ψ (a 2, b 1 ) ) ] Ψ (a 2, b 1 ) : k s = k i ∧ (k s + 1) = t i : k s = k i ∧ WP( Ψ (a 2, b 1 ) )
18
University of California, San Diego Validating High-Level Synthesis 18 SPARK: Parallelizing HLS Framework C Program High Level Synthesis RTL Binding Intermediate Representation (IR) Transformations Code Motion, CSE, IVA, Copy Propagation, Dead Code Elimination, Percolation, Trailblazing, Chaining Across conditions, dynamic CSE. Heuristics HTG Scheduling Walker, Candidate Op Walker, Get Available Ops, Loop Pipelining Scheduled IR SPARK Pre-Synthesis Optimization AllocationScheduling No Pointer No Recursion No goto Validation Engine
19
University of California, San Diego Validating High-Level Synthesis 19 Results Benchmarks No. of simulation relation entries No. of calls to theorem prover Time (sec:msec) 1. incrementer6900:5 2. integer-sum62000:8 3. array-sum62400:8 4. diffeq74101:6 5. waka117902:6 6. pipelining127502:3 7. rotor147102:5 8. parker2628105:2 9. s2r2757026:7 10. findmin82978714:8
20
University of California, San Diego Validating High-Level Synthesis 20 Bugs Found in SPARK Array Copy Propagation Code motion Code fragment Before scheduling After scheduling (Buggy) After scheduling (Correct) a[0] := b[1]; c := a[0]; a[0] := b[1]; c := b[0]; a[0] := b[1]; c := b[1]; Code fragment Before scheduling After scheduling (Buggy) After scheduling (Correct) ret[1] := blk[0] <<3; ret[0] := ret[1]; ret[1] := blk[0] <<3; ret[0] := ret[1]; Did not copy array index Read After Write dependency
21
University of California, San Diego Validating High-Level Synthesis 21 Related Work Translation Validation Sequential Programs [Pnueli et al. 98] [Necula 00] [Zuck et al. 05] CSP Programs [Kundu et al. 07] HLS Verification Scheduling Step Correctness preserving transformation [Eveking 99] Symbolic Simulation [Ashar 99] Formal assertions [Narasimhan 01] Relational approaches for Equivalence of FSMDs [Kim 04, Karfa 06]
22
University of California, San Diego Validating High-Level Synthesis 22 Conclusion and Future Directions Presented an automated algorithm for translation validation of the HLS process. Implemented it for a HLS tool called SPARK. Modular: works on one procedure at a time. Practical: took on average 6 secs to run per procedure. Easy to Implement: only a fraction of development cost of SPARK. Useful: found 2 previously unknown bugs. In future, Remaining phases of SPARK: parsing, binding and code generation.
23
University of California, San Diego Validating High-Level Synthesis 23 Thank You
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.