Presentation is loading. Please wait.

Presentation is loading. Please wait.

Optimizations for Faster Execution of Esterel Programs

Similar presentations


Presentation on theme: "Optimizations for Faster Execution of Esterel Programs"— Presentation transcript:

1 Optimizations for Faster Execution of Esterel Programs
Dumitru Potop Robert de Simone IRISA Rennes INRIA Sophia Antipolis My name is … and I am going to talk about …

2 Outline The Esterel language The GRC intermediate representation
The problem of efficient code generation The GRC intermediate representation Intuitive description Software code generation Code analysis and optimizations Practical results. Conclusion. Future work My presentation is structured in 3 large parts. The first is a brief presentation of the Esterel language insisting on the problem of efficient code generation The second is a presentation of our approach to the compilation of Esterel, which is based on the use o a new intermediate representation, called GRC. I shall present here the intermediate format, and then insist on aspects concerning the optimized software code generation. The last part presents the practical results and some directions for the future.

3 The Esterel language Formal semantics (FSMs, circuits) Esterel
Formal verification Esterel specification SW translation HW (RTL) generation The synchronous language Esterel has well-defined formal semantics and supports a wide range of formal analysis and verification techniques. These techniques are mainly based on the interpretation of Esterel into mathematical models like FSMs or the digital synchronous circuits. Esterel can be translated automatically, and by preserving the semantics, into both software code, for simulation or implementation, or hardware code (at RTL level). Simulation SW implementation HW prototype (FPGA) Synthesis input

4 The Esterel language Reactive, clock-driven execution module ABRO:
input A,B,R; output O; loop [ await A || await B ]; emit O; halt every R end module The await statements which retain it until the next clock tick. Temps divise en cycles – reactif (await, halt coupent l’instant) Notion de hard en soft It follows a clock-driven, cycle-based execution model, like that of digital synchronous circuits. At each execution instant the program reads its inputs, computes the reaction, outputs the results and changes its state. In the initial state, control has not yet entered the program. Then, in the first instant it enters the loop, then the sequence, then, the parallel, and then it is distributed to the await statements, which are activated.

5 The Esterel language Reactive, clock-driven execution
module ABRO: input A,B,R; output O; loop [ await A || await B ]; emit O; halt every R end module execution instant inputs outputs any - If the signals A and B are present in the second execution instant, then the await tests terminate, the signal O is emitted, and control is retained by the halt statement.

6 The Esterel language Reactive, clock-driven execution
module ABRO: input A,B,R; output O; loop [ await A || await B ]; emit O; halt every R end module execution instant inputs outputs any - 1 AB O Halt remains activated until preempted by the loop-every statement. At every occurrence of R, the body of the loop is strongly preempted and instantly restarted.

7 The Esterel language Reactive, clock-driven execution
module ABRO: input A,B,R; output O; loop [ await A || await B ]; emit O; halt every R end module execution instant inputs outputs any - 1 AB O 2 R Now, we could continue our execution to discover all the reachable states and transitions of the program.

8 The Esterel language Synchrony
Causality cycles present S else emit S end Constructive semantics (SOS rules) ·signal S,T in emit S; present T then present S else emit T end else emit O end; end Most problems in the compilation of Esterel come from its synchronous signal semantics, which easily lead to causality cycles. To properly handle the issue, Esterel has constructive semantics that distinguish the bad cycles – nondeterministic or with no behavior – from causal cycles that can be assigned a proper behavior. The problem comes from the semantics of the signals. More exactly, a signal can be tested only after it has been emitted (so that it is present) or after all its emissions have been invalidated for the current instant (in which case the status is set to absent). The problem is emphasized by our small example. When the execution of the fragment begins, control enters the signal statement, setting the status of S and T to unknown. Then, S is emitted and control blocks on the signal test because T does not have a status of present or absent. We have a causality cycle where the emit T statement depends on the outcome of the presence test and vice-versa. Meanwhile, the program is correct under constructive semantics. S being present, we can invalidate the else branch of the test on S, so that emit T is invalidated, so that T is absent, and so on. Thus, the constructive semantics of Esterel are based on a combination of control flow and code pruning based on forward information propagation. Note that we needed to evaluate parts of the present S statement while not actually executing them.

9 The Esterel language Synchrony
Causality cycles present S else emit S end Constructive semantics (SOS rules) signal S,T in ·emit S; present T then present S else emit T end else emit O end; end Synchronous model, like that of the sequential circuits

10 The Esterel language Synchrony
Causality cycles present S else emit S end Constructive semantics (SOS rules) signal S,T in emit S; ·present T then present S else emit T end else emit O end; end Synchronous model, like that of the sequential circuits

11 The Esterel language Synchrony
Causality cycles present S else emit S end Constructive semantics (SOS rules) causality cycle signal S,T in emit S; ·present T then present S else emit T end else emit O end; end Synchronous model, like that of the sequential circuits

12 The Esterel language Synchrony
Causality cycles present S else emit S end Constructive semantics (SOS rules) causality cycle signal S,T in emit S; ·present T then present S else emit T end else emit O end; end break the cycle Causality cycles are bad. They result into cycles at the level of the graph-based intermediate formats, like the circuits, leading to very slow simulation and expensive analysis (and in particular expensive correctness checks). These problems are also present when we generate simulation code. Moreover, we have here the problem of the operator grain, as sequential operators generally represent operations otherwise consisting of several circuit gates.

13 First compilation scheme (1980’s)
FSM-based translation Exhaustive semantic expansion Explosion in size, expensive analysis Fast code (execute only active code) loop [ await A || await B ]; emit O; halt every R To solve all causality issues at compilation time, the first compilation scheme compiled away all concurrency by transforming the initial program into a Mealy machine. This was done through exhaustive symbolic simulation of the semantic rules, and sequential code was then generated by encoding the resulting FSM in C. The code was very fast, but the FSM explodes for most meaningful examples. 1 2 3 4

14 Second compilation scheme (1990’s)
Circuit-based translation Encode primitives into gates Small circuit size (quasi-linear) Slow software code (acyclic circuit evaluation) A loop [ await A || await B ]; emit O; halt every R When specifications became too big to be expanded, a second solution was found – the translation of the Esterel source into a digital synchronous circuit. The generated C code which is essentially a circuit simulator which executes all gates, is small. The causality defects (circuit cycles) are handled explicitly by replacing them with acyclic counterparts. However, the re-synthesis algorithms are of high complexity, being based on state space exploration. Thus, while the overall technique is theoretically complete, it is limited in practice to programs generating acyclic circuits. This being said, the main drawback of the approach is not the completeness, but the speed of the generated code (a flat circuit simulator, evaluating all gates at each instant). R O start B

15 Third compilation scheme (2000’s)
“Simulation” code Follows the naïve semantics (control-flow) Accept less programs (acyclic circuits) Code: small and very fast (statically scheduled) Edwards, Closse/Weil if(START){A_active=1;B_active=1;START=0} else { if(R){A_active=1;B_active=1;} else if(A_active|B_active) { if(A_active) if(A) A_active=0; if(B_active) if(B) B_active=0; if(!(A_active|B_active)) O=1; } loop [ await A || await B ]; emit O; halt every R The need of speed lead to the recent development of compilation schemes that re-discover the nice structure of the Esterel programs in order to generate well-structured sequential code. These new approaches interpret the Esterel source as a control-flow specification and generate very good code by statically scheduling its reactive operations. The approach is limited to the acyclic case, while acyclic remains to be defined and related to the circuit-level notion. Critere syntaxique. A l’exec.

16 Third compilation scheme (2000’s)
My goal: Formal intermediate model Preserve high-level information from Esterel (Static analysis, optimization techniques) Relation with constructive semantics (soundness of analysis and optimization) Relation with the circuit translation (soundness of execution, unique notion of acyclicity) The need of speed lead to the recent development of compilation schemes that re-discover the nice structure of Esterel programs in order to generate well-structured sequential code. These new approaches interpret the Esterel source as a control-flow specification and generate very good code by statically scheduling its reactive operations. The approach is limited to the acyclic case, but acyclic is hard to define in a formal way, as we shall see. As efficient execution of large specifications becomes more and more necessary, new techniques have been lately proposed, consisting essentially into direct software simulation of the reactive features of the language. Intuitively, a test on signal A is directly translated into the if(A) statement, “emit O” is translated into “O=1”, and so on. The approach is currently based on a control-flow interpretation of the Esterel source, and results into code that is very small (quasi-linear in the size of the Esterel source), very fast (as only active parts of the program are evaluated), and easily traceable. The drawback, because there is a drawback, in that the control-flow interpretation only covers a part of the correct Esterel programs.

17 Hierarchical state (structure)
The GRC intermediate format 1 boot: Hierarchical state (structure) # 4 await A 3 || await B 2 5 loop [ await A;emit B || await B ]; emit O; halt every R # sequence loop-every halt 6 activation new state We move now to the second part of my presentation which starts with an intuitive description of the GRC format. Empty boxes = program operations. Blue = state representation, encoding, and decoding Entered in order, exited in reverse order State representation = skeleton, flowgraph = muscles and nerves, influencing one another Flowgraph = unrolling of the Esterel source. Skeleton = static information repository. Boot = generated, technical detail enter 4 exit 1 enter 2 enter 3 3 enter 5 [1] exit 2 [2] Inactive[4] R 4 exit 4 emit B term[4] term exit 3 emit O enter 6 Control/Data flow graph (behaviour) A [3] pause 3 pause[4] 2 Inactive[5] [6] 5 exit 5 term[5] B 3 pause[5]

18 The hierarchical state structure
1 boot: loop [ await A;emit B || await B ]; emit O; halt every R # 4 await A 3 || await B 2 5 # sequence loop-every halt 6 Empty boxes = program operations. Blue = state representation, encoding, and decoding Entered in order, exited in reverse order State representation = skeleton, flowgraph = muscles and nerves, influencing one another Flowgraph = unrolling of the Esterel source. Skeleton = static information repository. Boot = generated, technical detail parallel/exclusive abstraction of the syntax tree nodes represent the activity condition of various subprogram fragments

19 Flowgraph execution loop # [ await A;emit B || await B ]; emit O; halt
Activated from the hierarchical state 1 boot: loop [ await A;emit B || await B ]; emit O; halt every R # 4 await A 3 || await B 2 5 # sequence loop-every halt 6 enter 4 Empty boxes = program operations. Blue = state representation, encoding, and decoding Entered in order, exited in reverse order State representation = skeleton, flowgraph = muscles and nerves, influencing one another Flowgraph = unrolling of the Esterel source. Skeleton = static information repository. Boot = generated, technical detail exit 1 enter 2 enter 3 3 enter 5 [1] exit 2 Inactive[4] [2] R 4 exit 4 emit B term[4] term exit 3 emit O enter 6 A [3] pause 3 pause[4] Inactive[5] 2 [6] 5 exit 5 term[5] B 3 pause[5]

20 Flowgraph execution • loop # [ await A;emit B || await B ]; emit O;
Activated from the hierarchical state 1 boot: loop [ await A;emit B || await B ]; emit O; halt every R # 4 await A 3 || await B 2 5 # sequence loop-every halt 6 enter 4 Memem programme, execute a chaque instant, decodage d’etat d’abord, dynamique-> satique Empty boxes = program operations. Blue = state representation, encoding, and decoding Entered in order, exited in reverse order State representation = skeleton, flowgraph = muscles and nerves, influencing one another Flowgraph = unrolling of the Esterel source. Skeleton = static information repository. Boot = generated, technical detail exit 1 enter 2 enter 3 3 enter 5 [1] exit 2 Inactive[4] [2] R 4 exit 4 emit B term[4] term exit 3 emit O enter 6 A [3] pause 3 pause[4] Inactive[5] 2 [6] 5 exit 5 term[5] R absent, A present B 3 pause[5]

21 Flowgraph execution •loop # [ await A;emit B || await B ]; emit O;
Activated from the hierarchical state 1 boot: •loop [ await A;emit B || await B ]; emit O; halt every R # 4 await A 3 || await B 2 5 # sequence loop-every halt 6 enter 4 Empty boxes = program operations. Blue = state representation, encoding, and decoding Entered in order, exited in reverse order State representation = skeleton, flowgraph = muscles and nerves, influencing one another Flowgraph = unrolling of the Esterel source. Skeleton = static information repository. Boot = generated, technical detail exit 1 enter 2 enter 3 3 enter 5 [1] exit 2 Inactive[4] [2] R 4 exit 4 emit B term[4] term exit 3 emit O enter 6 A [3] pause 3 pause[4] Inactive[5] 2 [6] 5 exit 5 term[5] R absent, A present B 3 pause[5]

22 Flowgraph execution loop # [ await A;emit B •|| await B ]; emit O;
Activated from the hierarchical state 1 boot: loop [ await A;emit B •|| await B ]; emit O; halt every R # 4 await A 3 || await B 2 5 # sequence loop-every halt 6 enter 4 Empty boxes = program operations. Blue = state representation, encoding, and decoding Entered in order, exited in reverse order State representation = skeleton, flowgraph = muscles and nerves, influencing one another Flowgraph = unrolling of the Esterel source. Skeleton = static information repository. Boot = generated, technical detail exit 1 enter 2 enter 3 3 enter 5 [1] exit 2 Inactive[4] [2] R 4 exit 4 emit B term[4] term exit 3 emit O enter 6 A [3] pause 3 pause[4] Inactive[5] 2 [6] 5 exit 5 term[5] R absent, A present B 3 pause[5]

23 Flowgraph execution loop # [ •await A;emit B || await B ]; emit O;
Activated from the hierarchical state 1 boot: loop [ •await A;emit B || await B ]; emit O; halt every R # 4 await A 3 || await B 2 5 # sequence loop-every halt 6 enter 4 Empty boxes = program operations. Blue = state representation, encoding, and decoding Entered in order, exited in reverse order State representation = skeleton, flowgraph = muscles and nerves, influencing one another Flowgraph = unrolling of the Esterel source. Skeleton = static information repository. Boot = generated, technical detail exit 1 enter 2 enter 3 3 enter 5 [1] exit 2 Inactive[4] [2] R 4 exit 4 emit B term[4] term exit 3 emit O enter 6 A [3] pause 3 pause[4] Inactive[5] 2 [6] 5 exit 5 term[5] R absent, A present B 3 pause[5]

24 Flowgraph execution loop # [ await A;•emit B || await B ]; emit O;
Activated from the hierarchical state 1 boot: loop [ await A;•emit B || await B ]; emit O; halt every R # 4 await A 3 || await B 2 5 # sequence loop-every halt 6 enter 4 Empty boxes = program operations. Blue = state representation, encoding, and decoding Entered in order, exited in reverse order State representation = skeleton, flowgraph = muscles and nerves, influencing one another Flowgraph = unrolling of the Esterel source. Skeleton = static information repository. Boot = generated, technical detail exit 1 enter 2 enter 3 3 enter 5 [1] exit 2 Inactive[4] [2] R 4 exit 4 emit B term[4] term exit 3 emit O enter 6 A [3] pause 3 pause[4] Inactive[5] 2 [6] 5 exit 5 term[5] R absent, A present B 3 pause[5]

25 Flowgraph execution loop # [ await A;emit B• || await B ]; emit O;
Activated from the hierarchical state 1 boot: loop [ await A;emit B• || await B ]; emit O; halt every R # 4 await A 3 || await B 2 5 # sequence loop-every halt 6 enter 4 Empty boxes = program operations. Blue = state representation, encoding, and decoding Entered in order, exited in reverse order State representation = skeleton, flowgraph = muscles and nerves, influencing one another Flowgraph = unrolling of the Esterel source. Skeleton = static information repository. Boot = generated, technical detail exit 1 enter 2 enter 3 3 enter 5 [1] exit 2 Inactive[4] [2] R 4 exit 4 emit B term[4] term exit 3 emit O enter 6 A [3] pause 3 pause[4] Inactive[5] 2 [6] 5 exit 5 term[5] R absent, A present B 3 pause[5]

26 Flowgraph execution loop # [ await A;emit B || await B ]; •emit O;
Activated from the hierarchical state 1 boot: loop [ await A;emit B || await B ]; •emit O; halt every R # 4 await A 3 || await B 2 5 # sequence loop-every halt 6 enter 4 Empty boxes = program operations. Blue = state representation, encoding, and decoding Entered in order, exited in reverse order State representation = skeleton, flowgraph = muscles and nerves, influencing one another Flowgraph = unrolling of the Esterel source. Skeleton = static information repository. Boot = generated, technical detail exit 1 enter 2 enter 3 3 enter 5 [1] exit 2 Inactive[4] [2] R 4 exit 4 emit B term[4] term exit 3 emit O enter 6 A [3] pause 3 pause[4] Inactive[5] 2 [6] 5 exit 5 term[5] R absent, A present B 3 pause[5]

27 Flowgraph execution loop # [ await A;emit B || await B ]; emit O;
Activated from the hierarchical state 1 boot: loop [ await A;emit B || await B ]; emit O; •halt every R # 4 await A 3 || await B 2 5 # sequence loop-every halt 6 enter 4 Empty boxes = program operations. Blue = state representation, encoding, and decoding Entered in order, exited in reverse order State representation = skeleton, flowgraph = muscles and nerves, influencing one another Flowgraph = unrolling of the Esterel source. Skeleton = static information repository. Boot = generated, technical detail exit 1 enter 2 enter 3 3 enter 5 [1] exit 2 Inactive[4] [2] R 4 exit 4 emit B term[4] term exit 3 emit O enter 6 A [3] pause 3 pause[4] Inactive[5] 2 [6] 5 exit 5 term[5] R absent, A present B 3 pause[5]

28 Flowgraph execution loop # [ await A;emit B || await B ]; emit O; halt
Activated from the hierarchical state 1 boot: loop [ await A;emit B || await B ]; emit O; halt every R # 4 await A 3 || await B 2 5 # sequence loop-every halt 6 enter 4 Empty boxes = program operations. Blue = state representation, encoding, and decoding Entered in order, exited in reverse order State representation = skeleton, flowgraph = muscles and nerves, influencing one another Flowgraph = unrolling of the Esterel source. Skeleton = static information repository. Boot = generated, technical detail exit 1 enter 2 enter 3 3 enter 5 [1] exit 2 Inactive[4] [2] R 4 exit 4 emit B term[4] term exit 3 emit O enter 6 A [3] pause 3 pause[4] Inactive[5] 2 [6] 5 exit 5 term[5] R absent, A present B 3 pause[5]

29 Simulation code generation
State encoding Static scheduling Sequential code Respects the causality (signal emission statements before tests statements) Theoreme de confluence, cyclique

30 Simulation code generation
boot: # State encoding 4 1 await A 3 || await B 5 2 # 1 loop-every halt 6 S[3]=1 S[1]=1 S[2]=0 3 S[4]=1 1 2 3 4 S[1] Inactive[4] Encoding for sequential code generation: bitwise, compact, hierarchic. The decoding is incremental. Parallel => encodings are concatenated Sequence => logarithmic encoding of the choice, then multiplexing of the branch encodings S[3] R S[3]=0 emit B term[4] term emit O S[2]=1 A pause 3 pause[4] Inactive[5] S[4] S[2] S[4]=0 term[5] B 3 pause[5]

31 Simulation code generation
bool aux=0; if(S[1]){ if(R){aux=1;} else { if(!S[2]){ if(S[3])if(A){S[3]=0;B=1;} if(S[4])if(B)S[4]=0; if(S[3]=0&S[4]=0){ O=1;S[2]=1; } }} } else {aux=1;} if(aux){S[1..4]=1011;} Static scheduling S[1..4]=1011 S[1] Inactive[4] Encoding for sequential code generation: bitwise, compact, hierarchic. The decoding is incremental. Parallel => encodings are concatenated Sequence => logarithmic encoding of the choice, then multiplexing of the branch encodings S[3] R S[3]=0 emit B term[4] term emit O S[2]=1 A pause 3 pause[4] Inactive[5] S[4] S[2] S[4]=0 term[5] B 3 pause[5]

32 Optimizations Based on static analysis (semantic-preserving, fast, efficient) Redundant state bit elimination False signal/data dependency elimination Node grouping Dead code removal

33 Same status at all instants
Optimizations Utility Simplify the state access/update protocol Simplify the state encoding Static analysis, example boot: trap T in sustain A || await B;await C;exit T end # nt: sustain A || await B Same status at all instants # nt: await C We are illustrating some of our simplification techniques with small examples. In the first, we can see that none of the parallel branches can terminate on its own. Exlplain why.

34 GRC code optimizations
Dependency removal 1 boot: 3 # pause; present S then emit T end; emit S; 2 # 4 exit 1 enter 2 enter 3 [1] emit T [2] exit 3 S enter 4 [3] 2 [4] exit 4 emit S exit 2 exit 0

35 GRC code optimizations
Dependency removal 1 boot: 3 # pause; present S then emit T end; emit S; 2 # 4 exit 1 enter 2 enter 3 [1] emit T [2] exit 3 S enter 4 [3] 2 [4] exit 4 emit S exit 2 exit 0

36 GRC code optimizations
Dependency removal 1 boot: 3 # pause; present S then emit T end; emit S; 2 # 4 exit 1 enter 2 enter 3 [1] emit T [2] exit 3 S enter 4 [3] 2 [4] exit 4 emit S exit 2 exit 0

37 Results Optimizing compiler Examples: Turbo channel bus
Berry’s wristwatch Video generator Shock absorber Operating system model Avionics fuel controller Avionics cockpit Man-machine interface Test configuration: PIII/1GHz/128M/Linux gcc –O, 1Mcycle random or given

38 Conclusion Intermediate model for Esterel programs
Static analysis, optimizations at GRC level GRC-acyclic = circuit-acyclic Code generation scheme Good practical results

39 Control clock hierarchy
Future work Digital circuit synthesis State encoding + translation into gates Good partial results Distributed implementation Connection between Esterel and Signal at source or GRC/HCDG level Extend GRC to a format capable of better handling dataflow aspects. Control registers Control clock hierarchy


Download ppt "Optimizations for Faster Execution of Esterel Programs"

Similar presentations


Ads by Google