1 Correct and efficient implementations of synchronous models on asynchronous execution platforms Stavros Tripakis UC Berkeley and Verimag EC^2 Workshop, Grenoble, June 2009
Some observations Concurrency => interleaving – C.f., synchronous systems (e.g., circuits) Concurrency => non-determinism – synchronous circuits are deterministic Concurrency => shared memory – C.f., data flow models Asynchronous concurrency (interleaving) => non-determinism – C.f., Kahn Process Networks 2 Threads have conquered the world, but …
What are the problems we (as a community) are trying to solve? Cope with concurrency… but what does it mean? What are the right execution platforms? – Which multicore architecture, memory model, … What are the right programming models? For which types of applications? How to map the latter to the former? – Correctly and efficiently! How to verify stuff? ± given, synchronous given, asynchronous focus 3
Synchronous vs. asynchronous concurrency Synchronous concurrency – Execution platforms: synchronous hardware – Programming models: Simulink, SCADE, synchronous languages (Esterel, Lustre, …), … Asynchronous concurrency – Execution platforms: many, including distributed platforms – Programming models: thread-based (often communicating by shared-memory) 4
Concurrency => non-determinism Most synchronous models are deterministic: synchronous hardware, Simulink, SCADE, most synchronous languages, … 5 Copyright The Mathworks Engine control model in Simulink
Concurrency => non-determinism Some asynchronous models are also deterministic, e.g.: – Kahn Process Networks: the sequence of values (stream) produced at each FIFO is the same independent of process interleaving 6
Our choice of programming model: synchronous Set of parallel processes, notion of global synchronous cycle – Simulink, SCADE, VHDL, Verilog, Lustre, Esterel, … Main advantages: – Determinism, no process interleaving: Easier to understand, easier to verify (less state explosion) Main objections: – “Synchrony is impossible/hard/too expensive to implement” – “This is especially true for distributed systems” “You need clock synchronization” – Practice seems to agree with this… Most available implementations of synchronous systems are either synchronous hardware, or centralized “read; compute; write;” control loops. – …but it is not quite true. 7
8 Semantics-preserving implementation of synchronous models Simulink single-processor single-task single-processor multi-task distributed, synchronous (TTA) … … distributed, asynchronous (KPN, LTTA,...) application execution platform design implementation
9 From synchronous models to asynchronous distributed implementations Joint work with Claudio Pinello, Cadence Alberto Sangiovanni-Vincentelli, UC Berkeley Albert Benveniste, IRISA (France) Paul Caspi, VERIMAG (France) Marco di Natale, SSSA (Italy) [IEEE Trans. Computers, Oct’08]
Implementation on asynchronous distributed platforms Asynchronous distributed platforms: – Many computers, each with a local clock No clock synchronization – Computers communicate using some network/protocol Don’t care which network, as long as finite FIFO queues (TCP) can be implemented on top synchronous model asynchronous platform with some communication network 10
Implementation on asynchronous distributed platforms synchronous model asynchronous platform with some communication network Intermediate layer: asynchronous processes communicating with finite FIFO queues 11
Implementation on asynchronous distributed platforms synchronous model Intermediate layer: asynchronous processes communicating with finite FIFO queues 12 This is like Kahn Process Networks with blocking write() when FIFO is full. FIFOs must be large enough to avoid deadlocks. => semantical (stream) preservation
Semantical preservation: proof Use old theories [1970s]: Marked graphs – Subclass of Petri Nets – Used to show FFP liveness (no deadlock) Kahn Process Networks – Used Kahn’s fundamental result: determinism – Streams do not depend on process interleaving 13
Performance analysis: worst-case logical-time throughput and latency 14 P1P2P1P2 WCLTT = 1/2 WCLTT = 1
15 From synchronous models to asynchronous multitask implementations Joint work with Paul Caspi, Norman Scaife, Christos Sofronis, VERIMAG [ACM Trans. Embed. Comp. Sys., Feb’08]
Implementation on centralized, multitasking platforms 16 Sync Single-processor Priority scheduling (fixed priority or EDF) scheduler T1T1 T2T2 T3T3 tasks Why multitasking and not single “real-compute-write” loop? For multi-rate models: – Multitask implementation schedulable, but single-task not schedulable
Implementation on centralized, multitasking platforms 17 Sync Single-processor Priority scheduling (fixed priority or EDF) scheduler T1T1 T2T2 T3T3 tasks Goal: semantical preservation
Implementation on centralized, multitasking platforms 18 Sync Single-processor Priority scheduling (fixed priority or EDF) scheduler T1T1 T2T2 T3T3 tasks - non-blocking (wait-free) - memory-optimal - semantics-preserving
Conclusions Concurrency => non-determinism Synchronous models are deterministic – easier to understand and verify Synchronous models can be implemented on a variety of asynchronous execution platforms, using non-trivial techniques: – Implementations are correct-by-construction – They are memory-optimal – Performance (throughput, latency, …) can be analyzed and optimized 19
Open questions For which applications is the synchronous programming model suitable? – Traditionally for control: avionics, automotive, … – Some recent works trying to apply it to multimedia/signal processing To what extent these methods apply to multicores? Are dataflow computers going to come back? 20