I MPLEMENTING S YNCHRONOUS M ODELS ON L OOSELY T IME T RIGGERED A RCHITECTURES Discussed by Alberto Puggelli.

Slides:



Advertisements
Similar presentations
Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising.
Advertisements

1 Verification of Parameterized Systems Reducing Model Checking of the Few to the One. E. Allen Emerson, Richard J. Trefler and Thomas Wahl Junaid Surve.
Regular Expression Manipulation FSM Model
Partial Order Reduction: Main Idea
Models of Concurrency Manna, Pnueli.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
Chapter 4 Retiming.
Analysis of Algorithms CS Data Structures Section 2.6.
Chapter 15 Basic Asynchronous Network Algorithms
Timed Automata.
Sequential Circuits1 DIGITAL LOGIC DESIGN by Dr. Fenghui Yao Tennessee State University Department of Computer Science Nashville, TN.
Global Flow Optimization (GFO) in Automatic Logic Design “ TCAD91 ” by C. Leonard Berman & Louise H. Trevillyan CAD Group Meeting Prepared by Ray Cheung.
Modern VLSI Design 2e: Chapter 8 Copyright  1998 Prentice Hall PTR Topics n High-level synthesis. n Architectures for low power. n Testability and architecture.
Section 7.4: Closures of Relations Let R be a relation on a set A. We have talked about 6 properties that a relation on a set may or may not possess: reflexive,
Requirements on the Execution of Kahn Process Networks Marc Geilen and Twan Basten 11 April 2003 /e.
DATAFLOW PROCESS NETWORKS Edward A. Lee Thomas M. Parks.
Synthesis of Embedded Software Using Free-Choice Petri Nets.
1 Complexity of Network Synchronization Raeda Naamnieh.
ECE 331 – Digital System Design
Worst-case Fair Weighted Fair Queueing (WF²Q) by Jon C.R. Bennett & Hui Zhang Presented by Vitali Greenberg.
FunState – An Internal Design Representation for Codesign A model that enables representations of different types of system components. Mixture of functional.
Design of Fault Tolerant Data Flow in Ptolemy II Mark McKelvin EE290 N, Fall 2004 Final Project.
Models of Computation for Embedded System Design Alvise Bonivento.
1 Relations: The Second Time Around Chapter 7 Equivalence Classes.
A Denotational Semantics For Dataflow with Firing Edward A. Lee Jike Chong Wei Zheng Paper Discussion for.
Dynamic NoC. 2 Limitations of Fixed NoC Communication NoC for reconfigurable devices:  NOC: a viable infrastructure for communication among task dynamically.
Petri Net Modeling for dynamic MM composite Object.
Mahapatra-A&M-Sprong'021 Co-design Finite State Machines Many slides of this lecture are borrowed from Margarida Jacome.
The Processor Data Path & Control Chapter 5 Part 1 - Introduction and Single Clock Cycle Design N. Guydosh 2/29/04.
1 Correct and efficient implementations of synchronous models on asynchronous execution platforms Stavros Tripakis UC Berkeley and Verimag EC^2 Workshop,
Time-Constrained Flooding A.Mehta and E. Wagner. Time-Constrained Flooding: Problem Definition ●Devise an algorithm that provides a subgraph containing.
Maria-Cristina Marinescu Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology A Synthesis Algorithm for Modular Design of.
Maria-Cristina Marinescu Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology High-level Specification and Efficient Implementation.
S. M. Farhad PhD Student Supervisor: Dr. Bernhard Scholz
Nattee Niparnan. Easy & Hard Problem What is “difficulty” of problem? Difficult for computer scientist to derive algorithm for the problem? Difficult.
1 Automatic Refinement and Vacuity Detection for Symbolic Trajectory Evaluation Orna Grumberg Technion Haifa, Israel Joint work with Rachel Tzoref.
Evaluation and Validation Peter Marwedel TU Dortmund, Informatik 12 Germany 2013 年 12 月 02 日 These slides use Microsoft clip arts. Microsoft copyright.
Mahapatra-A&M-Fall'001 Co-design Finite State Machines Many slides of this lecture are borrowed from Margarida Jacome.
Modern VLSI Design 4e: Chapter 8 Copyright  2008 Wayne Wolf Topics Basics of register-transfer design: –data paths and controllers; –ASM charts. Pipelining.
Introduction to State Machine
CIS 540 Principles of Embedded Computation Spring Instructor: Rajeev Alur
School of Computer Science, The University of Adelaide© The University of Adelaide, Control Data Flow Graphs An experiment using Design/CPN Sue Tyerman.
Hwajung Lee.  Models are simple abstractions that help understand the variability -- abstractions that preserve the essential features, but hide the.
Problem Statement How do we represent relationship between two related elements ?
CSCI1600: Embedded and Real Time Software Lecture 11: Modeling IV: Concurrency Steven Reiss, Fall 2015.
Automata & Formal Languages, Feodor F. Dragan, Kent State University 1 CHAPTER 7 Time complexity Contents Measuring Complexity Big-O and small-o notation.
Finite State Machines (FSM) OR Finite State Automation (FSA) - are models of the behaviors of a system or a complex object, with a limited number of defined.
Impossibility of Distributed Consensus with One Faulty Process By, Michael J.Fischer Nancy A. Lynch Michael S.Paterson.
Modern VLSI Design 3e: Chapter 8 Copyright  1998, 2002 Prentice Hall PTR Topics n Basics of register-transfer design: –data paths and controllers; –ASM.
Chapter 8: Relations. 8.1 Relations and Their Properties Binary relations: Let A and B be any two sets. A binary relation R from A to B, written R : A.
Chapter 13 Backtracking Introduction The 3-coloring problem
Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb
ECE 448 Lecture 6 Finite State Machines State Diagrams vs. Algorithmic State Machine (ASM) Charts.
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
High Performance Embedded Computing © 2007 Elsevier Lecture 4: Models of Computation Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte.
Mealy and Moore Machines Lecture 8 Overview Moore Machines Mealy Machines Sequential Circuits.
Chapter #6: Sequential Logic Design
Clockless Computing COMP
ECE 448 Lecture 6 Finite State Machines State Diagrams vs. Algorithmic State Machine (ASM) Charts.
CSCI1600: Embedded and Real Time Software
THEORY OF COMPUTATION Lecture One: Automata Theory Automata Theory.
Evaluation and Validation
Chapter 2: The Linux System Part 3
On-time Network On-chip
Instructor: Aaron Roth
Advanced Algorithms Analysis and Design
ECE 448 Lecture 6 Finite State Machines State Diagrams, State Tables, Algorithmic State Machine (ASM) Charts, and VHDL code ECE 448 – FPGA and ASIC Design.
EGR 2131 Unit 12 Synchronous Sequential Circuits
Instructor: Aaron Roth
CSCI1600: Embedded and Real Time Software
Presentation transcript:

I MPLEMENTING S YNCHRONOUS M ODELS ON L OOSELY T IME T RIGGERED A RCHITECTURES Discussed by Alberto Puggelli

O UTLINE From synchronous to LTTA Analysis of system throughput

S YNCHRONOUS IS GOOD … Predictability Theoretical backing Easy verification Automated synthesis …..

… BUT DIFFICULT TO IMPLEMENT ! Need for clock synchronization in embedded systems Long wires Cheap hardware Timing constraints

S OLUTION Complete all verification steps in the synchronous domain De-synchronize the system while preserving the semantic Stream Equivalence Preservation

S TEPS TO DESYNCHRONIZE Design the system with sync semantic Choose a suitable architecture Platform Based Design: select a set of intermediate layers to map the system description on the architecture while preserving the semantic

A RCHITECTURE : LTTA Loosely Time Triggered Architecture Each node has an independent clock and it can write and read to the medium asynchronously The medium is a shared memory, that is refreshed synchronously to the medium clock Neither blocking read nor blocking write Each node has to check for the presence of data in the medium (reading) and for the availability of memory (writing)

I NTERMEDIATE L AYER : FFP Kahn Process Network with bounded FIFOs (to represent a real system) Marked Directed Graphs (MDG) to allow semantic preservation and to analyze system performance

S YNCHRONOUS M ODEL Set of Mealy machines and connections among them  directed graph (nodes & edges) Every loop is broken by a unit-delay element (UD) Partial order: M i < M j if there is a link from M i to M j without UD (reflexive and transitive) Minimal element: M i if there is no M j s.t. M j < M i Each link is an infinite stream of values in V (also UD has output streams) Each machine produce an output and a next state as a function of the inputs and of the current state For UD y(k+1) = x(k) y(0) = i.v. (x input, y output) Firing the machines in any total order that respects the partial order

E XAMPLE

A RCHITECTURE : LTTA Each node runs a single process triggered by a local clock. Communication by Sampling (CbS) links among nodes. API: set of functions to call CbS. These functions can be run at certain conditions (assumption) and they guarantee certain functionalities. The execution time of each block is less than the time between 2 triggers.

A RCHITECTURE : C B S Only source nodes can write (fun: write()) Only destination nodes can read (fun: read()) Unknown execution time Atomicity is guaranteed (a function ends before the following one starts) No guarantee in freshness of data (due to execution time) Fun isNew: returns true if there are fresh data Write add an index (sn) to the data; Reader keeps the index (lsn) of the last read data: if lsn = sn the data is old.

E XAMPLE

I NTERMEDIATE L AYER : FFP Architectural similarities with LTTA and semantics close to synchronous Set of sequential processes communicating with finite FIFOs. Processes do NOT block: process has to check whether they can execute (queue is not empty before reading; queue is not full before writing) isEmpty; isFull; put; get (API similar to CbS)

M APPING SYNC TO FFP Each machine is mapped into a process (UD are not) There is a queue for each link If the link has a UD the queue is size 2 If the link has no UD the queue size is 1 At each trigger IF (all input queues are non-empty and all output queues are non-full) Compute outputs and new state Write outputs to output queues Else Skip step End if

M APPING SYNC TO FFP (2) Conversion into a Mark Directed Graph Every process becomes a transition Every queue is converted in a forward (model non- empty queues) and in a backward place (model non-full queues) If the queue has k places and it has(not) a UD, I put k- 1(k) tokens in the backward place and 1(0) in the forward place

E XAMPLE

AFTER FIRING T1

E XAMPLE AFTER FIRING T3

E XAMPLE AFTER FIRING T2

M APPING SYNC TO FFP Theorem: semantic is preserved with queues of size at most 2. Queue of size 2 if there is a UD; size 1 if there is not. Step 1: the FFP has no deadlocks (this is true by construction since I put at least one token in each directed circuit) Step 2: any execution of MDG is one possible execution of the corresponding FFP. Note: check for isFull is not necessary, because by construction, if inputs are not empty, outputs can’t be full.

M APPING FFP TO LTTA It is possible to map 1:1 from FFP to LTTA FFP API can be implemented on top of LTTA API Semantic is preserved by skipping processes that can’t be fired (either for empty inputs or for full outputs)

T HROUGHPUT ANALYSIS Need for an estimation of the system throughput (λ): each process either runs or skips for every trigger. Upper bound: clock rate (if globally sync) Is there a lower bound (worst case)?

T HROUGHPUT ANALYSIS In RT: Theorem: if the size of a queue is increased, the resulting throughput either increases or remains equal or larger. Need for a symbolical definition of throughput that is independent of the implementation  logical-time throughput In LT: The worst-case throughput is

T HROUGHPUT ANALYSIS To find the minimum we define a “slow triggering policy”: at each time step, the clock of each process ticks one and only one time and the clocks of disabled processes tick before the clocks of enabled processes Theorem: the throughput of a system that “adopts” the slow triggering policy is the lowest possible All disabled processes can’t run until the following time step  the throughput is minimized

T HROUGHPUT ANALYSIS To evaluate the value of λ min we first analyze the associated MDG. Create a reachability graph (RG) that implements the slow triggering policy  it is a graph in which all enabled transitions are fired (i.e. all transitions that are not enabled can’t be fired until the following step) Determine the lasso starting from M0 Lasso: a loop in the (RG) that the system travels an infinite number of times (remember: deadlock free)

T HROUGHPUT ANALYSIS If L is the length of the lasso, for the process P: The WC throughput is the same for all processes (a lasso is periodic, so all transitions have to be fired the same number of times to return to the starting marking) If Δ is the period of the slowest clock:

E XAMPLE Initial Marking: M0 = (0,1,0,1) #transitions = 3 #places = 2(3-1) = 4 Two adjacent processes can’t be enabled at the same time step!  The lasso is (0,1,0,1)  (1,0,0,1)  (0,1,1,0)  (1,0,0,1) The lasso has length 2 and each transition is fired once  throughput = 0.5

E XAMPLE

E XAMPLE 2 Initial Marking: M0 = (0,2,0,2) #transitions = 3 #places = 2(3-1) = 4 The lasso is (0,2,0,2)  (1,1,1,1)  (1,1,1,1) The lasso has length 1 and each transition is fired once  throughput = 1

E XAMPLE 2