Introduction to Silicon Programming in the Tangram/Haste language

Slides:



Advertisements
Similar presentations
Introduction to VLSI Programming TU/e course 2IN30 Lecture 2: Control Handshake Circuits (1) Prof.dr.ir Kees van Berkel [Dr. Johan Lukkien] [Dr.ir. Ad.
Advertisements

COMMUNICATING SEQUENTIAL PROCESSES C. A. R. Hoare The Queen’s University Belfast, North Ireland.
Parallel Programming 0024 Week 06 Thomas Gross Spring Semester 2010 Apr 15, 2010.
Models of Concurrency Manna, Pnueli.
Combinational Logic.
Functions and Functional Blocks
1 MODULE name (parameters) “Ontology” “Program” “Properties” The NuSMV language A module can contain modules Top level: parameters less module Lower level.
Lecture 8: Three-Level Architectures CS 344R: Robotics Benjamin Kuipers.
Expressions and Statements. 2 Contents Side effects: expressions and statements Expression notations Expression evaluation orders Conditional statements.
Reading1: An Introduction to Asynchronous Circuit Design Al Davis Steve Nowick University of Utah Columbia University.
Introduction to VLSI Programming TU/e course 2IN30 Lecture 3: Control Handshake Circuits (2)
Compiling Communicating Processes into Delay-Insensitive VLSI Circuits Alain J. Martin Department of Computer Science California Institute of Technology.
VLSI Programming of Asynchronous circuits for Low Power Kees van Berkel Philips Research Lab. Martin Rem Eindhoven University of Technology.
Lecture 12 Latches Section , Block Diagram of Sequential Circuit gates New output is dependent on the inputs and the preceding values.
Process Patterns in BizAGI. Slide 2 Overview Types of events Types of gateways Design patterns list.
© 2006 ITT Educational Services Inc. SE350 System Analysis for Software Engineers: Unit 9 Slide 1 Appendix 3 Object-Oriented Analysis and Design.
1 Concurrency Specification. 2 Outline 4 Issues in concurrent systems 4 Programming language support for concurrency 4 Concurrency analysis - A specification.
An Associative Broadcast Based Coordination Model for Distributed Processes James C. Browne Kevin Kane Hongxia Tian Department of Computer Sciences The.
© Ran GinosarAsynchronous Design and Synchronization 1 VLSI Architectures Lecture 2: Theoretical Aspects (S&F 2.5) Data Flow Structures.
Why Behavioral Wait statement Signal Timing Examples of Behavioral Descriptions –ROM.
Chapter 11: Distributed Processing Parallel programming Principles of parallel programming languages Concurrent execution –Programming constructs –Guarded.
Introduction to VLSI Programming Lecture 6: Resource sharing (course 2IN30) Prof. dr. ir.Kees van Berkel.
Operational Semantics Semantics with Applications Chapter 2 H. Nielson and F. Nielson
Sequential Circuits Chapter 4 S. Dandamudi To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S.
CS5204 – Operating Systems 1 Communicating Sequential Processes (CSP)
Boolean Algebra and Computer Logic Mathematical Structures for Computer Science Chapter 7.1 – 7.2 Copyright © 2006 W.H. Freeman & Co.MSCS Slides Boolean.
CS6133 Software Specification and Verification
Parallel architecture Technique. Pipelining Processor Pipelining is a technique of decomposing a sequential process into sub-processes, with each sub-process.
Reading1: An Introduction to Asynchronous Circuit Design Al Davis Steve Nowick University of Utah Columbia University.
The Complexity of Distributed Algorithms. Common measures Space complexity How much space is needed per process to run an algorithm? (measured in terms.
Algorithm Design.
Hwajung Lee. Well, you need to capture the notions of atomicity, non-determinism, fairness etc. These concepts are not built into languages like JAVA,
Hwajung Lee. Why do we need these? Don’t we already know a lot about programming? Well, you need to capture the notions of atomicity, non-determinism,
Alice in Action with Java Chapter 4 Flow Control.
Operational Semantics Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Introduction to distributed systems description relation to practice variables and communication primitives instructions states, actions and programs synchrony.
Combinational Design, Part 2: Procedure. 2 Topics Positive vs. negative logic Design procedure.
Functions of Processor Operation Addressing modes Registers i/o module interface Memory module interface Interrupts.
Other Approaches.
Synthesis from HDL Other synthesis paradigms
Asynchronous Interface Specification, Analysis and Synthesis
Model and complexity Many measures Space complexity Time complexity
Logics for Data and Knowledge Representation
Part IV: Synthesis from HDL Other synthesis paradigms
Timing Model Start Simulation Delay Update Signals Execute Processes
CS 352 Introduction to Logic Design
Recap: Lecture 1 What is asynchronous design? Why do we want to study it? What is pipelining? How can it be used to design really fast hardware?
Micro-Operations A computer executes a program Fetch/execute cycle
Chapter 3 Top Level View of Computer Function and Interconnection
Business System Development
Concurrency Specification
UML Sequence Diagrams.
ECE 434 Advanced Digital System L08
Concurrent Systems Modeling using Petri Nets – Part II
Logics for Data and Knowledge Representation
Alternating Bit Protocol
Introduction toVLSI Programming Lecture 4: Data handshake circuits
Chapter 20 Object-Oriented Analysis and Design
Logics for Data and Knowledge Representation
Logics for Data and Knowledge Representation
Introduction to Silicon Programming in the Tangram/Haste language
Semaphores Chapter 6.
Python Primer 1: Types and Operators
Clockless Logic: Asynchronous Pipelines
Appendix 3 Object-Oriented Analysis and Design
Chapter 13: I/O Systems.
Chapter 14. Activity Modeling for Transformational Systems
Introduction to VLSI Programming Lecture 5: Tangram & Tools
Early output logic and Anti-Tokens
Logics for Data and Knowledge Representation
Presentation transcript:

Introduction to Silicon Programming in the Tangram/Haste language Material adapted from lectures by: Prof.dr.ir Kees van Berkel [Dr. Johan Lukkien] [Dr.ir. Ad Peeters] at the Technical University of Eindhoven, the Netherlands

Handshake protocol Handshake between active and passive partner Communication is by means of alternating request (from active to passive) and acknowledge (from passive to active) signals Active: send request, then wait for acknowledge Passive: wait for request, then send acknowledge Handshake circuits are a special class of asynchronous circuits. Where ‘asynchronous’ stresses a propoerty that the circuit does not have (it is not synchronous), ‘handshake technology’ emphasizes the property that the circuits do have: namely the strict reliance on handshake protocols as a timing discipline. A handshake protocol is a game between an active partner (indicated with the red bullet) and a passive partner (indicated with the green bullet). A handshake is initiated by the active partner by sending a request to the passive partner. This partner then “does what it is supposed to do” and, when completed, sends an acknowledgement back to the active partner. Active Passive

Handshake component: sequencer Master Sequencer Task 1 Task 2 This animation shows a so-called “sequencer” component in operation. The task of a sequencer component, when activated from the top, is to first perform a handshake on the left-hand side, then on the right-hand side, and then to signal completion of its operation by completing the handshake with the top (activating) module. This animation shows how handshake communication can be used to implement control functions, in this case sequencing. Similar components exist for parallel or conditional activation.

Four-phase handshake protocol Circuit level implementation has separate wires for request and acknowledge Four-phase handshake protocol implements return-to-zero of these wires Active Side Req := 1 ; Wait (Ack); Req := 0 ; Wait (-Ack); Passive Side Wait (Req); Ack := 1; Wait (-Req); Ack := 0; Req Ack

Handshake signaling active side passive side request ar acknowledge ak time event sequence: ar ak ar ak

Handshake behaviors Let xi be boolean variables, and Si commands: skip always terminates without effect x is a shorthand for x:= true and x for x:= false S1 ; S2 denotes sequential execution of S1 and S2 S1 || S2 denotes parallel execution of S1 and S2 Program notation inspired by [Martin].

Handshake behaviors Let Gi be boolean expressions. Selection [G1  S1 [] … [] GN  SN ] execute an arbitrary Si for which guard Gi holds. When no guard holds then suspend execution until otherwise. Repetition [G1  S1 [] … [] GN  SN ] repeatedly execute Si for which Gi holds until all guards are false.

Useful shorthands ‘wait until’ [G] = [G  skip] Note: [G] ; S = [G  S] Unbounded repetition [S] = [true  S]

Useful shorthands Four-phase handshakes Two-phase handshakes a = [ar] ; ak ; [ar] ; ak a = ar ; [ak] ; ar ; [ak] Two-phase handshakes a = [ar] ; ak a = [ar] ; ak a = ar ; [ak] a = ar ; [ak]

Reorder properties In the absence of timing assumptions, One cannot observe the order of output transitions x1 ; x2 = x2 ; x1 = x1 || x2 One cannot fix the order of input transitions [x1] ; [x2] = [x2] ; [x1] = [x1] || [x2] = [x1  x2]

Enclosure and properties a : S = [ ar] ; S ; ak a : S = [ ar] ; S ; ak Reorder property a : b = [ar] ; ([br] ; bk) ; ak = [br] ; ([ar] ; ak) ; bk = b : a

Decomposition rule Let program P = … S … and let a be a “fresh” channel Program P can be decomposed into two parallel processes: P’ = … a ; a … and [a : S ; a ]

Some handshake components  a b Repeater : [a : [b ; b] ] Mixer : [ [ a : c ; [a : c] [] b : c ; [b : c] ] ] Sequencer : [[a : (b ; b ; c) ] ; [a: c]] | a c b a ; b c

Handshake circuit: duplicator For each handshake on a0 the duplicator produces two handshakes on a1 [[a0 : (a1 ; a1 ; a1) ] ; [a0: a1]] cf. Handshake behavior sequencer.

Production rules Production rules are guarded commands that specify (CMOS) gates  F  z, G  z  Interpretation do F then z:=true or G then z:=false od Guards must be mutually exclusive (environment) A gate is combinational if F  G is a tautology and it is sequential (state-holding) otherwise Guards must be stable: once a guard is true it must remain true until completion of transition

Behavior of a gate network Gate network is the union of all pairs of production rules (gates) The concurrent execution of this set of PRs amounts to [Martin]: [ select a PR ; fire that PR] If guard of PR equals false, firing = skip (firing a PR is an atomic action)

Initializable A handshake component realization is initializable : when all inputs are false, the gate network must autonomously proceed to an initial state; when all passive inputs are false, the component must autonomously proceed to a state with all active outputs false.

Handshake components: realization From handshake notation to gate network in 8 steps: Specify component in handshake notation. Expand to individual boolean variables (wires). Introduce auxiliary state variables (if required). Derive a set of production rules that implements this refined specification. Make production rules more symmetric (cheaper). (Verify isochronic forks.) Verify initialization constraints. Analyze time, area, and energy.

For those who are interested in the details Synthesis of Asynchronous VLSI Circuits Alain J. Martin Caltech CS-TR-93-28 PostScript link via async.bib (html version) Programming in VLSI: From communicating processes to delay-insensitive circuits Pages 1–64 in C.A.R. Hoare, ed., Developments in Concurrency and Communication

Handshake components realizations Connector: trivial Repeater: alternative ‘symmetrizations’ Mixer: isochronic forks Sequencer: introduction of auxiliary variable Duplicator: up to you? Selector: up to you!

Connector realization Behavior: [a : b ; a : b ] Expansion: [ [ar] ; br ; [bk] ; ak ; [ar] ; br ; [bk] ; ak] Production rules: bk  ak ar  br bk  ak ar  br A pair of wires (!): no area, no delay, no energy. b

[ [ar] ; [ br ; [bk] ; br ; [bk] ] ; ak ] Repeater realization  a b Behavior: [a : [b ; b] ] Expansion: [ [ar] ; [ br ; [bk] ; br ; [bk] ] ; ak ] Production rules: false  ak ar  bk  br true  ak bk  br However, not initializable!

Repeater realizations

Repeater: area, delay, energy Area: 2 gate equivalents Delay per cycle: 2 gate delays Energy per cycle: 2 transitions

Mixer realization a b | c Behavior: [ [ a : c ; a : c [] b : c ; b : c ] ] Restriction: ar  br must hold at all times! Expansion: [ [ [ar] ; cr ; [ck] ; ak ; [ar] ; cr ; [ck] ; ak [] [br] ; cr ; [ck] ; bk ; [br] ; cr ; [ck] ; bk ] ] c

Mixer realization a b | c Production rules: ar  ck  ak br  ck  bk ck  ak ck  bk ar  br  cr ar  br  cr More symmetric production rules: ar  ck  ak ar  ck  ak ar  ck  ak ar  ck  ak premature ak more expensive | a c b

Mixer realizations Mixer: area, delay, energy Area: 6 gate equivalents Delay per cycle: 8 gate delays Energy per cycle: 8 transitions

Duplicator chains Assume aM toggles at frequency f. Hence a0 toggles at frequency f / 2M. Let Edup be the duplication energy per cycle. Power of duplicator chain equals P = f Edup (1/2 + 1/4 + 1/8 + ...) < f Edup

Join realization a c b  Behavior: [ [ a : b : c ; a : b : c ] ] Expansion: [ [ ar] ; [ br] ; cr ; [ ck] ; bk ; ak ; [ar] ; [br] ; cr ; [ck] ; bk ; ak ] = [ [ ar  br] ; cr ; [ ck] ; bk , ak ; [arbr] ; cr ; [ck] ; bk , ak ]

Join realization a b c fork C-element *[ [ ar  br] ; cr ; [ ck] ; bk , ak ; [arbr] ; cr ; [ck] ; bk , ak ] Production rules: ck  ak ck  bk ck  ak ck bk ar  br  cr ar  br  cr a c b  fork C-element

Join realization ak ck ar cr C br bk ck  ak ck bk ck  ak ck bk ar  br  cr ar  br  cr Join: area, delay, energy Area: 2 gate equivalents Delay per cycle: 4 gate delays Energy per cycle: 4 transitions C ar br cr ck ak bk

Sequencer realization Specification: [(a : (b ; b ; c) ) ; (a: c)] Expansion: [ [ar] ; br ; [bk] ; br ; [bk] ; cr ; [ck] ; ak ; [ar] ; cr ; [ck] ; ak ] a ; b c

Sequencer realization x ar bk br cr ck ak Sequencer: area, delay, energy Area: 5 gate equivalents Delay per cycle: 12 gate delays (8 with optimized C-element) Energy per cycle: 12 transitions (10 with optimized C-element)

Parallel realization a || b c [ a : ((b ; b) || (c c)) ; a ] Cf. Join component Expansion: [ [ar] ; ( (br ; [bk] ; br ; [bk]) || (cr ; [ck] ; cr ; [ck]) ) ; ak; [ar] ; ak] a || b c

Selector (specification) [|] c e d b [ (a :[ b: x ; b ; d [] c: x ; c ; e ] ) ; (a: [x  d []  x  e ] ) ]

Assignment: duplicator realization Behavior: [[a : (b ; b ; b) ] ; [a: b]] Required: realization with 2 sequential gates (sequencer + mixer requires 3 sequential gates) Follow all 8 realization steps!! Add comparison with sequencer+mixer realization. #2 a b