Direct synthesis of large-scale asynchronous controllers using a Petri-net-based approach Ivan BlunnoPolitecnico di Torino Alex BystrovUniv. Newcastle.

Slides:

Advertisements

Similar presentations

VERILOG: Synthesis - Combinational Logic Combination logic function can be expressed as: logic_output(t) = f(logic_inputs(t)) Rules Avoid technology dependent.

Advertisements

Combinational Logic.

Reading1: An Introduction to Asynchronous Circuit Design Al Davis Steve Nowick University of Utah Columbia University.

1 BalsaOpt a tool for Balsa Synthesis Francisco Fernández-Nogueira, UPC (Spain) Josep Carmona, UPC (Spain)

Delay/Phase Regeneration Circuits Crescenzo D’Alessandro, Andrey Mokhov, Alex Bystrov, Alex Yakovlev Microelectronics Systems Design Group School of EECE.

1 Advanced Digital Design Synthesis of Control Circuits by A. Steininger and J. Lechner Vienna University of Technology.

Hazard-free logic synthesis and technology mapping I Jordi Cortadella Michael Kishinevsky Alex Kondratyev Luciano Lavagno Alex Yakovlev Univ. Politècnica.

Hardware and Petri nets Synthesis of asynchronous circuits from Signal Transition Graphs.

Logic Synthesis for Asynchronous Circuits Based on Petri Net Unfoldings and Incremental SAT Victor Khomenko, Maciej Koutny, and Alex Yakovlev University.

Detecting State Coding Conflicts in STGs Using Integer Programming Victor Khomenko, Maciej Koutny, and Alex Yakovlev University of Newcastle upon Tyne.

Hardware and Petri nets: application to asynchronous circuit design Jordi CortadellaUniversitat Politècnica de Catalunya, Spain Michael KishinevskyIntel.

Introduction to asynchronous circuit design: specification and synthesis Jordi Cortadella, Universitat Politècnica de Catalunya, Spain Michael Kishinevsky,

Introduction to asynchronous circuit design: specification and synthesis Part IV: Synthesis from HDL Other synthesis paradigms.

Introduction to asynchronous circuit design: specification and synthesis Part III: Advanced topics on synthesis of control circuits from STGs.

1 Logic design of asynchronous circuits Part II: Logic synthesis from concurrent specifications.

Behavioral Design Outline –Design Specification –Behavioral Design –Behavioral Specification –Hardware Description Languages –Behavioral Simulation –Behavioral.

Handshake protocols for de-synchronization I. Blunno, J. Cortadella, A. Kondratyev, L. Lavagno, K. Lwin and C. Sotiriou Politecnico di Torino, Italy Universitat.

Introduction to asynchronous circuit design: specification and synthesis Part II: Synthesis of control circuits from STGs.

Combining Decomposition and Unfolding for STG Synthesis (application paper) Victor Khomenko 1 and Mark Schaefer 2 1 School of Computing Science, Newcastle.

The Multicycle Processor II CPSC 321 Andreas Klappenecker.

1 Logic synthesis from concurrent specifications Jordi Cortadella Universitat Politecnica de Catalunya Barcelona, Spain In collaboration with M. Kishinevsky,

Asynchronous Interface Specification, Analysis and Synthesis M. Kishinevsky Intel Corporation J. Cortadella Technical University of Catalonia.

1 Logic design of asynchronous circuits Part III: Advanced topics on synthesis.

Visualisation and Resolution of Coding Conflicts in Asynchronous Circuit Design A. Madalinski, V. Khomenko, A. Bystrov and A. Yakovlev University of Newcastle.

Bridging the gap between asynchronous design and designers Part II: Logic synthesis from concurrent specifications.

Resolution of Encoding Conflicts by Signal Insertion and Concurrency Reduction based on STG Unfoldings V. Khomenko, A. Madalinski and A. Yakovlev University.

Behaviour-Preserving Transition Insertions in Unfolding Prefixes

STG-based synthesis and Petrify J. Cortadella (Univ. Politècnica Catalunya) Mike Kishinevsky (Intel Corporation) Alex Kondratyev (University of Aizu) Luciano.

Models of Computation for Embedded System Design Alvise Bonivento.

Lab for Reliable Computing Generalized Latency-Insensitive Systems for Single-Clock and Multi-Clock Architectures Singh, M.; Theobald, M.; Design, Automation.

1 State Encoding of Large Asynchronous Controllers Josep Carmona and Jordi Cortadella Universitat Politècnica de Catalunya Barcelona, Spain.

Synthesis of Asynchronous Control Circuits with Automatically Generated Relative Timing Assumptions Jordi Cortadella, University Politècnica de Catalunya.

Advanced Tutorial on Hardware Design and Petri nets Jordi CortadellaUniv. Politècnica de Catalunya Luciano LavagnoUniversità di Udine Alex YakovlevUniv.

A New Type of Behaviour- Preserving Transition Insertions in Unfolding Prefixes Victor Khomenko.

Detecting State Coding Conflicts in STGs Using SAT Victor Khomenko, Maciej Koutny, and Alex Yakovlev University of Newcastle upon Tyne.

1 A Case for Using Signal Transition Graphs for Analysing and Refining Genetic Networks Richard Banks, Victor Khomenko and Jason Steggles School of Computing.

1 Petrify: Method and Tool for Synthesis of Asynchronous Controllers and Interfaces Jordi Cortadella (UPC, Barcelona, Spain), Mike Kishinevsky (Intel Strategic.

Center for Embedded Computer Systems University of California, Irvine and San Diego Loop Shifting and Compaction for the.

ENEE 408C Lab Capstone Project: Digital System Design Fall 2005 Sequential Circuit Design.

Automatic synthesis and verification of asynchronous interface controllers Jordi CortadellaUniversitat Politècnica de Catalunya, Spain Michael KishinevskyIntel.

ELEN 468 Advanced Logic Design

Asynchronous Circuit Verification and Synthesis with Petri Nets J. Cortadella Universitat Politècnica de Catalunya, Barcelona Thanks to: Michael Kishinevsky.

Behavioural synthesis of asynchronous controllers: a case study with a self-timed communication channel Alex Yakovlev, Frank Burns, Alex Bystrov, Albert.

Automated synthesis of micro-pipelines from behavioral Verilog HDL Ivan BlunnoPolitecnico di Torino Luciano LavagnoUniversità di Udine.

Maria-Cristina Marinescu Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology A Synthesis Algorithm for Modular Design of.

StateCAD FPGA Design Workshop. For Academic Use Only Presentation Name 2 Objectives After completing this module, you will be able to:  Describe how.

Finite State Machines. Binary encoded state machines –The number of flip-flops is the smallest number m such that 2 m  n, where n is the number of states.

A Usable Reachability Analyser Victor Khomenko Newcastle University.

Charles Kime & Thomas Kaminski © 2004 Pearson Education, Inc. Terms of Use (Hyperlinks are active in View Show mode) Terms of Use Lecture 12 – Design Procedure.

UK Asynchronous Forum, September Synthesis of multiple rail phase encoding circuits Andrey Mokhov, Crescenzo D’Alessandro, Alex Yakovlev Microelectronics.

ICCD Conversion Driven Design of Binary to Mixed Radix Circuits Ashur Rafiev, Julian Murphy, Danil Sokolov, Alex Yakovlev School of EECE, Newcastle.

EEE2243 Digital System Design Chapter 4: Verilog HDL (Sequential) by Muhazam Mustapha, January 2011.

COE 202 Introduction to Verilog Computer Engineering Department College of Computer Sciences and Engineering King Fahd University of Petroleum and Minerals.

Fall 2004EE 3563 Digital Systems Design EE 3563 VHSIC Hardware Description Language  Required Reading: –These Slides –VHDL Tutorial  Very High Speed.

Evaluating and Improving an OpenMP-based Circuit Design Tool Tim Beatty, Dr. Ken Kent, Dr. Eric Aubanel Faculty of Computer Science University of New Brunswick.

Introduction to VHDL Simulation … Synthesis …. The digital design process… Initial specification Block diagram Final product Circuit equations Logic design.

ELEE 4303 Digital II Introduction to Verilog. ELEE 4303 Digital II Learning Objectives Get familiar with background of HDLs Basic concepts of Verilog.

Chapter 11: System Design Methodology Digital System Designs and Practices Using Verilog HDL and 2008, John Wiley11-1 Chapter 11: System Design.

RTL Hardware Design by P. Chu Chapter 9 – ECE420 (CSUN) Mirzaei 1 Sequential Circuit Design: Practice Shahnam Mirzaei, PhD Spring 2016 California State.

Specification mining for asynchronous controllers Javier de San Pedro† Thomas Bourgeat ‡ Jordi Cortadella† † Universitat Politecnica de Catalunya ‡ Massachusetts.

Synthesis from HDL Other synthesis paradigms

Asynchronous Interface Specification, Analysis and Synthesis

Synthesis of Speed Independent Circuits Based on Decomposition

Topics Modeling with hardware description languages (HDLs).

Part IV: Synthesis from HDL Other synthesis paradigms

Synthesis of asynchronous controllers from Signal Transition Graphs:

VHDL Introduction.

De-synchronization: from synchronous to asynchronous

Synthesis of multiple rail phase encoding circuits

COE 202 Introduction to Verilog

Presentation transcript:

Direct synthesis of large-scale asynchronous controllers using a Petri-net-based approach Ivan BlunnoPolitecnico di Torino Alex BystrovUniv. Newcastle upon Tyne Josep CarmonaUniv. Politècnica de Catalunya Jordi CortadellaUniv. Politècnica de Catalunya Luciano LavagnoUniversità di Udine Alex YakovlevUniv. Newcastle upon Tyne

Outline + Motivation Design flow Verilog HDL specification Petri nets and trace expressions Synthesis process Conclusion

Motivation Language-based design key enabler to synchronous logic success Use HDL as single language for specification logic simulation and debugging synthesis post-layout simulation HDL must support multiple levels of abstraction

Motivation HDL generates large asynchronous controllers: need direct synthesis Guarantee an implementation Automatic exploration of the design space Benefit from existing structural methods for logic synthesis Benefit (at the design stage) from existing performance estimation approaches

Design flow Control/data splitting STG (control) HDL specification Synthesizable HDL (data) Synthesis (petrify) Timing analysis (Synopsys) HDL implementation Synthesis (Synopsys) Logic implementation Delay insertion Logic delays

Design flow What is available? simulators (no synchronous assumption…) logic synthesis (from BFSM, STG, …) layout (almost like synchronous…) What is missing? translator from HDL to synthesis specification model translator from synthesis implementation model to HDL

Other approaches Special-purpose languages pros: syntax and semantics can be tailored to asynchronous Models of Computation (STG, BFSM, process algebrae) cons: not familiar to designers, no standard tool support Examples Tangram Communicating Hardware Processes Balsa

Our approach General-purpose language pros: several tools available, broad user basis cons: syntax and semantics oriented to gates, (not STGs or BFSMs or process algebrae) need to define a subset for synthesis (full language only good for simulation) Choice VHDL Verilog [Blunno & Lavagno, ASYNC’00]

Outline Motivation Design flow + Verilog HDL specification Petri nets and trace expressions Synthesis Conclusion

Asynchronous Verilog subset Module and signal declaration: module example(a, b, c, d); input a, b[7..0]; output c, d; reg e, f, g[11..0]; Currently only single module supported always loop surrounds live behavior initial block defines initialization sequence

Asynchronous Verilog subset Transitions: input signals: wait statement wait(a);... wait (!b); output signals: assignment statement c = a + b; Each statement generates a trace expression and a datapath fragment

Asynchronous Verilog subset Causality relations: Verilog statements begin-end for sequencing fork-join for concurrency if-then-else for input choice Only structured mix of sequencing, concurrency and choice can be specified

Example: simple filter always begin wait(start); R = SMP * 3; RES = SMP * 4; if(b7 == 1) RES = 0; else begin if(b6 == 1) RES = 1; end; done = 1; wait(!start); done = 0; end

Control-data partitioning Splitting of asynchronous control and synchronous data path Automated insertion of bundling delays CONTROL UNIT DATA PATH delay request acknowledge

Outline Motivation Design flow Verilog HDL specification + Petri nets and trace expressions Synthesis Conclusion

Controller design flow PNTE Circuit Petri Net Transformations Reductions Synthesis HDL Syntax-directed translation

Design flow PNTE Boolean equations Performance Estimation Area Estimation Critical cycles Transformations Cost estimation Structural synthesis

PNTE Free-choice Petri net Transitions are trace expressions Trace expressions represent well-structured event relations –Causality –Concurrency –Choice

Trace expressions (TE) TE e TE; TE TE || TE TE  TE trace expressions are a subset of CCS agent expressions [Milner 80]

Trace expressions: example ( a || ( b ; c) ) || (d e) || ;  a bc de

From PN to PNTE Reductions to simplify the net structure Concurrency relations take –O(n 2 ) in Trace expressions –O(n 3 ) in Free-Choice systems [Kovalyov & Esparza]

Reductions TE 1 TE 2 TE 1 ; TE 2

Reductions  TE 1 TE 1 || TE 2 TE 2 

Example a fb c dg h e d;  a; ( b || f ) c g; h;  e

Outline Motivation Design flow Verilog HDL specification Petri nets and trace expressions + Synthesis Conclusion

Exploration of the design space Kit of transformations at Petri net –Concurrency reduction –Increase of concurrency –Event hiding Fast cost estimation –Area (Boolean equations) –Performance (critical cycles)

Transformations at the net level Concurrency reduction a fb c d f and b are concurrent !

Transformations at the net level Concurrency reduction a fb c d f and b are ordered !

Transformations at the net level Concurrency reduction in TE a fb c d ; || a bcdf ; ; Concurrency in TE: b and f have a common parallel antecessor

; || a bcdf ; ; Transformations at the net level Concurrency reduction in TE a fb c d Concurrency reduction: change the parallelizer by a sequencer ;

Transformations at the net level Increase of concurrency a fb c d c is ordered with f and b!

Transformations at the net level Increase of concurrency a f bc d c, f and b are concurrent!

Transformations at the net level Increase of concurrency in TE a fb c d ; || a bcdf ; ; Increase of concurrency: reorganizing the subtree

Transformations at the net level Increase of concurrency in TE a fb c d Increase of concurrency: reorganizing the subtree ; || a bcdf ; ; d c

Transformations at the net level Increase of concurrency in TE a fb c d ; a Increase of concurrency: reorganizing the subtree ; b || cf d

Transformations at the net level Event hiding a fb c d hiding of b !

Transformations at the net level a f c d b hidden ! Event hiding

Transformations at the net level a fb c d ; || a bcdf ; ; Event hiding : delete the corresponding leaf... Event hiding in TE

Transformations at the net level a fb c d ; a cd ; ; || f Event hiding : delete the corresponding leaf... Event hiding in TE

|| f Transformations at the net level a fb c d ; a cd ; ; f Event hiding : delete the corresponding leaf... and simplify the tree structure Event hiding in TE

Synthesis of control logic For large-scale controllers: Direct translation from Petri Net (or STG-h/s- refined) specifications Logic synthesis from fully refined STGs with pseudo-one-hot encoding, structural techniques and STG-level optimisations

Why direct translation? Logic synthesis has problems with state space explosion, repetitive and regular structures (log-based encoding approach) Direct translation has linear complexity but can be area inefficient (inherent one-hot encoding) What about performance?

Shifter Example (x:=y;y:=a)* [Bystrov at al, 6 th UK Async Forum,’99] Control Logic option Speed (ns) Refined STG directly synthesized by Petrify5.4 Circuit decomposition with two D-elements4.2 Circuit decomposition and Petrify re-synthesis3.3 Re-synthesis with relative timing1.7

Direct Translation of Petri Nets Previous work dates back to 70s Synthesis into event-based (2-phase) circuits (similar to micropipeline control) –S.Patil, F.Furtek (MIT) Synthesis into level-based (4-phase) circuits (similar to synthesis from one-hot encoded FSMs) –R. David (’69, translation FSM graphs to CUSA cells) –L. Hollaar (’82, translation from parallel flowcharts) –V. Varshavsky et al. (’90,’96, translation from PN into an interconnection of David Cells)

David’s original approach a b c d x1x1 x’ 2 x’ 1 x2x2 yaya ycyc ybyb x’ 2 x1x1 Fragment of flow graph CUSA for storing state b

Hollaar’s approach K L A B K N M L N Fragment of flow-chart One-hot circuit cell A B (0) (1) 1 1 M

Hollaar’s approach K L M A B K N M L N Fragment of flow-chart One-hot circuit cell A B (1) 0 1

Hollaar’s approach K L M A B K N M L N Fragment of flow-chart One-hot circuit cell A B (1) 0 1

Varshavsky’s Approach p1p2 p1 p2 (1)(0) (1) 1* (1) Operation Controlled To Operation

Varshavsky’s Approach p1p2 p1 p2 (1)(0) 0->11->0 (1)

Varshavsky’s Approach p1p2 p1 p2 1->0 0->1 1->0 1->0->1 1*

Translation in brief This method has been used for designing control of a token ring adaptor [Yakovlev et al.,Async. Design Methods, 1995] The size of control was about 80 David Cells with 50 controlled hand shakes

Direct translation examples In this work we tried direct translation: From STG-refined specification (VME bus controller) –Worse than logic synthesis From a largish abstract specification with high degree of repetition (mod-6 counter) –Considerable gain to logic synthesis From a small concurrent specification with dense coding space (“butterfly” circuit) –Similar or better than logic synthesisb

Example 1: VME bus controller Result of direct translation (DC unoptimised):

VME bus controller After DC-optimisation (in the style of Varshavsky et al WODES’96)

David Cell library

VME bus controller After DC-optimisation (in the style of Varshavsky et al WODES’96)

“Data path” control logic Example of interface with a handshake control (DTACK, DSR/DSW):

Ex 2: “Flat” mod-6 Counter TE-like Specification: ((p?;q!) 5 ;p?;c!)* Petri net (5-safe): p? c! q! 5 5

“Flat” mod-6 Counter Refined (by hand) and optimised (by Petrify) Petri net:

“Flat” mod-6 counter Result of direct translation (optimised by hand):

David Cells and Timed circuits (a) Speed-independent(b) With Relative Timing

“Flat” mod-6 counter (a) speed-independent(b) with relative timing

“Butterfly” circuit a+a- b- dummy b+ Initial Specification: STG after CSC resolution: a+ a- b+ b- x+ x- y+ y- z+ z-

“Butterfly” circuit Speed-independent logic synthesis solution:

“Butterfly” circuit Speed-independent DC-circuit:

“Butterfly” circuit DC-circuit with aggressive relative timing:

Comparison with logic synthesis ExampleLogic synthesis DC-translation VME-bus (overall operation cycle) 6ns11ns Mod-6 count (p->q/c, worst case cycle) >5ns1.6ns Butterfly (with RT, operation cycle) 2ns1.8ns

DC control with Relative Timing DC op1op2

DC control with Relative Timing DC op1op2 David Cell type Token shift time Speed-independent 1.2ns Mild RT (fast bkwd reset)0.8ns Aggressive RT (fast fwd set)0.4ns

Synthesis Encoding based on a David-cell approach Transformations to improve area and performance Structural methods to derive a circuit [Pastor et al.] Transactions on CAD, Nov’98

Synthesis x+ z+ z- y- x- y+ p1 p2 p3 p4 p5 p6 p7 Next-state function of signal y ?

Synthesis x+ z+ z- y- x- y+ p1 p2 p3 p4 p5 p6 p7 Next-state function of signal y ? y = x + z

Synthesis example: VME bus Device LDS LDTACK D DSr DSw DTACK VME Bus Controller Data Transceiver Bus DSr LDS LDTACK D DTACK Read Cycle

Synthesis example: VME bus LDTACK+ D+ DTACK+ DSr- D- DTACK-LDS- LDTACK-DSr+ LDS+ READ CYCLE SPECIFICATION LDTACK+ D+ DTACK+ DSr- D- DTACK- LDS- LDTACK- DSr+ LDS+ csc0- csc0+ PETRIFY ( Optimizing Performance )

Synthesis example: VME bus p2+ ldtack+ p8-p11- lds+ p1+ d+ p3+ p1- p2- p4+ dtack+ p3- p5+ dsr- p4- p9+p6+ d-p5- p10+p7+ lds-dtack- p9-p6- p11+ ldtack-p8+ dsr+ p10- p7- LDTACK+ D+ DTACK+ DSr- D- DTACK-LDS- LDTACK-DSr+ LDS+

Synthesis example: VME bus p2+ ldtack+ p8-p11- lds+ p1+ d+ p3+ p1- p2- p4+ dtack+ p3- p5+ dsr- p4- p9+p6+ d-p5- p10+p7+ lds-dtack- p9-p6- p11+ ldtack-p8+ dsr+ p10- p7- ldtack+ lds+d+ dtack+ dsr- p9+ d- lds-dtack- p9-ldtack- dsr+

Synthesis example: VME bus ldtack+ lds+d+ dtack+ dsr-p9+ d- lds-dtack- p9-ldtack- dsr+ ldtack+ lds+d+ dtack+ dsr- p9+ d- lds-dtack- p9-ldtack- dsr+

Cost estimation Heuristics: –AREA :  { # literals in each Excitacion Region} –PERFORMANCE : length of critical cycle in the net Exploration of the design space guided by cost estimations

Performance estimation: critical cycles e a b c d f g h i j k e a b c d f g h i j k Marked-Graph Decomposition

Conclusions Fully automated design flow –From HDLs (control / data splitting) –Existing tools for data-path synthesis –Direct synthesis guarantees implementation (HDL  Petri net, Petri-net-based encoding) –Synthesis of large controllers by efficient spec models (Free-choice Petri nets + trace expressions) –Exploration of the design space (optimization) by property-preserving transformations –Logic synthesis by structural methods