Download presentation
Presentation is loading. Please wait.
Published byAndrew Newman Modified over 8 years ago
1
1 Advanced Digital Design Asynchronous Design: FSL by A. Steininger and M. Delvai Vienna University of Technology
2
2 Outline Introduction Introduction Principles Principles Basic gates Basic gates Design flow and tools Design flow and tools Circuit design with FSL Circuit design with FSL Pipeline Pipeline Data Paths Data Paths Conclusion Conclusion Research plans Research plans
3
3 Fundamental Design Problem Ensure a lossles data flow A SRCSNK f(x) Issue ConditionCapture Condition only valid and consistent data has to be consumed new data can be issued, when the previous one was already consumed
4
4 Synchronous Approach SRCSNK f(x) A T Clk Global Time Reference => indirect conclusion
5
5 Asynchronous Circuits SRCSNK f(x) local handshake protocol Acknowledge Request
6
6 1x „1“ | 2x „0“ Four State Logic SRC SNK f(x) ∆t => delay insensitive 3x „1“ | 2x „0“ 2x „1“ | 1x „0“ => additional information required => SNK must able to recognize when data is valid and consistent
7
7 FSL encoding Use 2 codes per logic value X X.a X.b need two-rail coding: = L = H = h = l L => H (0,0) => (1,1) (0,0) ? (1,1) (1,0) (0,1)
8
8 Completion Detection SRC SNK Completion Detection CMPD HlLhHhLl consistent data
9
9 Phase Transition f(x) ? φ0φ0φ1φ1φ0φ0φ1φ1 We have to ensure, that: unconsistent input vectors are not processed f(x) is a monotonic function
10
10 Combinational Functions Variant 1: Hazard free impl. Consistency detector CDCD Processing of unconsistent inputs unevitable due to internal skew Variant 2: each basic gate processes only const. inputs function of each basic gate is monotonic f(x) local intelligence => hardware overhead
11
11 Consistent in φ1 (Combinational) Basic Gates And And Or Or Inv Inv (MUX) (MUX) (XOR) (XOR) … HL LL hl ll ** ** ** ** H L h l HLhlY E1 E2 Truth Table FSL-AND FSL AND E1 E2 Y * keep old value Consistent in φ0
12
12 E1.a E1.b E1 E2.a E2.b E2 Y.a Y.b Y Mem C D f a (x) f b (x) enable separation of control and data path => not delay insensitive FSL Gate Implementation
13
13 Further FSL Gates φ-Inverter φ-Inverter φ-Detector φ-Detector φ-Converter φ-Converter Register/Latch Register/Latch Memories Memories
14
14 rail b rail a H (1,0) h (1,1)HIGH L (0,1)l (0,0) LOW 1 (a,b) 0 (a,b) φ–Inverter φ–Inverter => simple inversion of rail b
15
15 rail a Std → FSL rail b Std FSL 1 HIGH 0LOW φ–Converter φ–Converter FSL logic Std logic H (1,1)h (1,0) HIGH L (0,0)l (0,1) LOW 1 (a,b) 0 (a,b) Sig. XORXOR requested φ Std rail a rail b FSL FSL → Std
16
16 H (1,1) h (1,0) HIGH L (0,0) l (0,1) LOW 1 (a,b) 0 (a,b) XOR ‘0‘ XOR ‘1‘ φ–Detector φ–Detector
17
17 Ctrl Latch φ in input data consistent and valid output data already consumed input data input data consumed freeze the latches again data in data out φ c-done pass FSL Register
18
18 Memory Standard RAM φ-det CONVCONV CONVCONV FSL_LogicSTD_LogicFSL_Logic STD_Logic Alternative: Store entire data in φ0 and φ1 => huge overhead (4x)
19
19 Design Flow and tools Requirements: Standard tools (Synopsys/Quartus) Standard tools (Synopsys/Quartus) Modelling on RTL level Modelling on RTL level Support for simulation and synthesis Support for simulation and synthesis Target platform FPGA Target platform FPGA
20
20 Adaptation: VHDL Definition of an FSL_logic type Definition of an FSL_logic type Redefinintion of std_1164 package Redefinintion of std_1164 package Additional functions Additional functions φ_det, φ_inv, conversion functions φ_det, φ_inv, conversion functions stable stable => Modelling FSL circuits on RTL level
21
21 Example: Program Counter stable_signals <= AddrInc_conv&JmpExe&JmpAddr; pc_next_SM: process begin stable(stable_signals); if JmpExe = JMP_EXE_I1 or JmpExe = JMP_EXE_I0 then AddrNxt <= JmpAddr; else AddrNxt <= AddrInc_conv; end if; end process pc_next_SM;
22
22 Adaptation: Synthesis (1) FSL Target FSL Library FSL Target FSL Library FSL AND, FSL OR, FSL INV, FSL Register, φ-Detector … FSL AND, FSL OR, FSL INV, FSL Register, φ-Detector … Synthesis with FSL Target Library Synthesis with FSL Target Library Netlist Package FSL_Rail Package FSL_Rail Definition FSL_rail_logic :Record (a,b) Definition FSL_rail_logic :Record (a,b) FSL AND, FSL OR, FSL INV, FSL Register, φ-Detector … FSL AND, FSL OR, FSL INV, FSL Register, φ-Detector … Netlist: Replace FSL with FSL_rail Netlist: Replace FSL with FSL_rail Synthesis with FPGA Target Library Synthesis with FPGA Target Library L & L H (0,0) & (1,1)
23
23 Adapation: Synthesis (2) conventional design flow FSL design flow
24
24 Adaptation: Simulation Same testbench for FSL_logic and FSL_rail_logic ciruits Same testbench for FSL_logic and FSL_rail_logic ciruits => Verification of FSL circuits Testbench FSL Response FSL Logic FSL Stimuli FSL Rail Logic Conversion
25
25 Outline √ Introduction √ Principles √ Basic gates √ Design flow and tools Circuit design Circuit design Pipeline Pipeline Data Paths Data Paths Modeling complex circuits Modeling complex circuits Open points Open points Conclusion Conclusion
26
26 (Linear) Pipeline LATCHLATCH f(x) LATCHLATCH LATCHLATCH LATCHLATCH K1 K2K3K4 00 11 00 11 00 Full initialized 00 00 00 00 00 Empty initialized
27
27 Bubble Concept (1) Progress is possible when a circuit contains at least one bubble K1 K2K3 00 11 00 K4 11 00 K3 K1 K2 11 11 00 11 00 bubble identical values
28
28 Bubble Concept (2) Initialization => ensure that the circuit contains at least one bubble Initialization => ensure that the circuit contains at least one bubble More than one bubble => higher processing speed More than one bubble => higher processing speed K1 K2K3K4K5K6K7 11 00 11 00 00 11 11 00
29
29 Non-linear Pipeline: Forward Path (1) K1 K2K3 00 11 00 11 11 00 11 φ-inv 00 Request SRC SNK
30
30 K1 K2K3 00 00 Operation K3 11 11 00 11 φ-inv 11 1 00 2 11 3 00 4 00 0 1 Non-linear Pipeline: Forward Path (2)
31
31 K1 K2K3 Operation K3 φ-inv 00 5 00 11 00 00 11 Non-linear Pipeline: Forward Path (2) 11 11
32
32 K1 K2K3 00 11 00 11 11 11 00 φ-inv 00 Non-linear Pipeline: Feedback Path (1)
33
33 K1 K2K3 00 00 Operation K3 11 11 11 00 φ-inv 11 1 00 2 11 3 00 1 0 Non-linear Pipeline: Feedback Path (2)
34
34 First Conclusion: Non-linear Pipeline K1 K2K3 00 11 00 11 00 11 φ-inv 00 K1 K2K3 00 11 00 11 11 00 Forward Path Feedback Path Phase inverter placement : Forward Path: Eliminate inconsistent inputs in the init. state Feedback Path: Generate inconsistent inputs in the init. state => true, when the pipeline is initialized full
35
35 Empty Non-linear Pipeline: Forward Path (1) K1 K2K3 φ-inv 11 00 00 00 00 00 00 00 11 00 11 00 11 00 empty initialized=> no phase inverter required
36
36 00 00 K1 K2K3 00 00 Operation K3 11 1 φ1φ1 2 11 3 φ1φ1 4 00 φ1 00 Empty Non-linear Pipeline: Forward Path (2) => different phase inverter placement for full and empty initialized circuit
37
37 “Invalid” Feedback Path K1 φ-Inv always required no inversion of the request signal Definition: A valid feedback path must contain at least two registers nodes
38
38 Phase Inverter Placement Phase inverter are required to avoid deadlocks Phase inverter are required to avoid deadlocks Placement of phase inverters depends on: Placement of phase inverters depends on: Topology of the circuit Topology of the circuit Type and number of components inside valid feedbacks Type and number of components inside valid feedbacks Dynamic behavior Dynamic behavior Initialization Initialization Handshake signals have to be considered Handshake signals have to be considered Processing speed depends on initialization Processing speed depends on initialization More configurations are possible More configurations are possible
39
39 Identification of Phase Inverter: Full Initialised Circuits Systematic approach: 1. Identify combinational logic and registers/memories 2. Generate a graph representation of the circuit based on registers /memories 3. Apply a coloring algorithm 4. Identify feedback path (=> loops) 5. Eliminate invalid feedbacks 6. Add a phase invert to each remaing loop (phase inverter can be shared)
40
40 Data Paths Reg f(x ) Reg DESEL. NODE f(x ) Reg f(x ) Reg SEL. NODE DEMUX FORK MUX MERGE
41
41 Example: Merging Data Paths Reg f(x ) Reg MUXMUX f(x ) Ack DW1 ( 0)DW4 ( 1)DW2 ( 1)DW3 ( 0) Assumption Acknowlegde is activated when selected data is available Differnent delay for both data paths DW1 ( 0)DW4 ( 1)DW2 ( 1)DW3 ( 0) ∆1 ∆3 DW1 ( 0) Step 1: In1 selected DW1 ( 0) DW2 ( 1) Step 2: In1 selected Step 3: In2 selected DW1 ( 0) In1 In2 Ack DW2 ( 1)
42
42 Example: Merging Data Paths (2) Depending on the circuit functionality: a) both inputs have to be consumed in each processing step Ensure that the difference in processing speed in all data paths is small enough Wait until all input are available (even the unused ones) b) all inputs have to be processed and consumed Insert synchronizer circuits to adjust the phase encoding of the input signals
43
43 DEMUX Example f(x ) Reg f(x ) Reg DEMUXDEMUX f(x ) DW1 ( 0) DW4 ( 1)DW3 ( 0) DW2 ( 1) Avoid loss of synchronization Dummy data approach Synchronizer circuits Performance considerations Wait on ack. of data paths Wait only on required ack.
44
44 Asynchronous ASPEAR : Asynchronous SPEAR
45
45 semi-automated design flow (based on Synopsys) semi-automated design flow (based on Synopsys) Our Current Status
46
46 semi-automated design flow (based on Synopsys) semi-automated design flow (based on Synopsys) theoretical investigations theoretical investigations Our Current Status
47
47 semi-automated design flow (based on Synopsys) semi-automated design flow (based on Synopsys) theoretical investigations theoretical investigations working 16-bit processor (on FPGA platform) working 16-bit processor (on FPGA platform) Our Current Status APEX 20KC
48
48 semi-automated design flow (based on Synopsys) semi-automated design flow (based on Synopsys) working 16-bit processor (on FPGA platform) working 16-bit processor (on FPGA platform) investigation of DI investigation of DI experimental robustness assessment: (fault-injection: synchronous design versus asyn) experimental robustness assessment: (fault-injection: synchronous design versus asyn) Our Current Status
49
49 Conclusion FSL Four State Logic (FSL Four State Logic (FSL Delay insensitive logic Delay insensitive logic Two Representation Low/High => Dual rail encoding Two Representation Low/High => Dual rail encoding Even combinational require a memory elements Even combinational require a memory elements Circuit design with FSL Circuit design with FSL Non linear pipeline Non linear pipeline Additional phase inverter required Additional phase inverter required Placement depends on Placement depends on Circuit topology Circuit topology Initialization Initialization … … Data paths Data paths Splitting: Synchronizer Circuits or Dummy Data required Splitting: Synchronizer Circuits or Dummy Data required Merging: Non-eager MUX or timing assumptions required Merging: Non-eager MUX or timing assumptions required
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.