Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Advanced Digital Design Asynchronous Design: FSL by A. Steininger and M. Delvai Vienna University of Technology.

Similar presentations


Presentation on theme: "1 Advanced Digital Design Asynchronous Design: FSL by A. Steininger and M. Delvai Vienna University of Technology."— Presentation transcript:

1 1 Advanced Digital Design Asynchronous Design: FSL by A. Steininger and M. Delvai Vienna University of Technology

2 2 Outline Introduction Introduction Principles Principles Basic gates Basic gates Design flow and tools Design flow and tools Circuit design with FSL Circuit design with FSL Pipeline Pipeline Data Paths Data Paths Conclusion Conclusion Research plans Research plans

3 3 Fundamental Design Problem Ensure a lossles data flow A SRCSNK f(x) Issue ConditionCapture Condition only valid and consistent data has to be consumed new data can be issued, when the previous one was already consumed

4 4 Synchronous Approach SRCSNK f(x) A T Clk Global Time Reference => indirect conclusion

5 5 Asynchronous Circuits SRCSNK f(x) local handshake protocol Acknowledge Request

6 6 1x „1“ | 2x „0“ Four State Logic SRC SNK f(x) ∆t => delay insensitive 3x „1“ | 2x „0“ 2x „1“ | 1x „0“ => additional information required => SNK must able to recognize when data is valid and consistent

7 7 FSL encoding  Use 2 codes per logic value X X.a X.b need two-rail coding: = L = H = h = l L => H (0,0) => (1,1) (0,0) ? (1,1) (1,0) (0,1)

8 8 Completion Detection SRC SNK Completion Detection CMPD HlLhHhLl consistent data

9 9 Phase Transition f(x) ? φ0φ0φ1φ1φ0φ0φ1φ1 We have to ensure, that:  unconsistent input vectors are not processed  f(x) is a monotonic function

10 10 Combinational Functions Variant 1:  Hazard free impl.  Consistency detector CDCD Processing of unconsistent inputs unevitable due to internal skew Variant 2:  each basic gate processes only const. inputs  function of each basic gate is monotonic f(x) local intelligence => hardware overhead

11 11 Consistent in φ1 (Combinational) Basic Gates And And Or Or Inv Inv (MUX) (MUX) (XOR) (XOR) … HL LL hl ll ** ** ** ** H L h l HLhlY E1 E2 Truth Table FSL-AND FSL AND E1 E2 Y * keep old value Consistent in φ0

12 12 E1.a E1.b E1 E2.a E2.b E2 Y.a Y.b Y Mem C D f a (x) f b (x) enable separation of control and data path => not delay insensitive FSL Gate Implementation

13 13 Further FSL Gates φ-Inverter φ-Inverter φ-Detector φ-Detector φ-Converter φ-Converter Register/Latch Register/Latch Memories Memories

14 14 rail b rail a H (1,0) h (1,1)HIGH L (0,1)l (0,0) LOW  1 (a,b)  0 (a,b) φ–Inverter φ–Inverter => simple inversion of rail b

15 15 rail a Std → FSL rail b Std FSL 1 HIGH 0LOW φ–Converter φ–Converter FSL logic Std logic H (1,1)h (1,0) HIGH L (0,0)l (0,1) LOW  1 (a,b)  0 (a,b) Sig. XORXOR requested φ Std rail a rail b FSL FSL → Std

16 16 H (1,1) h (1,0) HIGH L (0,0) l (0,1) LOW  1 (a,b)  0 (a,b) XOR ‘0‘ XOR ‘1‘ φ–Detector φ–Detector

17 17 Ctrl Latch φ in input data consistent and valid output data already consumed input data input data consumed freeze the latches again data in data out φ c-done pass FSL Register

18 18 Memory Standard RAM φ-det CONVCONV CONVCONV  FSL_LogicSTD_LogicFSL_Logic STD_Logic Alternative: Store entire data in φ0 and φ1 => huge overhead (4x)

19 19 Design Flow and tools Requirements: Standard tools (Synopsys/Quartus) Standard tools (Synopsys/Quartus) Modelling on RTL level Modelling on RTL level Support for simulation and synthesis Support for simulation and synthesis Target platform FPGA Target platform FPGA

20 20 Adaptation: VHDL Definition of an FSL_logic type Definition of an FSL_logic type Redefinintion of std_1164 package Redefinintion of std_1164 package Additional functions Additional functions φ_det, φ_inv, conversion functions φ_det, φ_inv, conversion functions stable stable => Modelling FSL circuits on RTL level

21 21 Example: Program Counter stable_signals <= AddrInc_conv&JmpExe&JmpAddr; pc_next_SM: process begin stable(stable_signals); if JmpExe = JMP_EXE_I1 or JmpExe = JMP_EXE_I0 then AddrNxt <= JmpAddr; else AddrNxt <= AddrInc_conv; end if; end process pc_next_SM;

22 22 Adaptation: Synthesis (1) FSL Target FSL Library FSL Target FSL Library FSL AND, FSL OR, FSL INV, FSL Register, φ-Detector … FSL AND, FSL OR, FSL INV, FSL Register, φ-Detector … Synthesis with FSL Target Library Synthesis with FSL Target Library  Netlist Package FSL_Rail Package FSL_Rail Definition FSL_rail_logic :Record (a,b) Definition FSL_rail_logic :Record (a,b) FSL AND, FSL OR, FSL INV, FSL Register, φ-Detector … FSL AND, FSL OR, FSL INV, FSL Register, φ-Detector … Netlist: Replace FSL with FSL_rail Netlist: Replace FSL with FSL_rail Synthesis with FPGA Target Library Synthesis with FPGA Target Library L & L H (0,0) & (1,1)

23 23 Adapation: Synthesis (2) conventional design flow FSL design flow

24 24 Adaptation: Simulation Same testbench for FSL_logic and FSL_rail_logic ciruits Same testbench for FSL_logic and FSL_rail_logic ciruits => Verification of FSL circuits Testbench FSL Response FSL Logic FSL Stimuli FSL Rail Logic Conversion

25 25 Outline √ Introduction √ Principles √ Basic gates √ Design flow and tools Circuit design Circuit design Pipeline Pipeline Data Paths Data Paths Modeling complex circuits Modeling complex circuits Open points Open points Conclusion Conclusion

26 26 (Linear) Pipeline LATCHLATCH f(x) LATCHLATCH LATCHLATCH LATCHLATCH K1 K2K3K4 00 11 00 11 00 Full initialized 00 00 00 00 00 Empty initialized

27 27 Bubble Concept (1) Progress is possible when a circuit contains at least one bubble K1 K2K3 00 11 00 K4 11 00 K3 K1 K2 11 11 00 11 00 bubble identical values

28 28 Bubble Concept (2) Initialization => ensure that the circuit contains at least one bubble Initialization => ensure that the circuit contains at least one bubble More than one bubble => higher processing speed More than one bubble => higher processing speed K1 K2K3K4K5K6K7 11 00 11 00 00 11 11 00

29 29 Non-linear Pipeline: Forward Path (1) K1 K2K3 00 11 00 11 11 00 11 φ-inv 00 Request SRC SNK

30 30 K1 K2K3 00 00 Operation K3 11 11 00 11 φ-inv 11 1 00 2 11 3 00 4 00  0  1 Non-linear Pipeline: Forward Path (2)

31 31 K1 K2K3 Operation K3 φ-inv 00 5 00 11 00 00 11 Non-linear Pipeline: Forward Path (2) 11 11

32 32 K1 K2K3 00 11 00 11 11 11 00 φ-inv 00 Non-linear Pipeline: Feedback Path (1)

33 33 K1 K2K3 00 00 Operation K3 11 11 11 00 φ-inv 11 1 00 2 11 3 00  1  0 Non-linear Pipeline: Feedback Path (2)

34 34 First Conclusion: Non-linear Pipeline K1 K2K3 00 11 00 11 00 11 φ-inv 00 K1 K2K3 00 11 00 11 11 00 Forward Path Feedback Path Phase inverter placement : Forward Path: Eliminate inconsistent inputs in the init. state Feedback Path: Generate inconsistent inputs in the init. state => true, when the pipeline is initialized full

35 35 Empty Non-linear Pipeline: Forward Path (1) K1 K2K3 φ-inv 11 00 00 00 00 00 00 00 11 00 11 00 11 00 empty initialized=> no phase inverter required

36 36 00 00 K1 K2K3 00 00 Operation K3 11 1 φ1φ1 2 11 3 φ1φ1 4 00  φ1 00 Empty Non-linear Pipeline: Forward Path (2) => different phase inverter placement for full and empty initialized circuit

37 37 “Invalid” Feedback Path K1 φ-Inv always required no inversion of the request signal Definition: A valid feedback path must contain at least two registers nodes

38 38 Phase Inverter Placement Phase inverter are required to avoid deadlocks Phase inverter are required to avoid deadlocks Placement of phase inverters depends on: Placement of phase inverters depends on: Topology of the circuit Topology of the circuit Type and number of components inside valid feedbacks Type and number of components inside valid feedbacks Dynamic behavior Dynamic behavior Initialization Initialization Handshake signals have to be considered Handshake signals have to be considered Processing speed depends on initialization Processing speed depends on initialization More configurations are possible More configurations are possible

39 39 Identification of Phase Inverter: Full Initialised Circuits Systematic approach: 1. Identify combinational logic and registers/memories 2. Generate a graph representation of the circuit based on registers /memories 3. Apply a coloring algorithm 4. Identify feedback path (=> loops) 5. Eliminate invalid feedbacks 6. Add a phase invert to each remaing loop (phase inverter can be shared)

40 40 Data Paths Reg f(x ) Reg DESEL. NODE f(x ) Reg f(x ) Reg SEL. NODE DEMUX FORK MUX MERGE

41 41 Example: Merging Data Paths Reg f(x ) Reg MUXMUX f(x ) Ack DW1 (  0)DW4 (  1)DW2 (  1)DW3 (  0) Assumption Acknowlegde is activated when selected data is available Differnent delay for both data paths DW1 (  0)DW4 (  1)DW2 (  1)DW3 (  0) ∆1 ∆3 DW1 (  0) Step 1: In1 selected DW1 (  0) DW2 (  1) Step 2: In1 selected Step 3: In2 selected DW1 (  0) In1 In2 Ack DW2 (  1)

42 42 Example: Merging Data Paths (2) Depending on the circuit functionality: a) both inputs have to be consumed in each processing step  Ensure that the difference in processing speed in all data paths is small enough  Wait until all input are available (even the unused ones) b) all inputs have to be processed and consumed  Insert synchronizer circuits to adjust the phase encoding of the input signals

43 43 DEMUX Example f(x ) Reg f(x ) Reg DEMUXDEMUX f(x ) DW1 (  0) DW4 (  1)DW3 (  0) DW2 (  1) Avoid loss of synchronization Dummy data approach Synchronizer circuits Performance considerations Wait on ack. of data paths Wait only on required ack.

44 44 Asynchronous ASPEAR : Asynchronous SPEAR

45 45 semi-automated design flow (based on Synopsys) semi-automated design flow (based on Synopsys) Our Current Status

46 46 semi-automated design flow (based on Synopsys) semi-automated design flow (based on Synopsys) theoretical investigations theoretical investigations Our Current Status

47 47 semi-automated design flow (based on Synopsys) semi-automated design flow (based on Synopsys) theoretical investigations theoretical investigations working 16-bit processor (on FPGA platform) working 16-bit processor (on FPGA platform) Our Current Status APEX 20KC

48 48 semi-automated design flow (based on Synopsys) semi-automated design flow (based on Synopsys) working 16-bit processor (on FPGA platform) working 16-bit processor (on FPGA platform) investigation of DI investigation of DI experimental robustness assessment: (fault-injection: synchronous design versus asyn) experimental robustness assessment: (fault-injection: synchronous design versus asyn) Our Current Status

49 49 Conclusion FSL Four State Logic (FSL Four State Logic (FSL Delay insensitive logic Delay insensitive logic Two Representation Low/High => Dual rail encoding Two Representation Low/High => Dual rail encoding Even combinational require a memory elements Even combinational require a memory elements Circuit design with FSL Circuit design with FSL Non linear pipeline Non linear pipeline Additional phase inverter required Additional phase inverter required Placement depends on Placement depends on Circuit topology Circuit topology Initialization Initialization … … Data paths Data paths Splitting: Synchronizer Circuits or Dummy Data required Splitting: Synchronizer Circuits or Dummy Data required Merging: Non-eager MUX or timing assumptions required Merging: Non-eager MUX or timing assumptions required


Download ppt "1 Advanced Digital Design Asynchronous Design: FSL by A. Steininger and M. Delvai Vienna University of Technology."

Similar presentations


Ads by Google