Presentation is loading. Please wait.

Presentation is loading. Please wait.

ROUTING ARCHITECTURE AND ALGORITHMS FOR A SUPERCONDUCTIVITY CIRCUITS-BASED COMPUTING HARDWARE Farhad Mehdipour, Hiroaki Honda, Hiroshi Kataoka, Koji Inoue,

Similar presentations


Presentation on theme: "ROUTING ARCHITECTURE AND ALGORITHMS FOR A SUPERCONDUCTIVITY CIRCUITS-BASED COMPUTING HARDWARE Farhad Mehdipour, Hiroaki Honda, Hiroshi Kataoka, Koji Inoue,"— Presentation transcript:

1 ROUTING ARCHITECTURE AND ALGORITHMS FOR A SUPERCONDUCTIVITY CIRCUITS-BASED COMPUTING HARDWARE Farhad Mehdipour, Hiroaki Honda, Hiroshi Kataoka, Koji Inoue, Kazuaki Murakami Kyushu University, Japan CCECE 2011

2 CREST-JST (2006~): Low-power, high-performance, reconfigurable processor using single-flux quantum (SFQ) circuits SFQ-LSRDP K. Murakami K. Inoue H. Honda F. Mehdipour H. Kataoka K. Murakami K. Inoue H. Honda F. Mehdipour H. Kataoka Kyushu Univ. Architecture, Compiler and Applications Kyushu Univ. Architecture, Compiler and Applications S. Nagasawa et al. Superconducting Research Lab. (SRL) SFQ process Superconducting Research Lab. (SRL) SFQ process N. Yoshikawa et al. Yokohama National Univ. SFQ-FPU chip, cell library Yokohama National Univ. SFQ-FPU chip, cell library A. Fujimaki et al. Nagoya Univ. SFQ-RDP chip, cell library, and wiring Nagoya Univ. SFQ-RDP chip, cell library, and wiring N. Takagi (Leader) et al. N. Takagi (Leader) et al. Nagoya Univ. CAD for logic design and arithmetic circuits Nagoya Univ. CAD for logic design and arithmetic circuits Our mission: Architecture, compiler and application development 2

3 Outline of Large-Scale Reconfigurable Data-Path (LSRDP) Processor 3 SFQ Features: High-speed switching and signal transmission Low power consumption Compact implementation (smaller area) Suitable for pipeline processing SFQ Features: High-speed switching and signal transmission Low power consumption Compact implementation (smaller area) Suitable for pipeline processing

4 … … … … … … Buffers LSRDP Memory inst; … conf_LSRDP ( ); Loop: rearrange_input_data ( ); set_IO_info ( ); run_LSRDP ( ); inst; … sync_lsrdp ( ); rearrange_output_data ( ); End_Loop inst; … inst conf_LSRDP(); conf. bit-stream … … … … rearrange_input_data () GPP Memory Controller set_IO_info ( ); Memory Controller … … … … … … … … … … … run_LSRDP ( );inst sync_lsrdp ( ); GPP Waiting for the LSRDP LSRDP terminating the operation rearrange_output_data ( ) GPP How it works 4

5 Architecture Exploration FUTU PE arch. I 4-inps/3-outs FU TU PE arch. II 3-inps/3-outs TU FUTU Basic PE arch. 3-inps/2-outs MCL= 1 Number of rows = 1.5×M Number of columns = 4×MCL Number of rows = 2×M Number of columns = 6×MCL+2 MCL= 1 Number of rows = 1.5×M Number of columns = 4×MCL+1 MCL= 2 LSRDP Layouts PE structures ORN structures 5

6 LSRDP Tool Chain Application C code Application C code 1 Modified application code Modified application code 2 Modifying application code Inserting LSRDP instructions in the code Modifying application code Inserting LSRDP instructions in the code 1 ISAcc or COINS compiler 2 DFG Extraction 1.asm code for MIPS-based GPP.asm code for MIPS-based GPP 2 Data flow graphs Placing and Routing Tool 2 Configuration file + various text & schematic reports Configuration file + various text & schematic reports 1 LSRDP library file Function definitions & declarations 1 LSRDP architecture description 2 1: flow of the assembly code generation for GPP 2: flow of configuration bit-stream generation for the LSRDP 1: flow of the assembly code generation for GPP 2: flow of configuration bit-stream generation for the LSRDP Simulator Performance evaluation Simulator Performance evaluation 6

7 Mapping DFGs onto LSRDP 7 Longest connections DFG LSRDP Architecture Description LSRDP Architecture Description Placing Input Nodes Placing Operational & Output Nodes Placing Operational & Output Nodes Routing Nets Routing IO Nets Final Map

8 Global routing algorithms src dest src dest vacant fully- occupied exhaustive search-based very time consuming exhaustive search-based very time consuming branch and bound alg. Very fast branch and bound alg. Very fast Routing DFG connections between source and destination PEs 8

9 Micro-Routing-Problem Definition Inputs – LSRDP basic specifications Layout, Width (W), MCL, PE arch., and etc. List of connections b/w consecutive rows – ORN structure including The number of CBs and T2s in each row The number of CB rows Topology of connections among CBs Output – Detailed routes via cross-bar switches The list of CBs used for routing each connection Configuration of CBs FUT T T T … T T T T … ORN i-th row (i+1)-th row A micro-routing algorithm has been implemented for the LSRDP with underlying layout II and PE arch. III

10 ORN Micro-routing 0001 1011 0001 1011 CB ½ CB (PE1  PE 5) (PE2  PE5, PE6, PE7) (PE3  PE6, PE8 ) (PE4  PE7, PE8) (PE1  PE 5) (PE2  PE5, PE6, PE7) (PE3  PE6, PE8 ) (PE4  PE7, PE8) 1/2CB: 1-input/2-ouput CB: 2-input/2-output Micro-nets Example 10 PE 1 PE 2 PE 3 PE 5 PE 6 PE 7 PE 4 PE 8 ½ CB CB (CB) CB 3 2 4 2 2 3 4 1 1 2 2 2 4 3 3 4 3 4 3 2 2 4 1 -

11 18 17 12 20 18 25 24 32 31 … … … … PEs in 3 rd Row PEs in 4 th row 4567891011 ORN Micro-Routing Example: Heat 8x2- ORN b/w 3rd and 4th Rows 9 10 11 12 13 14 16 18 8 17 6 15 7 9 10 11 12 13 14 16 18 8 17 6 15 7 9 10 11 12 13 14 16 18 8 17 6 15 7 9 10 11 12 13 14 16 18 8 17 6 15 7 9 10 11 12 13 14 16 18 8 17 6 15 7 9 10 11 12 13 14 16 18 8 17 6 15 7 12 17 24 20 25 18 31 32 18 24 12 18 20 24 18 17 32 25 24 31 12 18 25 24 31 18 32 17 20 12 18 24 31 32 25 17 20 9 10 11 12 13 14 16 18 8 17 6 15 7 12 18 20 24 31 32 17 18 25 12 18 20 24 31 17 32 24 25 12 18 24 25 32 9 10 11 12 13 14 16 18 8 17 6 15 7 17 20 31 12 18 20 24 31 32 25 17 9 10 11 12 13 14 16 18 8 17 6 15 7 12 20 24 31 17 32 18 25 18 12 17 20 24 31 32 25 9 10 11 12 13 14 16 18 8 17 6 15 7 6 4567891011

12 Specifications of Attempted DFGs total # of nodes # of Inputs# of outputs# of ops Heat-8x1 34 6416 Heat-8x2 608432 Heat-16x2 172161296 Poisson-3x3 6218133 Vibration-4x2 488424 Vibration-8x2136161272 Vibration-8x416816896 ERI-1 7616951 ERI-2 6719147 CCECE 2011 12

13 Example of a DFG Mapping Vibration- 8x2 CCECE 2011 13

14 Results of routing nets using the proposed algorithms DFGavg. hor. C.L. avg./max. ver. C.L. # of global/micro nets to route Time to map (sec) Heat-8x1 0.350.75/336/640.015 Heat-8x2 0.44 1.32/5 68/1141.75 Heat-16x2 0.47 1.64/7 204/3431.05 Poisson-3x3 0.68 2.4/16 67/1202074.5 Vibration-4x2 0.46 1.58/9 50/880.34 Vibration-8x2 0.42 2.15/10 154/3322.20 Vibration-8x4 2.48 3.72/16 348/6106721.3 ERI-1 0.75 2.21/9 111/37453.61 ERI-2 0.78 2.99/9 95/3320.327 CCECE 2011 14

15 Thank you for your attention!


Download ppt "ROUTING ARCHITECTURE AND ALGORITHMS FOR A SUPERCONDUCTIVITY CIRCUITS-BASED COMPUTING HARDWARE Farhad Mehdipour, Hiroaki Honda, Hiroshi Kataoka, Koji Inoue,"

Similar presentations


Ads by Google