Metrics for Reconfigurable Architectures Characterization: Remanence and Scalability Pascal BENOIT G. Sassatelli – L. Torres – D. Demigny M. Robert – G. Cambon
Outline Context Remanence Operative Density Case Study: the Systolic Ring Conclusion and perspectives
Context SoC and Customizable Platform Based-Design Specifications Processing power Area Power consumption etc. Reconfigurable Hardware (Coarse Grain) ASIC 1 DSP Reconfigurable Hardware (Fine Grain) We need metrics to compare ! ASIC 2
Context Architecture characterization Processing power Power consumption Flexibility Parallelism potential Dynamism Silicon area Scalability … Metrics Dehon criterion Remanence Operative density Generalisation to Architectural model characterisation and metrics depend on architectural parameters « Comparing architectures with a minimum of criteria »
Remanence Definition N PE : # of processing elements (PE) Nc: # of PE configurable per cycle Fe: operating frequency Fc configuration frequency Characterizes the Dynamism # of cycles to (re)configure the whole architecture Amount of data to compute between 2 configurations Fe Fc
Remanence Comparisons Only 1 cycle to (re)configure the DSP Few cycles to (re)configure coarse grain RA ( 8) Many cycles to (re)configure fine grain RA N PE NcRNameTypeF (MHz) ARDOISE Systolic Ring DART MorphoSys TMS320C62 Fine Grain RA Coarse Grain RA DSP VLIW
Operative Density Definition N PE : # of PEA: Core Area (relative unit ²) Area can be expressed as a function of N PE (architectural model) Characterizes Fixed N PE # of operators per relative area unit Variable N PE OD as a function of N PE A(N PE ) = N PE *A PE +A interconnect (N PE )+A memory (N PE ) A sequencer (N PE ) OD(N PE ) = k A(N PE ) =k.N PE the architectural model is scalable
Operative Density Comparisons DSP: sequencer area ARDOISE : fine granularity Coarse granularity Reconfigurable architectures Scalabilty of interconnect resources ? Generalization to architectural models NameType Area(M ²) ARDOISE Fine Grain RA Systolic Ring (S=1, C=6, N=2) Coarse Grain RA Systolic Ring (S=1, C=16, N=4) Coarse Grain RA DART Coarse Grain RA MorphoSys Coarse Grain RA TMS320C62 DSP VLIW NameType N PE Area(M ²) OD (N PE ) ARDOISE Fine Grain RA Systolic Ring (S=1, C=6, N=2) Coarse Grain RA Systolic Ring (S=1, C=16, N=4) Coarse Grain RA DART Coarse Grain RA MorphoSys Coarse Grain RA TMS320C62 DSP VLIW
-Architectural Model Characterization - A Case Study: The Systolic Ring
Architectural model Characterization The Systolic Ring Architectural model Based on a coarse-grained configurable PE
Architectural model Characterization The Systolic Ring Architectural model Based on a coarse-grained configurable PE Circular datapaths Dnode Switch
Architectural model Characterization The Systolic Ring Architectural model Based on a coarse-grained configurable PE Circular datapaths 3 parameters C: # of layers N: # of Dnodes per layer Dnode Switch layer 1 layer 2 layer 3 layer 4 # of layers : 4 (C = 4) # of Dnode per layer : 2 (N = 2)
Architectural model Characterization The Systolic Ring Architectural model Based on a coarse-grained configurable PE Circular datapaths 3 parameters C: # of layers N: # of Dnodes per layer layer 1layer 2 layer 3 layer 4 layer 5layer 6 layer 7 layer 8 # of layers : 8 (C = 8) # of Dnode per layer : 2 (N = 2)
Architectural model Characterization The Systolic Ring Architectural model Based on a coarse-grained configurable PE Circular datapaths 3 parameters C: # of layers N: # of Dnodes per layer S: # of Rings # of layers : 8 (C = 8) # of Dnode per layer : 2 (N = 2) 1 Systolic Ring (S = 1) layer 1layer 2 layer 3 layer 4 layer 5layer 6 layer 7 layer 8
Architectural model Characterization The Systolic Ring Architectural model Based on a coarse-grained configurable PE Circular datapaths 3 parameters C: # of layers N: # of Dnodes per layer S: # of Rings # of layers : 4 (C = 4) # of Dnode per layer : 2 (N = 2) 4 Systolic Ring (S = 4)
Architectural model Characterization The Systolic Ring Architectural model Based on a coarse-grained configurable PE Circular datapaths 3 parameters C: # of layers N: # of Dnodes per layer S: # of Rings Control Units Local Dnodes units Dnode Sequencer
Architectural model Characterization The Systolic Ring Architectural model Based on a coarse-grained configurable PE Circular datapaths 3 parameters C: # of layers N: # of Dnodes per layer S: # of Rings Control Units Local Dnode unit Local Ring unit Local Ring Sequencer Local Ring Sequencer Local Ring Sequencer Local Ring Sequencer
Architectural model Characterization The Systolic Ring Architectural model Based on a coarse-grained configurable PE Circular datapaths 3 parameters C: # of layers N: # of Dnodes per layer S: # of Rings Control Units Local Dnode unit Local Ring unit Global unit Global Sequencer Local Ring Sequencer Local Ring Sequencer Local Ring Sequencer Local Ring Sequencer
Architectural model Characterization Remanence Only one Systolic Ring S=1 N PE = # of Dnodes = N*C*S = N*C Remanence formalisation k= C/N
Architectural model Characterization A(N PE ) formalisation for OD(N PE ) 0.18µ CMOS technology C = 4, N = 2, S = 1 A(8) = 3.3 mm ² A(8) = 407M ² Area formalisation: A ( N PE ) = f ( N, C, S ) depends on C / N ratio and S N PE = N.C.S Area formalisation calibrated on these results Systolic Ring layout (C=4, N=2, S=1)
Architectural model Characterization OD(N PE ) for 1 Systolic Ring (S=1) k = C/N = [ 0.25 ; 4 ] decreasing OD(N PE ) OD(N PE ) for several Systolic Ring k = C/N = 4 multi-ring instanciations increase scalability
Architectural model Characterization Customisation and design technique between 60 and 80 processing elements
Architectural model Characterization Customisation and design technique between 60 and 80 processing elements
Architectural model Characterization Customisation and design technique Design Space
Architectural model Characterization Best OD and remanence Worst interconnect resources and processing power Design Space
Architectural model Characterization Design Space Worst OD and remanence Best interconnect resources and processing power
Architectural model Characterization R and OD can be integrated in CAD tools to observe architectural parameters effects and choose best trade-offs in the design space
R 1 OD 1 R 2 OD 2 R 3 OD 3 R n OD n Conclusion and perspectives IP 1 Specifications Processing power Area Power consumption etc. IP 2IP 3IP n
R 1 OD 1 R 2 OD 2 R 3 OD 3 R n OD n Conclusion and perspectives IP 1 Specifications Processing power Area Power consumption etc. IP 2IP 3IP n Architectural models Comparisons
R 1 OD 1 R 2 OD 2 R 3 OD 3 R n OD n Conclusion and perspectives IP 1 Specifications Processing power Area Power consumption etc. IP 2IP 3IP n Architectural model Customisation
Thank You