Express Cube Topologies for On-chip Interconnects Boris Grot J. Hestness, S. W. Keckler, O. Mutlu † The University of Texas at Austin † Carnegie Mellon University ‡ Part of this work was performed at Microsoft Research Feb 17, 2009HPCA ‘09
The Era of Many-core UTCS2HPCA ‘09 Intel Larrabee 16+ cores Bidirectional ring interconnect UT TRIPS 2x16 exec tiles 16 NUCA tiles Multiple networks Intel Polaris 80 tiles 8x10 mesh Tilera Tile 64 cores 5 mesh networks
Networks on a Chip (NOCs) On-chip advantages No pin constraints Rich wiring resources On-chip limitations 2D substrates limit implementable topologies Logic area constrains use of wiring resources Energy/power budget caps Focus Topologies for tomorrow’s many-core CMPs HPCA ‘093UTCS
Outline Introduction Existing topologies Multidrop Express Channels (MECS) Evaluation Generalized Express Cubes Summary UTCS4HPCA '09
UTCS5HPCA '09 2-D Mesh
Pros Low design & layout complexity Simple, fast routers Cons Large diameter Energy & latency impact UTCS6HPCA '09 2-D Mesh
Pros Multiple terminals attached to a router node Fast nearest-neighbor communication via the crossbar Hop count reduction proportional to concentration degree Cons Benefits limited by crossbar complexity UTCS7HPCA '09 Concentration (Balfour & Dally, ICS ‘06 )
UTCS8HPCA '09 Concentration Side-effects Fewer channels Greater channel width
UTCS9HPCA ‘09 Replication CMesh-X2 Benefits Restores bisection channel count Restores channel width Reduced crossbar complexity
UTCS10HPCA '09 Flattened Butterfly (Kim et al., Micro ‘07) Objectives: Improve connectivity Exploit the wire budget
UTCS11HPCA '09 Flattened Butterfly (Kim et al., Micro ‘07)
UTCS12HPCA '09 Flattened Butterfly (Kim et al., Micro ‘07)
UTCS13HPCA '09 Flattened Butterfly (Kim et al., Micro ‘07)
UTCS14HPCA '09 Flattened Butterfly (Kim et al., Micro ‘07)
Pros Excellent connectivity Low diameter: 2 hops Cons High channel count: k 2 /2 per row/column Low channel utilization Increased control (arbitration) complexity UTCS15HPCA '09 Flattened Butterfly (Kim et al., Micro ‘07)
UTCS16HPCA '09 Multidrop Express Channels (MECS) Objectives: Connectivity More scalable channel count Better channel utilization
UTCS17HPCA '09 Multidrop Express Channels (MECS)
UTCS18HPCA '09 Multidrop Express Channels (MECS)
UTCS19HPCA '09 Multidrop Express Channels (MECS)
UTCS20HPCA '09 Multidrop Express Channels (MECS)
UTCS21HPCA ‘09 Multidrop Express Channels (MECS)
Pros One-to-many topology Low diameter: 2 hops k channels row/column Asymmetric Cons Asymmetric Increased control (arbitration) complexity UTCS22HPCA ‘09 Multidrop Express Channels (MECS)
Analytical Comparison UTCS23HPCA '09 CMeshFBflyMECS Network Size Radix (conctr’d) Diameter Channel count Channel width Router inputs Router outputs
Experimental Methodology TopologiesMesh, CMesh, CMesh-X2, FBFly, MECS, MECS-X2 Network sizes64 & 256 terminals RoutingDOR, adaptive Messages64 & 576 bits Synthetic trafficUniform random, bit complement, transpose, self-similar PARSEC benchmarks Blackscholes, Bodytrack, Canneal, Ferret, Fluidanimate, Freqmine, Vip, x264 Full-system configM5 simulator, Alpha ISA, 64 OOO cores Energy evaluationOrion + CACTI 6 UTCS24HPCA '09
UTCS25HPCA '09 64 nodes: Uniform Random
UTCS26HPCA ' nodes: Uniform Random
UTCS27HPCA '09 Energy (100K pkts, Uniform Random)
UTCS28HPCA '09 64 Nodes: PARSEC
Generalized Express Cubes Low-dimensional k-ary n-cube n = {1,2} Good fit for planar silicon Express channels Improve connectivity MECS for better wire utilization Multiple networks Improve throughput Reduce crossbar area & energy overhead Hierarchical scaling UTCS29HPCA '09
Partitioning: a GEC Example UTCS30HPCA '09 MECS MECS-X2 Flattened Butterfly Partitioned MECS
Summary MECS A novel one-to-many topology Good fit for planar substrates Excellent connectivity Effective wire utilization Generalized Express Cubes Framework & taxonomy for NOC topologies Extension of the k-ary n-cube model Useful for understanding and exploring on-chip interconnect options Future: expand & formalize UTCS31HPCA '09
Summary MECS A novel one-to-many topology Good fit for planar substrates Excellent connectivity Effective wire utilization Generalized Express Cubes Framework & taxonomy for NOC topologies Extension of the k-ary n-cube model Useful for understanding and exploring on-chip interconnect options Future: expand & formalize UTCS32HPCA '09
UTCS33HPCA '09