Three-Dimensional Layout of On-Chip Tree-Based Networks Hiroki Matsutani (Keio Univ, Japan) Michihiro Koibuchi (NII, Japan) D. Frank Hsu (Fordham Univ,

Three-Dimensional Layout of On-Chip Tree-Based Networks Hiroki Matsutani (Keio Univ, Japan) Michihiro Koibuchi (NII, Japan) D. Frank Hsu (Fordham Univ, USA) Hideharu Amano (Keio Univ, Japan)

Outline Introduction –Network-on-Chip (NoC) –2-D vs. 3-D Fat Tree –2-D layout –3-D layout Fat H-Tree –2-D layout –3-D layout Evaluations –Area, Wire length, Energy [Matsutani, IPDPS’07]

Network-on-Chip (NoC) Tile architectures –MIT RAW –Texas U. TRIPS –Intel 80-tile NoC Various topologies –Mesh, Torus –Fat Trees –Fat H-Tree (FHT) [Vangal, ISSCC’07] [Buger, Computer’04] [Taylor, Micro’02] 16-core Tile architecture Tile (core & router) Packet switched network on a chip We proposed FHT as an alternative to Fat Trees [Matsutani, IPDPS’07]

2D Topologies: Mesh & Torus RouterCore 2-D Mesh2-D Torus –2x bandwidth of mesh RAW [Taylor, IEEE Micro’02]

2D Topologies: Fat Tree Fat Tree (p, q, c) p: # of upward links q: # of downward links c: # of core ports RouterCore Fat Tree (2,4,2)Fat Tree (2,4,1) In this talk, we focus on 3-D layout scheme of tree-based topologies Rank-1 Rank-2

2D NoC vs. 3D NoC 2D NoCs –Long wires (esp. trees) –Wire delay –Packets consume power at links according to their wire length 3D NoCs –Several small wafers or dices are stacked Vertical link –Micro bump –Through-wafer via –Very short (10-50um) [Ezaki, ISSCC’04] [Burns, ISSCC’01] Long horizontal wires in 2D NoCs can be replaced by very short vertical links in 3D NoCs Next slides show the 3D layout scheme of Fat Tree and FHT

Fat Tree: 2-D layout Fat Tree (p, q, c) p: # of upward links q: # of downward links c: # of core ports RouterCore Fat Tree (2,4,2)Fat Tree (2,4,1) We preliminarily show the 3D layout scheme of Fat Trees

Fat Tree: 3-D layout (4-split) 2-D coordinates3-D coordinates Original 2-D layout transformation Dividing into 4 layers Top-rank routers are distributed to each layer Layer-0Layer-1 Layer-2Layer-3

Original 2-D layout Fat Tree: 3-D layout (4-split) Top-rank links are replaced with vertical interconnects (10-50um) 2-D coordinates3-D coordinates transformation 3-D layout (4-stacked) This 3-D layout is evaluated in terms of area, wire, & energy Layer-0

Outline Introduction –Network-on-Chip (NoC) –2-D vs. 3-D Fat Tree –2-D layout –3-D layout Fat H-Tree –2-D layout –3-D layout Evaluations –Area, Wire length, Power [Matsutani, IPDPS’07]

Fat H-Tree: Structure Fat H-Tree –Red Tree (H-Tree) –Black Tree (H-Tree) [Matsutani, IPDPS’07] Combining two H-Trees (red & black) RouterCoreRouterCore Location of black tree is shifted lower-right direction of red tree By shifting the location of black tree, the connection pattern of trees is different from the original Fat Trees

[Matsutani, IPDPS’07] Fat H-Tree: Structure Fat H-Tree –Red Tree (H-Tree) –Black Tree (H-Tree) Combining two H-Trees (red & black) RouterCoreRouterCore Fat H-Tree is formed on red & black trees

[Matsutani, IPDPS’07] Fat H-Tree: Structure Fat H-Tree –Red Tree (H-Tree) –Black Tree (H-Tree) Combining two H-Trees (red & black) RouterCoreRouterCore Rank-2 or upper routers are omitted in this figure Each core is connected to both red & black trees Ring is formed with cores & rank1 routers Torus-level performance by combing only two H-Trees

Fat H-Tree: 2-D layout on VLSI Fat H-Tree –Torus structure  Folded as well as the folded layout of 2-D Torus Fat H-Tree’s 2-D layout RouterCore Topologically equivalent (Long feedback links across the chip) [Matsutani, IPDPS’07] The next slides propose the 3D layout scheme of Fat H-Tree

Fat H-Tree: 3-D layout (overview) Fat H-Tree –(Problem) Fat H-Tree has a torus structure –Folding so as to keep the torus structure (step 1) fold it horizontally (step 2) fold it vertically consisting of red & black trees Until the # of folded pieces meets the # of layers the 3-D IC has E.g., four layers  fold twice

Fat H-Tree: 3-D layout (overview) Fat H-Tree –(Problem) Fat H-Tree has a torus structure –Folding so as to keep the torus structure consisting of red & black trees (step 1) fold it horizontally (step 2) fold it vertically Until the # of folded pieces meets the # of layers the 3-D IC has E.g., four layers  fold twice

Fat H-Tree: 3-D layout (overview) Here we show the 3D layouts of red & black trees separately Fat H-Tree –(Problem) Fat H-Tree has a torus structure –Folding so as to keep the torus structure consisting of red & black trees (step 1) fold it horizontally (step 2) fold it vertically Until the # of folded pieces meets the # of layers the 3-D IC has E.g., four layers  fold twice

Fat H-Tree: 3-D (Red tree; 4-split) 2-D coordinates3-D coordinates transformation Original 2-D layout 3-D layout (4-stacked) Layer-0Layer-1 Layer-2Layer-3

Fat H-Tree: 3-D (Red tree; 4-split) 2-D coordinates3-D coordinates transformation Original 2-D layout 3-D layout (4-stacked) Top-rank links are replaced with vertical interconnects (10-50um) Layer-0

Fat H-Tree: 3-D (Black tree;4-split) Original 2-D layout 3-D layout (4-stacked) 2-D coordinates3-D coordinates transformation Layer-0Layer-1 Layer-2Layer-3 They can be connected via only a vertical link

Fat H-Tree: 3-D (Black tree;4-split) Original 2-D layout 3-D layout (4-stacked) The periphery cores are connected to different layers 2-D coordinates3-D coordinates transformation

Fat H-Tree: 3-D (Black tree;4-split) 2-D coordinates3-D coordinates transformation Original 2-D layout 3-D layout (4-stacked) Top-rank links are replaced with vertical interconnects (10-50um) The periphery cores are connected to different layers Layer-0

Fat H-Tree: 3-D layout (4-split) Red tree (3-D) Layer-0 Black tree (3-D) Fat H-Tree (3-D) Layer-0 The 3-D layout of Fat H-Tree can be formed by superimposing 3-D layouts of red & black trees

Evaluations: 2-D vs. 3-D 2-D layout –64-core 3-D layout –16-core x 4-layer –Vertical interconnects L mm L/2 mm

Network logic area: # of routers N=N=16N=64N=256 FT1628120 FT21256240 FHT1042170 3Dmesh1664256 3Dtorus1664256 # of routers & their ports in trees are less than mesh/torus 3-D mesh/torus: node degree 7 Fat H-Tree: node degree 5 Fat Tree (2,4,2): node degree 6 FT1: Fat tree(2,4,1) FT2: Fat tree(2,4,2) FHT: Fat H-Tree

Network logic area: 2-D vs. 3-D [Davis, DToC’05] Wormhole router –1-flit = 64-bit –3-stage pipeline Network interface –FIFO buffer –Packet forwarding (Fat H-Tree only) Inter-wafer via –1-10um square –100um per layer per 1-bit signal 2 Inter-wafer via area is calculated according to # of vertical links Network logic area –Routers, NIs –Inter-wafer vias Arbiter 5x5 XBAR FIFO Typical wormhole router Synthesized with a 90nm CMOS [Matsutani, ASPDAC’08]

Network logic area: Overhead of 3D Synthesis result of 64-core (16-core x 4) FT1: Fat Tree(2,4,1) FT2: Fat Tree(2,4,2) FHT: Fat H-Tree 3D layout of trees  area overheat is modest (at most 7.8%) 3D torus 2D torus Inter-wafer via area (+7.8%)

Total wire length of all links Total unit-length of links –Core router –Router router 1-unit link How many unit-links is required ? 1-unit = distance between neighboring cores

Total wire length of all links FT1: Fat Tree(2,4,1) FT2: Fat Tree(2,4,2) FHT: Fat H-Tree N=N=16N=64N=256 2D FT1321921,024 2D FT2643842,048 2D FHT723921,800 2Dmesh24112480 2Dtorus48224960 1-unit

Total wire length of all links N=N=16N=64N=256 2D FT1321921,024 2D FT2643842,048 2D FHT723921,800 2Dmesh24112480 2Dtorus48224960 1-unit N=N=16N=64N=256 3D FT116128768 3D FT2322561,536 3D FHT40200904 3Dmesh1696448 3Dtorus32192896 1-unit 4-stacked FT1: Fat Tree(2,4,1) FT2: Fat Tree(2,4,2) FHT: Fat H-Tree Wire length of trees is reduced by 25%-50% (close to torus)

Energy: NoC’s energy model Ave. flit energy –Send 1-flit to dest. –How much energy[J] ? Parameters –8mm square chip –64-core (16-core x 4) –90nm CMOS Switching energy –1-bit switching @ Router –Gate-level sim –0.183 [pJ / hop] Link energy –1-bit transfer @ Link –0.150 [pJ / mm] Via energy –4.34 [fF / via] 8mm [Davis, DToC’05]

Energy: Reduction by going 3D Frequent use of longest links Short hop count  less energy FT1: Fat tree(2,4,1) FT2: Fat tree(2,4,2) FHT: Fat H-Tree 2-D layout

Energy: Reduction by going 3D 2-D layout 3-D layout Moving distance of packets is reduced The 3D layout of trees reduces the energy by 30.8%-42.9% FT1: Fat tree(2,4,1) FT2: Fat tree(2,4,2) FHT: Fat H-Tree

Summary: 3-D layout of trees Drawbacks of on-chip tree-based topologies –Long links around the root of tree –Wire delay problem –Repeater insertion  additional energy consumption 3-D layout schemes of Fat Trees & Fat H-Tree –Wire length is reduced by 25%-50% –Area overhead is at most 7.8% –Flit transmission energy is reduced by 30.8%-42.9% Need to consider negative impacts of 3-D (cost,heat,yield…) In addition, energy-hungry repeater buffers can be removed

Thank you for your attention

Backup slides

Energy: Reduction by going 3D 2-D layout (w/o repeaters) 2-D layout (with repeaters) (*) Repeater insertion model: N. Weste et.al, “CMOS VLSI Design (3rd ed)”, 2005. (*) Energy is increased FT1: Fat tree(2,4,1) FT2: Fat tree(2,4,2) FHT: Fat H-Tree

Three-Dimensional Layout of On-Chip Tree-Based Networks Hiroki Matsutani (Keio Univ, Japan) Michihiro Koibuchi (NII, Japan) D. Frank Hsu (Fordham Univ,

Similar presentations

Presentation on theme: "Three-Dimensional Layout of On-Chip Tree-Based Networks Hiroki Matsutani (Keio Univ, Japan) Michihiro Koibuchi (NII, Japan) D. Frank Hsu (Fordham Univ,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Three-Dimensional Layout of On-Chip Tree-Based Networks Hiroki Matsutani (Keio Univ, Japan) Michihiro Koibuchi (NII, Japan) D. Frank Hsu (Fordham Univ,

Similar presentations

Presentation on theme: "Three-Dimensional Layout of On-Chip Tree-Based Networks Hiroki Matsutani (Keio Univ, Japan) Michihiro Koibuchi (NII, Japan) D. Frank Hsu (Fordham Univ,"— Presentation transcript:

Similar presentations

About project

Feedback