Download presentation
Presentation is loading. Please wait.
1
ESE534: Computer Organization
Day 20: November 9, 2016 Interconnect 6: Direct Drive, MoT ESE Fall DeHon
2
Previously Tree-based network Mesh Interconnect Segmentation in Mesh
O(N) switches with linear population Mesh Interconnect O(Np+0.5) switches with linear population Segmentation in Mesh ESE Fall DeHon
3
Today Directional Drive Mesh-of-Trees (MoT) Multi-level metal
Heterogeneous Blocks ESE Fall DeHon
4
Bidirectional Any wire can send data any direction. Horizontal
Can route right or left Vertical Can route up or down ESE Fall DeHon
5
Buffered Bidirectional Wires
C Block S Block Logic ESE Fall DeHon
6
Bidirectional Any wire can send data any direction. Horizontal
Can route right or left Vertical Can route up or down Pros? Cons? ESE Fall DeHon
7
Tristate vs. Buffer Which is faster? ESE Fall DeHon
8
Buffered Switch Composition
Making a buffer tristateable makes it larger and diminishes its drive strength. ESE Fall DeHon
9
Slides and Study from Guy Lemieux (Paper FPT2004 – Tutorial FPT 2009)
Directional Drive Slides and Study from Guy Lemieux (Paper FPT2004 – Tutorial FPT 2009) ESE Fall DeHon
10
Bidirectional Wires Problem Half of tristate buffers left unused
Buffers + input muxes dominate interconnect area ESE Fall DeHon
11
Bidirectional vs Directional
ESE Fall DeHon
12
Bidirectional vs Directional
ESE Fall DeHon
13
Bidirectional Switch Block
ESE Fall DeHon
14
Directional Switch Block
ESE Fall DeHon
15
Bidirectional vs Directional
Switch Block Directional half as many Switch Elements Switch Element Same quantity and type of circuit elements, twice the wiring ESE Fall DeHon
16
Multi-driver Wiring Logic outputs use tristate buffers (C Block)
S Block S Block Directional & multi-driver wiring CLB ESE Fall DeHon
17
Single-driver Wiring Logic outputs use muxes (S Block)
New connectivity constraint Directional & single-driver wiring CLB ESE Fall DeHon
18
Long Wires in Directional
Long wires, span L tiles Example L = 3 CLB CLB CLB CLB CLB CLB CLB CLB CLB CLB 1 2 3 ESE Fall DeHon
19
Directional, Single-driver Benefits
Average improvements 0% channel width (most surprising?) 9% delay 14% tile length of physical layout 25% transistor count 32% area-delay product 37% wiring capacitance Any reason to use bidirectional? ESE Fall DeHon
20
Preclass Complete Table ESE Fall DeHon
21
MoT ESE Fall DeHon
22
Recall: Mesh Switches Switches per switchbox: Switches into network:
6w/Lseg Switches into network: (K+1) w Switches per PE: 6w/Lseg + Fc(K+1) w w = cNp-0.5 Total Np-0.5 Total Switches: N*(Sw/PE) Np+0.5 > N ESE Fall DeHon
23
Recall: Mesh Switches Switches per PE: Not change for
6w/Lseg + Fc(K+1) w w = cNp-0.5 Total Np-0.5 Not change for Any constant Fc Any constant Lseg ESE Fall DeHon
24
Mesh of Trees Hierarchical Mesh Build Tree in each column
[Leighton/FOCS 1981] ESE Fall DeHon
25
Mesh of Trees Hierarchical Mesh Build Tree in each column
…and each row [Leighton/FOCS 1981] ESE Fall DeHon
26
MoT Parameterization Support C with additional trees C=1 C=2
(like BFT) C=1 C=2 ESE Fall DeHon
27
MoT Parameterization: P
ESE Fall DeHon
28
Mesh of Trees Logic Blocks Connect to the C trees C<W
Only connect at leaves of tree Connect to the C trees Per side 4C total C<W ESE Fall DeHon
29
Switches How many switches per tree? How many trees?
How many total switches? ESE Fall DeHon
30
Switches Total Tree switches Sw/Tree: 2 C N× (switches/tree)
ESE Fall DeHon
31
Switches Only connect to leaves of tree Leaf switches: C(K+1)
Total switches Leaf + Tree O(N) Compare Mesh O(Np+0.5) ESE Fall DeHon
32
Wires Design: O(Np) in top level Total wire width of channels: O(Np)
Another geometric sum No detail route guarantee (at present) ESE Fall DeHon
33
Empirical Results Benchmark: Toronto 20 Compare to Lseg=1, Lseg=4
CLMA ~ 8K LUTs Mesh(Lseg=4): w=14 122 switches/LB MoT(p=0.67,arity=2): C=4 89 switches/LB Benchmark wide: 10% less CLMA largest Asymptotic advantage [Rubin,DeHon/FPGA2003] ESE Fall DeHon
34
MoT Parameters C, P Arity Staggering ESE Fall DeHon
35
Staggering With multiple Trees Offset relative to each other
Avoids worst-case discrete breaks ESE Fall DeHon
36
Staggering With multiple Trees (not offset)
ESE Fall DeHon
37
Staggering With multiple Trees Offset relative to each other
Avoids worst-case discrete breaks ESE Fall DeHon
38
Staggering With multiple Trees Offset relative to each other
Avoids worst-case discrete breaks ESE Fall DeHon
39
Flattening Can use arity other than two ESE Fall DeHon
40
[Rubin&DeHon/TRVLSI2004]
Overall 26% fewer than mesh [Rubin&DeHon/TRVLSI2004] ESE Fall DeHon
41
Day 5 ESE Fall DeHon
42
Wire Layers = More Wiring
Day 5 ESE Fall DeHon
43
Mesh Segmentation Wires of uniform length Lseg
Allow wires to bypass switchboxes ESE Fall DeHon
44
Mesh C-box connections
What’s the relationship between metal layers and tile size (just C-Box)? ESE Fall DeHon
45
Single Segment Length Rebuffer
What’s the relationship between metal layers, segment length, and tile size (just for rebuffer)? Y-Y switches Like Preclass B Lseg=4 Side view Layout ESE Fall DeHon
46
Use of Metal Layers in Mesh
Number of metal layers we can use is ultimately limited Uniform wires saturate vias from active layer at bottom ESE Fall DeHon
47
MoT Layout Main issue is layout 1D trees in multilayer metal
ESE Fall DeHon
48
Row/Column Layout Geometric Progression does not saturate via space!
ESE Fall DeHon
49
Row/Column Layout ESE Fall DeHon
50
Composite Logic Block Tile
ESE Fall DeHon
51
P=0.75 Row/Column Layout ESE Fall DeHon
52
P=0.75 Row/Column Layout ESE Fall DeHon
53
MoT Layout Easily laid out in Multiple metal layers
Minimal O(Np-0.5) layers Contain constant switching area per LB Even with p>0.5 ESE Fall DeHon
54
Compare Mesh Ability to use metal layers limited MoT
Switch area per LB grows O(Np-0.5) Ability to use metal layers limited Saturate vias MoT Switch area per LB constant O(1) Can use growing number of metal layers While keeping switch area constant ESE Fall DeHon
55
Heterogeneous Blocks ESE Fall DeHon
56
Heterogeneous How integrate heterogeneous blocks?
Memory blocks alongside 4-LUTs Processors Custom Units ESE Fall DeHon
57
Crossbar With Heterogeneous Blocks
ESE Fall DeHon
58
Replace Clusters Cluster LUTs Replace LUT clusters with other units
Requires same/similar I/O Network support If replace in row (column) Change height (width) to adjust for different size ESE Fall DeHon
59
Stratix V Floorplan TODO: commercial example
Source: ESE Fall DeHon
60
Replace Subtrees ESE Fall DeHon
61
Heterogeneous Example
Unified Network Fine-grained spatial logic Memory Blocks VLIW temporal processors ESE Fall DeHon
62
Heterogeneity Raises Architectural Questions
Resource balance Logic vs. memory [day 21, 22, 24] Fine-grained vs. coarse-grained Locality of resources Logic to memory [day 24] ESE Fall DeHon
63
Big Ideas [MSB Ideas] Benefits to Single Driver Hierarchy structure
allows to save switches O(N) vs. W(Np+0.5) reduces worst-case switches in path Can exploit multi-level metal ESE Fall DeHon
64
Admin HW8 due today HW9 next Wednesday Reading for Monday on Web
Must run tools; will take time (plan for it) Reading for Monday on Web ESE Fall DeHon
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.