Presentation is loading. Please wait.

Presentation is loading. Please wait.

ESE534: Computer Organization

Similar presentations


Presentation on theme: "ESE534: Computer Organization"— Presentation transcript:

1 ESE534: Computer Organization
Day 20: November 9, 2016 Interconnect 6: Direct Drive, MoT ESE Fall DeHon

2 Previously Tree-based network Mesh Interconnect Segmentation in Mesh
O(N) switches with linear population Mesh Interconnect O(Np+0.5) switches with linear population Segmentation in Mesh ESE Fall DeHon

3 Today Directional Drive Mesh-of-Trees (MoT) Multi-level metal
Heterogeneous Blocks ESE Fall DeHon

4 Bidirectional Any wire can send data any direction. Horizontal
Can route right or left Vertical Can route up or down ESE Fall DeHon

5 Buffered Bidirectional Wires
C Block S Block Logic ESE Fall DeHon

6 Bidirectional Any wire can send data any direction. Horizontal
Can route right or left Vertical Can route up or down Pros? Cons? ESE Fall DeHon

7 Tristate vs. Buffer Which is faster? ESE Fall DeHon

8 Buffered Switch Composition
Making a buffer tristateable makes it larger and diminishes its drive strength. ESE Fall DeHon

9 Slides and Study from Guy Lemieux (Paper FPT2004 – Tutorial FPT 2009)
Directional Drive Slides and Study from Guy Lemieux (Paper FPT2004 – Tutorial FPT 2009) ESE Fall DeHon

10 Bidirectional Wires Problem Half of tristate buffers left unused
Buffers + input muxes dominate interconnect area ESE Fall DeHon

11 Bidirectional vs Directional
ESE Fall DeHon

12 Bidirectional vs Directional
ESE Fall DeHon

13 Bidirectional Switch Block
ESE Fall DeHon

14 Directional Switch Block
ESE Fall DeHon

15 Bidirectional vs Directional
Switch Block Directional half as many Switch Elements Switch Element Same quantity and type of circuit elements, twice the wiring ESE Fall DeHon

16 Multi-driver Wiring Logic outputs use tristate buffers (C Block)
S Block S Block Directional & multi-driver wiring CLB ESE Fall DeHon

17 Single-driver Wiring Logic outputs use muxes (S Block)
New connectivity constraint Directional & single-driver wiring CLB ESE Fall DeHon

18 Long Wires in Directional
Long wires, span L tiles Example L = 3 CLB CLB CLB CLB CLB CLB CLB CLB CLB CLB 1 2 3 ESE Fall DeHon

19 Directional, Single-driver Benefits
Average improvements 0% channel width (most surprising?) 9% delay 14% tile length of physical layout 25% transistor count 32% area-delay product 37% wiring capacitance Any reason to use bidirectional? ESE Fall DeHon

20 Preclass Complete Table ESE Fall DeHon

21 MoT ESE Fall DeHon

22 Recall: Mesh Switches Switches per switchbox: Switches into network:
6w/Lseg Switches into network: (K+1) w Switches per PE: 6w/Lseg + Fc(K+1) w w = cNp-0.5 Total  Np-0.5 Total Switches: N*(Sw/PE)  Np+0.5 > N ESE Fall DeHon

23 Recall: Mesh Switches Switches per PE: Not change for
6w/Lseg + Fc(K+1) w w = cNp-0.5 Total  Np-0.5 Not change for Any constant Fc Any constant Lseg ESE Fall DeHon

24 Mesh of Trees Hierarchical Mesh Build Tree in each column
[Leighton/FOCS 1981] ESE Fall DeHon

25 Mesh of Trees Hierarchical Mesh Build Tree in each column
…and each row [Leighton/FOCS 1981] ESE Fall DeHon

26 MoT Parameterization Support C with additional trees C=1 C=2
(like BFT) C=1 C=2 ESE Fall DeHon

27 MoT Parameterization: P
ESE Fall DeHon

28 Mesh of Trees Logic Blocks Connect to the C trees C<W
Only connect at leaves of tree Connect to the C trees Per side 4C total C<W ESE Fall DeHon

29 Switches How many switches per tree? How many trees?
How many total switches? ESE Fall DeHon

30 Switches Total Tree switches Sw/Tree: 2 C N× (switches/tree)
ESE Fall DeHon

31 Switches Only connect to leaves of tree Leaf switches: C(K+1)
Total switches Leaf + Tree O(N) Compare Mesh O(Np+0.5) ESE Fall DeHon

32 Wires Design: O(Np) in top level Total wire width of channels: O(Np)
Another geometric sum No detail route guarantee (at present) ESE Fall DeHon

33 Empirical Results Benchmark: Toronto 20 Compare to Lseg=1, Lseg=4
CLMA ~ 8K LUTs Mesh(Lseg=4): w=14  122 switches/LB MoT(p=0.67,arity=2): C=4  89 switches/LB Benchmark wide: 10% less CLMA largest Asymptotic advantage [Rubin,DeHon/FPGA2003] ESE Fall DeHon

34 MoT Parameters C, P Arity Staggering ESE Fall DeHon

35 Staggering With multiple Trees Offset relative to each other
Avoids worst-case discrete breaks ESE Fall DeHon

36 Staggering With multiple Trees (not offset)
ESE Fall DeHon

37 Staggering With multiple Trees Offset relative to each other
Avoids worst-case discrete breaks ESE Fall DeHon

38 Staggering With multiple Trees Offset relative to each other
Avoids worst-case discrete breaks ESE Fall DeHon

39 Flattening Can use arity other than two ESE Fall DeHon

40 [Rubin&DeHon/TRVLSI2004]
Overall 26% fewer than mesh [Rubin&DeHon/TRVLSI2004] ESE Fall DeHon

41 Day 5 ESE Fall DeHon

42 Wire Layers = More Wiring
Day 5 ESE Fall DeHon

43 Mesh Segmentation Wires of uniform length Lseg
Allow wires to bypass switchboxes ESE Fall DeHon

44 Mesh C-box connections
What’s the relationship between metal layers and tile size (just C-Box)? ESE Fall DeHon

45 Single Segment Length Rebuffer
What’s the relationship between metal layers, segment length, and tile size (just for rebuffer)? Y-Y switches Like Preclass B Lseg=4 Side view Layout ESE Fall DeHon

46 Use of Metal Layers in Mesh
Number of metal layers we can use is ultimately limited Uniform wires saturate vias from active layer at bottom ESE Fall DeHon

47 MoT Layout Main issue is layout 1D trees in multilayer metal
ESE Fall DeHon

48 Row/Column Layout Geometric Progression  does not saturate via space!
ESE Fall DeHon

49 Row/Column Layout ESE Fall DeHon

50 Composite Logic Block Tile
ESE Fall DeHon

51 P=0.75 Row/Column Layout ESE Fall DeHon

52 P=0.75 Row/Column Layout ESE Fall DeHon

53 MoT Layout Easily laid out in Multiple metal layers
Minimal O(Np-0.5) layers Contain constant switching area per LB Even with p>0.5 ESE Fall DeHon

54 Compare Mesh Ability to use metal layers limited MoT
Switch area per LB grows O(Np-0.5) Ability to use metal layers limited Saturate vias MoT Switch area per LB constant O(1) Can use growing number of metal layers While keeping switch area constant ESE Fall DeHon

55 Heterogeneous Blocks ESE Fall DeHon

56 Heterogeneous How integrate heterogeneous blocks?
Memory blocks alongside 4-LUTs Processors Custom Units ESE Fall DeHon

57 Crossbar With Heterogeneous Blocks
ESE Fall DeHon

58 Replace Clusters Cluster LUTs Replace LUT clusters with other units
Requires same/similar I/O Network support If replace in row (column) Change height (width) to adjust for different size ESE Fall DeHon

59 Stratix V Floorplan TODO: commercial example
Source: ESE Fall DeHon

60 Replace Subtrees ESE Fall DeHon

61 Heterogeneous Example
Unified Network Fine-grained spatial logic Memory Blocks VLIW temporal processors ESE Fall DeHon

62 Heterogeneity Raises Architectural Questions
Resource balance Logic vs. memory [day 21, 22, 24] Fine-grained vs. coarse-grained Locality of resources Logic to memory [day 24] ESE Fall DeHon

63 Big Ideas [MSB Ideas] Benefits to Single Driver Hierarchy structure
allows to save switches O(N) vs. W(Np+0.5) reduces worst-case switches in path Can exploit multi-level metal ESE Fall DeHon

64 Admin HW8 due today HW9 next Wednesday Reading for Monday on Web
Must run tools; will take time (plan for it) Reading for Monday on Web ESE Fall DeHon


Download ppt "ESE534: Computer Organization"

Similar presentations


Ads by Google