Caltech CS184 Winter2003 -- DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 14: February 10, 2003 Interconnect 4: Switching.

Slides:



Advertisements
Similar presentations
EDA (CS286.5b) Day 10 Scheduling (Intro Branch-and-Bound)
Advertisements

Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day6: October 11, 2000 Instruction Taxonomy VLSI Scaling.
Balancing Interconnect and Computation in a Reconfigurable Array Dr. André DeHon BRASS Project University of California at Berkeley Why you don’t really.
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day17: November 20, 2000 Time Multiplexing.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 9: February 20, 2008 Partitioning (Intro, KLFM)
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 15: March 12, 2007 Interconnect 3: Richness.
CS294-6 Reconfigurable Computing Day 8 September 17, 1998 Interconnect Requirements.
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day10: October 25, 2000 Computing Elements 2: Cascades, ALUs,
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day8: October 18, 2000 Computing Elements 1: LUTs.
EDA (CS286.5b) Day 5 Partitioning: Intro + KLFM. Today Partitioning –why important –practical attack –variations and issues.
DeHon March 2001 Rent’s Rule Based Switching Requirements Prof. André DeHon California Institute of Technology.
CS294-6 Reconfigurable Computing Day 10 September 24, 1998 Interconnect Richness.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 21: April 15, 2009 Routing 1.
EDA (CS286.5b) Day 6 Partitioning: Spectral + MinCut.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 13: February 26, 2007 Interconnect 1: Requirements.
CS294-6 Reconfigurable Computing Day 13 October 6, 1998 Interconnect Partitioning.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 11: February 14, 2007 Compute 1: LUTs.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 19: April 9, 2008 Routing 1.
CS294-6 Reconfigurable Computing Day 12 October 1, 1998 Interconnect Population.
DAST 2005 Week 4 – Some Helpful Material Randomized Quick Sort & Lower bound & General remarks…
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 18: March 21, 2007 Interconnect 6: MoT.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 16: March 14, 2007 Interconnect 4: Switching.
Balancing Interconnect and Computation in a Reconfigurable Array Dr. André DeHon BRASS Project University of California at Berkeley Why you don’t really.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 14: February 28, 2007 Interconnect 2: Wiring Requirements and Implications.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 13: February 4, 2005 Interconnect 1: Requirements.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 15: February 12, 2003 Interconnect 5: Meshes.
ESE Spring DeHon 1 ESE534: Computer Organization Day 19: April 7, 2014 Interconnect 5: Meshes.
Global Routing.
Mathematics Review and Asymptotic Notation
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 2: January 21, 2015 Heterogeneous Multicontext Computing Array Work Preclass.
CBSSS 2002: DeHon Costs André DeHon Wednesday, June 19, 2002.
1 Dynamic Interconnection Networks Miodrag Bolic.
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day11: October 30, 2000 Interconnect Requirements.
Caltech CS184 Spring DeHon 1 CS184b: Computer Architecture (Abstractions and Optimizations) Day 9: April 15, 2005 ILP 2.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 7: January 24, 2003 Instruction Space.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 18: February 18, 2005 Interconnect 6: MoT.
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 6: January 23, 2002 Partitioning (Intro, KLFM)
Caltech CS184b Winter DeHon 1 CS184b: Computer Architecture [Single Threaded Architecture: abstractions, quantification, and optimizations] Day9:
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 10: February 6, 2002 Placement (Simulated Annealing…)
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 13: February 20, 2002 Routing 1.
CBSSS 2002: DeHon Interconnect André DeHon Thursday, June 20, 2002.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 16: February 14, 2003 Interconnect 6: MoT.
Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 18: April 2, 2014 Interconnect 4: Switching.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 16: February 14, 2005 Interconnect 4: Switching.
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 9: February 9, 2004 Partitioning (Intro, KLFM)
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 13: February 6, 2003 Interconnect 3: Richness.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 6: January 22, 2003 VLSI Scaling.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 6: January 30, 2013 Partitioning (Intro, KLFM)
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 11: January 31, 2005 Compute 1: LUTs.
ESE Spring DeHon 1 ESE534: Computer Organization Day 18: March 26, 2012 Interconnect 5: Meshes (and MoT)
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day14: November 10, 2000 Switching.
ESE Spring DeHon 1 ESE534: Computer Organization Day 20: April 9, 2014 Interconnect 6: Direct Drive, MoT.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 12: February 5, 2003 Interconnect 2: Wiring Requirements.
Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 17: March 29, 2010 Interconnect 2: Wiring Requirements and Implications.
CALTECH CS137 Fall DeHon 1 CS137: Electronic Design Automation Day 21: November 28, 2005 Routing 1.
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day12: November 1, 2000 Interconnect Requirements and Richness.
ESE534: Computer Organization
ESE534: Computer Organization
ESE534: Computer Organization
CS184a: Computer Architecture (Structure and Organization)
CS137: Electronic Design Automation
ESE534: Computer Organization
ESE534: Computer Organization
ESE534: Computer Organization
Indirect Networks or Dynamic Networks
Interconnection Network Design Lecture 14
ESE534: Computer Organization
CS184a: Computer Architecture (Structures and Organization)
At the end of this session, learner will be able to:
CS184a: Computer Architecture (Structure and Organization)
Presentation transcript:

Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 14: February 10, 2003 Interconnect 4: Switching

Caltech CS184 Winter DeHon 2 Previously Used Rent’s Rule characterization to understand wire growth IO = c N p Top bisections will be  (N p ) 2D wiring area  (N p )  (N p ) =  (N 2p )

Caltech CS184 Winter DeHon 3 We Know How we avoid O(N 2 ) wire growth for “typical” designs How to characterize locality How we might exploit that locality to reduce wire growth Wire growth implied by a characterized design

Caltech CS184 Winter DeHon 4 Today Switching –Implications –Options Multilayer metalization

Caltech CS184 Winter DeHon 5 Switching: How can we use the locality captured by Rent’s Rule to reduce switching requirements? (How much?)

Caltech CS184 Winter DeHon 6 Observation Locality that saved us wiring, also saves us switching IO = c N p

Caltech CS184 Winter DeHon 7 Consider Crossbar case to exploit wiring:  split into two halves, connect with limited wires  N/2 x N/2 crossbar each half  N/2 x (N/2) p connect to bisection wires  2(N 2 /4) + 2(N/2) p+1 < N 2

Caltech CS184 Winter DeHon 8 Recurse Repeat at each level –form tree

Caltech CS184 Winter DeHon 9 Result If use crossbar at each tree node –O(N 2p ) wiring area p>0.5, direct from bisection –O(N 2p ) switches top switch box is O(N 2p ) switches at one level down is –2  (1/2 p ) 2  previous level –(2/2 2p )=2 (1-2p) –coefficient 0.5

Caltech CS184 Winter DeHon 10 Result If use crossbar at each tree node –O(N 2p ) switches top switch box is O(N 2p ) switches at one level down is  2 (1-2p)  previous level Total switches:  N 2p  (1+2 (1-2p) +2 2(1-2p) +2 3(1-2p) +…)  get geometric series; sums to O(1)  N 2p  (1/1-2 (1-2p) )  = 2 (2p-1) /(2 (2p-1) -1)  N 2p

Caltech CS184 Winter DeHon 11 Good News Good news –asymptotically optimal –Even without switches area O(N 2p ) so adding O(N 2p ) switches not change

Caltech CS184 Winter DeHon 12 Bad News Switches area >> wire crossing area –Consider 6 wire pitch  crossing 36  2 –Typical (passive) switch  –Passive only: 70x area difference worse once rebuffer or latch signals. Switches limited to substrate –whereas can use additional metal layers for wiring area

Caltech CS184 Winter DeHon 13 Additional Structure This motivates us to look beyond crossbars –can depopulate crossbars on up-down connection without loss of functionality?

Caltech CS184 Winter DeHon 14 Can we do better? Crossbar too powerful? –Does the specific down channel matter? What do we want to do? –Connect to any channel on lower level –Choose a subset of wires from upper level order not important

Caltech CS184 Winter DeHon 15 N choose K Exploit freedom to depopulate switchbox Can do with: –K  (N-K+1) swtiches –Vs. K  N –Save ~K 2

Caltech CS184 Winter DeHon 16 N-choose-M Up-down connections –only require concentration choose M things out of N –i.e. order of subset irrelevant Consequent: –can save a constant factor ~ 2 p /(2 p -1) (N/2) p x N p vs (N p - (N/2) p +1)(N/2) p P=2/3  2 p /(2 p -1)  2.7 Similary, Left-Right –order not important  reduces switches

Caltech CS184 Winter DeHon 17 Multistage Switching Can route any permutation w/ less switches than a crossbar If we allow switching in stages –Trade increase in switches in path –For decrease in total switches

Caltech CS184 Winter DeHon 18 Decomposition

Caltech CS184 Winter DeHon 19 Decomposition Switches: N/2  2  4 + (N/2) 2 < N 2

Caltech CS184 Winter DeHon 20 Recurse If it works once, try it again…

Caltech CS184 Winter DeHon 21 Result: Beneš Network 2log 2 (N)-1 stages (switches in path) Of N/2 2  2 switchpoints (4 switches) 4N  log 2 (N) total switches Compute route in O(N log(N)) time

Caltech CS184 Winter DeHon 22 Beneš Network Wiring Bisection: N Wiring  O(N 2 ) area (fixed wire layers)

Caltech CS184 Winter DeHon 23 Beneš Switching Beneš reduced switches –N 2 to N(log(N)) –using multistage network Replace crossbars in tree with Beneš switching networks

Caltech CS184 Winter DeHon 24 Beneš Switching Implication of Beneš Switching –still require O(W 2 ) wiring per tree node or a total of O(N 2p ) wiring –now O(W log(W)) switches per tree node converges to O(N) total switches! –O(log 2 (N)) switches in path across network strictly speaking, dominated by wire delay ~O(N p ) but constants make of little practical interest except for very large networks 

Caltech CS184 Winter DeHon 25 Better yet… Believe do not need Beneš on the up paths Single switch on up path Beneš for crossover Switches in path: log(N) up + log(N) down + 2log(N) crossover = 4 log(N) = O(log (N))

Caltech CS184 Winter DeHon 26 Linear Switch Population Can further reduce switches –connect each lower channel to O(1) channels in each tree node –end up with O(W) switches per tree node

Caltech CS184 Winter DeHon 27 Linear Switch (p=0.5)

Caltech CS184 Winter DeHon 28 Linear Consequences: Good News Linear Switches –O(log(N)) switches in path –O(N 2p ) wire area –O(N) switches –More practical than Beneš case

Caltech CS184 Winter DeHon 29 Linear Consequences: Bad News Lacks guarantee can use all wires –as shown, at least mapping ratio > 1 –likely cases where even constant not suffice expect no worse than logarithmic open to establish tight lower bound for any linear arrangement Finding Routes is harder –no longer linear time, deterministic –open as to exactly how hard

Caltech CS184 Winter DeHon 30 Mapping Ratio Mapping ratio says –if I have W channels may only be able to use W/MR wires –for a particular design’s connection pattern to accommodate any design –forall channels physical wires  MR  logical

Caltech CS184 Winter DeHon 31 Mapping Ratio Example: –Shows MR=3/2 –For Linear Population, 1:1 switchbox

Caltech CS184 Winter DeHon 32 Area Comparison Both: p=0.67 N=1024 M-choose-N perfect map Linear MR=2

Caltech CS184 Winter DeHon 33 Area Comparison M-choose-N perfect map Linear MR=2 Since –switch >> wire may be able to tolerate MR>1 reduces switches –net area savings Open: –Prove any constant mapping ratio Empirical: –Never seen greater than 1.5

Caltech CS184 Winter DeHon 34 Multi-layer metal? Preceding assumed –fixed wire layers In practice, –increasing wire layers with shrinking tech. –Increasing wire layers with chip capacity wire layer growth ~ O(log(N))

Caltech CS184 Winter DeHon 35 Multi-Layer Natural response to  (N 2p ) wire layers –Given N p wires in bisection rather than accept N p width –use N (p-0.5) layers –accommodate in N 0.5 width now wiring takes  (N) 2D area –with N (p-0.5) wire layers for p=0.5, – log(N) layers to accommodate wiring

Caltech CS184 Winter DeHon 36 Linear + Multilayer Multilayer says can do in  (N) 2D-area Switches require 2D-area –more than O(N) switches would make switches dominate –Linear and Beneš have O(N) switches There’s a possibility can achieve O(N) area –with multilayer metal and linear population

Caltech CS184 Winter DeHon 37 Fold and Squash Layout Caveat: this formulation for p=0.5 … Maybe p>0.5 on Day16

Caltech CS184 Winter DeHon 38 Fold and Squash Layout

Caltech CS184 Winter DeHon 39 Folding

Caltech CS184 Winter DeHon 40 Folding

Caltech CS184 Winter DeHon 41 Folding

Caltech CS184 Winter DeHon 42 Folding

Caltech CS184 Winter DeHon 43 Folding

Caltech CS184 Winter DeHon 44 Folding

Caltech CS184 Winter DeHon 45 Folding Invariants Lower folds leave both diagonals free Current level consumes one, leaving other free

Caltech CS184 Winter DeHon 46 Compact Folded Layout Can contain switches to constant area Wires still grow faster than linear Can use extra wire layers to accommodate wire growth (whereas switches not helped by additional wire layers)

Caltech CS184 Winter DeHon 47 Fold and Squash Result Can layout BFT – in O(N) 2D area –with O(log(N)) wiring layers

Caltech CS184 Winter DeHon 48 Big Ideas [MSB Ideas] In addition to wires, must have switches –Have significant area and delay Rent’s Rule locality reduces –both wiring and switching requirements Naïve switches match wires at O(N 2p ) –switch area >> wire area –prevent benefit from multiple layers of metal

Caltech CS184 Winter DeHon 49 Big Ideas [MSB Ideas] Can achieve O(N) switches –plausibly O(N) area with sufficient metal layers Switchbox depopulation –save considerably on area (delay) –will waste wires –May still come out ahead (evidence to date) –So far: routing no longer guaranteed routing becomes NP-complete?