Download presentation
Presentation is loading. Please wait.
Published byDamon Copeland Modified over 9 years ago
1
TLC: Transmission Line Caches Brad Beckmann David Wood Multifacet Project http://www.cs.wisc.edu/multifacet/ University of Wisconsin-Madison 12/3/03
2
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches2 Overview Problem: Global interconnect Opportunity: On-chip transmission lines –What are they? –Why now? Application: Large on-chip caches Solution: TLC: Transmission Line Caches +Consistent high performance +Simple logical design +Less substrate area –Circuit verification –Wafer manufacturing cost
3
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches3 Outline Problem: Global interconnect Opportunity: On-chip transmission lines Application: Large on-chip caches Solution: TLC: Transmission Line Caches Evaluation Conclusions
4
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches4 Global Interconnect Problem Global interconnect latency → Bottleneck –RC delay dominant –Held constant using repeaters –Doesn’t scale with transistors Large structures particularly hurt –Partitioning mitigates intra-partition delay –Performance dominated by inter-partition delay
5
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches5 Conventional Solution ↑ wire size → ↓ RC delay +3x size → 3x reduced delay +↑ wire segment length –3x channel area –Doesn’t scale Intrinsic repeater delay Inductive effects A Better Solution?
6
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches6 Outline Problem: Global interconnect Opportunity: On-chip transmission lines Application: Large on-chip caches Solution: TLC - Transmission Line Caches Evaluation Conclusions
7
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches7 RC vs. TL Communication Conventional Global RC Wire On-chip Transmission Line Voltage Distance Vt DriverReceiver Voltage Distance Vt DriverReceiver
8
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches8 RC Wire vs. TL Design RC delay dominated Receiver Driver On-chip Transmission Line Conventional Global RC Wire LC delay dominated ~0.375 mm ~10 mm
9
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches9 On-chip Transmission Lines Why now? → 2010 technology –Relative RC delay ↑ –Improve latency by 10x or more What are their limitations? –Require thick wires and dielectric spacing –Increase wafer cost Presents a different Latency/Bandwidth Tradeoff
10
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches10 Latency Comparison
11
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches11 Bandwidth Comparison 2 transmission line signals 50 conventional signals Key observation Transmission lines – route over large structures Conventional wires – substrate area & vias for repeaters
12
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches12 Outline Problem: Global interconnect Opportunity: On-chip transmission lines Application: Large on-chip caches Solution: TLC: Transmission Line Caches Evaluation Conclusions
13
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches13 Texas Non-uniform Cache Architectures (NUCA) Bank Switch SNUCA – statically partitions addresses across the banks Cache Controller Request 0x….3Request 0x….C
14
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches14 Texas DNUCA Solution AB Issues with DNUCA Locating cache blocks Power consumed accessing distant banks 15% of total area devoted to routing channels Frequently requested blocks migrate towards the cache controller
15
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches15 Outline Problem: Global interconnect Opportunity: On-chip transmission lines Application: Large on-chip caches Solution: TLC - Transmission Line Caches Evaluation Conclusions
16
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches16 TLC - Transmission Line Cache 512 KB Bank TLC Cache Controller TL Drivers & Receivers TL link 2x8 bytes High bandwidth, low latency interface between the controller and banks
17
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches17 TLC Cache Controller Repeaters Multi- cycle delay Central Cache Controller Logic Transmission Lines Latches Transmission Line Transceivers Transmission Lines
18
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches18 Outline Problem: Global interconnect Opportunity: On-chip transmission lines Application: Large on-chip caches Solution: TLC - Transmission Line Caches Evaluation Conclusions
19
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches19 Methodology Assumptions –ITRS projection for 2010 45 nm technology Low-k (2.1) intermetal dielectric –10 GHz operational frequency Physical Evaluation –Linpar RLC extractor –Hspice W element transmission line Performance Evaluation –Full system simulation –Simics extended with an Out-of-Order processor and memory system timing models
20
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches20 Cache Characteristics Cache Design Total Size BanksBank SizeBank Access Time Uncontended Latency SNUCA16 MB32512 KB8 cycles9 – 32 cycles DNUCA162566433 – 47 TLC1632512810 – 16 Exclusive write-back caches 4 wide, 30 stage pipeline, OoO processor 300 cycle memory latency
21
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches21 Performance SpecINTSpecFPCommercial
22
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches22 Substrate Area Cache Design Storage Area Channel Area Controller Area Total Area D-NUCA92 mm 2 17 mm 2 1.1 mm 2 110 mm 2 TLC773.11091* On-chip transmission lines allow direct routing from the driver to receiver without repeaters Facilitates compact layout Devotes less substrate area to the routing channels * 18% reduction
23
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches23 Link Utilization
24
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches24 Optimized TLC Designs Utilize fewer transmission lines –Base design: requires 2k transmission lines –Opt designs: require 1k, 500, & 350 Reduce manufacturing cost Increase logic complexity
25
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches25 Link Utilization (TLC Family)
26
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches26 Performance (TLC Family)
27
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches27 Conclusions 1 Transmission lines offer a different latency/bandwidth tradeoff Advantages –Lower latency for global links –Direct routing over large structures Limitations –Large, sparsely populated, metal layers –Greater circuit verification effort
28
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches28 Conclusions 2 Possible application: TLC Advantages –Consistent high performance –Simpler logical design –18% less substrate area –Less power in the communication network Disadvantages –Circuit verification –Wafer cost
29
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches29 Other Applications?
30
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches30 TL link 2x126 bits TL link 2x64 bits TL link 2x44 bits Optimized TLC Designs 1 MB Bank TLCopt 1000 Blocks are partitioned across 2 banks Each transmission line link is 126 bits wide 1008 total data TLs TLCopt 500 Blocks are partitioned across 4 banks Each transmission line link is 64 bits wide 512 total data TLs TLCopt Cache Controller TLCopt 350 Blocks are partitioned across 8 banks Each transmission line link is 44 bits wide 352 total data TLs
31
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches31 Equake Performance
32
Beckmann & WoodMICRO ’03 - TLC: Transmission Line Caches32 Additional Transceiver Delay
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.