Download presentation
Presentation is loading. Please wait.
Published byBasil Daniel Modified over 9 years ago
1
University of Toronto Department of Electrical and Computer Engineering Jason Zebchuk and Andreas Moshovos {zebchuk,moshovos}@eecg.toronto.edu June 2006 Workshop on Complexity-Effective Design - June 2006 RegionTracker: Using Dual-Grain Tracking for Energy Efficient Cache Lookup
2
June 18, 2006 Zebchuk © 2RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup I$ D$ CPU L2 DATA Need for Energy Efficient L2 Lookups n Locate blocks in high level caches more efficiently n Conventional tags are getting larger l Technology, microarchitectural and application trends l Larger caches use more energy n Demonstrate lookup energy reductions up to 82% l Up to 38% average across SPEC L2 TAGS
3
June 18, 2006 Zebchuk © 3RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup Dual-Grain Tracking Memory as a collection of REGIONS Memory as a collection of blocks n Region: 2 n sized, aligned memory area n Similar concept already used by various structures l TLB, Page Table
4
June 18, 2006 Zebchuk © 4RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup Program Behavior / Motivation n Few active Regions n “Bursty” access n Mostly gone before accessed again n RegionTracker: l Identify First Misses l Track block location for Few Regions In principleIn practice And before is touched again ðHow can this reduce energy?
5
June 18, 2006 Zebchuk © 5RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup I$ D$ CPU L2 DATA RegionTracker: Low Power Lookups n Frequent case: l Few Active Regions l Macroscopically Transient n RegionTracker: l Dynamically Identify Newly Touched Regions l Track block location using a compact structure L2 TAGS I$ D$ CPU L2 DATA L2 TAGS
6
June 18, 2006 Zebchuk © 6RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup RegionTracker Organization n CRH for First Miss Detection: 5% of tags n CBV for Tracking blocks within 128 regions: 17.5% n 128 x 8kB regions = 1MB tracked (at most 25% of a 4MB L2) I$ D$ CPU L2 DATA L2 TAGS
7
June 18, 2006 Zebchuk © 7RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup Which Regions are Cached? n If we had as many counters as regions: l Block Allocation: counter[region]++ l Block Eviction: counter[region]-- l Region cached only if counter[Region] non-zero n Not Practical: l E.g., 8KB Regions and 4GB Memory 512K counters Region Tag offset counter
8
June 18, 2006 Zebchuk © 8RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup Which Regions are Cached? Region Tag offset counter hash() n Imprecise: l Records a superset of currently cached Regions l False positives: lost opportunity, correctness preserved l Small: e.g., 512-4k entries for 2MB or 4MB cache n First Miss: l Full location information for ALL BLOCKS l No need for temporal locality Cached Region Hash (CRH)
9
June 18, 2006 Zebchuk © 9RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup CBV: Tracking Blocks within Regions Region Tag Block info Region Tag offsetblock Block #0 Block #63 Which data way is the block cached at? n Parallel lookup of RegionTag and Block Info n Experiments with 64 and 128 entry, 8-way set-associative CBV 4 256
10
June 18, 2006 Zebchuk © 10RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup Conventional Solution n Tag Hierarchy l Requires Locality u Temporal u Spatial as long as L2 block size > L1 block size l Latency limited l Not very energy efficient l RegionTracker is Better I$ D$ CPU L2 DATA L2 TAGS
11
June 18, 2006 Zebchuk © 11RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup Tag Hierarchy Set Tag Block TagOffset Set Tag #0 Tag #7 n Each access reads/writes 23 bytes n Sequential Comparison of Set Tag AND Block Tag 23 184 ========
12
June 18, 2006 Zebchuk © 12RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup Complexity Tradeoffs n Tag Heirarchy l Read/Write 184 bits u Complex Wiring to transfer 184 bits l Updated on every Tag Hierarchy miss n RegionTracker l Read/Write 4 bits u Only 4 bits transferred from tag array l Updated on L2 misses only l Flexible implementation (vertical/horizontal partitioning) l No modification to conventional cache policies/structures
13
June 18, 2006 Zebchuk © 13RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup Methodology n Processor l Deeply-Pipelined l 128-entry window l 8-way superscalar l 32kB L1 instruction and data caches n Spec CPU 2000 / Reference Inputs n 10 Billion Committed Instr. Samples after 100B n Used CACTI to estimate energy requirements
14
June 18, 2006 Zebchuk © 14RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup Energy Savings w/ 4MB L2 n Average reduction of 38% n Up to 82% reduction (gzip) n Robust performance, significant power savings for most programs Better CRH/CBV:
15
June 18, 2006 Zebchuk © 15RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup Tag Hierarchy Savings n Only 2 configurations actually save power! n Similar fraction of requests served by RegionTracker n RegionTracker much better! Sets:
16
June 18, 2006 Zebchuk © 16RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup RegionTracker Summary n Coarse-Grain tracking to capture first misses n Dual-Grain tracking to track blocks n Service many L2 Requests n Reduce L2 Lookup Energy n Does not require temporal locality n Can exploit spatial locality much better than a tag hierarchy n Significantly reduces L2 Lookup Power with minimal additional complexity
17
June 18, 2006 Zebchuk © 17RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup RegionTracker: Using Dual-Grain Tracking for Energy Efficient Cache Lookup Jason Zebchuk and Andreas Moshovos {zebchuk, moshovos}@eecg.toronto.edu University of Toronto Department of Electrical and Computer Engineering
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.