Download presentation
Presentation is loading. Please wait.
Published byHarriet Rogers Modified over 8 years ago
1
† Dept. Computer Science and Engineering The Pennsylvania State University ‡ IMEC, Belgium Estimating Influence of Data Layout Optimizations on SDRAM Energy Consumption † H.S. Kim, † V. Narayanan, † M. Kandemir, ‡ E. Brockmeyer, ‡ F. Catthoor, † M.J. Irwin Aug., 2003
2
2 Estimating influence of data layout optimizations on SDRAM energy Applications demand much larger memory bandwidth (eg. Video applications) There have been much work on reducing off-chip memory access frequency by improving local (intermediate) memory locality Locality in SDRAM itself make significant difference on energy, as well (a page open operation is 6 times more expensive than a data read operation) Estimation of the number of page open operation (page break) can serve as an energy estimate of various optimizations Data Layout optimization Conventional Layout vs. Blocked Layout
3
3 Preliminaries (SDRAMS) Banked architecture BANK 0 MEMORY ARRAY ROW DECO- DER BANK 0 MEMORY ARRAY ROW DECO- DER BANK 0 MEMORY ARRAY ROW DECO- DER BANK 0 MEMORY ARRAY ROW DECO- DER DATA BUFFER SENSE AMS COLUMN DECODER CONTROL LOGIC MODE REGISTER CONTROL COMMANDS ADDRESS
4
4 Preliminaries (SDRAM operations) tRPtRCDCAS latency Precharge bank 0 Activate bank 0 Read data command DQ D0D1D2D3 Bank /Page x D0 tRRD DQD0D1D2D3 Bank 0 /Page y Lost cycles command Two consecutive operations to two different rows of one bank One operation
5
5 SDRAM energy consumption D words, B burst size, P miss miss rate, e act = x*e d, e stat_act = y*e d, where e act is energy per activation, e d energy per data transfer of one word, e stat_act static energy per activation (Example) Microns 8MB SDRAM, e act = 13nJ, e stat_act = 7nJ, e d = 3.6nJ, x+y ~ 6
6
6 Page break estimation of data layouts Page break estimation can be used to estimate energy and performance of various optimization techniques Estimation should take little time In blocked layout, different tile/block sizes/shapes result in different number of page breaks Intra page break Inter page break Tile size = Page size Block Array
7
7 Estimation Modeling Polyhedral Modeling of page breaks, implemented using Presburger Formulas Valid Iteration Points Lexicographical Ordering Data Layouts in Memory Mapping Memory Locations to Memory Banks Page Break Estimation Model for Blocked Layout Implementation Omega Calculator to simplify the models (existential operators allowed, not possible in Polylib) Polylib to count the numbers
8
8 Intra/Inter page break models for blocked data layout Intra page breaks Inter page breaks
9
9 Experiments E_ACT = (IDD0 - IDD3)*Trc*Vdd*T cycle *#.pagebreaks E_STAT = IDD3*Vdd* T cycle *total_cycles Benchmarks qsdpcm (quadtree-structured motion estimation) phods (parallel hierarchical motion estimation) an edge_detect code from UTDSP benchmark suite Various fetch tile/block shapes (set_1, set_2, set_3) Architectural assumptions a block of data is fetched from SDRAM into local data memory via Direct Memory Access (ie. software controlled intermediate memory) SDRAM (MICRON’s 8MB/4 banked, 32b bus, 1KB pages)
10
10 Experiments SDRAM cycle simulator ATOMIUM (memory instrumentation tool) MICRON’s SDRAM Power Calculator Memory reference log (addr. size, time) #. page activations Total Activation energy C code SDRAM power (& cycle) simulator to compare the estimates with
11
11 Results (qsdpcm, simulation) Conventional layout shows varying energy numbers depending on the array size (800X640 vs. 176X144) Blocked layout shows no variance on the array size
12
12 Results (row-major vs. blocked, phods) Estimated numbers match the corresponding simulated numbers reasonably for both row-major and blocked layout
13
13 Results (blocked layout, estimation vs. simulation) Arrays w/ manifest indexes can be estimated without error (edge_detect) Arrays w/ dynamic elements (eg. motion vectors) can be estimated reasonably (phods, qsdpcm) Varying energy numbers depending on block/tile shapes (set_1 ~ set_3) qsdpcm phods edge_detect
14
14 Conclusions and Future Work Estimation framework tracks page breaks well Blocked Layout reduces the number of page breaks significantly Tile/Block shapes should be chosen carefully On-going work Refinement of estimation formulas for conventional/blocked layout of higher order dimensional arrays Automation Automatic incorporation of omega library and polylib Automatic code transformation into main memory efficient data layout for each array Exploration techniques to find optimal data layout
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.