Download presentation
Presentation is loading. Please wait.
Published byJeffrey Cook Modified over 9 years ago
1
© 2004 Wayne Wolf Memory system optimizations Strictly software: Effectively using the cache and partitioned memory. Hardware + software: Scratch-pad memories. Custom memory hierarchies.
2
© 2004 Wayne Wolf Taxonomy of memory optimizations (Wolf/Kandemir) Data vs. code. Array/buffer vs. non-array. Cache/scratch pad vs. main memory. Code size vs. data size. Program vs. process. Languages.
3
© 2004 Wayne Wolf Software performance analysis Worst-case execution time (WCET) analysis (Li/Malik): Find longest path through CDFG. Can use annotations of branch probabilities. Can be mapped onto cache lines. Difficult in practice---must analyze optimized code. Trace-driven analysis: Well understood. Requires code, input vectors.
4
© 2004 Wayne Wolf Software energy/power analysis Analytical models of cache (Su/Despain, Kamble/Ghose, etc.): Decoding, memory core, I/O path, etc. System-level models (Li/Henkel). Power simulators (Vijaykrishnan et al, Brooks et al).
5
© 2004 Wayne Wolf Power-optimizing transformations Kandemir et al: Most energy is consumed by the memory system, not the CPU core. Performance-oriented optimizations reduce memory system energy but increase datapath energy consumption. Larger caches increase cache energy consumption but reduce overall memory system energy.
6
© 2004 Wayne Wolf Cacheing for real-time systems Kirk and Strosnider---SMART Strategic Memory Allocation for Real-Time cache design cache is divided into segments critical processes get their own cache segments hardware flag selects private cache segment or pooled cache heuristic algorithm groups tasks into cache segments Wolfe---software cache partitioning map routines at link time to addresses which remove conflicts for critical routines
7
© 2004 Wayne Wolf Scratch pad memories Explicitly managed local memory. Panda et al used a static management scheme. Data structures assigned to off-chip memory or scratch pad at compile time. Put scalars in scratch pad, arrays in main. May want to manage scratch pad at run time.
8
© 2004 Wayne Wolf Reconfigurable caches Use compiler to determine best cache configuration for various program regions. Must be able to quickly reconfigure the cache. Must be able to identify where program behavior changes.
9
© 2004 Wayne Wolf Software methods for cache placement McFarling analyzed inter-function dependencies. Tomiyama and Yasuura used ILP. Li and Wolf used a process-level model. Kirovski et al use profiling information plus graph model. Dwyer/Fernando use bit vectors to construct boudns in instruction caches. Parmeswaran and Henkel use heuristics.
10
© 2004 Wayne Wolf Addressing optimizations Addressing can be expensive: 55% of DSP56000 instructions performed addressing operations in MediaBench. Utilize specialized addressing registers, pre/post-incr/decrement, etc. Place variables in proper order in memory so that simpler operations can be used to calculate next address from previous address.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.