3 rd Nov CSV881: Low Power Design1 Power Estimation and Modeling M. Balakrishnan
3 rd Nov CSV881: Low Power Design2 Outline Estimation problem Estimation at different levels System level Algorithmic level Processor level RT level Gate level Circuit level
3 rd Nov CSV881: Low Power Design3 Power Estimation Problem The objective in Power estimation is similar to other estimation problems; one tries to minimize time for estimation to achieve a certain accuracy or maximize accuracy for a given effort. Hi-fidelity is another objective
3 rd Nov CSV881: Low Power Design4 Abstraction Levels Higher the level of abstraction, it is likely to take less time but also produce lower accuracy Suitable models to speedup estimation at higher levels Primitive operations or structures keep on changing as we move up the abstraction levels
3 rd Nov CSV881: Low Power Design5 System Level Energy estimation in terms of very coarse granularity events e.g. specific tasks initiated by specific triggers or interrupts Estimates for components like memory, buses etc. handled separately Support for system level power management decisions
3 rd Nov CSV881: Low Power Design6 System Level Approaches Early approaches [3] were based on Monte-Carlo simulation. Random input vectors were generated and power data for them generated using simulation Approaches varied in terms of efficiency and accuracy. Some approaches provided for confidence level to be controlled Quality of results depend on the “statistical” properties of the input vectors and their impact on power Difficult to handle various power modes of operation A recent approach for IP power estimation [4] works with hierarchical models
3 rd Nov CSV881: Low Power Design7 Park et al [4] Estimation at different hierarchical levels –Direct tradeoff between accuracy and time for estimation Creation of power models at different levels TLM (Transaction level modeling)
3 rd Nov CSV881: Low Power Design8 Park et al[4] contd H.264 Prediction IP Active IDLE Inter Intra Luma Chroma Luma (16X16) Chroma Luma (4X4) Loc 0 Loc mMod 0Mod n
3 rd Nov CSV881: Low Power Design9 Algorithmic/Behavioral Level Models of RT level components expressed in terms of input characteristics that can be extracted from the behavior –e.g. adder operation energy in terms of word-length and hamming distance of inputs –e.g. memory energy per read or write access Energy estimation based on weighted sum of such basic operations Behavioral transformations to be supported in terms of energy change Prediction of interconnect power consumption to support data transfer is an issue
3 rd Nov CSV881: Low Power Design10 Algorithmic Energy Components Total energy consumed = energy consumed in computation +energy consumed in storage access +energy consumed in data transfer +energy consumed in control (function of allocation as well as binding)
3 rd Nov CSV881: Low Power Design11 Adder Module Characteristics Hamming distance EnergyEnergy
3 rd Nov CSV881: Low Power Design12 Processor Level First proposed by Tiwari et. al [5,6] for software power estimation The methodology is based on measuring power consumption for each instruction and Overall energy consumption is computed by taking a weighted sum of number of instructions of each type. The weighting factor is the power consumption of the individual instructions This approach based on measurements is valid only for a processor which has been fabricated. Sama et.al [7] have modified it to create an instruction level power model with a gate level simulator
3 rd Nov CSV881: Low Power Design13 Vivel Tiwari’s Model[5] Energy cost of an instruction = base cost (measuring current on a repetitive set of identical instructions) +circuit state overhead cost (measuring current on pairs of instruction) + resource constraint cost (to account for stall cycles due to resource contention) + cache energy costs (to account for cache misses) Tiwari observed that at least for CISC processors, operand and data value variations affect less than 3% of the total energy consumption.
3 rd Nov CSV881: Low Power Design14 Lee et al [6] Approach similar to the one proposed in [5]. Processor used is Fujitsu DSP processor instead of DX486 (Intel processor) Base cost of the instructions varies significantly unlike CISC processor Instructions classified into 6 different classes to reduce the size of measurements. (individual as well as pair wise measurements) Power minimization strategies suggested include –“Intelligent” register bank assignment –Instruction packing to reduce cycles –Instruction scheduling to reduce circuit state switching energy –Operand swapping to reduce computation in Booth’s algorithm
3 rd Nov CSV881: Low Power Design15 Sama et al[7] Instruction set model similar to the models proposed by Tiwari[5] Energy numbers obtained through a power simulator rather than actual measurement; thus models possible at design time and can be part of micro-architecture and/or instruction set architecture exploration Considerable speedup over gate-level or circuit- level simulation of the processor model
3 rd Nov CSV881: Low Power Design16 Issues in Instruction Set Power Models Instructions are not executed one at a time –All current processors are deeply pipelined and as many instructions are active concurrently in the pipeline, their interactions should also be accounted for –Tiwari[5] also measured interactions between consecutive instructions from different classes The effect of varying data (as well as address) is ignored in the model –Though can be accounted by an additive factor
3 rd Nov CSV881: Low Power Design17 RT Level Estimation Models of RTL components like adders, comparators, decoders, multiplexers etc. Models based on effective capacitance Switching activity estimated from the RTL code
3 rd Nov CSV881: Low Power Design18 Gate Level Estimation Effective capacitance models at the gate level in the library Switching activity is estimated for a given application (specified as a set of Boolean equations) Switching activity could be based on probabilistic input vector characteristics or actual input vector characteristics
3 rd Nov CSV881: Low Power Design19 Circuit Level Comparison is always with SPICE –How much faster and how close in terms of prediction? Compact set of vectors –How representative are these vectors in terms of actual use? Powermill [3] a popular tool achieves 2 to 3 order speedup while being within 10% accuracy. This is based on event driven timing simulation and uses a simplified table –driven device models.
3 rd Nov CSV881: Low Power Design20 References M. Pedram, “Power Minimization in IC Design”, ACM TODAES, Vol. 1., No. 1, Jan. 1996, pp Macii et al, “High-level Power Modeling, Estimation and Optimization”, DAC 1997 Burch et al, “ A Monte-carlo Approach for Power Estimation”, IEEE TVLSI, Vol. 1, No. 1, Mar pp Park et al, “System Level Power Estimation Methodology with H.264 Decoder Prediction IP Case Study”, pp
3 rd Nov CSV881: Low Power Design21 References (contd) Tiwari et al, “Power Analysis of Embedded Software: A First Step towards Software Power Minimization”, IEEE TVLSI, Vol. 2, No. 4, Dec. 1994, pp Lee et al, “Power Analysis and Minimization Techniques for Embedded DSP Software ”, IEEE TVLSI, Vol. 5, No. 1, Mar 1997, pp Sama et al, “Speeding up Power Estimation of Embedded Software”, ISLPED 2000, pp