L16 : Logic Level Design (2) 성균관대학교 조 준 동 교수
Low Power Logic Gate Resynthesis on Mapped Circuit 김현상 조준동 전기전자컴퓨터공학부 성균관대학교
Transition Probability Transition Probability: Prob. Of a transition at the output of a gate, given a change at the inputs Use signal probabilities Example: F = X’Y + XY’ –Signal Prob. Of F: P f = P x (1-P y )+(1-P x )P y –Transistion Prob. Of F = 2P f (1-P f ) –Assumption of independence of inputs Use BDDs to compute these References: Najm’91
Technology Mapping Implementing a Boolean network in terms of gates from a given library Popular technique: Tree-based mapping Library gates and circuits decomposed into canonical patterns Pattern matching and dynamic programming to find the best cover NP-complete for general DAG circuits Ref: Keutzer’87, Rudell’89 Idea: High transition probability points are hidden within gates
Low Power Cell Mapping Example of High Switching Activity Node Internal Mapping in Complex Gate
Signal Probability vs. Power
Spatial Correlation
Logic Synthesis for Low Power Precomputation logic –selectively precompute the output logic values -> reduce switching activity –using predictor function Retiming –re-positioning the F/F in a pipelined circuit –candidates for adding circuit nodes with high hazard activity circuit nodes with high load capacitance A R1 R2 R3 g1 g2 NOR g CLCL g CLCL R y
Logic Synthesis for Low Power State assignment –to minimize the switching activity on high state transition arc –can also consider the complexity of the combinational logic –experimental result 10% ~17% power reductions Path balancing –reduce hazards/glitches –key issue in the delay insertion to use the minimum number of delay to achieve the maximum reduction Multi-level network optimization –use network don’t care term –cost function minimize sum of the number of product terms and the weighted switching activity how changes in the global function of an internal node affects the switching activity of in its transitive fanout –experimental result ~10% power reduction
Logic Synthesis for Low Power Technology decomposition –minimizes the sum of the switching activities at the internal nodes –one method to inject high switching activity inputs into the tree as late as possible Technology mapping –general principle hide nodes with high switching activity inside the gates a b c d P(a) = 0.3 P(b) = 0.4 P(c) = 0.7 P(d) = 0.5 a b c d E(sw) = p(ab)+p(abc) +p(abcd) = a b a b E(sw) = p(ab)+p(cd) +p(abcd) = H H L H : high transition node L : low transition node
Low Power Logic Synthesis
Technology Mapping
Tree Decomposition
Huffman Algorithm
Depth-Constrained Decomposition Algorithm problem : minimize SUM from i=1 to m p_t (x_i ) input : 입력 시그널 확률 (p1, p2,íñíñíñ, pn), 높이 (h), 말단 노드의 수 (n), 게이트당 fanin limit(k) output : k-ary 트리 topology Begin sort (signal probability of p1, p2,íñíñíñ, pn); while (n!=0) if (h>logkn) assign k nodes to level L(=h+1); /* 레벨 L(=h+1) 에 노드 k 개만큼 할당 */ h=h-1, n=n-(k-1); /*upward*/ else if (h<logkn) assign k nodes to level L(=h+2); /* 이전 레벨 L(=h+2) 에 노드 k 개만큼 할당 */ h=h, n=n-(k-1); /*downward*/ else (h=logkn) assign the remaining nodes to level L(=h+1); /*complete; 레벨 L(=h+1) 에 나머지 노드를 모두 할당하고 complete k-ary 트리 구성 */ for (bottom level L; L>1; L--) min_edge_weight_matching (nodes in level L); End
Example
After Decomposition
After Tech. Mapping
Precomputation Power saving –Reduces power dissipation of combinational logic –Reduces internal power to precomputed registers Opportunity –Can be significant, dependent on; percentage of time latch precomputation is successful Cost –Increase area –Impact circuit timing –Increase design complexity number of bits to precompute –Testability may generate redundant logic
Precomputation Entire function is computed. Smaller function is defined, Enable is precomputed.
Before Precomputation Diagram Precomputation
After Precomputation Diagram Precomputation
Before Precomputation - Report Precomputation
After Precomputation - Report Precomputation
Precomputation Example - Before Code Library IEEE; Use IEEE.STD_LOGIC_1164.ALL; Entity before_precomputation is port ( a,b : in std_logic_vector(7 downto 0); CLK: in std_logic; D_out: out std_logic); end before_precomputation; Architecture Behav of before_precomputation is signal a_in, b_in : std_logic_vector(7 downto 0); signal comp : std_logic; Begin process (a,b,CLK) Begin if (CLK = '1' and CLK'event) then a_in <= a; b_in<= b; end if; if (a_in > b_in) then comp <= '1'; else comp <= '0'; end if; if (CLK'event and CLK='1') then D_out <= comp; end if; end process; end Behav;
Precomputation Example - After Code Library IEEE; Use IEEE.STD_LOGIC_1164.ALL; Entity after_precomputation is port (a, b : in std_logic_vector(7 downto 0); CLK: in std_logic; D_out: out std_logic); end after_precomputation; Architecture Behav of after_precomputation is signal a_in, b_in : std_logic_vector(7 downto 0); signal pcom, pcom_D : std_logic; signal CLK_en, comp : std_logic; Begin process(a,b,CLK) Begin if (CLK='1' and CLK'event) then a_in(7) <= a(7); b_in(7) <= b(7); end if; pcom <= a xor b; if (CLK='0') then pcom_D <= pcom; end if; CLK_en <= pcom_D and CLK;
Precomputation - Example After Code if (CLK_en='1' and CLK_en'event) then a_in(6 downto 0) <= a(6 downto 0); b_in(6 downto 0) <= b(6 downto 0); end if; if (a_in > b_in) then comp <= '1'; else comp <= '0'; end if; if (CLK='1' and CLK'event) then D_out <= comp; end if; end process; end Behav;