Nov. 29, 2005ELEC6970-0011 Power Minimization Using Voltage Reduction and Parallel Processing By Sudheer Vemula.

Nov. 29, 2005ELEC6970-0011 Power Minimization Using Voltage Reduction and Parallel Processing By Sudheer Vemula

Nov. 29, 2005ELEC6970-0012 Outline:-  Goal of the Project  Introduction to Parallel Processing  Delay of the critical path in the given circuit of 32x32 Array Multiplier  Methods to introduce parallelism in the given circuit.  Reduction in delay of critical path due to the introduced parallelism  Calculations showing that the estimation of area and delay  Conclusion

Nov. 29, 2005ELEC6970-0013 Goal of the Project  To reduce the power consumption of the circuit.  By reducing the Voltage of the power supply. Consequence: Increases the delay of the critical path.  To compensate the increase in delay by introducing parallelism.  To calculate the reduction in power.

Nov. 29, 2005ELEC6970-0014 Parallel Processing  Definition:- Concurrent execution of several programs or several blocks of a program is known as parallel processing[1].  Types of parallelism Data Parallelism & Control Parallelism  Data Parallelism is parallel execution of single expression on data distributed over multiple processors[2].  Control Parallelism is the parallelism that is achieved by the simultaneous execution of multiple threads [3].

Nov. 29, 2005ELEC6970-0015 Voltage Scaling and Delay:-  Since transistor is a voltage controlled current device, the resistance depends on the voltage and current.  = 0.5(0.5 Rp C + 0.5 Rn C) = 2 for low V dd

Nov. 29, 2005ELEC6970-0016 Critical Path:- Delay of the Critical path for a multiplier of order n x m = (2m+n-2) Delay of the Critical path for a multiplier of order 32 x 32 = 94 Approximate area of 32 x 32 Multiplier = 1024FAs + 128FAs (due to AND Gates) = 1152 FAs

Nov. 29, 2005ELEC6970-0017 Horizontal Partition:- Critical path delay for a multiplier of order 32x16 = (2*16+32-2) + Delay of the 32 bit Full Adder (FA) + Delay of the 16 bit Half Adder (HA) = 62 + Delay of the 32 bit FA+ Delay of the 16 bit HA Ex.: A=98 and B=76 AB=(90x76) + (8x76) =(9x76)x10 + 8x76

Nov. 29, 2005ELEC6970-0018 Vertical Partition Ex.: A=98 and B=76 AB = (98x70) + (98x6) = (98x7)x10 + (98x6) Critical path delay for a multiplier of order 16x32 = (2x32+16-2) + Delay of the 32 bit FA+ Delay of the 16 bit HA =78 + Delay of the 32 bit FA+ Delay of the 16 bit HA

Nov. 29, 2005ELEC6970-0019 Delay of the 32 bit FA:-  The computation of products and sum is done simultaneously.  FA introduces only a delay of 1 unit.  Now the remaining delay is due to the delay of the HA.  The delay due to 16 bit HA adder is ~ equal to 8 FA units Let A=1010B=10111010 X 10x 11 10100 11110 Product1:- 1 1 1 1 0 Product2:- 1 0 1 0 0 Sum:- 0 1 1 0 1 1 1 0

Nov. 29, 2005ELEC6970-00110 Eliminating the Delay due to Half Adder:-  Here we are introducing a 16 bit multiplexer to eliminate the delay due to 16 bit Half Adder.  The additional delay is only due to the multiplexer.  Delay of this circuit = 78+1+0.5(~delay due to mux)  Additional No. of gates = 32FAs + 16 HAs + Multiplexers ~ 32+8+5 = 45FAs  The same procedure can be implemented in the circuit with horizontal partitioning.

Nov. 29, 2005ELEC6970-00111 Ex.: A=98 and B=76 AB=(90x76) + (8x76) =(9x76) 10 + 8x76 =(9x7) 100 + (9x6) 10 +(8x7) 10 + (8x6)

Nov. 29, 2005ELEC6970-00112 Delay and Area Calculations:-  Delay of the circuit = (2x16+16-2)+ 1.5 + (Delay due to 32 bit FA) +1.5  Delay due to 32 bit FA is 16 units. Because the 16 LSBs of the FA are computed simultaneously with previous stage whereas the 16 MSBs are computed without any overlap.  Therefore, Delay = 49 + 16 = 65  Area Overhead = 2 x 16 bit FAs + 32 bit FA +3 x 16 bit HAs + 3 x 16 bit Multiplexers ~ 64 + 24 + 3 x 8 = 112 FAs Percentage Reduction in Delay = (94-65) x 100 / 94 = 30.8% Percentage Increase in Area = (112/1152) x 100 = 9.7%

Nov. 29, 2005ELEC6970-00113 Circuit with improved Delay:-

Nov. 29, 2005ELEC6970-00114 Delay and Area Calculations:-  Delay of the circuit = (2x16+16-2)+ 1.5 + (Delay due to 16 bit CLA) +1.5  Therefore, Delay = 49 + (16/3.6) = 53.5 --[4]  Area Overhead = 2 x 16 bit FAs + 16 bit FA + 16 bit Carry Look Ahead Adder (CLA) + 3 x 16 bit HAs + 3 x 16 bit Multiplexers ~ 32 + 16 + 16 x (10/7.2) + 24 + 24 --- [4] = 48 + 22 + 48 = 118 FAs Percentage Reduction in Delay = (94-53.5) x 100 / 94 = 43.08% Percentage Increase in Area = (118/1152) x 100 = 10.24%

Nov. 29, 2005ELEC6970-00115

Nov. 29, 2005ELEC6970-00116 Delay and Area Calculations:-  Delay of the circuit = (2x16+16-2)+ 1.5 + (Delay due to 16 bit CLA) +1.5 + 1(Added delay due to one FA)  Therefore, Delay = 49 + (16/3.6) +1 = 54.5 ---[4]  Area Overhead = 2 x 16 bit FAs + 16 bit FA + 16 bit Carry Look Ahead Adder (CLA) + 16 bit HA + 1 bit FA + 15 bit HA + 3 x 16 bit Multiplexers ~ 32 + 16 + 16 x (10/7.2) + 8 + 1+ 8.5 + 24 --- [4] = 48 + 22 + 41.5 = 111.5 FAs Percentage Reduction in Delay = (94-54.5) x 100 / 94 = 42.02% Percentage Increase in Area = (111.5/1152) x 100 = 9.7%

Nov. 29, 2005ELEC6970-00117 32x32 Multiplier with 4x4 Multipliers:-  New delay of the circuit = (2x4+4-2) + 1.5 + 1.5 + 10 (CLAs) + 3 + 4.5 (both from previous ckt. values) = 29.5  New Area overhead = 8 x 4 bit FAs + 8 x 4 bit HAs + 4 x 4 bit CLA + 4 x 4 bit FA + overhead of previous ckt = 32 + 16 + 16 x (10/7.2) + 16 + 111.5 ~ 198 FAs  Percentage reduction in Delay = (94 - 30) / 94 = 68%  Percentage increase in Area = 198/1152 = 17%

Nov. 29, 2005ELEC6970-00118 Conclusion:-  The percentage reduction in Delay is much higher than the increase in Area. So, there is a very high possibility that the final power consumed after voltage scaling is much lesser than the original value.

Nov. 29, 2005ELEC6970-00119 References  [1]dspvillage.ti.com/docs/catalog/dspplatform /details.jhtml  [2]www.llnl.gov/CASC/Overture/henshaw/do cumentation/App/manual/node160.html  [3]books.nap.edu/html/up_to_spedd/appD.ht ml  [4] J. M. Rabey & M. Pedram, Low power Design Metodologies, Kluwer Academic Publishers, Boston MA, 1996.

Nov. 29, 2005ELEC6970-0011 Power Minimization Using Voltage Reduction and Parallel Processing By Sudheer Vemula.

Similar presentations

Presentation on theme: "Nov. 29, 2005ELEC6970-0011 Power Minimization Using Voltage Reduction and Parallel Processing By Sudheer Vemula."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Nov. 29, 2005ELEC6970-0011 Power Minimization Using Voltage Reduction and Parallel Processing By Sudheer Vemula.

Similar presentations

Presentation on theme: "Nov. 29, 2005ELEC6970-0011 Power Minimization Using Voltage Reduction and Parallel Processing By Sudheer Vemula."— Presentation transcript:

Similar presentations

About project

Feedback