Nov. 29, 2005ELEC6970-0011 Power Minimization Using Voltage Reduction and Parallel Processing By Sudheer Vemula.

Slides:



Advertisements
Similar presentations
Multiplication and Shift Circuits Dec 2012 Shmuel Wimer Bar Ilan University, Engineering Faculty Technion, EE Faculty 1.
Advertisements

Introduction So far, we have studied the basic skills of designing combinational and sequential logic using schematic and Verilog-HDL Now, we are going.
CPE 626 CPU Resources: Adders & Multipliers Aleksandar Milenkovic Web:
1 KU College of Engineering Elec 204: Digital Systems Design Lecture 9 Programmable Configurations Read Only Memory (ROM) – –a fixed array of AND gates.
Comparator.
Multiplication Schemes Continued
Lecture Adders Half adder.
DPSD This PPT Credits to : Ms. Elakya - AP / ECE.
UNIVERSITY OF MASSACHUSETTS Dept
EE 382 Processor DesignWinter 98/99Michael Flynn 1 AT Arithmetic Most concern has gone into creating fast implementation of (especially) FP Arith. Under.
CSE-221 Digital Logic Design (DLD)
1 CS 140 Lecture 14 Standard Combinational Modules Professor CK Cheng CSE Dept. UC San Diego Some slides from Harris and Harris.
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 28: Datapath Subsystems 2/3 Prof. Sherief Reda Division of Engineering,
Fall 2005 L15: Combinational Circuits Lecture 15: Combinational Circuits Complete logic functions Some combinational logic functions –Half adders –Adders.
Dec. 6, 2005ELEC Glitch Power1 Low power design: Insert delays to eliminate glitches Yijing Chen Dec.6, 2005 Auburn university.
December 1, 2005 ELEC Project Presentation 1 Dual-Voltage Supply for Power Reduction ELEC 6970 – Low Power Design Project Presentation by Muthubalaji.
IMPLEMENTATION OF µ - PROCESSOR DATA PATH
ECE C03 Lecture 61 Lecture 6 Arithmetic Logic Circuits Hai Zhou ECE 303 Advanced Digital Design Spring 2002.
Distributed Arithmetic: Implementations and Applications
Combinational circuits
Chapter 5 Arithmetic Logic Functions. Page 2 This Chapter..  We will be looking at multi-valued arithmetic and logic functions  Bitwise AND, OR, EXOR,
CS 105 Digital Logic Design
Chapter 6-2 Multiplier Multiplier Next Lecture Divider
ECE 3130 – Digital Electronics and Design
Abdullah Aldahami ( ) Feb26, Introduction 2. Feedback Switch Logic 3. Arithmetic Logic Unit Architecture a.Ripple-Carry Adder b.Kogge-Stone.
Chapter 6-1 ALU, Adder and Subtractor
Arithmetic Building Blocks
Chapter 10 (Part 2): Boolean Algebra  Logic Gates (10.3) (cont.)  Minimization of Circuits (10.4)
Arithmetic Building Blocks
L7: Pipelining and Parallel Processing VADA Lab..
Implementation of Finite Field Inversion
DCSL & LVDCSL: A High Fan-in, High Performance Differential Current Switch Logic Families Dinesh Somasekhaar, Kaushik Roy Presented by Hazem Awad.
Digital Electronics Lecture 6 Combinational Logic Circuit Design.
EKT 221/4 DIGITAL ELECTRONICS II  Registers, Micro-operations and Implementations - Part3.
درس مدارهای منطقی دانشگاه قم مدارهای منطقی محاسباتی تهیه شده توسط حسین امیرخانی مبتنی بر اسلایدهای درس مدارهای منطقی دانشگاه.
Computer Science 101 Circuit Design - Examples. Sum of Products Algorithm Identify each row of the output that has a 1. Identify each row of the output.
CHAPTER 4 Combinational Logic Design- Arithmetic Operation (Section 4.6&4.9)
A Reconfigurable Low-power High-Performance Matrix Multiplier Architecture With Borrow Parallel Counters Counters : Rong Lin SUNY at Geneseo
COE 202: Digital Logic Design Combinational Circuits Part 2 KFUPM Courtesy of Dr. Ahmad Almulhem.
4. Computer Maths and Logic 4.2 Boolean Logic Logic Circuits.
Logic Design CS 270: Mathematical Foundations of Computer Science Jeremy Johnson.
9/15/09 - L15 Decoders, Multiplexers Copyright Joanne DeGroat, ECE, OSU1 Decoders and Multiplexer Circuits.
Half Adder & Full Adder Patrick Marshall. Intro Adding binary digits Half adder Full adder Parallel adder (ripple carry) Arithmetic overflow.
EE2174: Digital Logic and Lab Professor Shiyan Hu Department of Electrical and Computer Engineering Michigan Technological University CHAPTER 8 Arithmetic.
Combinational Circuits
CS221: Digital Logic Design Combinational Circuits
Digital Logic Design (CSNB163)
CS 105 DIGITAL LOGIC DESIGN Chapter 4 Combinational Logic 1.
Basics of Energy & Power Dissipation
CHAPTER 2 Digital Combinational Logic/Arithmetic Circuits
1 Combinational Logic EE 208 – Logic Design Chapter 4 Sohaib Majzoub.
Comparison of Various Multipliers for Performance Issues 24 March Depart. Of Electronics By: Manto Kwan High Speed & Low Power ASIC
UNIT 2. ADDITION & SUBTRACTION OF SIGNED NUMBERS.
Topic: N-Bit parallel and Serial adder
Explain Half Adder and Full Adder with Truth Table.
Full Adder Truth Table Conjugate Symmetry A B C CARRY SUM
Array Multiplier Haibin Wang Qiong Wu. Outlines Background & Motivation Principles Implementation & Simulation Advantages & Disadvantages Conclusions.
Chapter 12. Chapter Summary Boolean Functions Representing Boolean Functions Logic Gates Minimization of Circuits (not currently included in overheads)
Combinational Circuits
Basics Combinational Circuits Sequential Circuits
Unsigned Multiplication
VLSI Arithmetic Lecture 4
Polynomial Construction for Arithmetic Circuits
ELEC 6970: Low Power Design Class Project By: Sachin Dhingra
Multiplier-less Multiplication by Constants
Week 7: Gates and Circuits: PART II
Combinational Circuits
Adder Circuits By: Asst Lec. Basma Nazar
Computer Architecture
Presentation transcript:

Nov. 29, 2005ELEC Power Minimization Using Voltage Reduction and Parallel Processing By Sudheer Vemula

Nov. 29, 2005ELEC Outline:-  Goal of the Project  Introduction to Parallel Processing  Delay of the critical path in the given circuit of 32x32 Array Multiplier  Methods to introduce parallelism in the given circuit.  Reduction in delay of critical path due to the introduced parallelism  Calculations showing that the estimation of area and delay  Conclusion

Nov. 29, 2005ELEC Goal of the Project  To reduce the power consumption of the circuit.  By reducing the Voltage of the power supply. Consequence: Increases the delay of the critical path.  To compensate the increase in delay by introducing parallelism.  To calculate the reduction in power.

Nov. 29, 2005ELEC Parallel Processing  Definition:- Concurrent execution of several programs or several blocks of a program is known as parallel processing[1].  Types of parallelism Data Parallelism & Control Parallelism  Data Parallelism is parallel execution of single expression on data distributed over multiple processors[2].  Control Parallelism is the parallelism that is achieved by the simultaneous execution of multiple threads [3].

Nov. 29, 2005ELEC Voltage Scaling and Delay:-  Since transistor is a voltage controlled current device, the resistance depends on the voltage and current.  = 0.5(0.5 Rp C Rn C) = 2 for low V dd

Nov. 29, 2005ELEC Critical Path:- Delay of the Critical path for a multiplier of order n x m = (2m+n-2) Delay of the Critical path for a multiplier of order 32 x 32 = 94 Approximate area of 32 x 32 Multiplier = 1024FAs + 128FAs (due to AND Gates) = 1152 FAs

Nov. 29, 2005ELEC Horizontal Partition:- Critical path delay for a multiplier of order 32x16 = (2* ) + Delay of the 32 bit Full Adder (FA) + Delay of the 16 bit Half Adder (HA) = 62 + Delay of the 32 bit FA+ Delay of the 16 bit HA Ex.: A=98 and B=76 AB=(90x76) + (8x76) =(9x76)x10 + 8x76

Nov. 29, 2005ELEC Vertical Partition Ex.: A=98 and B=76 AB = (98x70) + (98x6) = (98x7)x10 + (98x6) Critical path delay for a multiplier of order 16x32 = (2x ) + Delay of the 32 bit FA+ Delay of the 16 bit HA =78 + Delay of the 32 bit FA+ Delay of the 16 bit HA

Nov. 29, 2005ELEC Delay of the 32 bit FA:-  The computation of products and sum is done simultaneously.  FA introduces only a delay of 1 unit.  Now the remaining delay is due to the delay of the HA.  The delay due to 16 bit HA adder is ~ equal to 8 FA units Let A=1010B= X 10x Product1: Product2: Sum:

Nov. 29, 2005ELEC Eliminating the Delay due to Half Adder:-  Here we are introducing a 16 bit multiplexer to eliminate the delay due to 16 bit Half Adder.  The additional delay is only due to the multiplexer.  Delay of this circuit = (~delay due to mux)  Additional No. of gates = 32FAs + 16 HAs + Multiplexers ~ = 45FAs  The same procedure can be implemented in the circuit with horizontal partitioning.

Nov. 29, 2005ELEC Ex.: A=98 and B=76 AB=(90x76) + (8x76) =(9x76) x76 =(9x7) (9x6) 10 +(8x7) 10 + (8x6)

Nov. 29, 2005ELEC Delay and Area Calculations:-  Delay of the circuit = (2x ) (Delay due to 32 bit FA) +1.5  Delay due to 32 bit FA is 16 units. Because the 16 LSBs of the FA are computed simultaneously with previous stage whereas the 16 MSBs are computed without any overlap.  Therefore, Delay = = 65  Area Overhead = 2 x 16 bit FAs + 32 bit FA +3 x 16 bit HAs + 3 x 16 bit Multiplexers ~ x 8 = 112 FAs Percentage Reduction in Delay = (94-65) x 100 / 94 = 30.8% Percentage Increase in Area = (112/1152) x 100 = 9.7%

Nov. 29, 2005ELEC Circuit with improved Delay:-

Nov. 29, 2005ELEC Delay and Area Calculations:-  Delay of the circuit = (2x ) (Delay due to 16 bit CLA) +1.5  Therefore, Delay = 49 + (16/3.6) = [4]  Area Overhead = 2 x 16 bit FAs + 16 bit FA + 16 bit Carry Look Ahead Adder (CLA) + 3 x 16 bit HAs + 3 x 16 bit Multiplexers ~ x (10/7.2) [4] = = 118 FAs Percentage Reduction in Delay = ( ) x 100 / 94 = 43.08% Percentage Increase in Area = (118/1152) x 100 = 10.24%

Nov. 29, 2005ELEC

Nov. 29, 2005ELEC Delay and Area Calculations:-  Delay of the circuit = (2x ) (Delay due to 16 bit CLA) (Added delay due to one FA)  Therefore, Delay = 49 + (16/3.6) +1 = [4]  Area Overhead = 2 x 16 bit FAs + 16 bit FA + 16 bit Carry Look Ahead Adder (CLA) + 16 bit HA + 1 bit FA + 15 bit HA + 3 x 16 bit Multiplexers ~ x (10/7.2) [4] = = FAs Percentage Reduction in Delay = ( ) x 100 / 94 = 42.02% Percentage Increase in Area = (111.5/1152) x 100 = 9.7%

Nov. 29, 2005ELEC x32 Multiplier with 4x4 Multipliers:-  New delay of the circuit = (2x4+4-2) (CLAs) (both from previous ckt. values) = 29.5  New Area overhead = 8 x 4 bit FAs + 8 x 4 bit HAs + 4 x 4 bit CLA + 4 x 4 bit FA + overhead of previous ckt = x (10/7.2) ~ 198 FAs  Percentage reduction in Delay = ( ) / 94 = 68%  Percentage increase in Area = 198/1152 = 17%

Nov. 29, 2005ELEC Conclusion:-  The percentage reduction in Delay is much higher than the increase in Area. So, there is a very high possibility that the final power consumed after voltage scaling is much lesser than the original value.

Nov. 29, 2005ELEC References  [1]dspvillage.ti.com/docs/catalog/dspplatform /details.jhtml  [2] cumentation/App/manual/node160.html  [3]books.nap.edu/html/up_to_spedd/appD.ht ml  [4] J. M. Rabey & M. Pedram, Low power Design Metodologies, Kluwer Academic Publishers, Boston MA, 1996.