ALU Organization Michael Vong Louis Young Rongli Zhu Dan.

Slides:



Advertisements
Similar presentations
Kuliah Rangkaian Digital Kuliah 7: Unit Aritmatika
Advertisements

Modular Combinational Logic
Logical Design.
Combinational Circuits
1 Specifications Functionality: AND, OR, XOR, ADD Maximum propagation delay : 2ns Power budget: 30mW Area: 200 µm ×400µm Prepared by: Christie Ma, Manjul.
Introduction So far, we have studied the basic skills of designing combinational and sequential logic using schematic and Verilog-HDL Now, we are going.
CPE 626 CPU Resources: Adders & Multipliers Aleksandar Milenkovic Web:
1 Lecture 12: Hardware for Arithmetic Today’s topics:  Designing an ALU  Carry-lookahead adder Reminder: Assignment 5 will be posted in a couple of days.
Lab 10 : Arithmetic Systems : Adder System Layout: Slide #2 Slide #3 Slide #4 Slide #5 Arithmetic Overflow: 2’s Complement Conversions: 8 Bit Adder/Subtractor.
San Jose State University Department of Electrical Engineering Dec 5th, Fall 2005 EE 166 PROJECT Advisor: Prof. David Parent Group Members Radhika Arora,
UC Berkeley, Dept of EECS EE141, Fall 2005, Project 2 Speed-Area Optimized 8-Bit Adder Presentation Slides.
Lecture #26 Gate delays, MOS logic
Team W1 Design Manager: Rebecca Miller 1. Bobby Colyer (W11) 2. Jeffrey Kuo (W12) 3. Myron Kwai (W13) 4. Shirlene Lim (W14) Stage VI: February 25 h 2004.
1 Design of 8- Bit ALU Neelam Chaudhari Archana Mulukutla Namita Mittal Madhumita Sanyal Advisor : Dr. David Parent Date : May 8, 2006.
[M2] Traffic Control Group 2 Chun Han Chen Timothy Kwan Tom Bolds Shang Yi Lin Manager Randal Hong Wed. Oct. 27 Overall Project Objective : Dynamic Control.
1 DESIGN OF 4-BIT ALU Fairchild Semiconductor DM74LS181 Prashanth Kommuri Akram Khan Gopinath Akkinepally Advisor: Dr. David W. Parent 5 December 2005.
1 Farhan Mohamed Ali (W2-1) Jigar Vora (W2-2) Sonali Kapoor (W2-3) Avni Jhunjhunwala (W2-4) Presentation 5 MAD MAC nd February, 2006 Top Level Integration.
IMPLEMENTATION OF µ - PROCESSOR DATA PATH
30 September 2004Comp 120 Fall September 2004 Chapter 4 – Logic Gates Read in Chapter 4 pages , , section 4.8 through top of page.
Introduction to CMOS VLSI Design Lecture 11: Adders
Modern VLSI Design 2e: Chapter 6 Copyright  1998 Prentice Hall PTR Topics n Shifters. n Adders and ALUs.
Project 2: Cadence Help Fall 2005 EE 141 Ke Lu. Design Phase Estimate delay using stage effort. Example: 8 bit ripple adder driving a final load of 16.
1 COMP541 Arithmetic Circuits Montek Singh Mar 20, 2007.
1 Design of 4-bit ALU Swathi Dasoju Mahitha Venigalla Advisor: David W.Parent 6 th December 2004.
1 DESIGN OF 8-BIT ALU Vijigish Lella Harish Gogineni Bangar Raju Singaraju Advisor: Dr. David W. Parent 8 May 2006.
1 8 Bit ALU EE 166 Design Project San Jose State University Roger Flores Brian Silva Chris Tran Harizo Yawary Advisor: Dr. Parent May 2006.
Computer ArchitectureFall 2008 © August 20 th, Introduction to Computer Architecture Lecture 2 – Digital Logic Design.
Introduction to CMOS VLSI Design Lecture 11: Adders David Harris Harvey Mudd College Spring 2004.
Lecture # 12 University of Tehran
Calculator Lab Overview Note: Slides Updated 10/8/12
Chapter 4 Gates and Circuits. 4–2 Chapter Goals Identify the basic gates and describe the behavior of each Describe how gates are implemented using transistors.
Binary Addition CSC 103 September 17, 2007.
Modern VLSI Design 4e: Chapter 4 Copyright  2008 Wayne Wolf Topics n Standard cell-based layout. n Channel routing. n Simulation.
CS1Q Computer Systems Lecture 9 Simon Gay. Lecture 9CS1Q Computer Systems - Simon Gay2 Addition We want to be able to do arithmetic on computers and therefore.
Chapter 6-1 ALU, Adder and Subtractor
Arithmetic Building Blocks
ECE 3110: Introduction to Digital Systems Chapter 6 Combinational Logic Design Practices Adders, subtractors, ALUs.
Modern VLSI Design 3e: Chapter 4 Copyright  1998, 2002 Prentice Hall PTR Topics n Layouts for logic networks. n Channel routing. n Simulation.
Modern VLSI Design 3e: Chapters 1-3 week12-1 Lecture 30 Scale and Yield Mar. 24, 2003.
Advanced VLSI Design Unit 05: Datapath Units. Slide 2 Outline  Adders  Comparators  Shifters  Multi-input Adders  Multipliers.
Complementary CMOS Logic Style Construction (cont.)
Logic Gates Logic gates are electronic digital circuit perform logic functions. Commonly expected logic functions are already having the corresponding.
Modern VLSI Design 4e: Chapter 6 Copyright  2008 Wayne Wolf Topics n Shifters. n Adders and ALUs.
4. Combinational Logic Networks Layout Design Methods 4. 2
IT253: Computer Organization Lecture 7: Logic and Gates: Digital Design Tonga Institute of Higher Education.
Lecture 18: Hardware for Arithmetic Today’s topic –Intro to Boolean functions (Continued) –Designing an ALU 1.
EE 466/586 VLSI Design Partha Pande School of EECS Washington State University
EE141 © Digital Integrated Circuits 2nd Arithmetic Circuits 1 Digital Integrated Circuits A Design Perspective Arithmetic Circuits Jan M. Rabaey Anantha.
COMP541 Arithmetic Circuits
Computer Architecture Lecture 3 Combinational Circuits Ralph Grishman September 2015 NYU.
Modern VLSI Design 3e: Chapter 4 Copyright  1998, 2002 Prentice Hall PTR Topics n Layouts for logic networks. n Channel routing. n Simulation.
COMP541 Arithmetic Circuits
EE466: VLSI Design Lecture 13: Adders
1 Carry Lookahead Logic Carry Generate Gi = Ai Bi must generate carry when A = B = 1 Carry Propagate Pi = Ai xor Bi carry in will equal carry out here.
BR 6/001 The RC Delay Model for Gates Recall that the RC Delay model for NMOS/PMOS from Harris (k is the width of the gate)
CS1Q Computer Systems Lecture 8
EE141 Project: 32x32 SRAM Abhinav Gupta, Glen Wong Optimization goals: Balance between area and performance Minimize area without sacrificing performance.
LOGIC CIRCUITLOGIC CIRCUIT. Goal To understand how digital a computer can work, at the lowest level. To understand what is possible and the limitations.
COE 360 Principles of VLSI Design Delay. 2 Definitions.
Combinational Circuits
4 BIT Arithmetic Logic Unit (ALU)
ECE 3130 Digital Electronics and Design
Swamynathan.S.M AP/ECE/SNSCT
Alpha Blending and Smoothing
Summary Half-Adder Basic rules of binary addition are performed by a half adder, which has two binary inputs (A and B) and two binary outputs (Carry out.
Combinatorial Logic Design Practices
Logic Gates.
Combinational Circuits
ECE 352 Digital System Fundamentals
Instructor: Michael Greenbaum
Presentation transcript:

ALU Organization Michael Vong Louis Young Rongli Zhu Dan

Overall ALU Organization The output lines (Y3 … Y0) run all the way through

ALU Organization: One Function Per Column Control signals will enable all transmission gates in a column

ALU Organization: One Bit Per Row Only one transmission gate in a row will be turned on. Only one function will drive Y.

Adder Logic Design

BK Cell States Our adder uses BK Cells. For each column of addition, there are three possible states (0 + 1) or (1 + 0) is carry propagate = P (1 + 1) is carry generate = G (0 + 0) is carry kill = K

BK Cell Truth Table More Significant InputLess Significant InputOutput KKK KPK KGK PKK PPP PGG GKG GPG GGG Each BK cell looks at the carry status of two networks and generate a single carry status.

BK Cell Boolean Equation Y1 = BD + AD + AB Y0 = BC + AC + AB Note: The encoding used: G = 11, K = 00, and P = 10 or 01 Y1 and Y0 are the same Boolean function. Just do the layout for Y1 and replicate it twice to get a BK cell This is the same function as the ripple adder’s carry out

Using BK Cells to make an Adder There is only one rule to using BK cells: To compute the carry of C i, you must have enough BK cells to reach all preceding bits, from bit (i-1) to bit 0. You can have just enough BK cells to compute the final carry, or you can have lots of BK cells to compute all carries.

BK Cell Example (part 1 of 2) If you just want the carry out of an 8 bit addition operation, then you will need 7 BK cells.

BK Cell Example (part 2 of 2) Note that the first input into the first BK cell on the right (the C and D of the red box), must be either G (11) or K (00). Let say the number we are adding are called A and B, this input is C = D = A 0 B in + A 0 C in + B 0 C in. The final output, the Y1 and Y0 of the yellow box, is also either G(11) or K(00).

Our Adder’s BK Cells This adder is around the same speed as a ripple adder. The entry into the red cell has the same delay as a BK cell. Red cell’s C = D = A 0 B 0 + A 0 C in + B 0 C in. So from input to C 3 there are really 3 BK stages. Each stage is the same as a carry out of a ripple.

Other BK Cell Examples Our adder does not benefit from the BK cells because it’s only 4 bits wide. Larger adders do benefit. Screen shots are taken from: Sklanski's adder: Problem: high fan-out for the lowest C 8 BK cell.

Another BK Cell Example Kogge & Stone adders: This one has more BK cells but less fan-out.

Our Adder --- After the BK Tree After the carries are generated, add them to the xor sums. If we are add A and B, and let the answer be SUM: SUM 0 = (A 0 xor B 0 ) xor C 0 SUM 1 = (A 1 xor B 1 ) xor C 1 SUM 2 = (A 2 xor B 2 ) xor C 2 SUM 3 = (A 3 xor B 3 ) xor C 3 This operation of two xor gates is called the “summer” in our adder.

Summer Schematic The idea is to have A and B preset a path so that when C is correctly set, it will show up at Y really fast. It didn’t work out that well. Y = (A XOR B) XOR C

Adder Logic Summary A tree of BK cells are used to compute all of the carries. The final sum for the i-th bit is A i xor B i xor C i, where A and B are the numbers that we are adding, and C is the carry computed by the BK cells.

Confirming the Logic with Verilog module bk(Y1, Y0, A, B, C, D); input A, B, C, D; output Y1, Y0; assign Y1 = (B&D) | (A&D) | (A&B); assign Y0 = (B&C) | (A&C) | (A&B); endmodule module summer(Y, A, B, C); input A, B, C; output Y; assign Y = (~A & ~B & C) | (~A & B & ~C) | (A & ~B & ~C) | (A & B & C); endmodule

adder.v (page 1) module adder(SUM, COUT, A, B, CIN); input [3:0] A, B; input CIN; output [3:0] SUM; output COUT; wire c1, c2, c3, c4; assign c1 = (A[0] & B[0]) | (A[0] & CIN) | (B[0] & CIN); wire bk1_0, bk1_1, bk2_0, bk2_1; wire bk3_0, bk3_1, bk4_0, bk_1; bk bk1(bk1_1, bk1_0, A[1], B[1], c1, c1); bk bk2(bk2_1, bk2_0, A[2], B[2], bk1_1, bk1_0); bk bk3(bk3_1, bk3_0, A[3], B[3], A[2], B[2]); bk bk4(bk4_1, bk4_0, bk3_1, bk3_0, bk1_1, bk1_0);

adder.v (page 2) assign c4 = bk4_1; assign c3 = bk2_1; assign c2 = bk1_1; assign COUT = c4; summer s0(SUM[0], A[0], B[0], CIN); summer s1(SUM[1], A[1], B[1], c1); summer s2(SUM[2], A[2], B[2], c2); summer s3(SUM[3], A[3], B[3], c3); endmodule

test.v module testbench; wire [3:0] SUM; wire COUT; reg [3:0] A, B; reg CIN; adder adder1(SUM, COUT, A, B, CIN); reg [4:0] i, j, k; initial begin CIN = 4'd1; for(i = 0; i < 16; i = i + 1) begin for(j = 0; j < 16; j = j + 1) begin A[3:0] = i[3:0]; B[3:0] = j[3:0]; #20; k = i + j + 1;

test.v (page 2) if(SUM[3:0] != k[3:0]) begin $display("At time %t, A = %d, B = %d, CIN = %d, SUM = %d, COUT = %d \n", $time, A, B, CIN, SUM, COUT); end else begin $display("A = %d, B = %d, CIN = %d tested \n", A, B, CIN); end $display("end of test \n"); end endmodule

Simulation with ModelSim

Adder Circuitry

Layout Guidelines PMOS: L = 0.6 um, W = 5.4 um NMOS: L = 0.6 um, W = 3 um Transistor Sizes (most of the time): Cell height: Total Height: 27 um VDD and GND path width: 1.5 um

Cell Hierarchy zproj_adder4b –zproj_bk zproj_bk_y1 –zproj_summer Zproj_mux2b

The bk_y1 Cell AOI Schematic Recall that the BK cell has a Y1 and Y0

The bk_y1 layout Note that metal 2 can route vertically through almost all of the cell.

The Complete BK Cell Schematic

The bk Cell Layout View 1

The bk Cell Layout View 2 A and B is the same all the way across while C and D swap rows Y1 is in the middle while Y0 is at the far right

Multiplexer Schematic The multiplexer is the basic cell for the summer

Multiplexer Schematic

Multiplexer Layout Note that vertical routing of metal 2 is possible in less than half of the cell.

Multiplexer Test Setup Note how ideal sources are fed directly into the mux

Multiplexer Power Usage

Multiplexer Test Result The load capacitance is 30 fF SONEZEROYTime (ns) 55-> ?? 50-> ?? 00-> ?? 05-> ?? 0-> >5055-> > >0050->50.133

Multiplexer Test Results (Page 2) Power = W/cm 2 The first group of results highlighted in red cells turned out to be inaccurate. The ONE and ZERO lines are not gate terminals. When the path way is set (S held steady), the rate at which the output changes is actually proportional to the change in input. The output is changing rapidly in the test because the input is an ideal voltage source with a rise time of 200 ps. A more realistic switching time can be obtained by passing the ideal input through two inverters before sending it to the “ONE” or “ZERO” line of the multiplexer.

Summer Schematic

Summer Layout View 1 The (NOT C) is labeled as C’

Summer Layout View 2 The right most Y is the sum

Putting the Support Cells Together to Form a 4 Bit Adder

Adder Schematic Page 1 This is the BK tree part of the adder

Adder Schematic Page 2 Output from the BK tree, and the original A and B bits are passed into the summer cells.

Adder Layout Note the long distance metal2 vertical routing

Adder Layout View 2

Adder Layout Input View

Adder Layout Output View

Adder Layout Area

Adder Test Setup I used VDC for 1 and VPULSE for Each output pin is loaded with 30 fF capacitors.

Adder Testing Results ABCINCOUT > 2.5v SUM3 < 2.5V SUM3 < 1V ns3.264 ns4.192 ns ns3.111 ns4.039 ns ns2.33 ns3.150 ns ns1.906 ns2.726 ns ns2.22 ns3.069 ns

Adder Worst Case A = 1111 B = 0001 CIN = 0 Note the sagging SUM3 output.

SUM 3 output is sagging because it is 4 Transistor Away from GND

Speeding up SUM3’s Rate of Change with a Multiplexer The capacitor now charges and discharges faster because it is closer to VDD and GND. However, the multiplexer will be an extra delay Effect of using an extra multiplexer at the output: -Y_fast will arrive at 2.5V 0.35 ns later. -Y_fast will arrive at 1V ns earlier.

Timing without a Multiplexer Buffer SUM3 is changes slowly if its output is used to charge 30fF of capacitance directly. Note the time scale for this test goes up to 16 ns.

Timing with a Multiplexer Buffer Note the time scale for this test goes up to just 12 ns. Passing SUM3’s output to a multiplexer buffer delays the wave but increase the rate of change.

ALU Schematic

DRC of ALU

Extracted View

LVS