Farhan Mohamed Ali (W2-1) Jigar Vora (W2-2) Sonali Kapoor (W2-3) Avni Jhunjhunwala (W2-4) Shiven Seth (W2-5) Presentation 1 MAD MAC 525 1 st February,

Slides:



Advertisements
Similar presentations
Programmable FIR Filter Design
Advertisements

Presentation #M2 EZ Parking Wontaek Shin (M2-1) Shanshan Ma (M2-2) Nan Li (M2-3) Stage 1: 1/24/2006 Design Proposal Overall Project Objective: Design a.
M3: ProDiver 525 Kavita Arora (M3-1) *Lisa Gentry (M3-2) Steven Wasik (M3-3) Karolina Werner (M3-4) Stage : 4 Feb 04 Size Estimates/ Floor Plan Overall.
1 CONSTRUCTING AN ARITHMETIC LOGIC UNIT CHAPTER 4: PART II.
Design Goal Design an Analog-to-Digital Conversion chip to meet demands of high quality voice applications such as: Digital Telephony, Digital Hearing.
[M2] Traffic Control Group 2 Chun Han Chen Timothy Kwan Tom Bolds Shang Yi Lin Manager Randal Hong Wed. Oct. 29 Overall Project Objective : Dynamic Control.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE VLSI Circuit Design Lecture 24 - Subsystem.
Idongesit Ebong (1-1) Jenna Fu (1-2) Bowei Gai (1-3) Syed Hussain (1-4) Jonathan Lee (1-5) Design Manager: Myron Kwai Overall Project Objective: Design.
Team M1 Enigma Machine Milestone 5 Adithya Attawar (M11) Shilpi Chakrabarti (M12) Zavo Gabriel (M13) Mike Sokolsky (M14) Design Manager: Prateek Goenka.
1 Farhan Mohamed Ali (W2-1) Jigar Vora (W2-2) Sonali Kapoor (W2-3) Avni Jhunjhunwala (W2-4) Presentation 12 MAD MAC th April, 2006 Short Final Presentation.
Idongesit Ebong (1-1) Jenna Fu (1-2) Bowei Gai (1-3) Syed Hussain (1-4) Jonathan Lee (1-5) Design Manager: Myron Kwai Overall Project Objective: Design.
Copyright 2008 Koren ECE666/Koren Part.6b.1 Israel Koren Spring 2008 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer.
1 Farhan Mohamed Ali (W2-1) Jigar Vora (W2-2) Sonali Kapoor (W2-3) Avni Jhunjhunwala (W2-4) Presentation 7 MAD MAC th March, 2006 Functional Block.
1 GPS Waypoint Navigation Team M-2: Charles Norman (M2-1) Julio Segundo (M2-2) Nan Li (M2-3) Shanshan Ma (M2-4) Design Manager: Zack Menegakis Presentation.
Team W3: Anthony Marchetta Derek Ritchea David Roderick Adam Stoler Milestone 3: Feb. 4 th Size Estimates/Floorplan Overall Project Objective: Design an.
Camera Auto Focus Presentation 4, February 14 th, 2007 Team W1: Tom Goff (W11) David Hwang (W12) Kate Killfoile (W13) Greg Look (W14) Design Manager: Bowei.
[M2] Traffic Control Group 2 Chun Han Chen Timothy Kwan Tom Bolds Shang Yi Lin Manager Randal Hong Wed. Oct. 27 Overall Project Objective : Dynamic Control.
Farhan Mohamed Ali (W2-1) Jigar Vora (W2-2) Sonali Kapoor (W2-3) Avni Jhunjhunwala (W2-4) Presentation 4 MAD MAC th February, 2006 Gate Level Design.
1 Farhan Mohamed Ali (W2-1) Jigar Vora (W2-2) Sonali Kapoor (W2-3) Avni Jhunjhunwala (W2-4) Presentation 5 MAD MAC nd February, 2006 Top Level Integration.
Farhan Mohamed Ali (W2-1) Jigar Vora (W2-2) Sonali Kapoor (W2-3) Avni Jhunjhunwala (W2-4) Presentation 3 MAD MAC th February, 2006 Size estimates/Floor.
1 Farhan Mohamed Ali (W2-1) Jigar Vora (W2-2) Sonali Kapoor (W2-3) Avni Jhunjhunwala (W2-4) Presentation 10 MAD MAC th April, 2006 Top-Level Layout.
High Dynamic Range Emeka Ezekwe M11 Christopher Thayer M12 Shabnam Aggarwal M13 Charles Fan M14 Manager: Matthew Russo 6/26/
Sprinkler Buddy Presentation #8: “Testing/Finalization of all Modules and Global Placement” 3/26/2007 Team M3 Kartik Murthy Panchalam Ramanujan Sasidhar.
1. 2 Farhan Mohamed Ali Jigar Vora Sonali Kapoor Avni Jhunjhunwala 1 st May, 2006 Final Presentation MAD MAC 525 Design Manager: Zack Menegakis Design.
Sprinkler Buddy Presentation #7: “Redesign of Adder Parts And Layout of Other Major Blocks” 3/07/2007 Team M3 Kalyan Kommineni Kartik Murthy Panchalam.
1 GPS Waypoint Navigation Team M-2: Charles Norman (M2-1) Julio Segundo (M2-2) Nan Li (M2-3) Shanshan Ma (M2-4) Design Manager: Zack Menegakis Presentation.
Chapter Four Arithmetic and Logic Unit
Camera Auto Focus Presentation 3, February 7 th, 2007 Team W1: Tom Goff (W11) David Hwang (W12) Kate Killfoile (W13) Greg Look (W14) Design Manager: Bowei.
1 Farhan Mohamed Ali (W2-1) Jigar Vora (W2-2) Sonali Kapoor (W2-3) Avni Jhunjhunwala (W2-4) Presentation 8 MAD MAC nd March, 2006 Functional Block.
Farhan Mohamed Ali (W2-1) Jigar Vora (W2-2) Sonali Kapoor (W2-3) Avni Jhunjhunwala (W2-4) Siven Seth (W2-5) Presentation 1 MAD MAC th January, 2006.
Team W1 Design Manager: Rebecca Miller 1. Bobby Colyer (W11) 2. Jeffrey Kuo (W12) 3. Myron Kwai (W13) 4. Shirlene Lim (W14) Stage II: 26 th January 2004.
1 GPS Waypoint Navigation Team M-2: Charles Norman (M2-1) Julio Segundo (M2-2) Nan Li (M2-3) Shanshan Ma (M2-4) Design Manager: Zack Menegakis Presentation.
1 ECE369 Chapter 3. 2 ECE369 Multiplication More complicated than addition –Accomplished via shifting and addition More time and more area.
Camera Auto Focus Presentation 4, February 14 th, 2007 Team W1: Tom Goff (W11) David Hwang (W12) Kate Killfoile (W13) Greg Look (W14) Design Manager: Bowei.
Camera Auto Focus Group W1 Tom Goff Dave Hwang Kate Killfoile Greg Look Design Manager: Bowei Gai Final Presentation, April 30 th, 2007 Project Objective:
Sprinkler Buddy Presentation #10: “LVS” 4/11/2007 Team M3 Sasidhar Uppuluri Devesh Nema Kalyan Kommineni Kartik Murthy Panchalam Ramanujan Design Manager:
Camera Auto Focus Presentation 6, February 28 th, 2007 Team W1: Tom Goff (W11) David Hwang (W12) Kate Killfoile (W13) Greg Look (W14) Design Manager: Bowei.
Noise Canceling in 1-D Data: Presentation #4 Seri Rahayu Abd Rauf Fatima Boujarwah Juan Chen Liyana Mohd Sharipp Arti Thumar M2 Feb 14 th, 2005 Gate Level.
Random Number Generator Dimtriy Solmonov W1-1 David Levitt W1-2 Jesse Guss W1-3 Sirisha Pillalamarri W1-4 Matt Russo W1-5 Design Manager – Thiago Hersan.
1 Farhan Mohamed Ali (W2-1) Jigar Vora (W2-2) Sonali Kapoor (W2-3) Avni Jhunjhunwala (W2-4) Presentation 9 MAD MAC th March, 2006 Functional Block.
CPS Computer Architecture Assignment 4: Multiplication, Division, Floating Point.
Number Systems II Prepared by Dr P Marais (Modified by D Burford)
AICCSA’06 Sharja 1 A CAD Tool for Scalable Floating Point Adder Design and Generation Using C++/VHDL By Asim J. Al-Khalili.
Introduction to Computer Organization and Architecture Lecture 10 By Juthawut Chantharamalee wut_cha/home.htm.
07/19/2005 Arithmetic / Logic Unit – ALU Design Presentation F CSE : Introduction to Computer Architecture Slides by Gojko Babić.
Spring 2002EECS150 - Lec12-cl3 Page 1 EECS150 - Digital Design Lecture 12 - Combinational Logic Circuits Part 3 March 4, 2002 John Wawrzynek.
Abdullah Aldahami ( ) March 12, Introduction 2. Background 3. Proposed Multiplier Design a.System Overview b.Fixed Point Multiplier.
55:035 Computer Architecture and Organization Lecture 5.
Advanced VLSI Design Unit 05: Datapath Units. Slide 2 Outline  Adders  Comparators  Shifters  Multi-input Adders  Multipliers.
Sprinkler Buddy Presentation #3: “System Level View and Floor Plan / Sizing” 2/07/2007 Team M3 Kartik Murthy Kalyan Kommineni Panchalam Ramanujan Sasidhar.
REGISTER TRANSFER & MICROOPERATIONS By Sohaib. Digital System Overview  Each module is built from digital components  Registers  Decoders  Arithmetic.
Computer Arithmetic See Stallings Chapter 9 Sep 10, 2009
Joseph Schneider February 23,  Fused Multiply-Add (FMA) is a unit designed to perform (A x B) + C as a single instruction  Faster, more precise.
Speedup Speedup is defined as Speedup = Time taken for a given computation by a non-pipelined functional unit Time taken for the same computation by a.
1/8/ L25 Floating Point Adder Copyright Joanne DeGroat, ECE, OSU1 IEEE Floating Point Adder Using the IEEE Floating Point Standard for an.
S 2/e C D A Computer Systems Design and Architecture Second Edition© 2004 Prentice Hall Chapter 6 Overview Number Systems and Radix Conversion Fixed point.
Howd - Zur Hung Eric Lai Wei Jie Lee Yu - Chiang Lee Design Manager: Jonathan P. Lee [M2] Huffman Encoder Project Presentation #3 February 7 th, 2007 Overall.
HDR- Design Presentation Team M1: Emeka Ezekwe (M11) Chris Thayer (M12) Shabnam Aggarwal (M13) Charles Fan (M14) Team M1 Manager: Matthew Russo.
Multiplier Design [Adapted from Rabaey’s Digital Integrated Circuits, Second Edition, ©2003 J. Rabaey, A. Chandrakasan, B. Nikolic]
Floating Point Operations
NxN Crossbar design for Barrel Shifter
Outline Introduction Floating Point Arithmetic Adder Multiplier.
Data Representation and Arithmetic Algorithms
Arithmetic Logical Unit
ECEG-3202 Computer Architecture and Organization
Data Representation and Arithmetic Algorithms
IEEE Floating Point Adder
Presentation transcript:

Farhan Mohamed Ali (W2-1) Jigar Vora (W2-2) Sonali Kapoor (W2-3) Avni Jhunjhunwala (W2-4) Shiven Seth (W2-5) Presentation 1 MAD MAC st February, 2006 Architecture Proposal W2 Project Objective: Design a crucial part of a GPU called the Multiply Accumulate Unit (MAC) which will revolutionize graphics.

MAD MAC 525 Status: Project chosen Specifications defined Architecture Design Behavioral Verilog Testbenches  To be done  Verilog : Gate Level Design  Schematic  Floor plan  Layout  Extraction, LVS, post-layout simulation

Multiply Accumulate unit (MAC) Executes function AB+C on 16 bit floating point inputs Multiply and add in parallel to greatly speed up operation Rounding is only performed only once so greater accuracy than individual multiply and add functions. MAD MAC accelerates FP16 blending to enable true HDR graphics Bright things can be really bright Dark things can be really dark And the details can be seen in both Overview - MAD MAC 525

Quick Overview of FP A = x 2 2 B = x 2 5 C = x 2 8 Step 1: A*B –Multiply the Significands: * = –Exponent of result is expA + expB = 7 –A*B = x 2 7 Step 2: Align C –To add two FP’s, their exponents must be the same –Shift by expA + expB – expC = – 8 = -1 –Shift the significand of C left by 1 – >

Quick Overview of FP (contd.) Step 3: Depending on signs of A*B and C, add or subtract the two –Suppose A, B, and C are all positive –A*B + C = = Step 4: Normalize the Result –Currently the significand is and the exponent is expA + expB = 7 –Normalized to x 2 9 Step 5: Round the Result –The significand needs to be fit in 10 bits –Based on bits 11 through 13, the signficand is rounded and fit in 10 bits

Block Diagram RegArray ARegArray BRegArray C Multiplier Exp CalcAlign Adder/Subtractor Control Logic & Sign Dtrmin Normalize Round Reg Y Leading 0 Anticipator Input Output 16

Design Decisions (Week 2): Implementing a 16 bit (fp16) format 1 bit sign, 10 bit significand and 5 bit exponent Compatible with OpenEXR format used in latest games Enable Ultra-Threading Implements high speed register arrays and fast thread switching logic to instantaneously switch to another available thread if the executing thread runs out of data Implementation: High speed register-arrays for each input

Design Decisions (contd.): Multiplier Implementation – 11 x 11 Carry-Save Multiplier – Reasons: Fast because it avoids having ripple carry in every stage Enables Compact Layout

Design Decisions (contd.): 2’s Complement Adder/Subtractor –Variable Length Carry-Select Adder Reason: Reduces delay through Muxes –Use the signs of the inputs to determine addition or subtraction –Output: 35-bits from Align + 1 Carry Out = 36 bits

Design Decisions (contd.): Leading Zero Counter –Carry-Save Adder to count the leading zeroes of C Reason: To pre-compute the amount of shifting the result of A*B+C to normalize it –This will speed up our design because the Leading Zero Counter will not be in the critical path (which is through our multiplier)

Design Decisions (contd.): Align Exponent –Always align the exponent of C to expA + expB –Shift the significand of C by (expA + expB – expC) If negative, shift left because C is bigger than A*B If positive, shift right because C is smaller than A*B –Implementation: n-Pass Shifter Normalize –Format the result of A*B + C to IEEE Format (i.e. change the significand from … to …) –Align the exponent of the result as necessary –n-Pass Shifter to shift the result of the adder by the amount given by the Leading Zero Counter Round –The result needs to be fit into 16 bits –To preserve precision, we round the result based on the last 3 bits –Implementation: Incrementer and Shifter

Behavioral Verilog

Behavioral Verilog (contd.)

Behavioral Verilog (Output)

Updated Estimated Transistor Count Registers (input, output, pipelining) 2500 Threading Logic3000 Carry-Save Multiplier5000 Carry-Select Adder 2000 Alignment Shifter 1500 Leading 0 Anticipator700 Normalize 2000 Rounding 1500 Special Cases and Control Logic 2000 Total20200

Problems and Questions? Difficulty finding a high-level simulator to exhaustively test our behavioral verilog because both Matlab and C use the IEEE 32-bit format. Currently we are thoroughly testing our behavioral verilog and coming up with different test cases by hand. Suggested Solutions: - Make a scalable 32-bit version of our behavioral verilog and test it against C - Finding code written for software simulation by the VAX, PDP microprocessors.

Questions?