ESE534 Computer Organization

Slides:



Advertisements
Similar presentations
Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 14: March 19, 2014 Compute 2: Cascades, ALUs, PLAs.
Advertisements

L10 – Transistors Logic Math 1 Comp 411 – Spring /22/07 Arithmetic Circuits Didn’t I learn how to do addition in the second grade?
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 21: April 2, 2007 Time Multiplexing.
CS 300 – Lecture 3 Intro to Computer Architecture / Assembly Language Sequential Circuits.
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day10: October 25, 2000 Computing Elements 2: Cascades, ALUs,
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day8: October 18, 2000 Computing Elements 1: LUTs.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 13: February 26, 2007 Interconnect 1: Requirements.
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day4: October 4, 2000 Memories, ALUs, and Virtualization.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 11: February 14, 2007 Compute 1: LUTs.
Penn ESE Spring DeHon 1 FUTURE Timing seemed good However, only student to give feedback marked confusing (2 of 5 on clarity) and too fast.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 5: January 24, 2007 ALUs, Virtualization…
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 13: February 4, 2005 Interconnect 1: Requirements.
Digital Computer Concept and Practice Copyright ©2012 by Jaejin Lee Logic Circuits I.
Unit 9 Multiplexers, Decoders, and Programmable Logic Devices
Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 9: February 24, 2014 Operator Sharing, Virtualization, Programmable Architectures.
Digital Computer Concept and Practice Copyright ©2012 by Jaejin Lee Logic Circuits I.
Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 7: February 6, 2012 Memories.
CDA 3101 Fall 2013 Introduction to Computer Organization The Arithmetic Logic Unit (ALU) and MIPS ALU Support 20 September 2013.
Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 5: February 1, 2010 Memories.
Penn ESE534 Spring DeHon 1 ESE534 Computer Organization Day 9: February 13, 2012 Interconnect Introduction.
Caltech CS184 Winter DeHon CS184a: Computer Architecture (Structure and Organization) Day 4: January 15, 2003 Memories, ALUs, Virtualization.
Lecture 17: Dynamic Reconfiguration I November 10, 2004 ECE 697F Reconfigurable Computing Lecture 17 Dynamic Reconfiguration I Acknowledgement: Andre DeHon.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 11: January 31, 2005 Compute 1: LUTs.
CBSSS 2002: DeHon Universal Programming Organization André DeHon Tuesday, June 18, 2002.
Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 8: February 19, 2014 Memories.
B0110 ALU ENGR xD52 Eric VanWyk Fall Today Back to Gates! Review Timing with Adders Compare Growth Characteristics Construct Adder/Subtractor Construct.
Introduction to the FPGA and Labs
ESE532: System-on-a-Chip Architecture
ESE534: Computer Organization
Combinational Circuits
ESE532: System-on-a-Chip Architecture
ESE532: System-on-a-Chip Architecture
ESE534: Computer Organization
ESE534: Computer Organization
CS184a: Computer Architecture (Structure and Organization)
ESE532: System-on-a-Chip Architecture
ESE532: System-on-a-Chip Architecture
CS184a: Computer Architecture (Structure and Organization)
Instructor: Dr. Phillip Jones
Digital Logic Last Time … This Time … Control Path, Arithmetic Ops a
We will be studying the architecture of XC3000.
5. Combinational circuits
ESE534: Computer Organization
ESE534: Computer Organization
ESE534: Computer Organization
ESE534: Computer Organization
CSE 370 – Winter 2002 – Comb. Logic building blocks - 1
ESE534: Computer Organization
Day 17: October 8, 2014 Performance: Gates
Week 7: Gates and Circuits: PART II
ESE534: Computer Organization
Part I Background and Motivation
Day 3: September 4, 2013 Gates from Transistors
Day 14: October 8, 2010 Performance
Arithmetic and Decisions
CSE 370 – Winter 2002 – Comb. Logic building blocks - 1
Day 18: October 20, 2010 Ratioed Logic Pass Transistor Logic
Digital Logic Circuits
Day 17: October 9, 2013 Performance: Gates
Day 15: October 13, 2010 Performance: Gates
TA David “The Punner” Eitan Poll
ESE534: Computer Organization
ESE534: Computer Organization
Combinational Circuits
ESE534: Computer Organization
Day 16: October 12, 2012 Performance: Gates
FIGURE 5-1 MOS Transistor, Symbols, and Switch Models
Day 16: October 17, 2011 Performance: Gates
ESE 150 – Digital Audio Basics
Instruction execution and ALU
Presentation transcript:

ESE534 Computer Organization Day 7: February 17, 2014 Interconnect Introduction Penn ESE534 Spring2014 -- DeHon

Previously Universal building blocks Computations with gates Penn ESE534 Spring2014 -- DeHon

Today Universal Spatially Programmable Crossbar Programmable compute blocks Start thinking about optimizing interconnect Penn ESE534 Spring2014 -- DeHon

Spatial Programmable Penn ESE534 Spring2014 -- DeHon

Spatially Programmable Program up “any” function Not sequentialize in time E.g. Want to build any FSM or any arithmetic operation Penn ESE534 Spring2014 -- DeHon

Needs? Need a collection of gates. What else will we need? Penn ESE534 Spring2014 -- DeHon

Needs Need some registers Need way to programmably wire gates together Penn ESE534 Spring2014 -- DeHon

Multiplexer Interconnect Use a multiplexer for programmable interconnect Can select any source to be an input for a gate How big is an N-input multiplexer? Penn ESE534 Spring2014 -- DeHon

Sources? What are potential sources? Inputs to circuit -- I Outputs of gates -- G Outputs of registers – R N=I+G+R Penn ESE534 Spring2014 -- DeHon

Sinks Which things need programmable inputs? (and how many?) Circuit outputs -- O Gate – needs one per input -- kG Assuming k-input gates Registers – R M = O+R+kG Penn ESE534 Spring2014 -- DeHon

N-input, M-output Multiplexing Area? Bits to control the multiplexers? Data input switching Capacitance Switched? Delay? Penn ESE534 Spring2014 -- DeHon

Mux Programmable Interconnect Area M×N = (I+G+R) × (O+kG+R) = kG2 + … Scales faster than gates! Penn ESE534 Spring2014 -- DeHon

Interconnect Costs We can do better than this Even when we do better Touch on a little later in lecture Dig into details later in term Even when we do better Interconnect can be dominate Area, delay, energy Particularly for Spatial Architectures Penn ESE534 Spring2014 -- DeHon

Dominant Time LUT is the gate. Definition coming soon… Penn ESE534 Spring2014 -- DeHon

Dominant Power [Energy] XC4003A data from Eric Kusse (UCB MS 1997) [Virtex II, Shang et al., FPGA 2002] [Tuan et al./FPGA 2006] Penn ESE534 Spring2014 -- DeHon

Crossbar Penn ESE534 Spring2014 -- DeHon

Crossbar Allows us to connect any of a set of inputs to any of the outputs. This is functionality provided with our muxes Penn ESE534 Spring2014 -- DeHon

Crossbar Structure Can be more efficient Penn ESE534 Spring2014 -- DeHon

Crossbar Costs Area still goes as M×N Delay proportional to M + N More realistic even for mux implementation Energy still goes as M×N Penn ESE534 Spring2014 -- DeHon

Crossbar Notation Penn ESE534 Spring2014 -- DeHon

Gates with Crossbar Interconnect Penn ESE534 Spring2014 -- DeHon

Programmable, Universal Program to any function of N gates Limitation we only use nand2 gates? What can we say about functions of arbitrary 2-input gates? Penn ESE534 Spring2014 -- DeHon

Universal Computation with Fixed Compute Operator Being minimalists, this shows: do not need compute to be programmable Just use fixed nor2 or nand2 Penn ESE534 Spring2014 -- DeHon

Programmable Functions Penn ESE534 Spring2014 -- DeHon

Mux can be a programmable gate bool mux4(bool a, b, c, d, s0, s1) { return(mux2( mux2(a,b,s0), mux2(c,d,s0), s1)); } Penn ESE534 Spring2014 -- DeHon

Program AND2? How do we program to behave as and2? Penn ESE534 Spring2014 -- DeHon

Mux as Logic bool and2(bool x, y) {return (mux4(false,false,false,true,x,y));} bool or2(bool x, y) {return (mux4(false,true,true,true,x,y));} Just by routing “data” into this mux4, Can select any two input function Penn ESE534 Spring2014 -- DeHon

LUT – LookUp Table When use a mux as programmable gate Call it a LookUp Table (LUT) Implementing the Truth Table for small # of inputs # of inputs =k (need mux-2k) Just lookup the output result in the table Penn ESE534 Spring2014 -- DeHon

Programmable Compute Can use programmable gate in place of nor gate Penn ESE534 Spring2014 -- DeHon

Programmable Compute How do we “program” to perform a particular function? Penn ESE534 Spring2014 -- DeHon

Instructions Set of bits that tell the programmable units how to behave Penn ESE534 Spring2014 -- DeHon

Is an Adder Universal? Assuming interconnect: A: 001a What’s c? (big assumption as we have just seen) Consider: What’s c? A: 001a B: 000b S: 00cd Penn ESE534 Spring2014 -- DeHon

Practically To reduce (some) interconnect, and to reduce number of operations, do tend to build a bit more general “universal” computing function Penn ESE534 Spring2014 -- DeHon

Arithmetic Logic Unit (ALU) Observe: with small tweaks can get many functions with basic adder components Penn ESE534 Spring2014 -- DeHon

Arithmetic and Logic Unit Penn ESE534 Spring2014 -- DeHon

ALU Functions A+B w/ Carry B-A A xor B (squash carry) A*B (squash carry) /A Penn ESE534 Spring2014 -- DeHon

Optimization and Design Space Interconnect Optimization and Design Space Penn ESE534 Spring2014 -- DeHon

Locality Maybe we don’t need to connect everything to everything? Cluster groups of C things at leaves CG gates, CR registers Limit cluster I/O – CI, CO Crossbar within cluster Crossbar among clusters Penn ESE534 Spring2014 -- DeHon

Compare Switches Penn ESE534 Spring2014 -- DeHon

8×16=128 8×4+18×4=104 Penn ESE534 Spring2014 -- DeHon

Comparing Full Crossbar needs: kG2 switches How many switches needed for: CG gates per cluster CI inputs to cluster CO outputs from cluster (ignore registers and circuit input/output) Penn ESE534 Spring2014 -- DeHon

Costs Cluster Input Crossbar: Cluster Output Crossbar: Inputs: CI+CG Outputs: kCG Cluster Output Crossbar: Input: CG Output: CO Master Crossbar: Inputs: (G/CG) × CO Outputs: (G/CG) × CI (G/CG)×CO×(G/CG)×CI +(G/CG) ×(k×CG2+k×CG×CI+CG×CO) Penn ESE534 Spring2014 -- DeHon

Costs Cluster: G2×(CI×CO/CG2)+k×G×(CG+CI+CO) Full Crossbar: kG2 Compare at: CG=8, CI=CO=2, G=256, k=2 Cluster case? Full crossbar case? 216×(2×2/82)+28×(8+2+2)×2 = 212 + 12×29= 212 + 3×211= 5×213 217 Penn ESE534 Spring2014 -- DeHon

More? Can we do it again? Discuss Generalization? Issues? Penn ESE534 Spring2014 -- DeHon

Optimizing Interconnect What other interconnect optimizations/structures might we use? Penn ESE534 Spring2014 -- DeHon

Interconnect Design Space Large interconnect design space We will be exploring systematically Day16—20+25 Penn ESE534 Spring2014 -- DeHon

Big Ideas Interconnect can be programmable Interconnect area/delay/energy can dominate compute area Exploiting structure can reduce area Locality Penn ESE534 Spring2014 -- DeHon

Admin HW2 HW3 Wednesday Drop date is Friday HW4 out On-time submissions graded (late assignments will be graded later) HW3 Wednesday Drop date is Friday HW4 out Reading for Wed. on Web Penn ESE534 Spring2014 -- DeHon