Download presentation
Presentation is loading. Please wait.
1
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania 18042 nestorj@lafayette.edu ECE 425 - VLSI Circuit Design Lecture 23 - Subsystem Design Spring 2007
2
ECE 425 Spring 2007Lecture 23 - Subsystem Design2 Announcements Reading Wolf: 6.1-6.9 These notes drawn in part from handouts by J. Rabaey, Digital Integrated Circuits, © Prentice-Hall 1995.
3
ECE 425 Spring 2007Lecture 23 - Subsystem Design3 Where we Are: Last Time: More about Project Pad Frame & Comparator Streaming Video: Two Talks on Chip-Level Design Today: Custom Subsystem Design General approach Shifters Adders
4
ECE 425 Spring 2007Lecture 23 - Subsystem Design4 Subsystem Design General Techniques Goals Pipelining Datapath Design Common Subsystems Shifters Adders ALUs Multipliers Memories Structure Logic
5
ECE 425 Spring 2007Lecture 23 - Subsystem Design5 Subsystem Design (Ch. 6) Goals of Custom Subsystem Design Maximize performance Minimize area Fit together with other subsystems Key idea: optimize across levels of abstractions Layout Circuit Logic Register-Transfer and Higher
6
ECE 425 Spring 2007Lecture 23 - Subsystem Design6 Optimizing for Peformance and/or Area Layout level Microscopic changes: move wires, change wire sizing add vias, reduce source/drain cap, etc. Macroscopic changes: cell placement, design of hierarchy Circuit level Transistor sizing Advanced circuits (e.g., dynamic logic)
7
ECE 425 Spring 2007Lecture 23 - Subsystem Design7 Optimizing for Peformance and/or Area (cont'd) Logic level Use specialized designs (e.g. shifters, ALUs, etc.) Flatten to reduce delay Restructure Register-transfer level (and above) Place latches/flip flops to maximize performance (retiming) Encode FSMs to minimize area/delay Perform computations in parallel with extra hardware if cost permits Pipeline logic to increase performance
8
ECE 425 Spring 2007Lecture 23 - Subsystem Design8 Pipelining Key idea: Partition combinational function with latches / flip flops Each partition is called a stage Time between each result: one clock period Latency: number of clock cycles before result appears (== number of stages)
9
ECE 425 Spring 2007Lecture 23 - Subsystem Design9 Example - Before Pipelining Comb. logic delay t p =80ns* Latch/Flip-Flop setup t su =5ns Clock Period t clk =85ns * archiac TTL timing values - divide by approx. 100 for VLSI!
10
ECE 425 Spring 2007Lecture 23 - Subsystem Design10 Example - After Pipelineing Comb. logic delay t p1 =t p2 =40ns Latch setup t su =5ns Clock Period t clk =45ns Latency: 2 cycles
11
ECE 425 Spring 2007Lecture 23 - Subsystem Design11 Pipelining Comments Impact on performance Increases operations per unit time Increases latency Added overhead due to register setup times Design concerns / limits Balance stage delays for best performance Structure of logic may limit number of stages
12
ECE 425 Spring 2007Lecture 23 - Subsystem Design12 Effect of Adding Pipeline Stages
13
ECE 425 Spring 2007Lecture 23 - Subsystem Design13 Custom Datapath Design Goal: create a tight design of several elements Arithmetic / Logic Functions, Shifters Storage: Registers, Register Files Interconnect: wires, buses
14
ECE 425 Spring 2007Lecture 23 - Subsystem Design14 Datapath Pysical Design Bit-sliced layout of each component Connection by abutment "Pitch-matched" connections Designed using "wiring plan" Wiring Plan
15
ECE 425 Spring 2007Lecture 23 - Subsystem Design15 Bus Design in datapaths Key idea: replace multiplexers with distributed drivers for long connections Pseudo-nmos NOR: Fig 6-8, p. 318 high power simple design Precharged: Fig 6-9, p. 319 lower power more complex design In either case, careful circuit design and interconnect modeling is essential Pseudo-nmos bus Precharged bus
16
ECE 425 Spring 2007Lecture 23 - Subsystem Design16 Ancient Example: the Motorola 68K
17
ECE 425 Spring 2007Lecture 23 - Subsystem Design17 Subsystem Design General Techniques Goals Pipelining Datapath Design Common Subsystems Shifters Adders ALUs Multipliers Memories Structure Logic
18
ECE 425 Spring 2007Lecture 23 - Subsystem Design18 Shifter Design Why shift? Arithmetic operations Floating-point Bit field extraction Shift Register - one shift per clock cycle Hardware shifters - implement as comb. logic Single-bit shifters Barrel shifters Logarithmic shifters
19
ECE 425 Spring 2007Lecture 23 - Subsystem Design19 Single-Bit Shifter Essentially a MUX made from pass transistors Source: J. Rabaey, Digital Integrated Circuits © 1995 Prentice-Hall.
20
ECE 425 Spring 2007Lecture 23 - Subsystem Design20 1 00 Barrel Shifter pass transistors connect input bit to chosen output regular layout each signal flows through only one trans. gate area dominated by pitch of metal wires
21
ECE 425 Spring 2007Lecture 23 - Subsystem Design21 4 X 4 Barrel Shifter - Layout Source: J. Rabaey, Digital Integrated Circuits © 1995 Prentice-Hall Width barrel ~ 2 p m M
22
ECE 425 Spring 2007Lecture 23 - Subsystem Design22 Logarithmic Shifter Combine shifts of powers-of-two Source: J. Rabaey, Digital Integrated Circuits © 1995 Prentice-Hall
23
ECE 425 Spring 2007Lecture 23 - Subsystem Design23 Logarithmic Shifter - Layout Source: J. Rabaey, Digital Integrated Circuits © 1995 Prentice-Hall
24
ECE 425 Spring 2007Lecture 23 - Subsystem Design24 Adder Design Review: Full Adder Sum:s i = a i XOR b i XOR c i Carry:c i+1 = a i *b i + a i *c i + b i *c i AiAi BiBi SiSi CiCi C i+1 AiAi BiBi CiCi SiSi 0000 0 0011 0 0101 0 0110 1 1001 0 1010 1 1100 1 1111 1
25
ECE 425 Spring 2007Lecture 23 - Subsystem Design25 Adder Design (cont'd) Ripple: constructed from n full adders Compact, but delay proportional to n May be tolerable when n=8, BUT What about n=32? Potential worst cases: A 0 or B 0 to S 31 A 0 or B 0 to C 32 A0A0 B0B0 S0S0 C0C0 C1C1 A1A1 B1B1 S1S1 C1C1 C2C2 A2A2 B2B2 S2S2 C2C2 C3C3 A3A3 B3B3 S3S3 C3C3 C4C4 0
26
ECE 425 Spring 2007Lecture 23 - Subsystem Design26 Full Adder - Static CMOS Source: J. Rabaey, Digital Integrated Circuits © 1995 Prentice-Hall
27
ECE 425 Spring 2007Lecture 23 - Subsystem Design27 Inversion Property Source: J. Rabaey, Digital Integrated Circuits © 1995 Prentice-Hall
28
ECE 425 Spring 2007Lecture 23 - Subsystem Design28 Inversion Adder Source: J. Rabaey, Digital Integrated Circuits © 1995 Prentice-Hall
29
ECE 425 Spring 2007Lecture 23 - Subsystem Design29 Mirror Adder: A Better Structure Source: J. Rabaey, Digital Integrated Circuits © 1995 Prentice-Hall
30
ECE 425 Spring 2007Lecture 23 - Subsystem Design30 Mirror adder notes The NMOS and PMOS chains are completely symmetrical. This guarantees identical rising and falling transitions if the NMOS and PMOS devices are properly sized. A maximum of two series transistors can be observed in the carry-generation circuitry. When laying out the cell, the most critical issue is the minimization of the capacitance at node C o. The reduction of the diffusion capacitances is particularly important. The capacitance at node C o is composed of four diffusion capacitances, two internal gate capacitances, and six gate capacitances in the connecting adder cell. The transistors connected to C i are placed closest to the output. Only the transistors in the carry stage have to be optimized for optimal speed. All transistors in the sum stage can be minimal size. Source: J. Rabaey, Digital Integrated Circuits © 1995 Prentice-Hall
31
ECE 425 Spring 2007Lecture 23 - Subsystem Design31 Dynamic Adder - np-CMOS Source: J. Rabaey, Digital Integrated Circuits © 1995 Prentice-Hall 17 transistors
32
ECE 425 Spring 2007Lecture 23 - Subsystem Design32 Layout - Dynamic np-CMOS adder Source: J. Rabaey, Digital Integrated Circuits © 1995 Prentice-Hall
33
ECE 425 Spring 2007Lecture 23 - Subsystem Design33 Speeding up Carry - Carry Lookahead Key idea: trade off delay, amount of logic used Benefit: Faster addition Cost: much more logic Define two signals for each adder stage: Generateg i = a i *b i Propagatep i = a i + b i Why use these names? Adder i will always generate a carry if a i, b i both true A i will propagate a carry input if either or both a i, b i both true A0A0 B0B0 S0S0 C0C0 C1C1 11 X 1 10 1 1
34
ECE 425 Spring 2007Lecture 23 - Subsystem Design34 Carry Lookahead (cont’d) Now rewrite carry output as function of a i,b i,p i,g i Original eqn: c i+1 = a i *b i + a i *c i + b i *c i New eqn:c i+1 = g i + p i *c i "Flatten" carry function in terms of g i, p i c 1 = g 0 + p 0 *c 0 c 2 = g 1 + p 1 *c 1 = g 1 + p 1 *(g 0 + p 0 *g 0 ) = g 1 + p 1 *g 0 + p 1 *p 0 *c 0 c 3 = g 2 + p 2 *g 1 + p 2 *p 1 *g 0 + p 3 *p 2 *p 1 *c 0 c 4 = g 3 + p 3 *g 2 + p 3 *p 2 *g 1 + p 3 *p 2 *p 1 *g 0 + p 3 *p 2 *p 1 *p 1 *c 0 Add carry lookahead logic that computes c 1 -c 4 in terms of p 0 - p 3 and g 0 -g 3
35
ECE 425 Spring 2007Lecture 23 - Subsystem Design35 Logarithmic Lookahead: Brent-Kung Adder Source: J. Rabaey, Digital Integrated Circuits © 1995 Prentice-Hall
36
ECE 425 Spring 2007Lecture 23 - Subsystem Design36 Coming Up: More about adders ALUs Memories Structure Logic
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.