Day 29: November 18, 2011 Dynamic Logic ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 29: November 18, 2011 Dynamic Logic Penn ESE370 Fall2011 -- DeHon
Today Memory Energy wrapup Dynamic Logic Strategy Form Compare CMOS Penn ESE370 Fall2011 -- DeHon
Memory What fraction of memory cells is involved in a read/write? What are most cells doing on a cycle? Reads are slow Cycles long lots of time to leak Penn ESE370 Fall2011 -- DeHon
ITRS 2009 45nm C0 = 0.045mm × Cg,total High Performance Low Power Isd,leak 100nA/mm 50pA/mm Isd,sat 1200 mA/mm 560mA/mm Cg,total 1fF/mm 0.91fF/mm Vth 285mV 585mV C0 = 0.045mm × Cg,total Penn ESE370 Fall2011 -- DeHon
High Power Process V=1V d=1000 g=0.5 Waccess=Wbuf=2 Full swing for simplicity Csc = 0 (just for simplicity, typically <Cload) BL: Cload=1000C0 ≈ 45 fF = 45×10-15F WN = 2 Ileak = 9×10-9 A P= (45×10-15) freq + 1000×9×10-9 W Penn ESE370 Fall2011 -- DeHon
Relative Power P= (45×10-15) freq + 1000×9×10-9 W Crossover freq<200MHz How partial swing on bit line change? Reduce dynamic energy Increase percentage in leakage energy Reduce crossover frequency Penn ESE370 Fall2011 -- DeHon
Consequence Leakage energy can dominate in large memories Care about low operating (or stand-by) power Use process or transistors with high Vth Reduce leakage at expense of speed Penn ESE370 Fall2011 -- DeHon
Memory in Processors Most of the area on modern processors is memory Often accounts for 80—90% of transistors Example: Intel 6-core processor 1.9 Billion transistors 25MB of L3 + L2 cache 25 x 106 x 8 bits/byte x 6tr/bit =1.2 Billion Plus L1 memories, RF, branch predict, reorder ….. Penn ESE370 Fall2011 -- DeHon
Dynamic Logic Penn ESE370 Fall2011 -- DeHon
Motivation Like to avoid driving pullup/pulldown networks reduce capacitive load Power, delay Penn ESE370 Fall2011 -- DeHon
Motivation Like to avoid driving pullup/pulldown networks reduce capacitive load Power, delay Ratioed had problems with Large device for ratioing Slow pullup Static power Penn ESE370 Fall2011 -- DeHon
Idea Use clock to disable pullup during evaluation Penn ESE370 Fall2011 -- DeHon
Discuss Use clock to disable pullup during evaluation What happens when /Pre=0, A=B=0 /pre=1, A=B=0? /pre=1, A=1, B=0? Sizing implication? Concerns? Requirements? Penn ESE370 Fall2011 -- DeHon
Advantages Large device Single network Driven by clock not data/logic Can pullup quickly w/out putting load on logic Single network Pulldown Don’t have to size for ratio with pullup Swings rail-to-rail Penn ESE370 Fall2011 -- DeHon
Domino Logic Penn ESE370 Fall2011 -- DeHon
Domino Everything charged high After inverter all inputs low Why do we want this? Disabled, waiting for an enabling transition Penn ESE370 Fall2011 -- DeHon
Requirements Single transition All inputs at 0 during precharge Once fires, it is done like domino falling All inputs at 0 during precharge Precharge to 1 so inversion makes 0 Non-inverting gates http://en.wikipedia.org/wiki/File:Domino_effect.jpg Penn ESE370 Fall2011 -- DeHon
Issues Noise sensitive Power? Activity? Penn ESE370 Fall2011 -- DeHon
Domino or4 Penn ESE370 Fall2011 -- DeHon
Domino Logic Performance Compare to CMOS cases? R0/2 input nor4 or4 nand4 Penn ESE370 Fall2011 -- DeHon
Dynamic OR4 Precharge time? Driving input With R0/2 Driving inverter and self cap? Output self delay? Penn ESE370 Fall2011 -- DeHon
Class ended here Penn ESE370 Fall2011 -- DeHon
CMOS NOR4 Driving input Driving self cap? With R0/2 Penn ESE370 Fall2011 -- DeHon
CMOS NAND4 Driving input Driving self cap? w/ R0/2 Penn ESE370 Fall2011 -- DeHon
Delay Roundup Or4 domino 2g+1/3 3/2 15g+2 Nor4 cmos n/a 5 20g Circuit Precharge Input Driving Inv or Self Output Delay Or4 domino 2g+1/3 3/2 15g+2 Nor4 cmos n/a 5 20g Nand4 cmos For same output drive strength (R0/2), comparable or lower self load and lower input loading. Penn ESE370 Fall2011 -- DeHon
Discuss (time permit) Avoid inversion? Converting from CMOS? Post-charge Penn ESE370 Fall2011 -- DeHon
Observe Better (lower) ratio of input capacitance to drive strength Particularly good for Driving large loads Large fanin gates Harder to design with Timing and polarity restrictions Avoiding noise Especially with today’s high variation tech. Can consume more energy/op Penn ESE370 Fall2011 -- DeHon
Admin Project 2: Due Wednesday Monday: in Detkin Lab Hope you are well along Optimization push over weekend Monday: in Detkin Lab See posted lab description Teams assigned Penn ESE370 Fall2011 -- DeHon
Idea Dynamic/clocked logic Only build/drive one network Fast transition propagation Spend delay (capacitance) on pullup off critical path of logic More complicated, power Reserve for when most needed Penn ESE370 Fall2011 -- DeHon