Download presentation
Published byPeter Richard Modified over 9 years ago
1
Seok-jae, Lee VLSI Signal Processing Lab. Korea University
A 180-mV Subthreshold FFT Processor Using a Minimum Energy Design Methodology -Alice Wang & Anantha Chandrakasan- Seok-jae, Lee VLSI Signal Processing Lab. Korea University
2
Why FFT processor? FFT processor is used for wireless sensor network.
FFT has been used in target tracking, localization and radar by analyzing phase differences form multiple sensors. FFT processor require low power design, chip speed is not critical. FFT processor is configured with some multipliers, control logics and SRAM memory parts. With various design method for low power consumption -variable bit precision, variable FFT length-, more power saving can be achived. Especially, multipliers, control logics and SRAM are implemented using ‘SUBTHRESHOLD’ circuits dissipated extremely low energy.
3
Radix-2 Butterfly FFT architecture
Subthreshold circuits are used!!!
4
8-b and 16-b Scalable Baugh-Wooley Multiplier
With 8-b precision, MSB parts of two inputs are processed. To minimize switching in the LSB adders, LSB inputs are gated.
5
Minimum Energy Point Analysis(1)
The power supply starting from large value is dropped, the switching(dynamic) and overall energy reduced. (VDD > Vth)
6
Minimum Energy Point Analysis(2)
Computation delay!!! In subthreshold region, the propagation delay increases exponentially resulting in a increase in leakage energy. (VDD <Vth)
7
Minimum Energy Point Analysis(3)
Optimal operating point (VDD, VTH) = (380mV, 480mV) Case 1 : Processing speed is not important. The optimal operating point occurs at the minimum energy point. And circuit operates with corresponding frequency.
8
Minimum Energy Point Analysis(4)
Optimal operating point contour Case 2 : Processing speed is critical. The given frequency constraints the VDD and VTH to achieve maximum power saving. One performance contours is tangent to one energy contour.
9
Minimum Energy Point for fixed VTH
VTH value is fixed as 450mV for implementing FFT processor. VDD value is 400mV for minimizing energy consumption Low power FFT processor operates in SUBTHRESHOLD region !!!
10
Subthreshold Inverter
Case 1 : Input is logical ‘0’. In subthreshold region, the leakage current is significant, So minimum WP (WP(min)) exists to pull up output node. worst case : Fast NMOS & Slow PMOS (FS) Leakage, IOFF ION ION Leakage, IOFF 1 Case 2 : Input is logical ‘1’. Minimum sized NMOS pulls down output node to ‘0’. But a large PMOS lead to a large leakage current compared to the drive current if NMOS. So maximum WP (WP(max)) exists to pull down output node. worst case : Slow NMOS & Fast PMOS (FS)
11
Operating Point for a Subthreshold Inverter
VDD = 195mV, WP = 5.4um (0.18um technology)
12
Subthreshold Standard Cell – XOR Case (1)
Conventional XOR gate scheme in subthreshold region In A=1, B=0 case, Leakage current is large and ION/IOFF is small. So, output node can not be fully pulled up.
13
Subthreshold Standard Cell – XOR Case (2)
A transmission gate XOR in subthreshold region devices are balanced Because there are two devices pulling the output node high and two diveces pulling low, ION/IOFF is not degraded!!!
14
Subthreshold Memory Design
FFT processor contains eight 128W X 16b RAM blocks and four 256W X 16b blocks. => Analyzing the functionality of conventional 6T SRAM in subthreshold. - Bitline cap, bitline leakage, speed, PVT variation…etc.. => Hierarchical read-bitline is used in the design of data memory and achieves acceptable ION/IOFF in subthreshold.
15
Subthreshold Write Access (1)
NPD have to be large enough to… voltage at LO does not rise above ΔVLO due to leakage of PPU and BL. Worst case : Slow NMOS and Fast PMOS (SF)
16
Subthreshold Write Access (2)
Write ‘Low’ case : => Determines NPS to pull HI down to ΔVLO , worst : SF Write ‘High’ case : Determines Maximum NPD and NPS. Since NPD and NPS causes voltage divider by its leakage current, so the drive current of PPU used to pull LO up to ΔVHI .
17
Sizing analysis on NPD If VDD decreases,
Cell size increase dramatically!!! This is optimal point, but this value can’t satisfy both READ and WRITE condition!!!
18
A Latch Based Write Sceheme and its analysis
C2MOS tristate inverters is a more robust design for subthrehold operation. The tristate latch memory cells shows functionality at down to 215mV.
19
Subthreshold Read Access (1)
The conventional 128W single-ended scheme case During precharge phase, Wpre is on and Bit line (RBL) is charged to VDD. But, since the charge stored bitline leaks away through all of the pull down device, Wpre is sized to offset the maximum leakage current through the pull down devices.
20
Subthreshold Read Access (2)
1 1 In worst case, M0 = 0 and M1~M127 =1, the bit line leakage are maximized. But, in this case, when RBL evaluate to ‘0’, ION << IOFF , RBL fails to evaluate to ‘0’. 1 1
21
Subthreshold Read Access (3)
The tristate-based scheme case 1 1 1 In worst case, M0 = 0 and M1~M127 =1, the tristate-based read access also suffer from bitline leakage effects. RBL evaluate to ‘0’, ION << IOFF , RBL fails to evaluate to ‘0’. 1
22
Subthreshold Read Access (4)
Proposed hierarhical-read-bitline scheme case Proposed SRAM scheme has some area, timing overhead but achieves extremely low energy dissipation. Latency!!! MUX with balanced circuit Need a decoder!!!
23
Results – Energy Dissipation as a function of VDD
The optimal operating point for minimal energy dissipation is at VDD = 350mV In simulation result, VDD = 400mV.
24
Results – Energy of 8-b and 16-b Processing
25
Summary specifications values Technology
0.18um CMOS with six metal layer Area 2.6 X 2.1 mm2 FFT length 128, 256, 512, 1024 Bit precision 8bit and 16bit precision Voltage supply 180~900mV Clock frequency 164Hz ~ 6MHz Power consumption 90nW (VDD=180mV) 600nW (VDD = 350mV, frequency = 10kHz)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.