Presentation is loading. Please wait.

Presentation is loading. Please wait.

1. NATURE: Non-Volatile Nanotube RAM based Field-Programmable Gate Arrays Wei Zhang†, Niraj K. Jha† and Li Shang ‡ †Dept. of Electrical Engineering Princeton.

Similar presentations


Presentation on theme: "1. NATURE: Non-Volatile Nanotube RAM based Field-Programmable Gate Arrays Wei Zhang†, Niraj K. Jha† and Li Shang ‡ †Dept. of Electrical Engineering Princeton."— Presentation transcript:

1 1

2 NATURE: Non-Volatile Nanotube RAM based Field-Programmable Gate Arrays Wei Zhang†, Niraj K. Jha† and Li Shang ‡ †Dept. of Electrical Engineering Princeton University ‡ Dept. of Electrical and Computer Engineering Queen’s University

3 3 A Hybrid CMOS/NAnoTUbe REconfigurable Architecture Motivation Background on CNT and NRAM Architecture of NATURE Logic Folding Experimental Results Conclusions

4 4 Motivation Moore’s Law: What’s Next? Carbon nanotubes (CNTs) Nanowires Single electron devices... Challenges in nano-circuits/architectures Lack of a mature fabrication process Defects and run-time failures Reconfigurable architectures, such as an FPGA, favored Regular structures ease fabrication Fault tolerance through reconfiguration

5 5 Motivation (Contd.) Problems of existing reconfigurable architectures High reconfiguration time overhead Low area efficiency Some recent works on programmable nanofabrics Molecular logic array (Goldstein et al. [ICCAD 2002]) Molecular logic array (Goldstein et al. [ICCAD 2002]) Nanowire PLA (Dehon et al. [FPGA 2004]) Nanowire PLA (Dehon et al. [FPGA 2004]) CMOS/nanowire hybrid architecture CMOL (Strukov et al. [Nanotechnology 2005]) CMOS/nanowire hybrid architecture CMOL (Strukov et al. [Nanotechnology 2005]) Fabrication problem not yet solved

6 6 NATURE CMOS fabrication compatible CMOS fabrication compatible NRAM-based Run-time reconfiguration Run-time reconfiguration Temporal logic folding Temporal logic folding Design flexibility Design flexibility Logic density Logic density Advantages of NATURE Hybrid design leverages beneficial aspects of both CMOS and CNT technologies NRAMs are distributed in NATURE to store multi- context reconfiguration bits Fine-grain reconfiguration (even cycle-by-cycle) Enables temporal logic folding Flexibility to perform area-performance trade- offs One-to-two orders of magnitude increase in logic density

7 7 Background Carbon nanotube (CNT) Metallic or semiconducting Single-wall or multi-wall Diameter: 1-100nm Length: up to millimeters Ballistic transport Excellent thermal conductivity Very high current density High chemical stability Robust to environment Source: Euronanotrade

8 8 Background (Contd.) Non-volatile nanotube random-access memory (NRAM) Mechanically bent or not: determines bistable on/off states Fully CMOS-compatible manufacturing process Prototype chip: 10 Gbit NRAM Will be ready for the market in the near future Source: Nantero

9 9 NRAMs Properties of NRAMs Non-volatile Similar speed to SRAM Similar density to DRAM Chemically and mechanically stable NATURE not tied to NRAMs Phase change RAM Magnetoresistive RAM Ferroelectric RAM

10 10 Architecture of NATURE Island-style logic blocks (LBs) connected by various levels of interconnects An LB contains a super macroblock (SMB) and a local switch matrix

11 11 Architecture of a Super Macroblock (SMB) n 1 macroblocks (MBs) comprise an SMB, here n 1 = 4

12 12 Architecture of a Macroblock (MB) n 2 logic elements (LEs) comprise an MB, here n 2 = 4

13 13 Logic Element and Interconnect An LE implements a computation and contains: An m-input look-up table (LUT) A flip-flop A pass transistor Interconnect Mixed wire segment scheme 25%, 50% and 25% distribution for length-1, length-4 and long wires Direct links from one LB to its 4 neighbors

14 14 Support for Reconfiguration Reconfiguration time short: 160ps Area overhead of NRAMs k: no. of reconfiguration sets per NRAM, assume k = 16 Area overhead: 20.5% per LB, assuming 100nm technology for CMOS logic and nanotube length Logic density = k (conf. copies) x area per configuration = 16*(1-0.205)=12.75 Appropriate value for k obtained through design space exploration

15 15 Temporal Logic Folding Basic idea: one can use NRAM-enabled run-time reconfiguration to realize different Boolean functions in the same logic element (LE) every few cycles

16 16 Example Without logic folding Num of LEs = 6 Delay = 4 LE delays +Interconnect delay Num of LEs = 2 Delay =4*clock_period With logic folding Clock period =LE delay +Reconfiguration +Interconnect delay

17 17 Folding Levels Logic folding can be performed at different levels of granularity, providing flexibility to perform area-performance trade-offs A level-p folding implies reconfiguration of the LE after the execution of p LUT computations (a) level-1 folding (b) level-2 folding

18 18 Choosing the Folding Level Advantages of logic folding Significant flexibility for performing area-performance trade-offs Ability to map much larger circuits using the same number of LEs Significant improvement in the area/circuit delay product Reduction in the need for global routing Folding level Clock period increases: Routing delay increases Number of clock cycles decreases Reconfiguration time decreases Total delay typically decreases Number of LEs increases Area increases

19 19 Experimental Setup Instance of architecture: 4 MBs in an SMB, 4 LEs in an MB, and LEs contain a 4-input LUT Number of reconfiguration copies k varied in order to compare implementations corresponding to selected folding levels: level-1, level-2, level-4 and no logic folding Results based on 100nm CMOS technology parameters

20 20 Experimental Results Average area-time product advantage = 2X Maximum area-time product advantage = 3X

21 21 16-RCA: 16-bit ripple carry adder 16-CLA: 16-bit carry lookahead adder 16-CSA: 16-bit carry select adder 8-MUL: 8-bit multiplier Experimental Results (Contd.) Average area-time product advantage = 13X Maximum area-time product advantage = 35X

22 22 Experimental Results (Contd.) Flexibility in performing area-performance trade-off For area-time (AT) product, larger the circuit depth, more the advantages of level-1 folding relative to no folding For the 64-bit ripple-carry adder, this advantage is about 35X LE utilization and logic density very high, with a reduced need for a deep interconnect hierarchy

23 23 Conclusions NATURE: A novel high-performance run-time reconfigurable architecture Introduction of NRAMs into the architecture enables cycle-by-cycle reconfiguration and logic folding Choice of different folding levels allows the flexibility of performing area-performance trade-offs Logic density and area-time product improved significantly Can be very useful for cost-conscious embedded systems and future FPGA improvement


Download ppt "1. NATURE: Non-Volatile Nanotube RAM based Field-Programmable Gate Arrays Wei Zhang†, Niraj K. Jha† and Li Shang ‡ †Dept. of Electrical Engineering Princeton."

Similar presentations


Ads by Google