Project Overview: Nanoscale Application Specific ICs (NASIC) and Wire-Streaming Processors (WiSP) Csaba Andras Moritz Associate Professor University of.

Slides:



Advertisements
Similar presentations
Defect Tolerance for Yield Enhancement of FPGA Interconnect Using Fine-grain and Coarse-grain Redundancy Anthony J. YuGuy G.F. Lemieux September 15, 2005.
Advertisements

NanoFabric Chang Seok Bae. nanoFabric nanoFabric : an array of connect nanoBlocks nanoBlock : logic block that can be progammed to implement Boolean function.
Lecture 15 Finite State Machine Implementation
A Novel 3D Layer-Multiplexed On-Chip Network
Electrical and Computer Engineering - Santosh Khasanvis, K. M. Masum Habib*, Mostafizur Rahman, Pritish Narayanan, Roger K. Lake* and Csaba Andras Moritz.
1 Cleared for Open Publication July 30, S-2144 P148/MAPLD 2004 Rea MAPLD 148:"Is Scaling the Correct Approach for Radiation Hardened Conversions.
WATERLOO ELECTRICAL AND COMPUTER ENGINEERING 40s: Circuits 1 WATERLOO ELECTRICAL AND COMPUTER ENGINEERING 40s Circuits Department of Electrical and Computer.
ECE 424 – Introduction to VLSI Design Emre Yengel Department of Electrical and Communication Engineering Fall 2014.
Defect Tolerance for Yield Enhancement of FPGA Interconnect Using Fine-grain and Coarse-grain Redundancy Anthony J. Yu August 15, 2005.
Defect Tolerance for Yield Enhancement of FPGA Interconnect Using Fine-grain and Coarse-grain Redundancy Anthony J. Yu August 15, 2005.
Reconfigurable Computing: What, Why, and Implications for Design Automation André DeHon and John Wawrzynek June 23, 1999 BRASS Project University of California.
Extensible Processors. 2 ASIP Gain performance by:  Specialized hardware for the whole application (ASIC). −  Almost no flexibility. −High cost.  Use.
PipeRench: A Coprocessor for Streaming Multimedia Acceleration Seth Goldstein, Herman Schmit et al. Carnegie Mellon University.
CMOL vs NASICs T. Wang University of Massachusetts, Amherst September 29, 2005.
Lecture 26: Reconfigurable Computing May 11, 2004 ECE 669 Parallel Computer Architecture Reconfigurable Computing.
ENGIN112 L38: Programmable Logic December 5, 2003 ENGIN 112 Intro to Electrical and Computer Engineering Lecture 38 Programmable Logic.
Spring 08, Jan 15 ELEC 7770: Advanced VLSI Design (Agrawal) 1 ELEC 7770 Advanced VLSI Design Spring 2007 Introduction Vishwani D. Agrawal James J. Danaher.
1 Clockless Logic Montek Singh Tue, Mar 16, 2004.
Behavioral Design Outline –Design Specification –Behavioral Design –Behavioral Specification –Hardware Description Languages –Behavioral Simulation –Behavioral.
Optimal Layout of CMOS Functional Arrays ECE665- Computer Algorithms Optimal Layout of CMOS Functional Arrays T akao Uehara William M. VanCleemput Presented.
CMOL overview ● CMOS / nanowire / MOLecular hybrids ● Uses combination of Micro – Nano – Nano implements regular blocks (ie memory) – CMOS used for logic,
NanoPLA Overview Yao Guo Oct 6, Semiconducting Nanowires Few nm’s in diameter (e.g. 3nm) –Diameter controlled by seed catalyst Can be microns long.
1 EECS Components and Design Techniques for Digital Systems Lec 21 – RTL Design Optimization 11/16/2004 David Culler Electrical Engineering and Computer.
Array-Based Architecture for FET-Based, Nanoscale Electronics André DeHon 2003 Presented By Mahmoud Ben Naser.
FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.
Si and Ge NW FETs, NiSi-Si-NiSI conductor hetero-structures and manufacturing steps Csaba Andras Moritz Associate Professor University of Massachusetts,
1 3/22/02 Benchmark Update u Carnegie Cell Library: “Free to all who Enter” s Need to build scaling model of standard cell library s Based on our open.
Chapter #6: Sequential Logic Design 6.2 Timing Methodologies
NanoPLAs Mike Gregoire. Overview ► Similar to CMOS PLA (Programmable Logic Array) ► Uses NOR-NOR logic to implement any logical function ► Like other.
CS 151 Digital Systems Design Lecture 38 Programmable Logic.
Emerging Logic Devices
Digital Integrated Circuits for Communication
Introduction to Digital Logic Design Appendix A of CO&A Dr. Farag
TOWARDS AN EARLY DESIGN SPACE EXPLORATION TOOL SET FOR STT-RAM DESIGN Philip Asare and Ben Melton.
Building Cad Prototyping Tool for Emerging Nanoscale Fabrics Catherine Dezan Joined work between Lester( France.
Levels of Architecture & Language CHAPTER 1 © copyright Bobby Hoggard / material may not be redistributed without permission.
Principles Of Digital Design Chapter 1 Introduction Design Representation Levels of Abstraction Design Tasks and Design Processes CAD Tools.
Coarse and Fine Grain Programmable Overlay Architectures for FPGAs
CAD for Physical Design of VLSI Circuits
Computer Architecture. “The design of a computer system. It sets the standard for all devices that connect to it and all the software that runs on it.
LOPASS: A Low Power Architectural Synthesis for FPGAs with Interconnect Estimation and Optimization Harikrishnan K.C. University of Massachusetts Amherst.
Logic Synthesis for Low Power(CHAPTER 6) 6.1 Introduction 6.2 Power Estimation Techniques 6.3 Power Minimization Techniques 6.4 Summary.
Speculative Software Management of Datapath-width for Energy Optimization G. Pokam, O. Rochecouste, A. Seznec, and F. Bodin IRISA, Campus de Beaulieu
1. NATURE: Non-Volatile Nanotube RAM based Field-Programmable Gate Arrays Wei Zhang†, Niraj K. Jha† and Li Shang ‡ †Dept. of Electrical Engineering Princeton.
Amalgam: a Reconfigurable Processor for Future Fabrication Processes Nicholas P. Carter University of Illinois at Urbana-Champaign.
Modern VLSI Design 4e: Chapter 6 Copyright  2008 Wayne Wolf Topics Memories: –ROM; –SRAM; –DRAM; –Flash. Image sensors. FPGAs. PLAs.
CSE 494: Electronic Design Automation Lecture 2 VLSI Design, Physical Design Automation, Design Styles.
1 Towards Optimal Custom Instruction Processors Wayne Luk Kubilay Atasu, Rob Dimond and Oskar Mencer Department of Computing Imperial College London HOT.
A Reconfigurable Low-power High-Performance Matrix Multiplier Architecture With Borrow Parallel Counters Counters : Rong Lin SUNY at Geneseo
Advanced VLSI Design Unit 04: Combinational and Sequential Circuits.
Introduction to VHDL Simulation … Synthesis …. The digital design process… Initial specification Block diagram Final product Circuit equations Logic design.
Northeastern U N I V E R S I T Y 1 Design and Test of Fault Tolerant Quantum Dot Cellular Automata Electrical and Computer Department.
Dynamic Logic Dynamic Circuits will be introduced and their performance in terms of power, area, delay, energy and AT2 will be reviewed. We will review.
Hrushikesh Chavan Younggyun Cho Structural Fault Tolerance for SOC.
Copyright © 2010 Houman Homayoun Houman Homayoun National Science Foundation Computing Innovation Fellow Department of Computer Science University of California.
ECE 551: Digital System Design & Synthesis Motivation and Introduction Lectures Set 1 (3 Lectures)
1 November 11, 2015 A Massively Parallel, Hybrid Dataflow/von Neumann Architecture Yoav Etsion November 11, 2015.
Integrated Microsystems Lab. EE372 VLSI SYSTEM DESIGNE. Yoon 1-1 Panorama of VLSI Design Fabrication (Chem, physics) Technology (EE) Systems (CS) Matel.
Csaba Andras-Moritz ECE 668 3D IC Technology and Emerging 3D Processors.
1 Recap: Lecture 4 Logic Implementation Styles:  Static CMOS logic  Dynamic logic, or “domino” logic  Transmission gates, or “pass-transistor” logic.
1 Clockless Logic Montek Singh Thu, Mar 2, Review: Logic Gate Families  Static CMOS logic  Dynamic logic, or “domino” logic  Transmission gates,
EECE 320 L8: Combinational Logic design Principles 1Chehab, AUB, 2003 EECE 320 Digital Systems Design Lecture 8: Combinational Logic Design Principles.
Stateless Combinational Logic and State Circuits
NanoMadeo Introduction of Error Correcting Schemes in the Design Process of Self-Healing Circuits for Nanoscale Fabrics.
ELE 523E COMPUTATIONAL NANOELECTRONICS
An Automated Design Flow for 3D Microarchitecture Evaluation
HIGH LEVEL SYNTHESIS.
Hardware Assisted Fault Tolerance Using Reconfigurable Logic
Introduction to Computer Systems Engineering
Presentation transcript:

Project Overview: Nanoscale Application Specific ICs (NASIC) and Wire-Streaming Processors (WiSP) Csaba Andras Moritz Associate Professor University of Massachusetts, Amherst August 14, 2005

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 2 NASIC and WISP Projects Objective  Explore novel circuits, architectures, suitable software approaches, and tools (CAD and simulation) as well as built-in fault tolerance approaches for designs on 2-D silicon nanowire (SiNW) and carbon nanotube (CNT) fabrics  Optimizations at various system layers to preserve fabric density without added requirements for manufacturing at nano-scale  Explore suitable architectures  Compare with aggressive deep-submicron CMOS designs  Explore new fabric models that overcome limitations discovered Collaborators  MassNanotech on nano devices and nanoscale fabrication  University of Brest in France on CAD tools for NASICs  6 other UMass ECE faculty on system work

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 3 Motivation Little work on the benefits of SiNW and CNT based devices as building blocks for nanoscale systems. We are trying to answer questions like  What are the challenges when building nanoscale systems (e.g., circuits and architectures)?  Can the density advantages of nanodevices be preserved at system level? What would be the capabilities of such systems compared to CMOS systems at the end of the CMOS roadmap (like 30-nm and below)?  Influence device/manufacturing research based on insights gained?

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 4 Outline Circuit-level work  Static NASICs  Dynamic NASICs  Nano Latches  Pipelining  Multi-tile design and pipelining Architectures  Wire Streaming (WISP)  Comparison with equivalent CMOS designs at 30-nm and below  Multi-tile designs Tools  Simplefit: performance comparison based on analytical models  Madeo: CAD prototyping tools for NASIC designs Papers Published

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 5 Nanodevices Lauhon et al., Nature 420,57 Carbon Nanotubes (CNT) Silicon Nanowires (SiNW) Nanoarray Transistors or Diodes

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 6 NASIC Circuit Work

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 7 A Nanotile in Static NASICs microwires pullup network pulldow n network OR plane AND plane

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 8 Sequential Circuits on Nanogrids The flip-flop has poor area efficiency.  Feedback loop requires turning corner on 2-D fabric It requires 2 doping types in each dimension. When cascading combinational circuits and flip-flops, diagonal effect further reduces area efficiency. An adder and a Flip-flop (pull up/down are not shown)

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 9 Dynamic Circuits Dynamic circuits Precharge-Evaluate-Hold phase Hold phase for cascading dynamic circuits

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 10 Dynamic NASIC tile and Pipeline Nano-Latch provides implicit latching on the SiNW  Dynamic circuit style with precharge- evaluate-hold control (see papers)  Solution for temporary data storage  Used to build pipelined structures high-density stream processing Pipelined structure

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 11 Nano-Latch Cascaded dynamic circuits -> NanoLatch NanoLatch is implicit -> better area efficiency NanoLatch: idea for temporary storage

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 12 Architecture

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 13 Interconnection Neighboring NASIC tiles are connected by NWs while global interconnections are provided by MWs Note that only minor modifications and limited modifications to tiles are needed for efficient interconnect

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 14 Pipeline Structure - An Example Pipelined 2-bit adder 2 bit adding processed in 2 stages

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 15 Overview of WiSP WiSP – Wire-Streaming Processor Why WiSP?  Explore a nanoscale design at the circuit and architectural levels  It exercises most principles in NASICs NASIC circuits and principles for cascading and tiling Pipelining, dynamic circuits and nano latches  An architecture style that fits 2-D fabric constraints Requires minimal feedback and control Can use temporary storage on the wire, etc Features in the ISA help improve density

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 16 Architecture of WiSP-0 WiSP-0 is the initial version of WiSP.  Supports simple ISA: nop, movi, mov, add, mul  Hazards exposed to compiler  Implements 5-stage pipeline on 5 NASIC nanotiles Floorplan of WiSP-0 Schematic of WiSP-0

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 17 Program Counter (pull up/down and microwires are not shown) Bottom part is a 4- bit incrementer circuit Top part is a 4-bit nanolatch High density is achieved despite feedback path and latching

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 18 Instruction ROM

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 19 Register File

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 20 ALU

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 21 Comparison with 30-nm CMOS Area breakdown WiSP-0 Projected (1,000x1,000 tiles) Comparison with 30-nm CMOS 12.5X 115X Density Ratio

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 22 Fault-tolerance Work Exploring built-in fault-tolerance Defect maps not required System level redundancy Simulation tools to evaluate yield

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 23 Dynamic Style for Fault-tolerant Designs – FTD NASICs To use our built-in redundancy mechanism on dynamic nanotiles, we need to change the logic style: Use preDischarge-Evaluate-Hold phase on horizontal nanowires, but use preCharge-Evaluate- Hold phase on vertical nanowires

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 24 Area Impact of Various FT Approaches Studied for WISP 2-level redundancy based on FTD NASIC logic S0 and S1 are other built-in circuit-level techniques

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 25 Simulation and CAD Tools

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 26 Initial System-Level Evaluation CMOSNASIC Technology30nm4nm pitched NW 90nm pitched MW Chip area1.8cm 2 # of transistors 10 9 variable We use SimpleFit (C.A.Moritz, et al 2001, IEEE Trans Par&Distributed Systems) extended for our circuit and nanodevice assumptions to evaluate the impact of density in a tiled NASIC architecture vs. 30-nm CMOS assuming same speed Assumptions

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 27 CAD Tools for NASICs Developing NASIC MADEO  Physical modeling  Logic synthesis: combinational and sequential blocks  Architecture composing: binding blocks together to build operators, networks or processing elements,  Library management  Smalltalk based description We have created with our collaborators an initial version for NASICs

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 28 Scaling of WISP ALU Designs Using NASIC Madeo

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 29 Manual vs. NASIC Madeo RF Designs Manual Madeo

Copyright - Csaba Andras Moritz, ECE, UMass Amherst 30 Papers  “Wire-Streaming Processors on 2-D Nanowire Fabrics”, extended version of NSTI (Nano Science and Technology Institute) and NSC-3 papers, UMass Technical Report.  “Wire-Streaming Processors on 2-D Nanowire Fabrics”, NSTI (Nano Science and Technology Institute) Nanotech 2005, California, May  "Latching on the Wire and Pipelining in Nanoscale Designs", Non-Silicon Computation Workshop, in conjunction with ISCA-31  "Opportunities and Challenges in Application­Tuned Circuits and Architectures Based on Nanodevices", Computing Frontiers’04, ACM SIGMicro  "NASIC: Nanoscale Application-Specific ICs and Architectures", Boston Area Computer Architecture Workshop'04