Presentation is loading. Please wait.

Presentation is loading. Please wait.

ERD Architecture Benchmarking: The NRI MIND Activity Ralph K. Cavin, III, Kerry Bernstein & Jeff Welser July 12, 2009 San Francisco, CA.

Similar presentations


Presentation on theme: "ERD Architecture Benchmarking: The NRI MIND Activity Ralph K. Cavin, III, Kerry Bernstein & Jeff Welser July 12, 2009 San Francisco, CA."— Presentation transcript:

1 ERD Architecture Benchmarking: The NRI MIND Activity Ralph K. Cavin, III, Kerry Bernstein & Jeff Welser July 12, 2009 San Francisco, CA

2 Goals of the NRI/MIND Benchmarking Project Develop circuit/subsystem level examples of the applications of novel devices Evaluate the circuits/subsystems in the energy-time-space context versus CMOS implementations Determine most promising applications for emerging devices with an emphasis on integration with CMOS

3 Architectural Innovations haven’t been the major driver for system performance Analysis of high perf architectures and the technologies they were built in, examining device vs arch contributions to throughput - Predominant influence on SPEC2000 is from device technology - Modest contributions from architecture

4 Four Architectural Projections 1)CMOS is not going away anytime soon. Charge (state variable), and the MOSFET (fundamental switch) will remain the preferred HPC solution until new switches appear as the long term replacement solution in 10-20 years. 2)Hdwre Accelerators execute selected functions faster than software performing it on the CPU. Accelerators are responsible for substantial improvements in thru-put. 3)Alternative switches often exhibit emergent, idiosyncratic behavior. We should exploit them. Certain physical behaviors may emulate selected HPC instruction sequences. Some operations may be superior to digital solutions. 4)New switches may improve high-utilitization accelerators The shorter term supplemental solution (5-15 years) improves or replaces accelerators “built in CMOS and designed for CMOS”, either on-chip or on-3D-stack or on-planar

5 Matching Logic Functions & New Switch Behaviors Single Spin Spin Domain Tunnel-FETs NEMS MQCA Molecular Bio-inspired CMOL Excitonics ? Popular Accelerators New Switch Ideas Encrypt / Decrypt Compr / Decompr Reg. Expression Scan Discrete COS Trnsfrm Bit Serial Operations H.264 Std Filtering DSP, A/D, D/A Viterbi Algorithms Image, Graphics Example: Cryptography Hardware Acceleration Operations required:Rotate, Byte Alignmt, EXORs, Multiply, Table Lookup Circuits used in Accel:Transmission Gates (“T-Gates”) New Switch Opportunity:A number of new switches (i.e. T-FETs) don’t have (example)thermionic barriers: won’t suffer from CMOS Pass-gate V T drop, Body Effect, or Source-Follower delay. Potential Opportunity:Replace 4 T-Gate MOSFETs with 1 low power switch.

6 Examples of Benchmarking Work in Progress Magnetic Tunnel Junction one-bit adder Magnetic Logic for one-bit adder Magnetic Ring Logic Devices Many other devices are being evaluated in a variety of circuit configurations.

7 Background - MTJ Researchers have been investigating post-CMOS devices for many years. In short term, people are looking for switches that supplement CMOS and are CMOS-compatible, supporting ultra-low power operation. MTJ (Magnetic Tunnel Junction) is one of the strongest candidate which is available in practice rather than only in theory. – Excellent for memory and storage. STT-RAM using MTJ is strong candidate for universal memory. – For logic design, good or not? Any memory device can also be used to build logic circuits, in theory at least, and MTJ is no exception. The discovery of spin torque transfer (STT) makes MTJ scalable and completely CMOS-compatible.

8 MTJ-based DyCML 1 Bit Full Adder MTJ is used as both a memory cell and functional input. The switching of MTJ conducted by STT using control signals WL, BL. It is actually a CMOS-MTJ-combined version of DyCML. Thus, it is more reasonable to compare it with CMOS-based DyCML to see MTJ’s impact.

9 Results ED Curve of 65nm process DyCML-MTJ SCMOS DyCML-CMOS

10 Nanomagnet Logic (NML) PIs Gary Bernstein 1, X. Sharon Hu 2, Michael Niemier 2, Wolfgang Porod 1 Student Researchers: M. Tanvir Alam 1, Michael Crocker 2, Aaron Dingler 2, Steve Kurtz 2, Shawn Liu 2, M. Jafar Siddiq 1, Edit Varga 1 Affiliations: 1 Department of Electrical Engineering, 2 Department of Computer Science and Engineering

11 Comparison to CMOS Hard to compare magnet to transistor – Need to make technology comparison at functional unit level; consider initial projections here Natural comparison = low power CMOS systems, sub-threshold, etc. 11 A C B Sum C out M1 M2 M3 Base performance projections on adder design.

12 Trends 12 V &  r EDP (pJ ns) Because of sensitivity to sub-threshold slope, threshold voltage … energy, delay can vary significantly from technology to technology. These are best data points for CMOS (0.3V - 1V) Energy (pJ)Delay (ns) CMOS0.020261 NML NP 0.029198 NML P 0.02918 With  r = 1, can still see ~15X performance gain due to higher throughput CMOS0.1920 If higher supply voltage to match delay, ~7X energy savings NML NP 0.0012198 NML P 0.001218 With  r = 5, ~17x (NP) and ~158X (P) energy savings with better performance

13 Magnetic Ring Logic Devices – Benchmarks/Metrics Caroline Ross - MIT These devices work by the movement of domain walls around thin film rings with general structure Hard layer/Spacer/Soft layer, e.g. Co/Cu/NiFe or Co/MgO/NiFe. Rings can have several remanent states with different resistances. This is useful for multibit memory. However, digital logic uses two levels so in these examples, some of the complexity available in ring devices is wasted NAND/NOR configurations are being analyzed.

14 Existing prototypeProjection Device area1 µm 2 Improve x 100? Switching speed5 nsProportional to 1/device length (improve x 10?) and domain wall velocity (improve x 10?) Switching energy5 10 -14 J (10 7 kT)Proportional to switching speed (improve x 100??) and to device x-section area (improve x 10-20?) and to critical current for wall motion (improve x10- 100?) Prototype Magnetic Ring Device Performance

15 Summary The Nanoelectronics Research Initiative benchmarking project should be nearing completion by mid-August, 2009 The ERA section plans to provide a summary of findings for 2009


Download ppt "ERD Architecture Benchmarking: The NRI MIND Activity Ralph K. Cavin, III, Kerry Bernstein & Jeff Welser July 12, 2009 San Francisco, CA."

Similar presentations


Ads by Google