1 IRAM Vision Microprocessor & DRAM on a single chip: –on-chip memory latency 5-10X, bandwidth 50-100X –improve energy efficiency 2X-4X (no off-chip bus)

Slides:



Advertisements
Similar presentations
Jared Casper, Ronny Krashinsky, Christopher Batten, Krste Asanović MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA A Parameterizable.
Advertisements

Evolution of Chip Design ECE 111 Spring A Brief History 1958: First integrated circuit – Flip-flop using two transistors – Built by Jack Kilby at.
Main Mem.. CSE 471 Autumn 011 Main Memory The last level in the cache – main memory hierarchy is the main memory made of DRAM chips DRAM parameters (memory.
Lecture 2: Modern Trends 1. 2 Microprocessor Performance Only 7% improvement in memory performance every year! 50% improvement in microprocessor performance.
1 Network Performance Model Sender Receiver Sender Overhead Transmission time (size ÷ band- width) Time of Flight Receiver Overhead Transport Latency Total.
Lecture 12: DRAM Basics Today: DRAM terminology and basics, energy innovations.
1 BGL Photo (system) BlueGene/L IBM Journal of Research and Development, Vol. 49, No. 2-3.
1 Lecture 15: DRAM Design Today: DRAM basics, DRAM innovations (Section 5.3)
Slide 1 Computers for the Post-PC Era John Kubiatowicz, Kathy Yelick, and David Patterson IBM Visit.
Slide 1 Adaptive Compilers and Runtime Systems Kathy Yelick U.C. Berkeley.
VIRAM-1 Architecture Update and Status Christoforos E. Kozyrakis IRAM Retreat January 2000.
Hitachi SR8000 Supercomputer LAPPEENRANTA UNIVERSITY OF TECHNOLOGY Department of Information Technology Introduction to Parallel Computing Group.
Slide 1 Computers for the Post-PC Era Aaron Brown, Jim Beck, Rich Martin, David Oppenheimer, Kathy Yelick, and David Patterson
1 The Future of Microprocessors Embedded in Memory David A. Patterson EECS, University.
1 IRAM: A Microprocessor for the Post-PC Era David A. Patterson EECS, University of.
Slide 1 Computers for the Post-PC Era David Patterson, Katherine Yelick University of California at Berkeley UC Berkeley IRAM.
Embedded DRAM for a Reconfigurable Array S.Perissakis, Y.Joo 1, J.Ahn 1, A.DeHon, J.Wawrzynek University of California, Berkeley 1 LG Semicon Co., Ltd.
Lecture 1: Introduction to High Performance Computing.
1 IRAM and ISTORE David Patterson, Katherine Yelick, John Kubiatowicz U.C. Berkeley, EECS
Slide 1 Computers for the Post-PC Era Aaron Brown, Jim Beck, Kimberly Keeton, Rich Martin, David Oppenheimer, Randi Thomas, John Kubiatowicz, Kathy Yelick,
Welcome Three related projects at Berkeley –Intelligent RAM (IRAM) –Intelligent Storage (ISTORE) –OceanStore Groundrules –Questions are welcome during.
A Flexible Architecture for Simulation and Testing (FAST) Multiprocessor Systems John D. Davis, Lance Hammond, Kunle Olukotun Computer Systems Lab Stanford.
Real Parallel Computers. Modular data centers Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra,
1 Lecture 7: Part 2: Message Passing Multicomputers (Distributed Memory Machines)
RSC Williams MAPLD 2005/BOF-S1 A Linux-based Software Environment for the Reconfigurable Scalable Computing Project John A. Williams 1
UC Berkeley 1 The Datacenter is the Computer David Patterson Director, RAD Lab January, 2007.
Sun Fire™ E25K Server Keith Schoby Midwestern State University June 13, 2005.
Memory Intensive Benchmarks: IRAM vs. Cache Based Machines Parry Husbands (LBNL) Brain Gaeke, Xiaoye Li, Leonid Oliker, Katherine Yelick (UCB/LBNL), Rupak.
Frank Casilio Computer Engineering May 15, 1997 Multithreaded Processors.
Egle Cebelyte. Random Access Memory is simply the storage area where all software is loaded and works from; also called working memory storage.
CPE 731 Advanced Computer Architecture Technology Trends Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University of California,
R&D for First Level Farm Hardware Processors Joachim Gläß Computer Engineering, University of Mannheim Contents –Overview of Processing Architecture –Requirements.
Parallel Programming on the SGI Origin2000 With thanks to Igor Zacharov / Benoit Marchand, SGI Taub Computer Center Technion Moshe Goldberg,
Alpha 21364: A Scalable Single-chip SMP Peter Bannon Senior Consulting Engineer Compaq Computer Corporation Shrewsbury, MA.
Computing Environment The computing environment rapidly evolving ‑ you need to know not only the methods, but also How and when to apply them, Which computers.
Slide 1 IRAM and ISTORE Projects Aaron Brown, Jim Beck, Rich Fromm, Joe Gebis, Kimberly Keeton, Christoforos Kozyrakis, David Martin, Morley Mao, Rich.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
Raw Status Update Chips & Fabrics James Psota M.I.T. Computer Architecture Workshop 9/19/03.
Spring EE 437 Lillevik 437s06-l22 University of Portland School of Engineering Advanced Computer Architecture Lecture 22 Distributed computer Interconnection.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
SPRING 2012 Assembly Language. Definition 2 A microprocessor is a silicon chip which forms the core of a microcomputer the concept of what goes into a.
Graciela Perera Department of Computer Science and Information Systems Slide 1 of 18 INTRODUCTION NETWORKING CONCEPTS AND ADMINISTRATION CSIS 3723 Graciela.
Introduction to Computers - Hardware
Computers for the Post-PC Era
Itanium® 2 Processor Architecture
Network Connected Multiprocessors
Lynn Choi School of Electrical Engineering
Overview Parallel Processing Pipelining
Berkeley Cluster Projects
Hardware Technology Trends and Database Opportunities
Part VI Input/Output and Interfacing
NVIDIA’s Extreme-Scale Computing Project
Rough Schedule 1:30-2:15 IRAM overview 2:15-3:00 ISTORE overview break
RAM, CPUs, & BUSES Egle Cebelyte.
Berkeley Cluster: Zoom Project
Scaling for the Future Katherine Yelick U.C. Berkeley, EECS
IDISK Cluster 8 disks, 8 CPUs, DRAM /shelf
Computer Architecture CSCE 350
IRAM and ISTORE Projects
CS775: Computer Architecture
IRAM: A Microprocessor for the Post-PC Era
Welcome Three related projects at Berkeley Groundrules Introductions
RAW Scott J Weber Diagrams from and summary of:
IRAM: A Microprocessor for the Post-PC Era
CS 252 Spring 2000 Jeff Herman John Loo Xiaoyi Tang
IRAM: A Microprocessor for the Post-PC Era
A microprocessor into a memory chip Dave Patterson, Berkeley, 1997
IRAM Vision Microprocessor & DRAM on a single chip:
Cluster Computers.
Interconnection Network and Prefetching
Presentation transcript:

1 IRAM Vision Microprocessor & DRAM on a single chip: –on-chip memory latency 5-10X, bandwidth X –improve energy efficiency 2X-4X (no off-chip bus) –serial I/O 5-10X v. buses –smaller board area/volume –adjustable memory size/width DRAMDRAM fabfab Proc Bus DRAM $$ Proc L2$ LogicLogic fabfab Bus DRAM I/O Bus BusBus

2 VIRAM-1 Specs/Goals Technology micron, 5-6 metal layers, fast xtor Memory16-32 MB Die size≈ mm 2 Vector pipes/lanes4 64-bit (or 8 32-bit or bit) TargetLow PowerHigh Performance Serial I/O4 1 Gbit/s8 2 Gbit/s Power university ≈ volt logic≈ volt logic Clock univers. 200scalar/200vector MHz300sc/300vector MHz Perf university 1.6 GFLOPS GOPS GFLOPS GOPS 16 Power industry ≈ volt logic≈ volt logic Clock industry 400scalar/400vector MHz600s/600v MHz Perf industry 3.2 GFLOPS GOPS 16 4 GFLOPS GOPS 16

3 IRAM Update 2 test chips: serial lines (MOSIS) + Embedded DRAM/Crossbar (LG Semicon) Simulator/Architecture Manual Completed Initial Vector Compiler (“VIC”) Completed Partner for scalar processor (Sandcraft/MIPS) LG delays, prospects => stick to plan to re-evaluate options for IRAM prototype –Foundary: TSMC, UMC –DRAM companies: IBM, Micron, NEC, Toshiba Applications: FFT, segmentation,...

4 IRAM App: ISTORE (“Intelligent Storage”) 1 IRAM/DRAM + crossbar switch + fast serial link v. conventional SMP Move function to data v. data to CPU $$ Proc L2$ Conventional CPU Bus IRAMIRAM IRAMIRAM IRAMIRAM IRAMIRAM I/O Bus BusBus cross bar …

5 Another Vision of ISTORE 1 IRAM/disk + xbar + fast serial link v. conventional SMP, cluster Network latency = f(SW overhead), not link distance Move function to data v. data to CPU (scan, sort, join,...) Cost/performace, more scalable … cross bar … … … IRAM … … … … … cross bar CPU/Memory

6 ISTORE Update Build prototypes to gain experience, develop software before IRAM chips arrive –Replace with IRAM chips once available ISTORE-0: 2 Sandcraft Development boards + Fast Ethernet + Real-time OS (VxWorks/QNX) ISTORE-1: “Intelligent SIMM” module based on Mitsubishi M32RXD (DRAM interface+CPU)

7 IRAM/ISTORE Schedule IRAMISTORE/OSCompiler

IRAM/ISTORE Presentations –MicroDesign Resources Dinner Meeting, 1/8/98 –Embedded Memory Workshop, Japan, 3/15/98 –Stanford Computer Science Colloquim, 5/6/98 –University of Virginia Distinguished Lecture, 5/19/98 –SIGMOD98 Keynote Address, 6/3/98 Articles –“New Processor Paradigm: V-IRAM”, Microprocessor Report, 3/9/98, –“A perfect match.” New Scientist, 4/18/98, –"Professor's Idea for Speedy Chip Could Be More Than Academic," Wall Street Journal, 8/28/98, B1, B4.