1 IRAM and ISTORE David Patterson, Katherine Yelick, John Kubiatowicz U.C. Berkeley, EECS

Slides:



Advertisements
Similar presentations
Communication-Avoiding Algorithms Jim Demmel EECS & Math Departments UC Berkeley.
Advertisements

System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
Vectors, SIMD Extensions and GPUs COMP 4611 Tutorial 11 Nov. 26,
Streaming SIMD Extension (SSE)
Slides Prepared from the CI-Tutor Courses at NCSA By S. Masoud Sadjadi School of Computing and Information Sciences Florida.
1 Comparison of Intel Microprocessor 8086, 386, 486, Pentium II by Hong Li Rivier College, CS699A Professional Seminar Fall 1999.
 2002 Prentice Hall Hardware Basics: Inside The Box Chapter 2.
The Evolution of RISC A Three Party Rivalry By Jenny Mitchell CS147 Fall 2003 Dr. Lee.
TOSSIM A simulator for TinyOS Presented at SenSys 2003 Presented by : Bhavana Presented by : Bhavana 16 th March, 2005.
Computer Architecture at Berkeley Professor John Kubiatowicz.
IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)
Slide 1 Perspective on Post-PC Era PostPC Era will be driven by 2 technologies: 1) Mobile Consumer Devices –e.g., successor to PDA, cell phone, wearable.
Languages and Compilers for High Performance Computing Kathy Yelick EECS Department U.C. Berkeley.
Unified Parallel C at LBNL/UCB UPC at LBNL/U.C. Berkeley Overview Kathy Yelick U.C. Berkeley, EECS LBNL, Future Technologies Group.
Room: E-3-31 Phone: Dr Masri Ayob TK 2123 COMPUTER ORGANISATION & ARCHITECTURE Lecture 4: Computer Performance.
Slide 1 Exploiting 0n-Chip Bandwidth The vector ISA + compiler technology uses high bandwidth to mask latency Compiled matrix-vector multiplication: 2.
Slide 1 Patterson’s Projects, People, Impact Reduced Instruction Set Computer (RISC) –What: simplified instructions to exploit VLSI: ‘80-’84 –With:
Performance Analysis, Modeling, and Optimization: Understanding the Memory Wall Leonid Oliker (LBNL) and Katherine Yelick (UCB and LBNL)
ELEC 6200, Fall 07, Oct 29 McPherson: Vector Processors1 Vector Processors Ryan McPherson ELEC 6200 Fall 2007.
Slide 1 Computers for the Post-PC Era John Kubiatowicz, Kathy Yelick, and David Patterson IBM Visit.
UC Berkeley 1 Time dilation in RAMP Zhangxi Tan and David Patterson Computer Science Division UC Berkeley.
Slide 1 Adaptive Compilers and Runtime Systems Kathy Yelick U.C. Berkeley.
1 A New Direction for Computer Architecture Research Lih Wen Koh 19 May 2004 COMP4211 Advanced Architectures & Algorithms Week 11 Seminar.
Hitachi SR8000 Supercomputer LAPPEENRANTA UNIVERSITY OF TECHNOLOGY Department of Information Technology Introduction to Parallel Computing Group.
EECS 470 Superscalar Architectures and the Pentium 4 Lecture 12.
Slide 1 Computers for the Post-PC Era David Patterson University of California at Berkeley UC Berkeley IRAM Group UC Berkeley.
Seqeuential Logic State Machines Memory
CIS 314 : Computer Organization Lecture 1 – Introduction.
1 IRAM: A Microprocessor for the Post-PC Era David A. Patterson EECS, University of.
Databases on ISTORE: AME for parallel RDBMSs Noah Treuhaft.
Slide 1 Computers for the Post-PC Era David Patterson, Katherine Yelick University of California at Berkeley UC Berkeley IRAM.
Copyright © 2006, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners Intel® Core™ Duo Processor.
Slide 1 ISTORE: Introspective Storage for Data-Intensive Network Services Aaron Brown, David Oppenheimer, Kimberly Keeton, Randi Thomas, Noah Treuhaft,
Slide 1 ISTORE: An Introspective Storage Architecture for Network Service Applications Aaron Brown, David Oppenheimer, Kimberly Keeton, Randi Thomas, Jim.
NPACI: National Partnership for Advanced Computational Infrastructure August 17-21, 1998 NPACI Parallel Computing Institute 1 Cluster Archtectures and.
Welcome Three related projects at Berkeley –Intelligent RAM (IRAM) –Intelligent Storage (ISTORE) –OceanStore Groundrules –Questions are welcome during.
Computer performance.
Semiconductor Memory 1970 Fairchild Size of a single core –i.e. 1 bit of magnetic core storage Holds 256 bits Non-destructive read Much faster than core.
Multi-core systems System Architecture COMP25212 Daniel Goodman Advanced Processor Technologies Group.
1 Lecture 1: CS/ECE 3810 Introduction Today’s topics:  Why computer organization is important  Logistics  Modern trends.
1 Recap (from Previous Lecture). 2 Computer Architecture Computer Architecture involves 3 inter- related components – Instruction set architecture (ISA):
Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum Arend Dittmer Director Product Management HPC April,
Slide 1 Research in Internet Scale Systems Katherine Yelick U.C. Berkeley, EECS With Jim Beck, Aaron Brown, Daniel Hettena,
Memory Intensive Benchmarks: IRAM vs. Cache Based Machines Parry Husbands (LBNL) Brain Gaeke, Xiaoye Li, Leonid Oliker, Katherine Yelick (UCB/LBNL), Rupak.
Frank Casilio Computer Engineering May 15, 1997 Multithreaded Processors.
A few issues on the design of future multicores André Seznec IRISA/INRIA.
CS/EE 5810 CS/EE 6810 F00: 1 Multimedia. CS/EE 5810 CS/EE 6810 F00: 2 New Architecture Direction “… media processing will become the dominant force in.
Chapter Overview Microprocessors Replacing and Upgrading a CPU.
Slide 1 IRAM and ISTORE Projects Aaron Brown, Jim Beck, Rich Fromm, Joe Gebis, Kimberly Keeton, Christoforos Kozyrakis, David Martin, Morley Mao, Rich.
Compilers and Applications Kathy Yelick Dave Judd, Ronny Krashinsky, Randi Thomas, Samson Kwok, Simon Yau, Kar Ming Tang, Adam Janin, Thinh Nguyen Computer.
Slide 1 Computers for the Post-PC Era David Patterson University of California at Berkeley UC Berkeley IRAM Group UC Berkeley.
Computer Organization Yasser F. O. Mohammad 1. 2 Lecture 1: Introduction Today’s topics:  Why computer organization is important  Logistics  Modern.
Jan. 5, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 1: Overview of High Performance Processors * Jeremy R. Johnson Wed. Sept. 27,
Xinsong1 Multimedia Extension Technology survey Xinsong Yang Electrical and Computer Engineering 734 Final Project 5/10/2002.
Slide 1 Recovery-Oriented Computing Aaron Brown, Dan Hettenna, David Oppenheimer, Noah Treuhaft, Leonard Chung, Patty Enriquez, Susan Housand, Archana.
Hardware Architecture
1 IRAM Vision Microprocessor & DRAM on a single chip: –on-chip memory latency 5-10X, bandwidth X –improve energy efficiency 2X-4X (no off-chip bus)
William Stallings Computer Organization and Architecture 6th Edition
Computers for the Post-PC Era
Rough Schedule 1:30-2:15 IRAM overview 2:15-3:00 ISTORE overview break
Computer Architecture at Berkeley
Chapter 4 Data-Level Parallelism in Vector, SIMD, and GPU Architectures Topic 13 SIMD Multimedia Extensions Prof. Zhang Gang School.
Prof. Zhang Gang School of Computer Sci. & Tech.
Scaling for the Future Katherine Yelick U.C. Berkeley, EECS
Vector Processing => Multimedia
Welcome Three related projects at Berkeley Groundrules Introductions
Research in Internet Scale Systems
ISTORE Update David Patterson University of California at Berkeley
IRAM Vision Microprocessor & DRAM on a single chip:
Presentation transcript:

1 IRAM and ISTORE David Patterson, Katherine Yelick, John Kubiatowicz U.C. Berkeley, EECS

2 IRAM Overview Low power, high performance processor for multimedia Mixed logic and DRAM –On-chip bandwidth, low power Vector processing –Parallelism replace high clock »200 MHz, 3.2 Gflops, 2 Watts –Simple issue and control logic (power, area) »Push complexity into compiler; well-understood model Compare to SIMD media extensions (MMX, VIS,…) –Multimedia has fine-grained data parallelism. –Scalable, multi-generation instruction set Working with application experts (video and speech)

3 ISTORE Overview Two kill applications for future –Storage and retrieval Design points –2000: 80 nodes in 3 racks –2001: 1000 nodes with IBM (?) –2005: 10K nodes in 1 rack (?) »Add IRAM to 1” disk Key problems are availability, maintainability, and evolutionary growth (AME) of a thousand node servers Approach Hardware built for availability: monitor, diagnostics New class of benchmarks for AME Reliable systems from unreliable hw/sw components Introspection: the system watches itself