RAMPing Down Chuck Thacker Microsoft Research August 2010.

Slides:



Advertisements
Similar presentations
January 2008 RAMP Retreat BEE3 Update Chuck Thacker John Davis Microsoft Research Chen Chang BWRC/BEECube 16 January 2008.
Advertisements

RAMP Gold : An FPGA-based Architecture Simulator for Multiprocessors Zhangxi Tan, Andrew Waterman, David Patterson, Krste Asanovic Parallel Computing Lab,
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 5 CS252 Graduate Computer Architecture Spring 2014 Lecture 5: Out-of-Order Processing Krste Asanovic.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Cache III Steve Ko Computer Sciences and Engineering University at Buffalo.
4/16/2013 CS152, Spring 2013 CS 152 Computer Architecture and Engineering Lecture 19: Directory-Based Cache Protocols Krste Asanovic Electrical Engineering.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Snoopy Caches I Steve Ko Computer Sciences and Engineering University at Buffalo.
CS61C L40 Parallelism in Processor Design (1) Garcia, Spring 2008 © UCB License Except as otherwise noted, the content of this presentation is licensed.
The Evolution of RISC A Three Party Rivalry By Jenny Mitchell CS147 Fall 2003 Dr. Lee.
RAMP in Retrospect David Patterson August 25, 2010.
RAMP Sharing (Playing well with others) John Davis, Christos Kozyrakis, Chuck Thacker, Takefumi Miyoshi, Shinya Takamaeda, Phillip Jones, Tayo Oguntebi,
Parallel Applications Parallel Hardware Parallel Software 1 The Parallel Computing Laboratory Krste Asanovic, Ras Bodik, Jim Demmel, Tony Keaveny, Kurt.
RAMP Retreat August 2008 Christos Kozyrakis Pervasive Parallelism Laboratory Stanford University
1 Jan 07 RAMP PI Report: Plans until next Retreat & Beyond Krste Asanovíc (MIT), Derek Chiou (Texas), James Hoe(CMU), Christos Kozyrakis (Stanford), Shih-Lien.
1 RAMP White RAMP Retreat, BWRC, Berkeley, CA 20 January 2006 RAMP collaborators: Arvind (MIT), Krste Asanovíc (MIT), Derek Chiou (Texas), James Hoe (CMU),
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Snoopy Caches II Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Directory-Based Caches II Steve Ko Computer Sciences and Engineering University at Buffalo.
EECS Electrical Engineering and Computer Sciences B ERKELEY P AR L AB P A R A L L E L C O M P U T I N G L A B O R A T O R Y EECS Electrical Engineering.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Directory-Based Caches I Steve Ko Computer Sciences and Engineering University at Buffalo.
1 RAMP Implementation J. Wawrzynek. 2 RDL supports multiple platforms:  XUP, pure software, BEE2 BEE2 will be the standard RAMP platform for the next.
June 2007 RAMP Tutorial BEE3 Update Chuck Thacker John Davis Microsoft Research 10 June, 2007.
Seven Minute Madness: Special-Purpose Parallel Architectures Dr. Jason D. Bakos.
1 Research Accelerator for MultiProcessing Dave Patterson, UC Berkeley January RAMP collaborators: Arvind (MIT), Krste Asanovíc (MIT), Derek Chiou.
RAMP Summer Retreat 2008 Breakout Reports RAMP Summer Retreat 2008 Attendees (Compiled by Greg Gibeling)
Future of Computer Architecture
Ramp January 2007 retreat Xilinx XUP and RAMP donations Paul Hartke Kees Vissers Patrick Lysaght.
Ramp august 2008 retreat Xilinx RAMP donations Kees Vissers Paul Hartke Xilinx Research.
Configurable System-on-Chip: Xilinx EDK
1 Breakout thoughts (compiled with N. Carter): Where will RAMP be in 3-5 Years (What is RAMP, where is it going?) Is it still RAMP if it is mapping onto.
Research Accelerator for Multiple Processors
CS 152 Computer Architecture and Engineering Lecture 21: Directory-Based Cache Protocols Scott Beamer (substituting for Krste Asanovic) Electrical Engineering.
1 RAMP Models and Platforms Krste Asanovic UC Berkeley RAMP Retreat, Berkeley, CA January 15, 2009.
1 A Community Vision for a Shared Experimental Parallel HW/SW Platform Dave Patterson, Pardee Professor of Comp. Science, UC Berkeley President, Association.
1 Introduction to Research Accelerator for Multiple Processors David Patterson (Berkeley, CO-PI), Arvind (MIT), Krste Asanovíc (Berkeley/MIT), Derek Chiou.
CS61C L43 Hardware Parallel Computing (1) Garcia, Spring 2007 © UCB Leakage solved!  Insulators to separate wires on chips have always had problems with.
ASPLOS ’08 Ramp Tutorial BEE3 Update Chuck Thacker John Davis Microsoft Research 2 March 2008.
1 RAMP Tutorial Introduction/Overview Krste Asanovic UC Berkeley RAMP Tutorial, ASPLOS, Seattle, WA March 2, 2008.
1 Fast Communication for Multi – Core SOPC Technion – Israel Institute of Technology Department of Electrical Engineering High Speed Digital Systems Lab.
The Berkeley View: A New Framework & a New Platform for Parallel Research David Patterson and a cast of thousands Pardee Professor of Computer Science,
January 2007 RAMP Retreat BEE3 Update Chuck Thacker Technical Fellow Microsoft Research 11 January, 2007.
1 “Embedded” RAMP Workshop 8/23/06 bee2.eecs.berkeley.edu/wiki/embedded.
April 18, 2011CS152, Spring 2011 CS 152 Computer Architecture and Engineering Lecture 19: Directory-Based Cache Protocols Krste Asanovic Electrical Engineering.
Only Sky is a limit, So lets the real job start S. Pogrebenko, , &
General FPGA Architecture Field Programmable Gate Array.
7/20/05FDIS The Design and Application of Berkeley Emulation Engines John Wawrzynek Bob Brodersen Chen Chang University of California, Berkeley Berkeley.
1 Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.
1 Berkeley RAD Lab Technical Overview Armando Fox, Randy Katz, Michael Jordan, Dave Patterson, Scott Shenker, Ion Stoica March 2006.
1 How to Hurt Scientific Productivity David A. Patterson Pardee Professor of Computer Science, U.C. Berkeley President, Association for Computing Machinery.
1 3-General Purpose Processors: Altera Nios II 2 Altera Nios II processor A 32-bit soft core processor from Altera Comes in three cores: Fast, Standard,
1 Hardware Security Mechanisms Krste Asanovic U.C. Berkeley August 20, 2009.
CS/ECE 3330 Computer Architecture Kim Hazelwood Fall 2009.
Advanced Computer Architecture, CSE 520 Generating FPGA-Accelerated DFT Libraries Chi-Li Yu Nov. 13, 2007.
1 RAMP Infrastructure Status Daniel Burke 19 Aug 08.
10/27: Lecture Topics Survey results Current Architectural Trends Operating Systems Intro –What is an OS? –Issues in operating systems.
1 Instruction Set Architecture (ISA) Alexander Titov 10/20/2012.
Lecture 10: Logic Emulation October 8, 2013 ECE 636 Reconfigurable Computing Lecture 13 Logic Emulation.
by Computer System Design Lecture 1 Wannarat Suntiamorntut
Field Programmable Gate Arrays FPGA’s Moving from Fixed Mode Architectures to Mode Configurable Architectures for HDTV and Digital Cinema applications.
Derek Chiou Prototyping to Emulation 1 From Prototyping to Emulation: The StarT (*T) Era ( ) Derek Chiou (Dataflow-StarT-Synthesis Era occupant)
KHz-SLR PC Board Eastbourne, October 2005 for kHz SLR Complete PC Board.
1 Retreat (Advance) John Wawrzynek UC Berkeley January 15, 2009.
Raw Status Update Chips & Fabrics James Psota M.I.T. Computer Architecture Workshop 9/19/03.
Breakout Report: RAMP Grand Challenges Eric Chung, Joel Emer, Paul Hartke, James C. Hoe, James Holt, Asif Khan, Christos Kozyrakis Martha Mercaldi Kim,
3/12/07CS Visit Days1 A Sea Change in Processor Design Uniprocessor SpecInt Performance: From Hennessy and Patterson, Computer Architecture: A Quantitative.
By Wannarat Computer System Design Lecture 1 Wannarat Suntiamorntut.
Conclusions on CS3014 David Gregg Department of Computer Science
1 Future of Computer Architecture David A. Patterson Pardee Professor of Computer Science, U.C. Berkeley President, Association for Computing Machinery.
Derek Chiou The University of Texas at Austin
David Patterson Electrical Engineering and Computer Sciences
Krste Asanovic Electrical Engineering and Computer Sciences
The performance requirements for DSP applications continue to grow and the traditional solutions do not adequately address this new challenge Paradigm.
Presentation transcript:

RAMPing Down Chuck Thacker Microsoft Research August 2010

Overview Original goals Participants Projects What worked What didn’t What next

3 Build Academic MPP from FPGAs (Slide from D. Patterson, 2006) As  20 CPUs will fit in Field Programmable Gate Array (FPGA), 1000-CPU system from  50 FPGAs? – 8 32-bit simple “soft core” RISC at 100MHz in 2004 (Virtex-II) – FPGA generations every 1.5 yrs;  2X CPUs,  1.2X clock rate HW research community does logic design (“gate shareware”) to create out-of-the-box, MPP – E.g., 1000 processor, standard ISA binary-compatible, 64-bit, cache-coherent  150 MHz/CPU in 2007 – RAMPants: Arvind (MIT), Krste Asanovíc (MIT), Derek Chiou (Texas), James Hoe (CMU), Christos Kozyrakis (Stanford), Shih-Lien Lu (Intel), Mark Oskin (Washington), David Patterson (Berkeley, Co-PI), Jan Rabaey (Berkeley), and John Wawrzynek (Berkeley, PI) “Research Accelerator for Multiple Processors”

4 Why RAMP [is] Good for Research MPP? SMPClusterSimulate RAMP Scalability (1k CPUs)CAAA Cost (1k CPUs)F ($40M)C ($2-3M)A+ ($0M)A ($ M) Cost of ownershipADAA Power/Space (kilowatts, racks) D (120 kw, 12 racks) A+ (.1 kw, 0.1 racks) A (1.5 kw, 0.3 racks) CommunityDAAA ObservabilityDCA+ ReproducibilityBDA+ ReconfigurabilityDCA+ CredibilityA+ FB+/A- Perform. (clock)A (2 GHz)A (3 GHz)F (0 GHz)C (.1 GHz) GPACB-BA-

Participants (PIs) UCB (D. Patterson, K. Asanovic, J. Wawrzynek) MIT (Arvind, J. Emer (MIT/Intel)) UT (D. Chiou) CMU (J. Hoe) UW (M. Oskin) Stanford (C. Kozyrakis)

Projects Berkeley: RAMP Gold MIT: HAsim UT: Protoflex CMU: FAST UW: --- Stanford: TCC (on BEE2)

What worked All the simulation-related projects. – The architecture community seems to really like simulators. BEE3/ BEECube – Got a more reliable, lower cost platform – Spun out a company to support/evolve it. – Some degree of industrial uptake (Chen?) Some of the actual architecture projects. – But not as many as I had hoped. – And we never really got a many-core system

What didn’t BEE3 – Still too expensive for most universities – And a recession didn’t help Gateware sharing – This turned out to be a lot harder than anyone thought. RIDL – Seemed like a good idea. What happened? Design tools – Tools like Bluespec help, but ISE is still lurking at the bottom and eating lots of time. “Waterhole effect”

What next? A new platform? (next slides) Better ways to collaborate and share: – FABRIC? – Need to get the students collaborating, rather than the PIs. ARPA games model from the ‘60s

A new platform for architecture research Need a better entry-level story – More like the XUPV5 or NetFPGA – A $2K board rather than $20K – Enables a larger user base Need a good expansion story – Plug boards together in a backplane or with cables Need FPGAs that are substantially better than V5 – One generation is probably not worth the engineering effort.

BEE5? ItemXC5VLX155TXC7V855P 6-LUT100K500K Flop100K1M BRAM DSP GTP/X1636 IO Single PWB – easy engineering

Expansion X5 direct-connect – 25X BEE3 capacity, half the cost. X16 mesh (or torus) – 80X BEE3 capacity, 2X cost. Larger?

Conclusions FPGAs are great for architecture research. – My earlier belief was wrong Consortia, particularly between universities, are hard. – It’s hard enough within a single department. – Hard to create win-win situations. – A bit like herding cats. Need people who can work up and down the stack.