1 Breakout thoughts (compiled with N. Carter): Where will RAMP be in 3-5 Years (What is RAMP, where is it going?) Is it still RAMP if it is mapping onto.

Slides:



Advertisements
Similar presentations
Reconfigurable Computing After a Decade: A New Perspective and Challenges For Hardware-Software Co-Design and Development Tirumale K Ramesh, Ph.D. Boeing.
Advertisements

An Overview Of Virtual Machine Architectures Ross Rosemark.
RAMP Gold : An FPGA-based Architecture Simulator for Multiprocessors Zhangxi Tan, Andrew Waterman, David Patterson, Krste Asanovic Parallel Computing Lab,
FPGA (Field Programmable Gate Array)
User-Mode Linux Ken C.K. Lee
System Simulation Of 1000-cores Heterogeneous SoCs Shivani Raghav Embedded System Laboratory (ESL) Ecole Polytechnique Federale de Lausanne (EPFL)
Tier 1 Breakout Topics How to study a 100,000-core system (yes that is 100K) using RAMP technologies? Krste What "great" research questions can RAMP help.
LOGO HW/SW Co-Verification -- Mentor Graphics® Seamless CVE By: Getao Liang March, 2006.
Fault Detection in a HW/SW CoDesign Environment Prepared by A. Gaye Soykök.
SYNAR Systems Networking and Architecture Group CMPT 886: Special Topics in Operating Systems and Computer Architecture Dr. Alexandra Fedorova School of.
Sim2Imp (Simulation to Implementation) Breakout J. Wawrzynek, K. Asanovic, G. Gibeling, M. Lin, Y. Lee, N. Patil.
UC Berkeley 1 Time dilation in RAMP Zhangxi Tan and David Patterson Computer Science Division UC Berkeley.
Define Embedded Systems Small (?) Application Specific Computer Systems.
Active Messages: a Mechanism for Integrated Communication and Computation von Eicken et. al. Brian Kazian CS258 Spring 2008.
CUDA Programming Lei Zhou, Yafeng Yin, Yanzhi Ren, Hong Man, Yingying Chen.
February 21, 2008 Center for Hybrid and Embedded Software Systems Mapping A Timed Functional Specification to a Precision.
Implications for Programming Models Todd C. Mowry CS 495 September 12, 2002.
Trend towards Embedded Multiprocessors Popular Examples –Network processors (Intel, Motorola, etc.) –Graphics (NVIDIA) –Gaming (IBM, Sony, and Toshiba)
1 RAMP Infrastructure Krste Asanovic UC Berkeley RAMP Tutorial, ISCA/FCRC, San Diego June 10, 2007.
Heterogeneous Computing Dr. Jason D. Bakos. Heterogeneous Computing 2 “Traditional” Parallel/Multi-Processing Large-scale parallel platforms: –Individual.
An Overview of Virtual Machine Architectures by J.E. Smith and Ravi Nair presented by Sebastian Burckhardt University of Pennsylvania CIS 700 – Virtualization.
Octavo: An FPGA-Centric Processor Architecture Charles Eric LaForest J. Gregory Steffan ECE, University of Toronto FPGA 2012, February 24.
Virtualization Concept. Virtualization  Real: it exists, you can see it.  Transparent: it exists, you cannot see it  Virtual: it does not exist, you.
Codeplay CEO © Copyright 2012 Codeplay Software Ltd 45 York Place Edinburgh EH1 3HP United Kingdom Visit us at The unique challenges of.
1 3-General Purpose Processors: Altera Nios II 2 Altera Nios II processor A 32-bit soft core processor from Altera Comes in three cores: Fast, Standard,
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Eric Keller, Evan Green Princeton University PRESTO /22/08 Virtualizing the Data Plane Through Source Code Merging.
Automated Design of Custom Architecture Tulika Mitra
Lecture 2 1 ECE 412: Microcomputer Laboratory Lecture 2: Design Methodologies.
Evaluating FERMI features for Data Mining Applications Masters Thesis Presentation Sinduja Muralidharan Advised by: Dr. Gagan Agrawal.
Configurable, reconfigurable, and run-time reconfigurable computing.
TEMPLATE DESIGN © Hardware Design, Synthesis, and Verification of a Multicore Communication API Ben Meakin, Ganesh Gopalakrishnan.
Jump to first page One-gigabit Router Oskar E. Bruening and Cemal Akcaba Advisor: Prof. Agarwal.
Mahapatra-Texas A&M-Fall'001 How to plan on project work? An attempt to consolidate your thought to gear up project activities.
VIRTUAL MEMORY By Thi Nguyen. Motivation  In early time, the main memory was not large enough to store and execute complex program as higher level languages.
MILAN: Technical Overview October 2, 2002 Akos Ledeczi MILAN Workshop Institute for Software Integrated.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
Task Graph Scheduling for RTR Paper Review By Gregor Scott.
Algorithm and Programming Considerations for Embedded Reconfigurable Computers Russell Duren, Associate Professor Engineering And Computer Science Baylor.
Lecture 12: Reconfigurable Systems II October 20, 2004 ECE 697F Reconfigurable Computing Lecture 12 Reconfigurable Systems II: Exploring Programmable Systems.
Modeling Big Data Execution speed limited by: –Model complexity –Software Efficiency –Spatial and temporal extent and resolution –Data size & access speed.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Based upon slides from Jay Lepreau, Utah Emulab Introduction Shiv Kalyanaraman
Parallel Portability and Heterogeneous programming Stefan Möhl, Co-founder, CSO, Mitrionics.
Baring It All to Software: Raw Machines E. Waingold, M. Taylor, D. Srikrishna, V. Sarkar, W. Lee, V. Lee, J. Kim, M. Frank, P. Finch, R. Barua, J. Babb,
DDRIII BASED GENERAL PURPOSE FIFO ON VIRTEX-6 FPGA ML605 BOARD PART B PRESENTATION STUDENTS: OLEG KORENEV EUGENE REZNIK SUPERVISOR: ROLF HILGENDORF 1 Semester:
Cray XD1 Reconfigurable Computing for Application Acceleration.
Real-Time System-On-A-Chip Emulation.  Introduction  Describing SOC Designs  System-Level Design Flow  SOC Implemantation Paths-Emulation and.
Background Computer System Architectures Computer System Software.
3/12/07CS Visit Days1 A Sea Change in Processor Design Uniprocessor SpecInt Performance: From Hennessy and Patterson, Computer Architecture: A Quantitative.
IceCube DAQ Mtg. 10,28-30 IceCube DAQ: Implementation Plan.
Lecture 4 Page 1 CS 111 Online Modularity and Memory Clearly, programs must have access to memory We need abstractions that give them the required access.
Lecturer 3: Processes multithreaded Operating System Concepts Process Concept Process Scheduling Operation on Processes Cooperating Processes Interprocess.
Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.
Virtualization Neependra Khare
SUBJECT : DIGITAL ELECTRONICS CLASS : SEM 3(B) TOPIC : INTRODUCTION OF VHDL.
Benefits of a Virtual SIL
Modeling Big Data Execution speed limited by: Model complexity
Andreas Hoffmann Andreas Ropers Tim Kogel Stefan Pees Prof
Current Generation Hypervisor Type 1 Type 2.
Mitrion-C Currently a programming language for FPGA accelerators
Andrew Putnam University of Washington RAMP Retreat January 17, 2008
Spatial Analysis With Big Data
Modularity and Memory Clearly, programs must have access to memory
Implementation of IDEA on a Reconfigurable Computer
RAMP Retreat, UC Berkeley
Distributed Shared Memory
Is “Higher Level” Better?
MPI-Message Passing Interface
UNISIM (UNIted SIMulation Environment) walkthrough
The George Washington University
Presentation transcript:

1 Breakout thoughts (compiled with N. Carter): Where will RAMP be in 3-5 Years (What is RAMP, where is it going?) Is it still RAMP if it is mapping onto something other than FPGA? Can we revert back to software simulation? Is RAMP cost effective with current Parallel HW (different 5 years ago). Now possible to build a 1K system with blades:  ex Niagara has 64 threads - could simulate 1K cores with 16 blades? Probably can get a 50 CPU cluster for "industry cost" of BEE3, $50K. Maybe $1K per core.

2 Breakout thoughts: On Clusters: For thoughput limited runs (lots of independent simulation runs) - ex: generating 3000 data points  then a cluster with many CPUs each working separately For low latency single run - network latency will dominate and result in low emulation performance. In all of these “software” approaches:  There is a big issue of the mismatch between “host” and “target”. For good matches (ISA, network, memory system) can be efficient.  FPGA because of fine grain configurability largely avoids this issue. But are these processor based emulations still "RAMP" systems?

3 Breakout thoughts: Perhaps what makes it “RAMP” is the design representation not the host. There may be something to using RAMP methodology and models for software (and other) implementations  As it is too easy to cheat in software only models - starting from a “hardware” model (with explicit interconnect, etc.) will help  Can Re"host" the emulation later to FPGAs. Has extra benefits of:  avoid the bother or cost of FPGA HW initially, and  Help in verification by having two implementations Simple Scalar has been successful party because it runs on anything.

4 Breakout thoughts: In order to pull this off need a higher level of abstraction for specifying units:  1. Library approach (users want parameterized high-level blocks)  2. modeling language (builders of blocks want a common language for both HW/SW implementations) Can C-gates help? 3-5 years RAMP will need to transition into "tools”  Right now project is still investigating the space of bag of tricks  At some point must transition to being much easier to use (a tool not a PhD thesis) Liberty as an example for RDL! (but has a steep learning curve!)