Combining Simulators and FPGAs “An Out-of-Body Experience”

Slides:



Advertisements
Similar presentations
RAMP Gold : An FPGA-based Architecture Simulator for Multiprocessors Zhangxi Tan, Andrew Waterman, David Patterson, Krste Asanovic Parallel Computing Lab,
Advertisements

Full-System Timing-First Simulation Carl J. Mauer Mark D. Hill and David A. Wood Computer Sciences Department University of Wisconsin—Madison.
Final Presentation Part-A
WHAT IS AN OPERATING SYSTEM? An interface between users and hardware - an environment "architecture ” Allows convenient usage; hides the tedious stuff.
Flash: An efficient and portable Web server Authors: Vivek S. Pai, Peter Druschel, Willy Zwaenepoel Presented at the Usenix Technical Conference, June.
Computer Architecture Lab at Combining Simulators and FPGAs “An Out-of-Body Experience” Eric S. Chung, Brian Gold, James C. Hoe, Babak Falsafi {echung,
Computer Architecture Lab at Building a Synthesizable x86 Eriko Nurvitadhi, James C. Hoe, Babak Falsafi S IMFLEX /P ROTOFLEX.
Computer Architecture Lab at 1 P ROTO F LEX : FPGA-Accelerated Hybrid Functional Simulator Eric S. Chung, Eriko Nurvitadhi, James C. Hoe, Babak Falsafi,
UC Berkeley 1 A Disk and Thermal Emulation Model for RAMP Zhangxi Tan and David Patterson.
G Robert Grimm New York University Disco.
UC Berkeley 1 Time dilation in RAMP Zhangxi Tan and David Patterson Computer Science Division UC Berkeley.
Disco Running Commodity Operating Systems on Scalable Multiprocessors.
Reliable Data Storage using Reed Solomon Code Supervised by: Isaschar (Zigi) Walter Performed by: Ilan Rosenfeld, Moshe Karl Spring 2004 Midterm Presentation.
Students:Gilad Goldman Lior Kamran Supervisor:Mony Orbach Part A Presentation Network Sniffer.
Computer Architecture Lab at 1 ProtoFlex: Status Update and Design Experiences Eric S. Chung, Michael Papamichael, Eriko Nurvitadhi, James C. Hoe, Babak.
1 RAMP Infrastructure Krste Asanovic UC Berkeley RAMP Tutorial, ISCA/FCRC, San Diego June 10, 2007.
Xen and the Art of Virtualization. Introduction  Challenges to build virtual machines Performance isolation  Scheduling priority  Memory demand  Network.
Virtualization Technology Prof D M Dhamdhere CSE Department IIT Bombay Moving towards Virtualization… Department of Computer Science and Engineering, IIT.
Peter S. Magnusson, Magnus Crhistensson, Jesper Eskilson, Daniel Forsgren, Gustav Hallberg, Johan Högberg, Frederik larsson, Anreas Moestedt. Presented.
Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical.
Microkernels, virtualization, exokernels Tutorial 1 – CSC469.
Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Jonathan Walpole (based on a slide set from Vidhya Sivasankaran)
CS533 Concepts of Operating Systems Jonathan Walpole.
Recall: Three I/O Methods Synchronous: Wait for I/O operation to complete. Asynchronous: Post I/O request and switch to other work. DMA (Direct Memory.
I/O management is a major component of operating system design and operation Important aspect of computer operation I/O devices vary greatly Various methods.
Computer Architecture Lab at 1 FPGAs and Bluespec: Experiences and Practices Eric S. Chung, James C. Hoe {echung,
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Edouard et al. Madhura S Rama.
 Virtual machine systems: simulators for multiple copies of a machine on itself.  Virtual machine (VM): the simulated machine.  Virtual machine monitor.
Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Vidhya Sivasankaran.
FPGA-based Fast, Cycle-Accurate Full System Simulators Derek Chiou, Huzefa Sanjeliwala, Dam Sunwoo, John Xu and Nikhil Patil University of Texas at Austin.
1.4 Hardware Review. CPU  Fetch-decode-execute cycle 1. Fetch 2. Bump PC 3. Decode 4. Determine operand addr (if necessary) 5. Fetch operand from memory.
Simics: A Full System Simulation Platform Synopsis by Jen Miller 19 March 2004.
Full and Para Virtualization
Processor Memory Processor-memory bus I/O Device Bus Adapter I/O Device I/O Device Bus Adapter I/O Device I/O Device Expansion bus I/O Bus.
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Presented by: Pierre LaBorde, Jordan Deveroux, Imran Ali, Yazen Ghannam, Tzu-Wei.
(1) SIMICS Overview. (2) SIMICS – A Full System Simulator Models disks, runs unaltered OSs etc. Accuracy is high (e.g., pollution effects factored in)
Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine and Mendel Rosenblum Presentation by Mark Smith.
1 Scaling Soft Processor Systems Martin Labrecque Peter Yiannacouras and Gregory Steffan University of Toronto FCCM 4/14/2008.
ECE 259 / CPS 221 Advanced Computer Architecture II (Parallel Computer Architecture) Interactions with Microarchitectures and I/O Copyright 2004 Daniel.
Computer Architecture Lab at ProtoFlex: An Architectural Exploration Vehicle Using FPGA-Accelerated Full-System Multiprocessor Simulations Eric S. Chung,
Input/Output (I/O) Important OS function – control I/O
Lecture 2. A Computer System for Labs
Introduction to Virtualization
Virtualization.
Virtual Machine Monitors
Bus Interfacing Processor-Memory Bus Backplane Bus I/O Bus
Chapter 13: I/O Systems Modified by Dr. Neerja Mhaskar for CS 3SH3.
Presented by Yoon-Soo Lee
Current Generation Hypervisor Type 1 Type 2.
Operating Systems (CS 340 D)
Andrew Putnam University of Washington RAMP Retreat January 17, 2008
CS 286 Computer Organization and Architecture
Improving java performance using Dynamic Method Migration on FPGAs
CSCE 212 Chapter 4: Assessing and Understanding Performance
Derek Chiou The University of Texas at Austin
Virtual Memory Main memory can act as a cache for the secondary storage (disk) Advantages: illusion of having more physical memory program relocation protection.
OS Virtualization.
Address Translation for Manycore Systems
Figure 1 PC Emulation System Display Memory [Embedded SOC Software]
ProtoFlex Tutorial: Full-System MP Simulations Using FPGAs
Computer-System Architecture
Shenghsun Cho, Mrunal Patel, Han Chen, Michael Ferdman, Peter Milder
A Survey on Virtualization Technologies
Chapter 1 Introduction to Operating System Part 5
Outline Module 1 and 2 dealt with processes, scheduling and synchronization Next two modules will deal with memory and storage Processes require data to.
A High Performance SoC: PkunityTM
CSC3050 – Computer Architecture
CSE 471 Autumn 1998 Virtual memory
Chapter 13: I/O Systems.
Cluster Computers.
Presentation transcript:

Combining Simulators and FPGAs “An Out-of-Body Experience” Eric S. Chung, Brian Gold, James C. Hoe, Babak Falsafi {echung, bgold, jhoe, babak}@ece.cmu.edu SIMFLEX/PROTOFLEX

The RAMP full-system challenge RAMP vision for studying systems w/ FPGAs functional & cycle-accurate simulation scalability, speed, & flexibility on FPGAs full-system (run unmodified binaries & OS)    I/O MMU controller DMA controller IRQ controller CPU CPU Terminal PCI Bus Memory Graphics card Ethernet controller SCSI controller Disk Disk ‘Full-sys’ RAMP will incur large effort yet, not all behaviors frequently used (e.g., I/O) June 22, 2006 Eric S. Chung / RAMP 2006 Summer Retreat

Combining simulators & FPGAs Simulators already provide full-system  why not simulate infrequent behaviors (e.g., I/O devices)? FPGA Simulator CPU CPU CPU CPU Memory SCSI Ethernet Memory SCSI Ethernet disk disk Advantages avoid impl. infreq. behaviors  lowers full-sys FPGA development low impact on scalability & perf. on FPGA June 22, 2006 Eric S. Chung / RAMP 2006 Summer Retreat

Outline Motivation Migration Implementation status Conclusion June 22, 2006 Eric S. Chung / RAMP 2006 Summer Retreat

Migration Target design 3 ways to map target object to host FPGA Simulator “Target objects” ex: func or timing cpu 3 1 2 3 ways to map target object to host FPGA-only Simulation-only Migratable Migratable objects switch modes between FPGA & simulator hosts target behavior need not be 100% in FPGA mode e.g., impl. 80% target behavior in FPGA, 100% in simulator 1 2 3 June 22, 2006 Eric S. Chung / RAMP 2006 Summer Retreat

Example CPU instruction stream Migration example Target-to-host mappings: CPU = migratable Memory = FPGA-only Devices = SW-only load CPU CPU FPGA Memory SCSI Example CPU instruction stream CPU state transfer Simulator SCSI cmd load CPU add time multiply I/O SCSI cmd add Memory SCSI sub .. disk June 22, 2006 Eric S. Chung / RAMP 2006 Summer Retreat

Advantages FPGA Simulator Lowers development effort Fast & scalable avoid bring-up of infrequent behaviors migrate & validate ref. models from simulator tailor impl. to workload (avoid rarely used instrs, good for CISC x86) Fast & scalable perf-critical objects on FPGA (eg, CPU, memory) scalable for MPs  add migratable CPUs FPGA Simulator CPU CPU CPU CPU CPU CPU Memory SCSI Memory SCSI disk June 22, 2006 Eric S. Chung / RAMP 2006 Summer Retreat

Subtleties Simulator FPGA Objects separated in simulator/FPGA interact examples: interrupts, DMA handle by forwarding messages between FPGA/simulator FPGA-only & SW-only mapped objects easy to locate migrated objects require tracking FPGA Simulator CPU CPU CPU DMA Memory SCSI Memory SCSI disk Forwarded DMA June 22, 2006 Eric S. Chung / RAMP 2006 Summer Retreat

Subtleties Cross-host interactions rare  low impact on FPGA perf. Objects separated in simulator/FPGA interact examples: interrupts, DMA handle by forwarding messages between FPGA/simulator FPGA-only & SW-only mapped objects easy to locate migrated objects require tracking Option 2: Forced migration Option 1: Forwarded interrupt FPGA Simulator CPU CPU CPU Interrupt Memory SCSI Memory SCSI disk Cross-host interactions rare  low impact on FPGA perf. June 22, 2006 Eric S. Chung / RAMP 2006 Summer Retreat

Subtleties cont. Migration cost FPGA & simulator asynchrony migrating object requires state copy e.g., migratable CPU has registers & TLBs FPGA-to-simulator latency & sim. time limits # migrations/instr FPGA & simulator asynchrony simulated time “ticks” at different rates in FPGA & simulator must synchronize for deterministic replay & accurate device timing June 22, 2006 Eric S. Chung / RAMP 2006 Summer Retreat

Outline Motivation Migration Implementation in progress Conclusion June 22, 2006 Eric S. Chung / RAMP 2006 Summer Retreat

Implementation status Target system Sun Fire[tm] 3800 Server (up to 24-way) UltraSPARC III ISA Solaris 8 Proof-of-concept software-to-software migration run 2 instances of Virtutech Simics migration designed & tested in 2 weeks can migrate on arbitrary behavior (e.g., ADD instruction) June 22, 2006 Eric S. Chung / RAMP 2006 Summer Retreat

BlueSPARC core (in progress) In-order SPARCV9 core supports 144 out of 170 integer instr behaviors supports partial MMU w/ I- & D-TLBs goal: 99.999% of instrs & behaviors in target workloads SPEC (mostly user-level), OLTP/DB2 (high TLB misses, 40% time in priv-mode) CPI ranges 5 to 7 cycles synth: 15k LUTs on Virtex-II Pro 30, 85MHz, 12MIPS (worst-case) developed in Bluespec HDL, 6000L in 6 weeks Core validation run RTL in lockstep w/ Simics’s UltraSPARC simulation model workload validation w/ SPEC, OLTP/DB2, OpenSPARC verif. suite June 22, 2006 Eric S. Chung / RAMP 2006 Summer Retreat

Migration on FPGA (in progress) Xilinx XUP Virtex-II Pro 30 Virtutech Simics Simics UltraSPARC BlueSPARC PowerPC Migration & message interface DDR memory Simulated target devices ethernet PowerPC functions core & memory initialization from Simics checkpoints facilitates migration for BlueSPARC connects simulated devices to memory (e.g., SCSI DMA) June 22, 2006 Eric S. Chung / RAMP 2006 Summer Retreat

Conclusion Contributions Future work We are ready for BEE2 virtualizes infrequent behaviors using simulation simplifies full-system FPGA emulator, still fast/scalable incremental validation from reference system Future work support migration in RDL? adding cores + scaling across multiple FPGAs We are ready for BEE2 Thanks! Questions? echung@ece.cmu.edu PROTOFLEX/SIMFLEX (http://www.ece.cmu.edu/~simflex) June 22, 2006 Eric S. Chung / RAMP 2006 Summer Retreat