Presentation is loading. Please wait.

Presentation is loading. Please wait.

Jared Casper, Ronny Krashinsky, Christopher Batten, Krste Asanović MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA A Parameterizable.

Similar presentations


Presentation on theme: "Jared Casper, Ronny Krashinsky, Christopher Batten, Krste Asanović MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA A Parameterizable."— Presentation transcript:

1 Jared Casper, Ronny Krashinsky, Christopher Batten, Krste Asanović MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA A Parameterizable FPGA Prototype of a Vector-Thread Processor

2 SCALE Vector-Thread Processor Key Features –4 lanes, 4 clusters –Cluster for indexed accesses –4 segment address generators –4 VLDQs –VRU includes throttle logic, refill address generator

3 SCALE Cache Cache Arbiter and Crossbar Memory Port Arbiter and Crossbar SegBufSegBuf TagsTags DataData MSHRMSHR TagsTags DataData MSHRMSHR TagsTags DataData MSHRMSHR TagsTags DataData MSHRMSHR SegBufSegBuf SegBufSegBuf SegBufSegBuf Key Features –Two cycle hit latency –Four 8 KB banks –32 way associative –32B cachelines –16B/cycle per bank –Four 16B segment buffers per bank

4 SCALE Prototype Chip Prototype SCALE processor in development ­Control processor: MIPS, 1 instr/cycle ­VTU: 4 lanes, 4 clusters/lane, 32 registers/cluster, 128 VPs max ­Primary I/D cache: 32 KB, 4x128b per cycle, non-blocking ­DRAM: 64b, 200 MHz DDR2 (64b at 400Mb/s: 3.2GB/s) ­Estimated 10 mm 2 in 0.18μm, 400 MHz (25 FO4) Cycle-level execution-driven C++ microarchitectural simulator ­Detailed VTU and memory system model

5 Scale Prototype Board Single Xilinx Virtex-II FPGA ­Configured via direct JTAG connection or SystemACE Multiple Memory Chips ­Six Micron DDR2 SDRAMs ­Two Micron Mobile SDRAMs ­One Micron RLDRAM ­One Samsung SRAM Two Logic Analyzer connections Multiple separate power islands Attached to custom test baseboard ­Sixteen independently measurable power supplies ­Byte-serial connection to a Linux PC

6 Module Placement Reduce the risk of the final custom chip implementation ­Allow early rapid prototyping of many of the system interactions Provide a parameterizable prototype for architectural experiments

7 Testing Setup

8

9 Status Completed Work ­Single-issue seven-stage pipeline MIPS processor core Mapped to the board and passes our MIPS verification test suite Will form the SCALE control processor ­DDR2 memory controllers Tested in isolation using simple memory traffic generators Work in progress ­Cache subsystem ­Vector-thread unit

10 Advantages of Using an FPGA Rapid full system simulation of a large variety of designs ­Allows extensive characterization of the design space Parameterization allows exploration of various tradeoffs ­Cache parameters and replacement policies ­Prefetch strategies ­DRAM access scheduling policies and power-down modes ­DRAM types (e.g., DDR2 vs. Mobile DRAM) Fast emulation system for SCALE software development Allows thorough debugging before going to silicon


Download ppt "Jared Casper, Ronny Krashinsky, Christopher Batten, Krste Asanović MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA A Parameterizable."

Similar presentations


Ads by Google