Download presentation
Presentation is loading. Please wait.
1
Retrospective on the VIRAM-1 Design Decisions Christoforos E. Kozyrakis kozyraki@cs.berkeley.edu IRAM Retreat January 9, 2001
2
VIRAM-1 Design RetrospectiveC.E. Kozyrakis, 1/2001 2 What We Probably Got Right Low power design approach Use of a commercial MIPS core Permutation instructions Fixed-point arithmetic model Single load-store unit Dropping of the network interface Testing infrastructure
3
VIRAM-1 Design RetrospectiveC.E. Kozyrakis, 1/2001 3 Low Power Design Approach Two design alternatives for VIRAM-1 –200 MHz, 2 W, 4 vector lanes –500 MHz, 10 W (?), 4-8 vector lanes (?) Low power was the right choice because –Low power is important for embedded and multimedia applications –It is easier to design a low power processor than a high frequency one –High power consumption would severely interfere with DRAM operation
4
VIRAM-1 Design RetrospectiveC.E. Kozyrakis, 1/2001 4 Use of Commercial MIPS Core Scalar core alternatives –Custom design optimized for a vector unit –Commercial core with generic coprocessor interface The MIPS m5Kc core was a great choice because –It is a flexible, synthesizable design with a lot of documentation and support –It comes with a RTL simulation environment which we reused for VIRAM-1 –It allowed us to work on a demo system based on a MIPS daughter-card and demo board
5
VIRAM-1 Design RetrospectiveC.E. Kozyrakis, 1/2001 5 Other Issues We Got Right Simple instructions for intra-register permutations –Allow the vectorization of reductions and FFT –Simple implementation compared to a general permutation Single load-store unit –Not sufficient memory bandwidth for two units –Address calculation and translation resources are expensive –Not obviously useful for most media applications
6
VIRAM-1 Design RetrospectiveC.E. Kozyrakis, 1/2001 6 Other Issues We Got Right Dropping of the network interface –Not necessary for embedded/multimedia systems –Would introduce significant design complexity Testing infrastructure –Highly automated and easy to use for developing tests and verifying the complete VIRAM-1 design
7
VIRAM-1 Design RetrospectiveC.E. Kozyrakis, 1/2001 7 What We Probably Got Wrong Insufficient benchmarking at early project stages Support for 64-bit data-types Lack of sub-banks in DRAM macros Dropping the decoupled pipeline Use of a crossbar for memory transfers Too much support for arithmetic exceptions Too much support for conditional execution
8
VIRAM-1 Design RetrospectiveC.E. Kozyrakis, 1/2001 8 Insufficient Benchmarking Limited benchmarking was performed early enough to affect major design decisions –Previous experience and intuition used in several cases Reasons for limited benchmarking –Lack of compiler –Lack of flexible performance model –Lack of man power and time Some of the following issues could probably be avoided if we had done more benchmarking
9
VIRAM-1 Design RetrospectiveC.E. Kozyrakis, 1/2001 9 Support for 64-bit Data Types VIRAM-1 supports 64-bit integer operations Excluding encryption, few multimedia applications require 64-bit operations Benefits from not supporting 64-bit operations –Large area savings from datapaths and pipeline registers –Large wiring savings from reduced width of data busses –Fewer modes to support and verify
10
VIRAM-1 Design RetrospectiveC.E. Kozyrakis, 1/2001 10 Lack of DRAM Sub-banks The DRAM macro used has a single bank –No overlapping of accesses to different rows is allowed Significant performance bottleneck for applications with strided or random accesses –4 addresses per cycle for 8 banks with 5 cycles random access cycle –Bank conflicts reduce random bandwidth even further
11
VIRAM-1 Design RetrospectiveC.E. Kozyrakis, 1/2001 11 Other Issues We Got Wrong Dropping the decoupled pipeline –The “delayed pipeline” was preferred to a decoupled one due to complexity and power advantages, despite the performance issues –Due to the length of the pipeline and the lack of sub- banks, it is not obvious that this was a wise decision Use of a crossbar for memory transfers –The memory crossbar is the weakest design component in terms of scalability and flexibility –Alternative approaches (e.g. ring) were probably worth a closer examination before rejecting
12
VIRAM-1 Design RetrospectiveC.E. Kozyrakis, 1/2001 12 Other Issues We Got Wrong Too much support for arithmetic exceptions –VIRAM-1 includes extensive support for software speculation, user-level handlers, precise execution (slower) for arithmetic exceptions –Many of these features will never be used by the compiler, multimedia applications, or system software Too much support for conditional execution –VIRAM-1 implements all possible alternatives for vector conditional execution (masked instructions, masked merger, scatter-gather, compress-expand) –Some of the are quite complex to implement and not obviously need for multimedia codes
13
VIRAM-1 Design RetrospectiveC.E. Kozyrakis, 1/2001 13 What May Be Too Early To Call Full-custom design of integer datapaths –Optimal area and power consumption but requires significant design time –Maybe we could use an ASIC approach based on tiling specialized macro-cells or library components Use of two multipliers per vector lane –Most applications don’t have such a high ration of multiply or multiply-add operations –Consumes a significant amount of area
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.