1 CS 161 Introduction to Programming and Problem Solving Chapter 4 Computer Taxonomy Herbert G. Mayer, PSU Status 10/11/2014
2 Syllabus Introduction Common Architecture Attributes General Limitations Data-Stream Instruction-Stream Generic Architecture Model Instruction Set Architecture (ISA) Iron Law of Performance Hello World in C++ References
3 Introduction: Uniprocessors Single Accumulator Architectures, earliest in the 1940s; e.g. Atanasoff, Zuse, von Neumann General-Purpose Register Architectures (GPR) 2-Address Architecture, i.e. GPR with one operand implied, e.g. IBM Address Architecture, i.e. GPR with all operands of arithmetic operation explicit, e.g. VAX 11/70 Stack Machines (e.g. B5000, B6000, HP3000) Pipelined architecture, e.g. CDC 5000, Cyber 6000 Vector Architecture, e.g. Amdahl 470/6, competing with IBM’s 360 in the 1970s blurs line to Multiprocessor
4 Introduction: Multiprocessors Shared Memory Architecture; e.g. Illiac IV, BSP Distributed Memory Architecture Systolic Architecture; see Intel ® iWarp and CMU’s warp architecture Data Flow Machine; see Jack Dennis’ work at MIT
5 Introduction: Hybrid Architectures Superscalar Architecture; see Intel 80860, AKA i860 VLIW Architecture see Multiflow computer or systolic array architecture, like Warp at CMU Pittsburgh, or iWarp at Intel in the 1990s Pipelined Architecture; debatable if it is a hybrid architecture EPIC Architecture; see HP and Intel ® Itanium ® architecture
6 Common Architecture Attributes Main memory (main store), external from processor Program instructions stored in main memory Also, data stored in main memory; typical for von Neumann architecture Data available in –distributed over– static memory, stack, heap, reserved OS space, free space, IO space Instruction pointer (AKA instruction counter, program counter pc), other special registers Von Neumann memory bottle-neck: everything travels on the same, single bus
7 Common Architecture Attributes Accumulator (register, 1 or many) holds result of arithmetic-logical operation Memory Controller handles memory access requests from processor; moves bits to/from memory; is part of “chipset” Current trend is to move some of the memory controller or IO controller onto CPU chip; caveat: that does not mean the chipset IS part of the CPU! Logical processor unit includes: FP unit, Integer unit, control unit, register file, load-store unit, pathways Physical processor unit includes: heat sensors, frequency control, voltage regulator, and more
8 General Limitations Compute-Bound: type of application, in which the vast majority of execution time is spent executing instructions; data flow register-to- register; time to access memory is a small % of overall Memory-Bound: application, in which the majority of execution time is spent loading and storing data in memory; time executing instructions is small % vs. time to access memory IO-Bound: application, in which the majority of execution time is spent accessing secondary storage; time executing instructions, even the time accessing memory, is small % vs. time to access secondary storage Backup-Bound (semi-serious only): Like IO-Bound, but backup storage medium can be even slower than typical secondary storage devices
9 Data-Stream Instruction-Stream Classification developed by Michael J. Flynn, Single-Instruction, Single-Data Stream (SISD) Architecture PDP Single-Instruction, Multiple-Data Stream (SIMD) Architecture Array Processors, Solomon, Illiac IV, BSP, TMC 3. 3.Multiple-Instruction, Single-Data Stream (MISD) Architecture Pipelined architecture 4. 4.Multiple-Instruction, Multiple-Data Stream Architecture (MIMD) true multiprocessor
10 Generic Architecture Model
11 Instruction Set Architecture (ISA) ISA is boundary between Software and Hardware Specifies logical machine visible to the programmer & compiler Is functional specification for processor designer That boundary is sometimes a very low-level piece of system SW that handles exceptions, interrupts, and HW-specific services that could fall into the domain of the OS
12 Instruction Set Architecture (ISA) What is specified, what is typical for ISA: Operations: what to perform and in which order Active, temporary operand storage for the CPU, can be: accumulator, stack, register, and memory note that stack can be word-sized, even bit-sized (e.g. extreme design of successor for NCR’s Century architecture of the 1970s) Number of operands per instruction; some implied, others listed explicit Operand location: where and how to locate/specify the operands: Register, literal, data in memory Type and size of operands: bit, byte, word, double-word,... Instruction Encoding in binary Data types: int, float, double, decimal, char, bit
13 Instruction Set Architecture (ISA)
14 Iron Law of Performance Clock-rate doesn’t count! Bus width doesn’t count. Number of registers and operations executed in parallel doesn’t count! What counts is how long it takes for my computational task to complete. That time is of essence in computing! If a MIPS-based solution runs at 1 GHz that completes a program X in 2.2 minutes, while that same Intel Pentium ® 4– based program runs at 3 GHz and completes that same program x in 5.5 minutes, programmers are more happy about the MIPS solution! Who then cares about the clock rate? If a solution on an Intel CPU can be expressed in an object program of size Y bytes, but on some other IBM architecture of size 1.86 * Y bytes, the Intel solution is generally more attractive Meaning of this: Wall-clock time (Time) counts, i.e. time I have to wait for completion Program Size is overall complexity of computational task
15 Iron Law of Performance
16 Hello World in C++ // will show later in C with printf() #include #include int main( void ) { // main cout << ”Hello World!” << endl; // in C: printf( “Hello World!\n” ); return 0; } // end main
17 References