Download presentation
Presentation is loading. Please wait.
Published byKarina Lattimore Modified over 9 years ago
2
Recap Measuring and reporting performance Quantitative principles Performance vs Cost/Performance
3
Fallacies and Pitfalls Fallacy: Peak performance tracks observed performance –Gap is often huge –E.g. Hitachi supercomputer 2 times faster than Cray (peak), but Cray is 2 times faster (real life!) –DEC Alpha: reported peak performance (assuming perfect pipeline and superscalar execution!) –Often used in supercomputer industry –Still a bad idea!
4
Fallacies and Pitfalls Fallacy: Best design optimises the primary objective without considering implementation –Complex designs impact time to market, affecting competitiveness –E.g. Intel Itanium — two year delay!
5
Fallacies and Pitfalls Pitfall: Ignoring software costs –Hardware costs used to dominate, but software is now a significant cost factor (e.g. 50% for a midrange server) –Impacts on cost-performance
6
Fallacies and Pitfalls Pitfall: Falling prey to Amdahl’s Law –Easy to get side-tracked into optimising some area that will have little overall impact
7
Fallacies and Pitfalls Fallacy: Synthetic benchmarks predict real performance (since 1 st Edition!) –Benchmarks are very susceptible to compiler and hardware optimisations –Examples: Compilers can discard 25% of Dhrystone! Whetstone doesn’t allow for some common, real optimisations! Compilers do benchmark-specific optimisations!
8
Fallacies and Pitfalls Fallacy: MIPS is useful for performance comparison (also 1 st Edition!) –Still popular (embedded processors) –MIPS depends on instruction set (useless for comparing different architectures) –MIPS varies between programs –MIPS can vary inversely to performance! E.g. FP in hardware/software
9
Chapter 1: Concluding Comments Several concepts that will be explored in more detail Chapter 2: Instruction set architecture Chapters 3 & 4: Pipelining –Appendix A: Basics –Chapter 3: Hardware techniques –Chapter 4: Compiler techniques Chapter 5: Memory hierarchies
10
Historical Perspectives Early computer history History of performance measurement –Details of Whetstone, MIPS, SPEC, etc.
11
Chapter Two Instruction Set Principles and Examples EDSAC Instruction Set
12
Contents Classification of architectures Features that are relatively independent of instruction sets “Different” Processors –DSP and media processors Impact of compilers
13
Introduction Use real programs for measurement –Results depend on programs and compilers used, but should be representative –Designers would consider much larger sets of programs Measurements are usually dynamic
14
2.2. Classification of Instruction Set Architectures Major criterion: CPU operand storage Four main styles of architecture: –Stack –Accumulator –General-purpose register machines Register-Memory Register-Register (or load-store) Operands are implicit Acc. implicit/Other explicit Operands explicit
15
Popularity Early machines: –Stack and accumulator Since 1980’s: –General register, load-store machines
16
Advantages of Registers Fast! Easier for compilers to use and optimise Can hold variables –Reduces memory traffic –Increases performance –Decreases program size –Dedicated registers frustrate these goals
17
Classifying GPR Machines Number of operands –Two or three Number of operands that may be in memory –0…3
18
Classifying GPR Machines TypeNumber of Operands Memory Operands Examples Register- Register 30 SPARC, MIPS, etc. Register- Memory 21 Intel 80x86, Motorola 68000 Memory- Memory 33VAX
19
2.3. Memory Addressing How is data accessed? How is a memory address interpreted? –Big-/Little-Endian ordering Generally unnoticed, except when exchanging data Alignment –Some machines insist on alignment (e.g. SPARC) –Other machines require multiple memory accesses for unaligned data
20
Data Addressing Modes Ten common addressing modes –Register –Immediate –Displacement –Register indirect –Indexed –Direct (or absolute) –Memory indirect –Autoincrement/Autodecrement –Scaled SPARC
21
Usage of Addressing Modes Measurements based on VAX –TeX, Spice, gcc Immediate and Displacement addressing dominate
22
Fig 2.7
23
Displacement Addressing Mode Wide variation in displacement (offset) values (Alpha, using SPEC2000)
24
Immediate Addressing Mode Mainly used for comparisons and ALU ops Overall (Alpha): –21% of instructions (integer) –16% of instructions (fp) Range of values (Alpha): –Mainly small ( 12 bits)
25
Figure 2.10
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.