Lecture 1: Introduction
Course Outline The aim of this course: Introduction to the methods and techniques of performance analysis of computer systems. Solve computer performance analysis problems related to measuring performance of computer systems, comparison of computer systems predicting the future performance under different configurations, designing new applications that meet performance requirements planning the capacity Hands-on experiments on modern hardware/software systems
Course Outline 1.Introduction 2.Hardware and software aspects of computer systems 3.Performance metrics 4.Performance measurement tools and techniques 5.Benchmarking 6.Statistical analysis of performance experiments 7.Design of experiments 8.Processor Performance ALU Pipelining Optimizing program performance 9.Memory Hierarchy Cache performance Optimizing program performance 10.Performance of multiprocessor systems 11.Simulation 12.Queueing Theory
Course Outline Textbook: D. Lilja, “Measuring Computer Performance: A Practitioner's Guide”, Cambridge University Press Reference Books: R. Jain, “The Art of Computer Systems Performance Analysis”, John Wiley P.J. Fortier, H.E. Michel, “Computer Systems Performance Evaluation and Prediction”, Digital Press K.R. Wadleigh, I.L. Crawford, “Software Optimization for High Performance Computing”, Prentice-Hall Computer Systems: A Programmer’s Perspective, R.E. Bryant, D.R.O’Hallaron, Pearson Computer Architecture, J.L. Hennessy, D.A. Patterson, Morgan & Kaufmann High Performance Computing, K.R. Wadleigh, I.L. Crawford, Prentice Hall
Course Outline Grading: Assignments30% Midterm30% Final Exam40%
Performance Evaluation of Computer Systems Computer systems consist of: Processor Memory Input/Output Operating system Network instructiondata Memory Processor Input unit Output unit PPPP Network
Performance Evaluation of Computer Systems Performance depends on: Technology
In recent years, microprocessors have become smaller and denser ComputerENIACLaptop Devices Weight (kg) Size (m 3 ) Power (watts) Cost ($) Memory (bytes) Performance (Flops/s)
Moore’s Law Gordon Moore predicted in 1965 that the transistor density of semiconductor chips would double roughly every 18 months.
Moore’s Law Number of transistors Performance Double every 1.5 year.
Top500 List at June 2013 ComputerCountryVendorProcessor + GPU + interconnect # coresR max (Pflops) R peak (Pflops) 1 Tianhe-2 China NUDTXeon 2.2GHz+ Nvidia GPU + custom TitanUSACrayOpteron 2.2GHz+ Nvidia GPU + CRAY Gemini Sequoia USA IBMBlueGene 1.6GHz+ custom K computer Japan FujitsuSparc64 2.0GHz+ Tofu Mira USA IBMBlueGene 1.6GHz+ custom
Performance Units Speed 1 Mflop/s 1 Megaflop/s 10 6 Flop/second 1 Gflop/s 1 Gigaflop/s 10 9 Flop/second 1 Tflop/s 1 Teraflop/s Flop/second 1 Pflop/s 1 Petaflop/s Flop/second 1 Eflop/s 1 Exaflop/s Flop/second Storage 1 MB 1 Megabyte 10 6 Bytes 1 GB 1 Gigabyte 10 9 Bytes 1 TB 1 Terabyte Bytes 1 PB 1 Petabyte Bytes
Moore’s Law Limits of Moore’s Law: Moore’s Law is exponential. Exponentials can not last forever. Heat is a problem in today’s CPUs The size of atoms is the fundamental barrier
Moore’s Law Reinterpreted Number of cores per chip doubles every 2 years Multicore architectures
Moore’s Law Reinterpreted Number of cores per chip doubles every 2 years, while clock speed decreases Multicore architectures
Performance Evaluation of Computer Systems Performance depends on: Technology Instruction Set Architecture
Instruction Set Architecture-ISA Instruction Set Design: RISC / CISC Code density Number of operands Stack machines (0-operand) Accumulator machines (1-operand) Register machines (2-operand, 3-operand)
Performance Evaluation of Computer Systems Performance depends on: Technology Instruction Set Architecture Organization
Memory Hierarchy HierarchySpeedSize Within the processor (CPU-registers-on chip cache) 1 nsByte L2 cache (SRAM)10 nsKByte Main Memory (DRAM)100 nsMByte Secondary storage (Disk)10 msGbyte Tertiary Storage (Tape/Disk)10 sTByte CPU Registers L1 Cache L2 Cache Main Memory Disk Tape
Organization Manycore Chips Single-core Dual-core CPU Registers L1 Cache L2 Cache Main Memory CPU Registers L1 Cache L2 Cache Main Memory CPU Registers L1 Cache
Performance Evaluation of Computer Systems Performance depends on: Technology Instruction Set Architecture Organization Software
The primary duty of software developers is to create functionally correct programs Performance evaluation is a part of software development for well-performing programs
Performance Analysis Cycle Have an optimization phase just like testing and debugging phase Code Development Measure Modify / Tune Analyze Usage Functionally complete and correct program Complete, correct and well-performing program
Systematic Approach to Performance Evaluation 1.Define the system 2.List services offered by the system 3.Select performance metrics 4.List system and workload parameters 5.Select factors and their values 6.Select evaluation technique 7.Select the workload 8.Design the experiment 9.Analyze the data 10.Present the results
1. Define the system ClientServerNetwork An Example:
2. List services offered by t he system Service: Remote procedure call
3. Select performance metrics Metrics: Time taken for the service Elapsed time Local CPU time Remote CPU time The rate at which the service can be performed calls per second
4. List system and workload parameters System Parameters Speed of the network Speed of the Local CPU Speed of the Remote CPU Operating system overhead Workload Parameters Time between successive calls Number and sizes of the call parameters
5. Select factors and their values Factors are the parameters to be varied and their values are called levels. For example: Factor:speed of the network; 2 levels: short distance (in the campus), long distance (across the country) Factor: Sizes of the call parameters; 2 levels: small, large Factor: number of consecutive calls; 11 levels: 1,2,4,8, … 1024
6. Select evaluation technique Three techniques: Analytical modeling Simulation Measuring the real system
7. Select the workload Depending on the evaluation technique, the workload may be expressed in different forms. Analytical modeling probability of various requests Simulation a trace of requests measured on a real system Measurement user programs
8. Design the experiment In the example: 2x2x11=44 experiments Phase 1 Number of factors is large but number of levels is small Phase 2 Reduce the number of factors and increase the number of levels
9. Analyze the data Analysis of Variance Regression etc.
10. Present the results Use graphical form to represent the data rather than statistical results