Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 2: Fundamentals of Computer Design Kai Bu

Similar presentations


Presentation on theme: "Lecture 2: Fundamentals of Computer Design Kai Bu"— Presentation transcript:

1 Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn http://list.zju.edu.cn/kaibu/comparch

2 Chapter 1

3 Transition from single processor to multiple processors; Quantitative approach: empirical observations (of programs, experimentations, simulation) as its tools;

4 Outline Classes of computers Parallelism Instruction Set Architecture Trends Dependability Performance Measurement

5 Outline Classes of computers Parallelism Instruction Set Architecture Trends Dependability Performance Measurement

6 5 Classes of Computers

7 PMD: Personal Mobile Device Wireless devices with multimedia user interfaces cell phones, tablet computers, etc. a few hundred dollars

8 PMD Characteristics Cost effectiveness less expensive packaging; absence of fan for cooling Responsiveness & Predictability real-time performance: a maximum execution time for each app segment; soft real-time: average time constraint – tolerate occasionally missed time constraint on an event. Memory efficiency optimize code size Energy efficiency battery power, heat dissipation

9 Desktop Computing Largest market share low-end netbooks: $x00 … high-end workstations: $x000

10 Desktop Characteristics Price-Performance combination of performance and price; compute performance graphics performance The most important to customers, and hence to computer designers

11 Servers Provide large-scale and reliable file and computing services (to desktops) Constitute the backbone of large-scale enterprise computing

12 Servers Characteristics Availability against server failure Scalability in response to increasing demand with scaling up computing capacity, memory, storage, and I/O bandwidth Efficient throughput toward more requests handled in a unit time

13 Why Server Availability

14 Clusters/WSCs Warehouse-Scale Computers collections of desktop computers or servers connected by local area networks to act as a single larger computer Characteristics price-performance, power, availability

15 Embedded Computers hide everywhere

16 Embedded vs Non-embedded Dividing line the ability to run third-party software Embedded computers’ primary goal meet the performance need at a minimum price; rather than achieve higher performance at a higher price

17 Outline Classes of computers Parallelism Instruction Set Architecture Trends Dependability Performance Measurement

18 Application Parallelism DLP: Data-Level Parallelism many data items being operated on at the same time TLP: Task-Level Parallelism tasks of work created to be operate independently and largely in parallel

19 Hardware Parallelism Computer hardware exploits two kinds of application parallelism in four major ways: Instruction-Level Parallelism Vector Architectures and GPUs Thread-Level Parallelism Request-Level Parallelism

20 Hardware Parallelism Instruction-Level Parallelism exploits data-level parallelism at modest levels – pipelining; at medium levels – speculative exec;

21 Hardware Parallelism Vector Architectures & GPUs (Graphic Process Units) exploit data-level parallelism apply a single instruction to a collection of data in parallel

22 Hardware Parallelism Thread-Level Parallelism exploits either DLP or TLP in a tightly coupled hardware model that allows for interaction among parallel threads

23 Hardware Parallelism Request-Level Parallelism exploits parallelism among largely decoupled tasks specified by the programmer or the OS

24 Classes of Parallel Arch itectures by Michael Flynn according to the parallelism in the instruction and data streams called for by the instructions at the most constrained component of the multiprocessor: SISD, SIMD, MISD, MIMD

25 SISD Single instruction stream, single data stream – uniprocessor Can exploit instruction-level parallelism

26 SIMD Single instruction stream, multiple data stream The same instruction is executed by multiple processors using different data streams. Exploits data-level parallelism Data memory for each processor; whereas a single instruction memory and control processor.

27 MISD Multiple instruction streams, single data stream No commercial multiprocessor of this type yet

28 MIMD Multiple instruction streams, multiple data streams Each processor fetches its own instructions and operates on its own data. Exploits task-level parallelism

29 Outline Classes of computers Parallelism Instruction Set Architecture Trends Dependability Performance Measurement

30 Instruction Set Architecture ISA actual programmer-visible instruction set the boundary between software and hardware 7 major dimensions

31 ISA: Class Most are general-purpose register architectures with operands of either registers or memory locations Two popular versions register-memory ISA: e.g., 80x86 many instructions can access memory load-store ISA: e.g., ARM, MIPS only load or store instructions can access memory

32 ISA: Memory Addressing Byte addressing Aligned address object width: s bytes address: A aligned if A mod s = 0

33 Each misaligned object requires two memory accesses

34 ISA: Addressing Modes Specify the address of a memory object Register, Immediate, Displacement

35 ISA: Types and Sizes of OP erands TypeSize in bits ASCII character8 Unicode character Half word 16 Integer word 32 Double word Long integer 64 IEEE 754 floating point – single precision 32 IEEE 754 floating point – double precision 64 Floating point – extended double precision 80

36 MIPS64 Operations Data transfer

37 MIPS64 Operations Arithmetic Logical

38 MIPS64 Operations Control

39 MIPS64 Operations Floating point

40 ISA: Control Flow Instr uctions Types: conditional branches unconditional jumps procedure calls returns Branch address: add an address field to PC (program counter)

41 ISA: Encoding an ISA Fixed length: ARM, MIPS – 32 bits Variable length: 80x86 – 1~18 bytes http://en.wikipedia.org/wiki/MIPS_architecture Start with a 6-bit opcode. R-type: three registers, a shift amount field, and a function field; I-type: two registers, a 16-bit immediate value; J-type: a 26-bit jump target.

42 Computer Architecture ISA Organization Hardware actual programmer visible instruction set; boundary between sw and hw; high-level aspects of computer design: memory system, memory interconnect, design of internal processor or CPU; computer specifics: logic design, packaging tech;

43 Outline Classes of computers Parallelism Instruction Set Architecture Trends Dependability Performance Measurement

44 Five Critical Implementation Technologies Integrated circuit logic technology Semiconductor DRAM Semiconductor flash Magnetic disk technology Network technology

45 Integrated circuit logic technology Moore’s Law: a growth rate in transistor count on a chip of about 40% to 55% per year doubles every 18 to 24 months

46 Semiconductor DRAM Capacity per DRAM chip doubles roughly every 2 or 3 years

47 Semiconductor Flash Electronically erasable programmable read-only memory Capacity per Flash chip doubles roughly every two years In 2011, 15 to 20 times cheaper per bit than DRAM

48 Magnetic Disk Technology Since 2004, density doubles every three years 15 to 20 times cheaper per bit than Flash 300 to 500 times cheaper per bit than DRAM For server and warehouse scale storage

49 Network Technology Switches Transmission systems

50 Performance Trends Bandwidth/Throughput the total amount of work done in a given time; Latency/Response Time the time between the start and the completion of an event;

51 Bandwidth over Latency

52 Trends in Power and Energy Power = Energy per unit time 1 watt = 1 joule per second energy to execute a workload = avg power x execution time Three primary concerns the max power for a processor sustained power consumption energy and energy efficiency

53 Trends in Power and Energy Sustained power consumption Metric: TDP Thermal Design Power determines cooling requirement Heat management 1. reduce clock rate and hence power as the thermal temperature approaches the junction temperature limit; 2. if 1 is not working, power down the chip.

54 Trends in Power and Energy Energy and Energy Efficiency energy to execute a workload = avg power x execution time Example processor A with 20% higher avg power consumption than processor B; but A executes the task with 70% of the time by B; A or B is more efficient?

55 Trends in Power and Energy Example processor A with 20% higher avg power consumption than processor B; but A executes the task with 70% of the time by B; A or B is more efficient? EnergyConsumptionA =1.2 x 0.7 x EnergyConsumptionB =0.84 x EnergyConsumptionB

56 Trends in Power and Energy Primary energy consumption within a microprocessor is for switching transistors – dynamic energy logic transistion: 0->1->0 or 1->0->1 The energy of a single transition

57 Trends in Power and Energy The power required per transistor For a fixed task, slowing clock rate (frequency) reduces power, but not energy.

58 Trends in Power and Energy Example some microprocessors with adjustable voltage; 15% reduction in voltage -> 15% reduction in frequency; the impact on dynamic energy and dynamic power?

59 Trends in Power and Energy Answer

60 Trends in Power and Energy Challenges distributing the power removing the heat preventing hot spots potential research topics

61 Trends in Power and Energy Energy-efficiency improvement techniques 1. do nothing well turn off the clock of inactive modules 2. DVFS: dynamic voltage-frequency scaling scale down clock frequency and voltage during periods of low activity

62 DVFS

63 Trends in Power and Energy Energy-efficiency improvement techniques 3. design for typical case PMDs, laptops – often idle memory and storage with low power modes to save energy 4. overclocking the chip runs at a higher clock rate for a short time until temperature rises

64 Trends in Cost Cost of an Integrated Circuit wafer for test; chopped into dies for packaging

65 Trends in Cost Cost of an Integrated Circuit percentage of manufactured devices that survives the testing procedure

66 Trends in Cost Cost of an Integrated Circuit

67 Trends in Cost Cost of an Integrated Circuit

68 Intel Core i7 Die

69 Trends in Cost Example

70

71 Trends in Cost Example

72 Trends in Cost Cost of an Integrated Circuit N: process-complexity factor for measuring manufacturing difficulty

73 Outline Classes of computers Parallelism Instruction Set Architecture Trends Dependability Performance Measurement

74 Dependability SLA: service level agreements System states: up or down Service states service accomplishment service interruption failurerestoration

75 Dependability Two measures of dependability Module reliability Module availability

76 Dependability Two measures of dependability Module reliability continuous service accomplishment from a reference initial instant MTTF: mean time to failure MTTR: mean time to repair MTBF: mean time between failures MTBF = MTTF + MTTR

77 Dependability Two measures of dependability Module reliability FIT: failures in time failures per billion hours MTTF of 1,000,000 hours = 10 9 /10 6 = 1000 FIT

78 Dependability Two measures of dependability Module availability

79 Dependability Example

80 Dependability Answer

81 Outline Classes of computers Parallelism Instruction Set Architecture Trends Dependability Performance Measurement

82 Measuring Performance Execution time the time between the start and the completion of an event Throughput the total amount of work done in a given time

83 Measuring Performance Computer X and Computer Y X is n times faster than Y

84 Quantitative Principles Parallelism Locality temporal locality: recently accessed items are likely to be accessed in the near future; spatial locality: items whose addresses are near one another tend to be referenced close together in time

85 Quantitative Principles Amdahl’s Law

86 Quantitative Principles Amdahl’s Law: two factors 1. Fraction enhanced : e.g., 20/60 if 20 seconds out of a 60- second program to enhance 2. Speedup enhanced : e.g., 5/2 if enhanced to 2 seconds while originally 5 seconds

87

88 Quantitative Principles Example

89 Quantitative Principles The Processor Performance Equation

90

91 Quantitative Principles Example

92 Quantitative Principles Example

93 ?

94 Reading Chapter 1.8, 1.10 – 1.13


Download ppt "Lecture 2: Fundamentals of Computer Design Kai Bu"

Similar presentations


Ads by Google