Presentation is loading. Please wait.

Presentation is loading. Please wait.

1/ 56 מבנה מחשבים 0368-2159 Lecture 1 הקדמה. 2/ 76 על מה נדבר היום :  Introduction : Computer Architecture  Administrative Matters  Engineering: 

Similar presentations


Presentation on theme: "1/ 56 מבנה מחשבים 0368-2159 Lecture 1 הקדמה. 2/ 76 על מה נדבר היום :  Introduction : Computer Architecture  Administrative Matters  Engineering: "— Presentation transcript:

1 1/ 56 מבנה מחשבים 0368-2159 Lecture 1 הקדמה

2 2/ 76 על מה נדבר היום :  Introduction : Computer Architecture  Administrative Matters  Engineering:  ממוליכים וחשמל ועד פעולות בינריות בסיסיות במחשב מתח חשמלי מוליכים סיליקון : מוליך למחצה טרנזיסטור פעולות בינריות ברכיבים אלקטרוניים

3 3/ 56 מה זה מבנה מחשבים ?  חומרה - טרנזיסטורים  מעגלים לוגיים  ארכיטקטורת מחשבים

4 4/ 76 מבנה מחשבים, מה זה?

5 5/ 76

6 6/ 76 Mother board

7 7/ 76

8 8/ 76 The paradigm (Patterson) Every Computer Scientist should master the “AAA”  Architecture  Algorithms  Applications

9 9/ 76 Computer Architecture The goal of Computer Architecture  To build “cost effective systems” How do we calculate the cost of a system ? How we evaluate the effectiveness of the system?  To optimize the system What are the optimization points ? Fact: most of the computer systems still use Von-Neumann principle of operation, even though, internally, they are much different from the computer of that time.

10 10/ 76 Anatomy: 5 components of any Computer (since 1946) Personal Computer Processor Computer Control (“brain”) Datapath (“brawn”) Memory (where programs, data live when running) Devices Input Output Keyboard, Mouse Display, Printer Disk (where programs, data live when not running)

11 11/ 76 Computer System Structure CPU I/O BUS Bridge Memory KeyBoard Mouse Scanner LAN Lan Adap USB Hub Graphic Adapt Video Buffer Mem BUS CPU BUS Cache Scsi/IDE Adap Scsi Bus Hard Disk

12 12/ 76 The Instruction Set: a Critical Interface instruction set software hardware

13 13/ 76 מה זה “Computer Architecture” ? Computer Architecture =  Instruction Set Architecture +  Machine Organization + …  = הנדסה + ארכיטקטורה

14 14/ 76 מבנה מחשבים What are “Machine Structures”? *Coordination of many levels (layers) of abstraction I/O systemProcessor Compiler Operating System (Linux, Win,..) Application (ex: browser) Digital Design Circuit Design Instruction Set Architecture Datapath & Control transistors Memory Hardware Software Assembler

15 15/ 76 Levels of Representation High Level Language Program Assembly Language Program Machine Language Program Control Signal Specification Compiler Assembler Machine Interpretation temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; lw$15,0($2) lw$16,4($2) sw$16,0($2) sw$15,4($2) 0000 1001 1100 0110 1010 1111 0101 1000 1010 1111 0101 1000 0000 1001 1100 0110 1100 0110 1010 1111 0101 1000 0000 1001 0101 1000 0000 1001 1100 0110 1010 1111 °°°° ALUOP[0:3] <= InstReg[9:11] & MASK

16 16/ 76 Instruction Set Architecture (subset of Computer Architecture) “... the attributes of a [computing] system as seen by the programmer, i.e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and controls the logic design, and the physical implementation.” – Amdahl, Blaaw, and Brooks, 1964SOFTWARE Organization of Programmable Storage Data Types & Data Structures: Encodings & Representations Instruction Set Instruction Formats Modes of Addressing and Accessing Data Items and Instructions Exceptional Conditions

17 17/ 76 Computer Architecture’s Changing Definition  1950s to 1960s Computer Architecture Course Computer Arithmetic  1970s to mid 1980s Computer Architecture Course Instruction Set Design, especially ISA appropriate for compilers  1990s Computer Architecture Course Design of CPU, memory system, I/O system, Multi- processors, Networks  2000s Computer Architecture Course: Special purpose architectures, Functionally reconfigurable, Special considerations for low power/mobile processing

18 18/ 76 Example ISAs (Instruction Set Architectures)  Digital Alpha(v1, v3)1992-97  HP PA-RISC(v1.1, v2.0)1986-96  Sun Sparc(v8, v9)1987-95  SGI MIPS(MIPS I, II, III, IV, V)1986-96  Intel(8086,80286,80386,1978-00 80486,Pentium, MMX,...) Itanium/I642002-

19 19/ 76 MIPS R3000 Instruction Set Architecture (Summary)  Instruction Categories Load/Store Computational Jump and Branch Floating Point -coprocessor Memory Management Special R0 - R31 PC HI LO OP rs rt rdsafunct rs rt immediate jump target 3 Instruction Formats: all 32 bits wide Registers Q: How many already familiar with MIPS ISA?

20 20/ 76 Forces on Computer Architecture Computer Architecture Technology Programming Languages Operating Systems History Applications Cleverness

21 21/ 76 Computers in the News: Sony Playstation 2000  As reported in Microprocessor Report, Vol 13, No. 5: Emotion Engine: 6.2 GFLOPS, 75 million polygons per second Graphics Synthesizer: 2.4 Billion pixels per second Claim: Toy Story realism brought to games!

22 22/ 76 Where are We Going?? מבנה מחשבים  Arithmetic Single/multicycle Datapaths IFetchDcdExecMemWB IFetchDcdExecMemWB IFetchDcdExecMemWB IFetchDcdExecMemWB PipeliningMemory Systems I/O

23 23/ 76 שקופית מאחת ההרצאות האחרונות בסמסטר

24 24/ 76 מבנה מחשבים : So What's In It For Me?  In-depth understanding of the inner-workings of modern computers, their evolution, and trade-offs present at the hardware/software boundary. Insight into fast/slow operations that are easy/hard to implementation hardware Out-of-order execution and branch prediction  Experience with the design process in the context of a large complex (hardware) design. Functional Spec --> Control & Datapath --> Physical implementation Modern CAD tools  Designer's "Conceptual" toolbox

25 25/ 76 Course Administration  Instructors: Yehuda Afek (afek@tau.ac.il)afek@tau.ac.il Nathan Intrator (nin@post.tau.ac.il)nin@post.tau.ac.il  TA:Ori Shalev (orish@post.tau.ac.il)orish@post.tau.ac.il  Materials: http://www.math.tau.ac.il/~nin/Courses/CompStruct0 5/CompStruct05.htm http://www.math.tau.ac.il/~nin/Courses/CompStruct0 5/CompStruct05.htm  Books: 1.V. C. Hamacher, Z. G. Vranesic, S. G. Zaky Computer Organization. McGraw-Hill, 1982Computer Organization. 2.H. Taub Digital Circuits and Microporcessors. McGraw-Hill 1982 3.Hennessy and Patterson, Computer Organization Design, the hardware/software interface, Morgan Kaufman 1998

26 26/ 76 Grading ציון :  מבחן סופי 80%  תרגילים 20% 7 תרגילים

27 27/ 76 Architecture & Microarchitecture  Architecture (ISA-Instruction Set Architecture): The collection of features of a processor (or a system) as they are seen by the “user” User: a binary executable running on the processor, or assembly level programmer  Microarchitecture (µarch, uarch): The collection of features or way of implementation of a processor (or a system) that do not affect the user

28 28/ 76 Architecture & Microarchitecture Elements  Architecture: Registers data width (8/16/32) Instruction set Addressing modes Addressing methods (Segmentation, Paging, etc...)  Architecture: Physical memory size Caches size and structure Number of execution units, number of execution pipelines Branch prediction TLB  Timing is considered Arch (though it is user visible!)  Processors with the same arch may have different Arch

29 29/ 76 Compatibility  Backward compatibility – New hardware can run existing software – Example: Pentium  4 can run software originally written for Pentium  III, Pentium  II, Pentium , 486, 386, 286  Forward compatibility – New software can run on existing hardware – Example: new software written with MMX TM must still run on older Pentium processors which do not support MMX TM – Less important than backward compatibility  New ideas: architecture independent – JIT – just in time compiler: Java and.NET – Binary translation

30 30/ 76 How to compare between different systems?

31 31/ 76 Benchmarks – Programs for Evaluating Processor Performance  Toy Benchmarks – 10-100 line programs – e.g.: sieve, puzzle, quicksort  Synthetic Benchmarks – Attempt to match average frequencies of real workloads – e.g., Winstone, Dhrystone  Real programs – e.g., gcc, spice  SPEC: System Performance Evaluation Cooperative – SPECint (8 integer programs) – and SPECfp (10 floating point)

32 32/ 76 CPI – to compare systems with same instruction set architecture (ISA)  The CPU is synchronous - it works according to a clock signal. Clock cycle is measured in nsec (10 -9 of a second). Clock rate (= 1/clock cycle) is measured in MHz (10 6 cycles/second).  CPI - cycles per instruction Average #cycles per Instruction (in a given program) IPC (= 1/CPI) : Instructions per cycles  Clock rate is mainly affected by technology, CPI by the architecture  CPI breakdown: how many cycles (in average) the program spends for different causes; e.g., in executing, memory I/O etc. CPI = #cycles required to execute the program #instruction executed in the program

33 33/ 76 CPI (cont.)  CPI i - #cycles to execute a given type of instruction e.g.: CPI add = 1, CPI mul = 3 Independent of a program  Calculating the CPI of a program IC i - #times instruction of type i was executed in the program IC - #instruction executed in the program: F i - relative frequency of instruction of type i : F i = IC i /IC Ncyc - #cycles required to execute the program: CPI: This calculation does not take into account other delays such as memory, I/O

34 34/ 76 CPU Time  CPU Time – The time required by the CPU to execute a given program: CPU Time = clock cycle  #cyc = clock cycle  CPI  IC  Our goal: minimize CPU Time – Minimize clock cycle:more MHz (process, circuit,  Arch) – Minimize CPI:  Arch (e.g.: more execution units) – Minimize IC:architecture (e.g.: MMX TM technology)  Speedup due to enhancement E

35 35/ 76 Speedup overall = ExTime old ExTime new = 1 Speedup enhanced Fraction enhanced (1 - Fraction enhanced ) + ExTime new = ExTime old x Speedup enhanced Fraction enhanced (1 - Fraction enhanced ) + Suppose that enhancement E accelerates a fraction F of the task by a factor S, and the remainder of the task is unaffected, then: Amdahl’s Law

36 36/ 76 Floating point instructions improved to run 2X; but only 10% of actual instructions are FP Speedup overall = 1 0.95 =1.053 ExTime new = ExTime old x (0.9 +.1/2) = 0.95 x ExTime old Corollary: Make The Common Case Fast Amdahl’s Law: Example

37 37/ 76 instruction set software hardware Instruction Set Design The ISA is what the user and the compiler sees The ISA is what the hardware needs to implement

38 38/ 76 Why ISA is important?  Code size long instructions may take more time to be fetched Requires larges memory (important in small devices, e.g., cell phones)  Number of instructions (IC) Reducing IC reduce execution time (assuming same CPI and frequency)  Code “simplicity” Simple HW implementation which leads to higher frequency and lower power Code optimization can better be applied to “simple code”

39 39/ 76 The impact of the ISA RISC vs CISC

40 40/ 76 CISC Processors  CISC - Complex Instruction Set Computer  The idea: a high level machine language  Characteristic Many instruction types, with many addressing modes Some of the instructions are complex: -Perform complex tasks -Require many cycles ALU operations directly on memory -Usually uses limited number of registers Variable length instructions -Common instructions get short codes  save code length  Example: x86

41 41/ 76 CISC Drawbacks  Compilers do not take advantage of the complex instructions and the complex indexing methods  Implement complex instructions and complex addressing modes  complicate the processor  slow down the simple, common instructions  contradict Amdahl’s law corollary: Make The Common Case Fast  Variable length instructions are real pain in the neck: It is difficult to decode few instructions in parallel -As long as instruction is not decoded, its length is unknown  It is unknown where the instruction ends  It is unknown where the next instruction starts An instruction may not fit into the “right behavior” of the memory hierarchy (will be discussed next lectures)  Examples: VAX, x86 (!?!)

42 42/ 76 RISC Processors  RISC - Reduced Instruction Set Computer  The idea: simple instructions enable fast hardware  Characteristic A small instruction set, with only a few instructions formats Simple instructions -execute simple tasks -require a single cycle (with pipeline) A few indexing methods ALU operations on registers only -Memory is accessed using Load and Store instructions only. -Many orthogonal registers -Three address machine: Add dst, src1, src2 Fixed length instructions  Examples: MIPS TM, Sparc TM, Alpha TM, PowerPC TM

43 43/ 76 RISC Processors (Cont.)  Simple architecture  Simple micro- architecture Simple, small and fast control logic Simpler to design and validate Room for on die caches: instruction cache + data cache -Parallelize data and instruction access Shorten time-to-market  Using a smart compiler Better pipeline usage Better register allocation  Existing RISC processor are not “pure” RISC e.g., support division which takes many cycles

44 44/ 76 RISC and Amdhal’s Law (Example)  In compare to the CISC architecture: 10% of the static code, that executes 90% of the dynamic has the same CPI 90% of the static code, which is only 10% of the dynamic, increases in 60% The number of instruction being executed is increased in 50% The speed of the processor is doubled -This was true for the time the RISC processors were invented  We get  And then

45 45/ 76 So, what is better, RISC or CISC  Today CISC architectures (X86) are running as fast as RISC (or even faster)  The main reasons are: Translates CISC instructions into RISC instructions (ucode) CISC architecture are using “RISC like engine”  We will discuss this kind of solutions later on in this course.

46 46/ 76 History First point contact transistor (germanium), 1947 John Bardeen and Walter Brattain Bell Laboratories Audion (Triode), 1906 Lee De Forest 1906 1947

47 47/ 76 History Intel Pentium II, 1997 Clock: 233MHz Number of transistors: 7.5 M Gate Length: 0.35 First integrated circuit (germanium), 1958 Jack S. Kilby, Texas Instruments Contained five components, three types: transistors resistors and capacitors 1958 1997

48 48/ 76 Integrated Circuits (2003 state-of-the-art)  Primarily Crystalline Silicon  1mm - 25mm on a side  2003 - feature size ~ 0.13µm = 0.13 x 10 -6 m  100 - 400M transistors  (25 - 100M “logic gates")  3 - 10 conductive layers  “CMOS” (complementary metal oxide semiconductor) - most common.  Package provides: spreading of chip-level signal paths to board-level heat dissipation.  Ceramic or plastic with gold wires. Chip in Package Bare Die

49 49/ 76

50 50/ 76

51 51/ 76

52 52/ 76 Printed Circuit Boards  fiberglass or ceramic  1-20 conductive layers  1-20in on a side  IC packages are soldered down.

53 53/ 76 Technology Trends: Memory Capacity (Single-Chip DRAM) year size (Mbit) 19800.0625 19830.25 19861 19894 199216 199664 1998128 2000256 2002512 Now 1.4X/yr, or 2X every 2 years. 8000X since 1980!

54 54/ 76 Technology Trends: Microprocessor Complexity 2X transistors/Chip Every 1.5 years Called “Moore’s Law” Alpha 21264: 15 million Pentium Pro: 5.5 million PowerPC 620: 6.9 million Alpha 21164: 9.3 million Sparc Ultra: 5.2 million Moore’s Law Athlon (K7): 22 Million Itanium 2: 410 Million

55 55/ 76 Technology Trends: Processor Performance 1.54X/yr Intel P4 2000 MHz (Fall 2001) year Performance measure

56 56/ 76 Technology Trends Imply Dramatic Change  Processor Logic capacity:about 30% per year Clock rate:about 20% per year  Memory DRAM capacity:about 60% per year (4x every 3 years) Memory speed:about 10% per year Cost per bit:improves about 25% per year  Disk Capacity:about 60% per year Total data use:100% per 9 months!  Network Bandwidth Bandwidth increasing more than 100% per year!

57 57/ 76 1980-2003, CPU--DRAM Speed gap 10 DRAM CPU Performance (1/latency) 100 1000 19801980 20002000 19901990 Year Gap grew 50% per year Q. How do architects address this gap? A. Put smaller, faster “cache” memories between CPU and DRAM. 10000 The power wall 20052005 CPU 60% per yr 2X in 1.5 yrs DRAM 9% per yr 2X in 10 yrs

58 58/ 76 Dimensions 1 cm1 mm0.1 mm10µm1 µm0.1 µm10 nm1 nm1 Å Chip size (1 cm) Diameter of Human Hair (25 µm) 1996 devices (0.35 µm) 2007 devices (0.01 µm) Silicon atom radius (1.17 Å) Deep UV Wavelength (0.248 µm) X-ray Wavelength (0.6 nm) 2001 devices (0.18 µm) 2005: 0.12 10e-6 = 1.2 10e-72006: 0.04 10e-6

59 59/ 76 Technology and Computer Architecture

60 60/ 76 Can it last forever – or – new challenges are coming Power density Power

61 61/ 76 Technology in the News  BIG LaCie the first to offer consumer-level 2 Terabyte disk! $1,999 Weighs 11 pounds! 5 1/4” form-factor  SMALL Pretec is offering a 12GB CompactFlash card Size of a silver dollar Cost? $9,999

62 62/ 76

63 63/ 76

64 64/ 76

65 65/ 76

66 66/ 76

67 67/ 76

68 68/ 76

69 69/ 76 out

70 70/ 76

71 71/ 76

72 72/ 76

73 73/ 76


Download ppt "1/ 56 מבנה מחשבים 0368-2159 Lecture 1 הקדמה. 2/ 76 על מה נדבר היום :  Introduction : Computer Architecture  Administrative Matters  Engineering: "

Similar presentations


Ads by Google