Download presentation
Presentation is loading. Please wait.
Published byFrancis Fox Modified over 9 years ago
2
© 2000 Morgan Kaufman Overheads for Computers as Components Architectures and instruction sets zComputer architecture taxonomy. zAssembly language.
3
© 2000 Morgan Kaufman Overheads for Computers as Components von Neumann architecture zMemory holds data, instructions. zCentral processing unit (CPU) fetches instructions from memory. ySeparate CPU and memory distinguishes programmable computer. zCPU registers help out: program counter (PC), instruction register (IR), general- purpose registers, etc.
4
© 2000 Morgan Kaufman Overheads for Computers as Components CPU + memory memory CPU PC address data IRADD r5,r1,r3 200 ADD r5,r1,r3
5
© 2000 Morgan Kaufman Overheads for Computers as Components Harvard architecture CPU PC data memory program memory address data address data
6
© 2000 Morgan Kaufman Overheads for Computers as Components von Neumann vs. Harvard zHarvard can’t use self-modifying code. zHarvard allows two simultaneous memory fetches. zMost DSPs use Harvard architecture for streaming data: ygreater memory bandwidth; ymore predictable bandwidth.
7
© 2000 Morgan Kaufman Overheads for Computers as Components RISC vs. CISC zComplex instruction set computer (CISC): ymany addressing modes; ymany operations. zReduced instruction set computer (RISC): yload/store; ypipelinable instructions.
8
© 2000 Morgan Kaufman Overheads for Computers as Components Instruction set characteristics zFixed vs. variable length. zAddressing modes. zNumber of operands. zTypes of operands.
9
© 2000 Morgan Kaufman Overheads for Computers as Components Programming model zProgramming model: registers visible to the programmer. zSome registers are not visible (IR).
10
© 2000 Morgan Kaufman Overheads for Computers as Components Multiple implementations zSuccessful architectures have several implementations: yvarying clock speeds; ydifferent bus widths; ydifferent cache sizes; yetc.
11
© 2000 Morgan Kaufman Overheads for Computers as Components Assembly language zOne-to-one with instructions (more or less). zBasic features: yOne instruction per line. yLabels provide names for addresses (usually in first column). yInstructions often start in later columns. yColumns run to end of line.
12
© 2000 Morgan Kaufman Overheads for Computers as Components ARM assembly language example label1ADR r4,c LDR r0,[r4] ; a comment ADR r4,d LDR r1,[r4] SUB r0,r0,r1 ; comment destination
13
© 2000 Morgan Kaufman Overheads for Computers as Components Pseudo-ops zSome assembler directives don’t correspond directly to instructions: yDefine current address. yReserve storage. yConstants.
14
© 2000 Morgan Kaufman Overheads for Computers as Components Pipelining zExecute several instructions simultaneously but at different stages. zSimple three-stage pipe: fetchdecodeexecutememory
15
© 2000 Morgan Kaufman Overheads for Computers as Components Pipeline complications zMay not always be able to predict the next instruction: yConditional branch. zCauses bubble in the pipeline: fetchdecode Execute JNZ fetchdecodeexecute fetchdecodeexecute
16
© 2000 Morgan Kaufman Overheads for Computers as Components Superscalar zRISC pipeline executes one instruction per clock cycle (usually). zSuperscalar machines execute multiple instructions per clock cycle. yFaster execution. yMore variability in execution times. yMore expensive CPU.
17
© 2000 Morgan Kaufman Overheads for Computers as Components Simple superscalar zExecute floating point and integer instruction at the same time. yUse different registers. yFloating point operations use their own hardware unit. zMust wait for completion when floating point, integer units communicate.
18
© 2000 Morgan Kaufman Overheads for Computers as Components Costs zGood news---can find parallelism at run time. yBad news---causes variations in execution time. zRequires a lot of hardware. yn 2 instruction unit hardware for n-instruction parallelism.
19
© 2000 Morgan Kaufman Overheads for Computers as Components Finding parallelism zIndependent operations can be performed in parallel: ADD r0, r0, r1 ADD r2, r2, r3 ADD r6, r4, r0 ++ + r0 r1r2r3 r0 r4 r6 r3
20
© 2000 Morgan Kaufman Overheads for Computers as Components Pipeline hazards Two operations that require the same resource cannot be executed in parallel: x = a + b; a = d + e; y = a - f; - + + x a b d e a y f
21
© 2000 Morgan Kaufman Overheads for Computers as Components Scoreboarding zScoreboard keeps track of what instructions use what resources: Reg fileALUFP instr1XX instr2X
22
© 2000 Morgan Kaufman Overheads for Computers as Components Order of execution zIn-order: yMachine stops issuing instructions when the next instruction can’t be dispatched. zOut-of-order: yMachine will change order of instructions to keep dispatching. ySubstantially faster but also more complex.
23
© 2000 Morgan Kaufman Overheads for Computers as Components VLIW architectures zVery long instruction word (VLIW) processing provides significant parallelism. zRely on compilers to identify parallelism.
24
© 2000 Morgan Kaufman Overheads for Computers as Components What is VLIW? zParallel function units with shared register file: register file function unit function unit function unit function unit... instruction decode and memory
25
© 2000 Morgan Kaufman Overheads for Computers as Components VLIW cluster zOrganized into clusters to accommodate available register bandwidth: cluster...
26
© 2000 Morgan Kaufman Overheads for Computers as Components VLIW and compilers zVLIW requires considerably more sophisticated compiler technology than traditional architectures---must be able to extract parallelism to keep the instructions full. zMany VLIWs have good compiler support.
27
© 2000 Morgan Kaufman Overheads for Computers as Components Static scheduling ab c d ef g abe fc dg nop expressionsinstructions
28
© 2000 Morgan Kaufman Overheads for Computers as Components Trace scheduling conditional 1 block 2block 3 loop head 4 loop body 5 Rank paths in order of frequency. Schedule paths in order of frequency.
29
© 2000 Morgan Kaufman Overheads for Computers as Components EPIC zEPIC = Explicitly parallel instruction computing. zUsed in Intel/HP Merced (IA-64) machine. zIncorporates several features to allow machine to find, exploit increased parallelism.
30
© 2000 Morgan Kaufman Overheads for Computers as Components IA-64 instruction format zInstructions are bundled with tag to indicate which instructions can be executed in parallel: taginstruction 1instruction 2instruction 3 128 bits
31
© 2000 Morgan Kaufman Overheads for Computers as Components Memory system zCPU fetches data, instructions from a memory hierarchy: Main memory L2 cache L1 cache CPU
32
© 2000 Morgan Kaufman Overheads for Computers as Components Memory hierarchy complications zProgram behavior is much more state- dependent. yDepends on how earlier execution left the cache. zExecution time is less predictable. yMemory access times can vary by 100X.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.