Download presentation
Presentation is loading. Please wait.
Published byMerryl Fisher Modified over 8 years ago
1
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 1 Advanced Computer Architecture CS 704 Advanced Computer Architecture Lecture 6 Instruction Set Principles (ISA Performance Analysis, Fallacies and Pitfalls) Prof. Dr. M. Ashraf Chughtai
2
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 2 Today’s Topics Recap Lecture 5 Recap Lecture 5 DSP Media Operations DSP Media Operations ISA Performance ISA Performance Putting it all Together Putting it all Together Summary Summary
3
MAC/VU-Advanced Computer Architecture Lecture 5 - Instruction Set Principles.. Cont'd 3 Recap: Lecture 5 Instruction encoding - Essential elements of computer instruction word: - Type of operands - Places of source and destinations - Place of next instruction -Instruction word length - Variable Length - Fixed length - Hybrid – variable fixed -Categories of Hybrid length 4, 3, 2, 1 and 0 address format
4
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 4 Recap: Lecture 5 ….. Cont’d - Comparison of hybrid instruction word format -Minimum number of memory bytes are required in case of 1 address (accumulator) format -Maximum for 4-address format - MIPS Instruction word format - RISC and MIPS a fixed length, 64-bit LOAD/STORE Architecture Architecture - It supports: - 8-, 16-, 32- and 64-bit operand - 8-, 16-, 32- and 64-bit operand - R-type, I-type and J-type - Arithmetic and logic operation - data transfer operations - Control flow operations
5
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 5 Media and Signal Processing Operands Graphic applications deal with 2D and 3D images 3D data type is called vertex Vertex structure has 4-components -x- coordinate -y- coordinate -z- coordinate -w-coordinate The three vertices specify a graphic primitive, such as a triangle; and the fourth to help with color and hidden surfaces Vertex values are usually 32-bit Floating point values DSP adds fixed point to the data types – binary point just to the right of the sign-bit
6
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 6 3D Data Type A triangle is visible when it is depicted as filled with pixels Pixels are typically 32-bits, usually consisting of four 8-bit channels -R -red -G-green -B-blue -A: Transparency of pixel when it is depicted when it is depicted
7
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 7 Media and Signal Processing Operations Data for multimedia operations is usually much narrower than the 64-bit data word of modern processors Thus, 64-bit may be partitioned in to four 16-bit data values so that the 64- bit ALU to perform four 16-bit operations (say add operation) in a single clock cycle
8
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 8 Media and Signal Processing Operations Here, extra hardware is added to prevent the ‘CARRY’ between the four 16-bit partitions of 64-bit ALU These operations are called Single- Instruction Multiple-Data (SIMD) or vector operations
9
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 9 Multimedia Operations Most graphic multimedia applications use 32-bit floating point operations allowing a single instruction to launch two 32-bit operations on operands found side-by-side in double precision register The table shown here summarizes SIMD instructions found in recent computers
10
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 10 Summary of SIMD instructions in recent computers Insert Table given in Fig. 2.17 from page 110
11
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 11 Multimedia Operations You may note that there is very little common across the five architectures All are fixed-width operation, performing multiple narrow operations on either 64-bit or 128-bit ALU The narrow operation are shown as B-byte, H-half word W-word and 8B double word
12
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 12 Digital Signals Processing Issues Saturating Add/Subtract Too Large Result and Overflow Result Rounding Choose from IEEE 754 mode algorithms Multiply Accumulate Vector and Matrix dot product operations
13
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 13 DSP Operations Saturating Add/Sub - DSP cannot ignore results of overflow otherwise it may miss an event, therefore, it uses saturating arithmetic. -Here, if the result is too large to be presented it is set to the largest representable number, based on the sign of the number -Here, if the result is too large to be presented it is set to the largest representable number, based on the sign of the number
14
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 14 DSP Operations Result Rounding IEEE 754 has several algorithms to round the wider accumulator into narrower one, DSPs select the appropriate mode to round the result Multiply-Accumulate (MAC) MAC operations are the key to dot product operations of vector and matrix multiply which need to accumulate a series of product
15
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 15 ISA Performance Role of Compiler - The interaction of compiler and high-level languages significantly effects how program uses an ISA -Optimizations performed by the compilers can be classified as follows:
16
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 16 Classification of Performance optimization - High-level optimization: is often done on the source with the output fed to the later optimization passes. - Local Optimization: is done within a straight- line code fragment (basic block) - Global Optimization: extends the optimization across branches - Register Allocation: associate registers with operands - Processor-dependent optimization: using the specific architecture
17
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 17 Impact of Compiler Technology - Interaction of compiler and high-level language affects how a program uses an ISA -Here, two important questions are: 1:How are variables allocated? 2:How many registers are needed to allocate variables appropriately? -These questions are addressed by using three areas in which high-level language allocates data
18
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 18 Three areas of data allocation 1:Local Variable area – Stack -It is used to allocate local variable -it grows or shrinks on procedure call or return -Objects on stack are primarily scalar – single variable rather than arrays and are addressed by stack-pointer -Register allocation is much more effective for stack-allocated objects
19
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 19 Three areas of data allocation … Cont’d: 2:Global Data Area -It is used to allocate statically declared objects such as global variables and constants -These objects are mostly arrays and other aggregate data structures -Register allocation is relatively less effective for global variables -Global variables are aliased – there are multiple way to address so make it illegal to put on registers
20
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 20 Three areas of data allocation … Cont’d: 3:Dynamic Object Allocation: Heap -It is used to allocate the objects that do not adhere to stack -The objects in heap are accessed with pointer but are not scalars -Most heap variable are aliased so register allocation is almost impossible for heap
21
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 21 ISA Performance … Cont’d MIPS Floating-point Operations - The instructions manipulate the floating- point registers -They indicate whether the operation is to be performed on single precision or double precision MOV.S copies a single precision register to another of the same type MOV.D copies a Double precision register to another of the same type MOV.D copies a Double precision register to another of the same type
22
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 22 MIPS Floating-point Operations MIPS Floating-point Operations … Cont’d - To get greater performance for graphic routines, MIPS64 offers Paired-Single Instructions -These instructions perform two 32-bit floating point operations on each half of the 64-bit floating point register Examples:ADD.PSSUB.PSMUL.PSDIV.PS
23
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 23 Putting it All Together - The earliest architectures were limited to instruction sets by the hardware technology of that time -In the 1960s, stack architecture became popular, viewed as being good match of high-level language -In the 1970s, the main concern of the architectures was to reduce the software cost, thus produced high-level architectures such as VAX machine
24
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 24 Putting it All Together.. Cont’d - In the 1980s, return to simpler architecture took place due to sophisticated compiler technology -In the 1990s, new architectures were introduced; these include:
25
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 25 Putting it All Together.. Cont’d 1990s Architectures 1: Address size doubles – 32-bit to 64-bit 2: Optimization of conditional branches via conditional execution e.g.; conditional move 3: Optimization of Cache performance via pre-fetch that increased the role of memory hierarchy in performance of computers 4: Multimedia support 5: Faster Floating point instructions 6: Long Instruction Word
26
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 26 Concluding the Instruction set Principles Three pillars of Computer Architecture Hardware, Software and Instruction Set Hardware, Software and Instruction Set Instruction Set Instruction Set Interface between hardware and software Taxonomy of Instruction Set: Taxonomy of Instruction Set: Stack, Accumulator and General Purpose Register Types and Size of Operands: Types and Size of Operands: Types: Integer, FP and Character Size: Half word, word, double word Size: Half word, word, double word Classification of operations Classification of operations Arithmetic, data transfer, control and support
27
MAC/VU-Advanced Computer Architecture Lecture 5 - Instruction Set Principles.. Cont'd 27 Concluding the Instruction set Principles Concluding the Instruction set Principles … Cont’d Operand Addressing Modes Operand Addressing Modes Immediate, register, direct and Indirect Immediate, register, direct (absolute) and Indirect Classification of Indirect Addressing Classification of Indirect Addressing Register, indexed, relative and memory Register, indexed, relative (i.e. with displacement) and memory Special Addressing Modes Special Addressing Modes Auto-increment, auto-decrement and scaled Control Instruction Addressing modes Control Instruction Addressing modes Branch, jump and procedure call/return
28
MAC/VU-Advanced Computer Architecture Lecture 5 - Instruction Set Principles.. Cont'd 28 Concluding the Instruction set Principles Concluding the Instruction set Principles… Cont’d Instruction encoding - Essential elements of computer instructions: type of operands, places of source and destinations and place of next instruction -Instruction word length Variable, fixed length and hybrid -Hybrid length taxonomy 4, 3, 2, 1 and 0 address format -Comparison of hybrid instruction word format Minimum number of memory bytes are required in case of 1 address (accumulator) format and maximum for 4-address format
29
MAC/VU-Advanced Computer Architecture Lecture 5 - Instruction Set Principles.. Cont'd 29 Concluding the Instruction set Principles Concluding the Instruction set Principles… Cont’d MIPS Instruction word format - RISC and MIPS a fixed length, 64-bit LOAD/STORE Architecture Architecture - It supports: - 8-, 16-, 32- and 64-bit operand - 8-, 16-, 32- and 64-bit operand - R-type, I-type and J-type - Arithmetic and logic operation - data transfer operations - Control flow operations
30
MAC/VU-Advanced Computer Architecture Lecture 5 - Instruction Set Principles.. Cont'd 30 Concluding the Instruction set Principles Concluding the Instruction set Principles… Cont’d Multimedia and Digital Signal Processing Operands -Graphic applications deal with 2D and 3D images -DSP adds fixed point to the data types – binary point just to the right of the sign-bit Multimedia and Digital Signal Processing operations -All are fixed-width operation, performing multiple narrow operations on either 64-bit or 128-bit ALU -The narrow operation B-byte, H-half word, W-word and 8B double word Multimedia and Digital Signal Processing issues -Saturating Add/Subtract Result Rounding Multiply Accumulate
31
MAC/VU-Advanced Computer Architecture Lecture 5 - Instruction Set Principles.. Cont'd 31 Concluding the Instruction set Principles Concluding the Instruction set Principles… Cont’d ISA Performance Role of Compiler: The interaction of compiler and high- level languages significantly effects how program uses an ISA -
32
MAC/VU-Advanced Computer Architecture Lecture 5 - Instruction Set Principles.. Cont'd 32 Allah Hafiz andAsalm-u-Alacum
33
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 33 Practice Problems Quantitative Principles [Lecture 2-3] Instruction Set Principles [Lecture 4-5]
34
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 34 Practice Problems Quantitative Principles [Lecture 2-3] 1:Computer hardware is designed using ISA having three types (Type A, B and C) of instructions. The clock cycles per instruction (CPI) for each type of instruction is as follows: Type – A2 CPI Type – B3 CPI Type – C4 CPI A compiler writer has written two different code sequences with different instruction count to execute an expression as given below. Code SequenceInstruction count for instruction type ABC 1214 2321 a)What is the instruction count of each sequence? b)Which of the sequence is faster? c)What is the CPI (average) for each instruction?
35
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 35 Solution to Practice Problem 1 a)The instruction count of Sequence 1 = 2+4+1 = 7 Sequence 2 = 1+1+4= 6 Sequence 2 = 1+1+4= 6 Result: Sequence 2 executes fewer instructions b)To find which sequence is faster, we have to find the CPU clock cycles for each sequence CPU Clock Cycles for sequence 1 = 2x2 + 3x4 + 4x1 = 20 cycles CPU Clock Cycles for sequence 1 = 2x3 + 3x2 + 4x4 = 28 cycles Result: Sequence 1 is faster c)To find the CPI [ CPU Cycles/Instruction Count) of each sequence CPI for sequence 1 = 20/7 = 2.85 CPI for sequence 2 = 28/6 = 4.67 Result: Sequence 2 which has fewer instructions has higher CPI, thus is slower
36
MAC/VU-Advanced Computer Architecture Lecture 6- Instruction Set Principles (3) 36 Practice Problems Instruction Set Principles [Lecture 4-5]
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.