Download presentation
Presentation is loading. Please wait.
Published byLaura Parsons Modified over 9 years ago
1
Computer Systems Organization and Architecture Topic 3: Processor Design
2
CPU must: ◦ Fetch instructions (suruhan ambil) ◦ Interpret _______ (tafsir _______) ◦ _______ data (_______ data) ◦ Process data (proses data) ◦ Write data (tulis data)
4
Registers form the highest level of the memory hierarchy (hierarki ingatan) ◦ Small set of high speed storage locations ◦ ______ storage for data and control information Two types of registers ◦ User-visible May be referenced by assembly-level instructions (suruhan paras perhimpunan) and are thus “_______” to the user ◦ Control (kawalan) and _______ registers Used to control the operation of the CPU Most are not visible to the user
5
General categories based on function ◦ General purpose (Serba guna) Can be assigned a variety of functions Ideally, they are defined _______ to the operations within the instructions ◦ _______ These registers only hold data ◦ Address (Alamat) These registers only hold _______ information Examples: general purpose address registers, segment pointers, stack pointers, index registers ◦ _______ codes (Kod _______) Visible to the user but values set by the CPU as the result of performing operations Example code bits: _______, _______, overflow (limpahan) Bit values are used as the basis for conditional jump instructions (suruhan lompat bersyarat)
6
Design trade off (tukar ganti) between general purpose and specialized registers ◦ General purpose registers _______ flexibility in instruction design ◦ _______ purpose registers permit implicit register specification in instructions - reduces register field size in an instruction ◦ No clear “best” design approach How many registers are enough? ◦ More registers permit more operands (kendalian) to be held within the CPU - reducing memory bandwidth requirements to some extent ◦ More registers cause an _______ in the field sizes needed to specify registers in an instruction word ◦ Locality of reference may not support too many registers ◦ Most machines use _______registers
7
How big (wide)? ◦ Address registers should be _______ enough to hold the longest address ◦ Data registers should be wide enough to hold most data types Would not want to use _______-bit registers if the vast majority of data operations used 16 and 32-bit operands Related to width of memory _______ bus Concatenate registers together to store longer formats B-C registers in the 8085 AccA-AccB registers in the 68HC11
8
These registers are used during the _______, decoding (penyahkodan) and _______ of instructions ◦ Many are not visible to the user / programmer ◦ Some are visible but can not be (easily) modified Typical registers ◦ _______ counter (PC) Points to the next instruction to be executed ◦ _______ register (IR) Contains the instruction being executed (most recently) ◦ Memory _______ register (MAR) Contains the address of a location in memory ◦ Memory _______ / _______ register (MBR) Contains a word of data to be written to memory or the word most recently read ◦ Program _______ word(s) Superset of condition code register Interrupt masks, supervisory modes, etc. Status information
9
A set of bits Includes Condition Codes _______ ◦ Contains the sign of the result of the last arithmetic operation _______ ◦ Set when the result is 0 _______ ◦ Set if an operation resulted in a carry (addition) into or borrow (subtraction) out of a high-order bit _______ ◦ Set if a logical compare result is equality _______ ◦ Used to indicate arithmetic overflow Interrupt enable/disable ◦ Used to enable or disable interrupts Supervisor ◦ Indicates whether the CPU is executing in supervisor or user mode
11
_______ Cycle ◦ May require memory access to fetch operands ◦ _______ addressing requires more memory accesses ◦ Can be thought of as additional instruction ________
13
Depends on CPU design In general: _______ ◦ PC contains _______ of next instruction ◦ Address moved to _______ ◦ Address placed on address bus ◦ Control unit requests memory read ◦ Result placed on _______ bus, copied to MBR, then to IR ◦ Meanwhile PC _______ by 1
14
IR is examined If indirect addressing, indirect cycle is _______ ◦ Right most N bits of _______ transferred to _______ ◦ Control unit requests memory _______ ◦ Result (address of _______) moved to MBR
17
May take many forms Depends on _______ being executed May include ◦ _______ read/write ◦ Input/Output ◦ _______ transfers ◦ _______ operations
18
_______ Current PC saved to allow resumption after interrupt Contents of PC copied to MBR Special memory location (e.g. _______ pointer) loaded to MAR MBR written to _______ PC loaded with address of interrupt handling routine Next instruction (first of _______ handler) can be fetched
20
Prefetch ◦ Fetch accessing main _______ ◦ Execution usually does not _______ main memory ◦ Can fetch next instruction during execution of current instruction ◦ Called instruction _______ Improved Performance ◦ But not doubled: Fetch usually _______ than execution Prefetch more than one instruction? Any jump or _______ means that prefetched instructions are not the required instructions ◦ Add more _______ to improve performance
21
The Central Processing Unit (CPU) is the _______ combination (kombinasi lojik) of the _______ _______ _______ (ALU) and the system’s control unit In this sub-section, we focus on the ALU and its operation ◦ Overview of the ALU ◦ Data representation (Perwakilan data) ◦ Computer Arithmetic and its hardware implementation
22
The ALU is that part of the computer that actually performs _______ and _______ operations on data All other elements of the computer system are there mainly to bring _______ to the ALU for processing or to take _______ from the ALU Registers are used as _______ and _______ for most ALU operations In early machines, _______ and _______ determined the overall structure of the CPU and its ALU ◦ Result was that machines were built around a single register, known as the __________ (penumpuk) ◦ The __________ was used in almost all ALU related _________
23
The _______ and _______of the CPU and the ALU is improved through increases in the complexity of the hardware ◦ Use _______ register sets to store operands, addresses and results ◦ _______ the capabilities of the ALU ◦ Use special hardware to support _______ of execution between points in a program ◦ _______ functional units within the ALU to permit concurrent operations Problem: design a minimal cost yet fully functional ALU ◦ What building block components would be included?
24
Solution: ◦ Only 2 basic _______ are required to produce a fully functional ALU A bit-wide _______ _______ unit A 2-input _______ gate ◦ NAND is a functionally complete logic operation ◦ Similarly, if you can add, all other arithmetic operations can be derived from addition. ◦ To conduct operations on _______ bit words is clearly tedious (menjemukan)! ◦ Goal then is to develop arithmetic and logic circuitry that is algorithmically _______ while remaining cost effective
25
_______-_______ format ◦ Positional representation using n bits ◦ Left most bit position is the sign bit 0 for _______ number 1 for _______ number ◦ Remaining n-1 bits represent the _______ ◦ Range: {-2 n-1 -1, +2 n-1 -1} ◦ Problems: Sign must be considered during arithmetic operations Dual representation of zero (-0 and +0)
26
Ones ______________ format ◦ Binary case of diminished (menyusut) _______ complement ◦ Negative numbers are represented by a bit-by-bit ______________ of the (positive) magnitude (the process of negation) ◦ Sign bit interpreted as in sign-magnitude format ◦ Examples (8-bit words): +42 = 0 00101010 - 42 = 1 11010101 ◦ Still have a _______ representation for zero (all zeros and all ones)
27
Twos ______________ format ◦ Binary case of radix complement ◦ Negative numbers, -X, are represented by the pseudo- positive number 2 n - |X| ◦ With 2 n symbols 2 n-1 -1 _______ numbers 2 n-1 _______ numbers ◦ Given the representation for +X, the representation for -X is found by taking the 1s complement of +X and adding 1 ◦ Caution: avoid confusion with “2s complement _______ (representation) and the 2s complement _______
28
◦ Converting between two word lengths (e.g., convert an 8- bit format into a 16-bit format) requires a sign extension: The _______ bit is extended from its current location up to the new location All bits in the extension take on the value of the old _______ bit +18= 00010010 +18= 00000000 00010010 -18= 11101110 -18= 11111111 11101110
30
Use of a single _______ adder is the simplest hardware ◦ Must implement an n-repetition for-loop for an n-bit addition ◦ This is lots of _______ for a typical addition Use a _______ adder unit instead ◦ n full adder units cascaded together ◦ In adding X and Y together unit i adds X i and Y i to produce SUM i and CARRY i ◦ Carry out of each stage is the carry in to the next stage ◦ Worst case add time is n times the delay of each unit -- despite the _______ operation of each adder unit -- Order (n) delay ◦ With signed numbers, watch out for _______: when adding 2 positive or 2 negative numbers, _______ has occurred if the result has the _______ sign
31
Alternatives to the ripple adder ◦ Must allow for the worst case delay in a ripple adder ◦ In most cases, _______ signals do not propagate through the entire adder ◦ Provide additional hardware to detect where carries will occur or when the carry _______ is completed ◦ Carry Completion Sensing Adders use additional circuitry to detect the time when all carries are completed Signal control unit that add is finished Essentially an ______________ device Typical add times are O(log n)
32
◦ Carry ___________ Adders Predict in advance what adder stage of a ripple adder will generate a carry out Use prediction to avoid the carry propagation delays -- generate all of the carries at once Add time is a _______, regardless of the width, n, of the word -- O(1) Problem: prediction in stage i requires information from all previous stages -- gates to implement this require large numbers of inputs, making this adder impractical for even moderate values of n
33
To perform X-Y, realize that X-Y = X+(-Y) Therefore, the following hardware is “typical”
35
A number of methods exist to perform integer multiplication ◦ Repeated _______: add the multiplicand to itself “multiplier” times ◦ Shift and add -- traditional “pen and paper” way of multiplying (extended to binary format) ◦ High speed (special purpose) hardware multipliers _______ addition ◦ Least sophisticated method ◦ Just use adder over and over again ◦ If the multiplier is n bits, can have as many as 2 n iterations of addition -- O(2 n ) !!!! ◦ Not used in an _______
36
Shift and add ◦ Computer’s version of the pen and paper approach: 1011 (11) x1101 (13) =========== 1011 00000 Partial products 101100 1011000 =========== 10001111 (143) ◦ The computer version accumulates the partial products into a running (partial) sum as the algorithm progresses ◦ Each partial product generation results in an _______ and _______ operation
37
Shift and add hardware for unsigned integers
38
Shift and add flowchart for unsigned integers
39
To multiply signed numbers (2s ____________) ◦ Normal shift and add does not work (problem in the basic algorithm of no sign extension to 2 n bits) ◦ ________ all numbers to their positive magnitudes, multiple, then figure out the correct sign ◦ Use a method that works for both positive and negative numbers ________ algorithm is popular (recoding the multiplier) ◦ ________ algorithm As in S&A, strings of 0s in the ________ only require shifting (no addition steps) “Recode” strings of 1s to permit similar ________ String of 1s from 2 u down to 2 v is treated as 2 u+1 - 2 v
40
In other words, - At the right end of a string of 1s in the multiplier, perform a ________ - At the left end of the string perform an ________ - For all of the 1s in between, just do ________ Hardware modifications required in (Figure shift and add hardware for unsigned integers) - Ability to perform ________ - Ability to perform ________ shifting rather than logical shifting (for sign extension) - A flip flop for bit Q -1 To determine ________ (add and shift, subtract and shift, shift) examine the bits Q 0 Q -1 - 00 or 11: just shift - 10: ________ and shift - 01: ________ and shift
41
Booth’s algorithm for multiplication
42
Advantages of Booth: - Treats positive and negative numbers ________ - Strings of 1s and 0s can be skipped over with shift operations for faster ________ time High performance multipliers ◦ ________ the computation time by employing more hardware than would normally be found in a S&A-type multiplier unit ◦ Not generally found in general-purpose processors due to expense ◦ Examples Combinational hardware multipliers Pipelined Wallace Tree adders from Carry-Save Adder units
43
Once you have committed to implementing multiplication, implementing division is a relatively easy next step that utilizes much of the same hardware Want to find quotient, Q, and remainder, R, such that D = Q x V + R Restoring division for ________ integers ◦ Algorithm adapted from the traditional “pen and paper” approach ◦ Algorithm is of time complexity O(n) for n-bit dividend ◦ Uses essentially the same ALU hardware as the ________ multiplication algorithm Adder / subtractor unit ________ wide shift register AQ that can be shifted to the left ________ for the divisor Control logic
44
Restoring division algorithm for unsigned integers
45
For two’s complement numbers, must deal with the ________ extension “problem” Algorithm: ◦ Load M with divisor, AQ with dividend (using sign bit extension) ◦ ________ AQ left 1 position ◦ If M and A have same sign, A A-M, otherwise A A+M ◦ Q 0 1 if sign bit of A has not changed or (A=0 AND Q=0), otherwise Q 0 =0 and restore *A ◦ Repeat ________ and +/- operations for all bits in Q ◦ Remainder is in A, quotient in Q If the signs of the divisor and the dividend were the same, quotient is correct, otherwise, Q is the 2’s complement of the quotient
46
2’s complement division examples
47
________ fixed point schemes do not have the ability to represent very large or very small numbers Need the ability to dynamically ________ the decimal point to a convenient location Format: +/-M x R +/-E Significand / mantissas are stored in a ________ format ◦ Either 1.xxxxx or 0.1xxxxx ◦ Since the 1 is required, don’t need to explicitly store it in the data word -- insert it for calculations only Exponents can be positive or negative values ◦ Use ________ (Excess coding) to avoid operating on negative exponents ◦ ________ is added to all exponents to store as positive numbers
49
For a fixed n-bit representation length, 2 n combinations of symbols ◦ If floating point ________ the range of numbers in the format (compared to integer representation) then the “spacing” between the numbers must increase This causes a ________ in the format’s precision ◦ If more bits are allocated to the exponent, range is ________ at the expense of decreased precision ◦ Similarly, more significand bits increases the ________ and reduces the range ◦ The ________ is chosen at design time and is not explicitly represented in the format Small -- smaller range Large -- increased range but loss of significant bits as a result of mantissa alignment when normalizing
50
Problems to deal with in the format ◦ Representation of ________ ◦ Over and ________ and how to detect ◦ ________ operations IEEE 754 format ◦ Defines single and double ________ formats (32 and 64 bits) ◦ Standardizes formats across many different platforms ◦ Radix 2 ◦ Single Range 10 -38 to 10 +38 8-bit exponent with 127 bias 23-bit mantissa ◦ Double Range 10 -308 to 10 +308 11-bit exponent with 1023 bias 52-bit mantissa
51
IEEE 754 Formats
52
Floating point arithmetic operations ◦ Addition and subtraction ________ significand Add or subtract significand Post ________ ◦ Multiplication ________ exponents Multiply significand Post normalize ◦ Division ________ exponents Divide significand Post normalize
56
In this section, we have focused on the operation of the CPU ◦ Registers and their use ◦ Instruction execution Looked at the basicd concepts associated with computer arithmetic ◦ Number representation ◦ Basic ALU construction ◦ Hardware and software implementations of multiplication and division operations ◦ Floating point numbers and operations
58
Computer Organization and Architecture, 6th Edition. Stallings, W. Prentice Hall. Computer Organization and Design. David A. Patterson, John L. Hennessy. Morgan Kaufmann
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.