Download presentation
Presentation is loading. Please wait.
Published byJames Arron Ray Modified over 9 years ago
1
INSTRUCTION PIPELINING
2
What is pipelining? The greater performance of the cpu is achieved by instruction pipelining. 8086 microprocesor has two blocks BIU(BUS INTERFACE UNIT) EU(EXECUTION UNIT) The BIU performs all bus operations such as instruction fetching,reading and writing operands for memory and calculating the addresses of the memory operands. The instruction bytes are transferred to the instruction queue. EU executes instructions from the instruction system byte queue. Both units operate asynchronously to give the 8086 an overlapping instruction fetch and execution mechanism which is called as Pipelining.
3
INSTRUCTION PIPELINING First stage fetches the instruction and buffers it. When the second stage is free, the first stage passes it the buffered instruction. While the second stage is executing the instruction,the first stage takes advantages of any unused memory cycles to fetch and buffer the next instruction. This is called instruction prefetch or fetch overlap.
4
Inefficiency in two stage instruction pipelining There are two reasons The execution time will generally be longer than the fetch time.Thus the fetch stage may have to wait for some time before it can empty the buffer. When conditional branch occurs,then the address of next instruction to be fetched become unknown.Then the execution stage have to wait while the next instruction is fetched.
5
Two stage instruction pipelining Simplified view wait new address wait Instruction Instruction Result discard EXPANDED VIEW Fetch Execute
6
Decomposition of instruction processing To gain further speedup,the pipeline have more stages(6 stages) Fetch instruction(FI) Decode instruction(DI) Calculate operands (i.e. EAs)(CO) Fetch operands(FO) Execute instructions(EI) Write operand(WO)
7
SIX STAGE OF INSTRUCTION PIPELINING Fetch Instruction(FI) Read the next expected instruction into a buffer Decode Instruction(DI) Determine the opcode and the operand specifiers. Calculate Operands(CO) Calculate the effective address of each source operand. Fetch Operands(FO) Fetch each operand from memory. Operands in registers need not be fetched. Execute Instruction(EI) Perform the indicated operation and store the result Write Operand(WO) Store the result in memory.
8
Timing diagram for instruction pipeline operation
9
High efficiency of instruction pipelining Assume all the below in diagram All stages will be of equal duration. Each instruction goes through all the six stages of the pipeline. All the stages can be performed parallel. No memory conflicts. All the accesses occur simultaneously. In the previous diagram the instruction pipelining works very efficiently and give high performance
10
Limits to performance enhancement The factors affecting the performance are 1. If six stages are not of equal duration,then there will be some waiting time at various stages. 2. Conditional branch instruction which can invalidate several instruction fetches. 3. Interrupt which is unpredictable event. 4. Register and memory conflicts. 5. CO stage may depend on the contents of a register that could be altered by a previous instruction that is still in pipeline.
11
Effect of conditional branch on instruction pipeline operation
12
Conditional branch instructions Assume that the instruction 3 is a conditional branch to instruction 15. Until the instruction is executed there is no way of knowing which instruction will come next The pipeline will simply loads the next instruction in the sequence and execute. Branch is not determined until the end of time unit 7. During time unit 8,instruction 15 enters into the pipeline. No instruction complete during time units 9 through 12. This is the performance penalty incurred because we could not anticipate the branch.
13
Simple pattern for high performance Two factors that frustrate this simple pattern for high performance are 1. At each stage of the pipeline,there is some overhead involved in moving data from buffer to buffer and in performing various preparation and delivery functions.This overhead will lengthen the execution time of a single instruction.This is significant when sequential instructions are logically dependent,either through heavy use of branching or through memory access dependencies 2. The amount of control logic required to handle memory and register dependencies and to optimize the use of the pipeline increases enormously with the number of stages.
14
Six-stage CPU instruction pipeline
15
THANK YOU
16
8086 Pin Function By: Madhu Oruganti SNIST,
17
Pin Diagram
18
Pin Functions Out of 40 pins, 32 pins are having same function in minimum or maximum mode, And remaining 8 pins are having different functions in minimum and maximum mode. Following are the pins which are having same functions
19
Symbol: AD15 - AD0, Pin No. 39, 2-16 Type: I/O ADDRESS DATA BUS: time multiplexed memory/IO address (T1), and data (T2, T3, TW, T4) bus. These lines are active HIGH and float to 3-state OFF during interrupt acknowledge and local bus ``hold acknowledge''.
20
Symbol: A19/S6, A18/S5, A17/S4, A16/S3 Pin No: 35 - 38 Type: O Address/ Status lines During T1: Address and then during T2, T3, Tw, T4 Status S5: IF flag condition and S6: LOW A17/S4A16/S3Characteristics 0 (Low) 0 1 (High) 1 01010101 Alternate Data Stack Code or none Data
21
Symbol: BHE#/S7 Pin No.: 34 Type: O Bus High Enable / Status: BHE#A0Characteristics 00110011 01010101 Whole word from even location Upper byte from/to odd address Lower byte from/to even address None
22
Symbol: RD# Pin No.: 32 Type: O Read: RD# is active LOW during read cycle in T2, T3 and Tw clocks and indicates that processor is performing memory or I/O read
23
Symbol: READY Pin No.: 22 Type: I Ready signal is received from memory or I/O devices to indicate the completion of data transfer Synchronized by 8284 clock generator
24
Symbol: INTR Pin No.: 18 Type: I Interrupt Request: Level triggered input received from interrupting device Sampled during last clock of each instruction cycle A subroutine is vectored through IVT if interrupt enable flag (IF) is SET
25
Symbol: TEST# Pin No.: 23 Type: I Test: Input is examined by the ‘wait’ instruction, if TEST# is LOW processor will continue execution otherwise wait in an idle state.
26
Symbol: NMI Pin No.: 17 Type: I Non Maskable Interrupt: Edge triggered input causes a TYPE 2 interrupt. Not maskable internally by software.
27
Symbol: RESET Pin No.: 21 Type: I Reset: Input causes the processor to immediately terminate its present activity Must be HIGH for at least 4 clock cycles
28
Symbol: CLK Pin No.: 19 Type: I Clock: provides the basic timing for the processor and bus controller. It is asymmetric with a 33% duty cycle to provide optimized internal timing.
29
Symbol: Vcc Pin No.: 40 Vcc: +5V power supply pin.
30
Symbol: GND Pin No.: 1, 20 GROUND
31
Symbol: MN/MX# Pin No.: 33 Type: I MINIMUM/MAXIMUM: indicates what mode the processor is to operate in. HIGH indicates minimum mode (Single processor system) LOW indicates maximum mode (Multi- processor system)
32
Pins having different functions in maximum mode Pin number 24 to 31 is having different functions in maximum mode which is explained below
33
Symbol: S2#, S1#, S0# Pin No.: 26-28 Type: O Status: active during T4, T1, and T2 and is returned to the passive state (1, 1, 1) during T3 or during TW when READY is HIGH Used by the 8288 Bus Controller to generate all memory and I/O access control signals
34
S2S1S0Characteristics 000Interrupt Acknowledge 001Read I/O Port 010Write I/O Port 011Halt 100Code Access 101Read Memory 110Write Memory 111Passive
35
Symbol: RQ#/GT0#, RQ#/GT1# Pin No.: 30, 31 Type: I/O Request/Grant: Pins are used by other local bus masters to force the processor to release the local bus at the end of the processor's current bus cycle. RQ/GT0# is having higher priority than RQ/GT1#
36
Symbol: LOCK# Pin No.: 29 Type: O LOCK: output indicates that other system bus masters are not to gain control of the system bus while LOCK is active LOW. Activated by the ``LOCK'' prefix instruction and remains active until the completion of the next instruction.
37
Symbol: QS1, QS0 Pin No.: 24, 25 Type: O Queue Status: The queue status is valid during the CLK cycle after which the queue operation is performed. QS1QS0Characteristics 00No Operation 01First Byte of Op Code from Queue 10Empty the Queue 11Subsequent Byte from Queue
38
Pins having different functions in minimum mode Pin number 24 to 31 is having different functions in minimum mode which is explained below
39
Symbol: M/IO# Pin No.: 28 Type: O Status Line: used to distinguish a memory access from an I/O access HIGH for memory operation and LOW for I/O operations
40
Symbol: WR# Pin No.: 29 Type: O Write: indicates that the processor is performing a write memory or write I/O cycle
41
Symbol: INTA# Pin No.: 24 Type: O Interrupt Acknowledgement: used as a read strobe for interrupt acknowledge cycles Active LOW during T2, T3 and TW of each interrupt acknowledge cycle.
42
Symbol: ALE Pin No.: 25 Type: O Address Latch Enable: It is a HIGH pulse active during T1 of any bus cycle Provided by the processor to latch the address into the 8282/8283 address latch.
43
Symbol: DT/R# Pin No.: 27 Type: O Data Transmit/Receive: used to control the direction of data flow through the transceiver
44
Symbol: DEN# Pin No.: 26 Type: O Data Enable: provided as an output enable for the 8286/8287 in a minimum system which uses the transceiver
45
Symbol: HOLD, HLDA Pin No.: 31, 30 Type: I, O Hold: indicates that another master is requesting a local bus ``hold.'‘ The processor receiving the ``hold'' request will issue HLDA (HIGH) as an acknowledgement
46
Email:madhuoruganti@sreenidhi.edu.in
47
Combinational Circuits Madhu Oruganti. SNIST
48
Outline Boolean Algebra Decoder Encoder MUX
49
History: Computer and the Rationalist Modern research issues in AI are formed and evolve through a combination of historical, social and cultural pressures. The rationalist tradition had an early proponent in Plato, and was continued on through the writings of Pascal, Descates, and Liebniz For the rationalist, the external world is reconstructed through the clear and distinct ideas of a mathematics
50
History: Development of Formal Logic The goal of creating a formal language for thought also appears in the work of George Boole, another 19 th century mathematician whose work must be included in the roots of AI The importance of Boole’s accomplishment is in the extraordinary power and simplicity of the system he devised: Three Operations
51
Three Operations three basic Boolean operations can be defined arithmetically as follows. x ∧ y=xy x ∨ y=x + y − xy ¬x=1 − x
52
Boolean function and logic diagram Boolean algebra: Deals with binary variables and logic operations operating on those variables. Logic diagram: Composed of graphic symbols for logic gates. A simple circuit sketch that represents inputs and outputs of Boolean functions.
53
Basic Identities of Boolean Algebra (1)x + 0 = x (2)x · 0 = 0 (3)x + 1 = 1 (4)x · 1 = 1 (5) x + x = x (6) x · x = x (7) x + x’ = x (8) x · x’ = 0 (9) x + y = y + x (10) xy = yx (11) x + ( y + z ) = ( x + y ) + z (12) x (yz) = (xy) z (13) x ( y + z ) = xy + xz (14) x + yz = ( x + y )( x + z) (15) ( x + y )’ = x’ y’ (16) ( xy )’ = x’ + y’ (17) (x’)’ = x
54
Gates Refer to the hardware to implement Boolean operators. The most basic gates are
55
Boolean function and truth table
56
Outline Boolean Algebra Decoder Encoder MUX
57
Decoder Accepts a value and decodes it Output corresponds to value of n inputs Consists of: Inputs (n) Outputs (2 n, numbered from 0 2 n - 1) Selectors / Enable (active high or active low)
58
The truth table of 2-to-4 Decoder
59
2-to-4 Decoder
61
The truth table of 3-to-8 Decoder A2A1A0D0D1D2D3D4D5D6D7 0001 0011 0101 0111 1001 1011 1101 1111
62
3-to-8 Decoder
63
3-to-8 Decoder with Enable
64
Decoder Expansion Decoder expansion Combine two or more small decoders with enable inputs to form a larger decoder 3-to-8-line decoder constructed from two 2-to-4-line decoders The MSB is connected to the enable inputs if A 2 =0, upper is enabled; if A 2 =1, lower is enabled.
65
Decoder Expansion
66
Combining two 2-4 decoders to form one 3-8 decoder using enable switch The highest bit is used for the enables
67
How about 4-16 decoder Use how many 3-8 decoder? Use how many 2-4 decoder?
68
Outline Boolean Algebra Decoder Encoder Mux
69
Encoders Perform the inverse operation of a decoder 2 n (or less) input lines and n output lines
70
Encoders
71
Encoders with OR gates
72
Encoders Perform the inverse operation of a decoder 2 n (or less) input lines and n output lines
73
Outline Boolean Algebra Decoder Encoder Mux
74
Multiplexer (MUX) A selector chooses a single data input and passes it to the MUX output It has one output selected at a time. A multiplexer can use addressing bits to select one of several input bits to be the output.
75
Function table with enable
76
4 to 1 line multiplexer S1S1 S0S0 F 00I0 01I1 10I2 11I3 4 to 1 line multiplexer 2 n MUX to 1 n for this MUX is 2 This means 2 selection lines s 0 and s 1
77
Multiplexer (MUX) Consists of: Inputs (multiple)= 2 n Output (single) Selectors (# depends on # of inputs) = n Enable (active high or active low)
78
Multiplexers versus decoders A Multiplexer uses n binary select bits to choose from a maximum of 2 n unique input lines. Decoders have 2^n number of output lines while multiplexers have only one output line. The output of the multiplexer is the data input whose index is specified by the n bit code.
79
Multiplexer Versus Decoder Note that the multiplexer has an extra OR gate. A1 and A0 are the two inputs in decoder. There are four inputs plus two selecs in multiplexer. 4-to-1 Multiplexer 2-to-4 Decoder
80
Cascading multiplexers Using three 2-1 MUX to make one 4-1 MUX S1S1 S0S0 F 00I0 01I1 10I2 11I3 F
81
F 2-1 MUX S E S 2 E S2S2 S1S1 S0S0 F 000I0I0 001I1I1 010I2I2 011I3I3 100I4I4 101I5I5 110I6I6 111I7I7 I0I1I0I1 I2I3I2I3 I4I5I4I5 I6I7I6I7 Example: Construct an 8-to-1 multiplexer using 2-to-1 multiplexers.
82
Example : Construct 8-to-1 multiplexer using one 2-to-1 multiplexer and two 4-to-1 multiplexers S2S2 S1S1 S0S0 X 000I0I0 001I1I1 010I2I2 011I3I3 100I4I4 101I5I5 110I6I6 111I7I7
83
Quadruple 2-to-1 Line Multiplexer Used to supply four bits to the output. In this case two inputs four bits each.
84
Quadruple 2-to-1 Line Multiplexer E (Enable) S (Select) Y (Output) 0XAll 0’s 10A 11B
85
Sequential circuits part 2: implementation, analysis & design
86
More summer fashion SR is one of 4 basic flip flops common in computer design Others can all be constructed from SR; they are: JK D (data) T (toggle)
87
JK flip flop Resolves undefined transition in SR J input acts like S (sets device) K acts like R (resets) When JK = 11, have toggle condition: switch from one state to other
88
Implementation of JK flip flop
89
JK flip flop implementation If JK = 00, SR = 00 because of AND – so SR won’t change state when clocked
90
JK flip flop implementation If JK = 10, R must be 0: if Q=0, Q’=1, so SR=10, the set condition: flip flop will change state (to Q=1) if Q=1, Q’=0, SR=00 (stable condition) so flip flop stays in Q=1
91
JK flip flop implementation If JK = 01, final state is Q=0 (analogous to JK=10)
92
JK flip flop implementation If JK=11, Q connects directly to R, Q’ to S so if Q=0, SR=10, so Q=1 if Q=1, SR=01, so Q=0
93
D flip flop D: data; one input + CP Q(t+1) independent of Q(t) – depends only on value of D at time t D flip flop holds data until next pulse
94
Constructing registers Can use D flip flops to construct individual bits of registers – one signal sent to each bit Setting/resetting flip flop requires a 1 signal on exactly one of its input lines – CP restricts incoming signal to appropriate time so device remains in sync D is split in 2, with one half inverted – so always 1 true, 1 false on data line Since CP usually false, both inputs normally 0 (no change in flip flop) When clock goes high, one of 2 lines (S or R) delivers 1
95
Device select signal Used in combination with CP & D signals to determine if register should send or receive data When one register is to send to another, 3 simultaneous signals sent to each register: clock device select send or receive All 3 ANDed together to indicate that specific register should send or receive at specific time
96
T flip flop T stands for Toggle like D, has one input + CP acts like control line that specifies selective toggle if T=0, flip flop doesn’t change; if T=1, toggles
97
Implementation of T flip flop Identical to JK, with J=K
98
General sequential network Sequential circuit: interconnection of gates & flip flops All gates can be grouped conceptually as combinational network, all flip flops as group of state registers Between clock pulses, combinational part produces output; amount of time needed depends on number of gates in net
99
General sequential network Arrows: one or more connecting lines I/O lines: connections to external environment Arrow between boxes: input lines to flip flops Clock line assumed but not shown
100
Hardware analysis vs. design Analysis: determine output given input and sequential network Design: input and output are known; need to determine makeup of sequential network General approach: construct state transition table and transition diagram determine output stream for given input stream
101
Excitation table The excitation table is a design tool for constructing circuits from a given type of flip-flop Given the desired transition from Q(t) to Q(t +1), what inputs are necessary to make the transition happen?
102
Characteristic table vs. Excitation table for SR flip flop Tells what next state is, given current input and current state Tells what current input must be given current state
103
Sequential analysis Step 1: List all possible combinations of current state and current input in an analysis table Step 2: For each combination, compute the output and the current inputs to the state registers Step 3: From the characteristic table, determine the next state and construct the state transition table and diagram
104
Example problem State registers: FFA & FFB (T flip flops) Combinational circuit inputs: X1 AND B (TA) X2 OR A (TB) TA & TB are inputs to FFA & FFB output: B’ AND X1 (Y)
105
Example problem 2 flip flops, so 4 possible states: AB 00 01 10 11 2 inputs, so 4 possible input combinations: X1X2 00 01 10 11
106
Example problem Given a state (AB) and an input (X1X2): what is output? what will be the state after CP? 16 possible answers, as shown on next slide
107
Analysis table for sample problem circuit 1 st 4 columns list possible combinations of initial state & initial input By the logic diagram, we know: Y(t)=X1(t) AND B’(t) TA(t)=X1(t) AND B(t) TB(t)=X2(t) OR A(t) Compute next 3 columns given above Compute last 2 from: characteristic table for T flip flop initial state of flip flop flip flop’s initial input
108
State transition table Table shows simple rearrangement of selected columns from table on previous slide For given initial state A(t)B(t) and input X1(t)X2(t), lists next state (A+1)(t)(B+1)(t) and initial output Y(t) States listed as ordered pairs – next state followed by initial output
109
State transition diagram Easier to visualize circuit behavior Transitions listed as ordered pairs of input followed by initial output, with slash separator
110
Asynchronous inputs An asynchronous input changes state of a flip-flop immediately without regard to CP Preset sets Q to 1 Clear clears Q to 0 Used to initialize the state of a machine Normal operation: both lines 0
111
Sequential design Given the state transition diagram, the output, and the type of flip-flop to be used, design the combinational circuit Any unused input combinations or unused states are don’t care conditions 2 n states are possible with n flip-flops
112
Design steps Step 1: In a design table, list the initial state, input, and output, and from the transition diagram list the next state Step 2: Use the excitation table for the given type of flip-flop to determine the input required for the state registers Step 3: Use Karnaugh maps to design a minimized two-level circuit for each flip-flop input
113
Sample problem
114
Design table for sample problem
115
Sequential design & K-maps Each flip flop in the problem can be considered a function of four variables: initial state (AB) input (X1X2) To design the combinational circuit we need a 4-variable K-map for each flip flop input
116
K-maps for sample problem Figures a and b below show K-maps for S & R inputs to FFA Row values are AB, columns are X1X2 X1X2 = 00 is a don’t care condition for both inputs, so first column of both tables is X
117
K-maps for sample problem Figures c and d show inputs to FFB Note that we can take advantage of don’t care conditions to minimize circuit
118
Resulting circuit with original spec
119
K-map & circuit for output Y
120
Another look at the register Basic building block of instruction set architecture array of D flip flops; each is bit in register common clock line connected to all flip flops; # of flip flops doesn’t affect speed of load operation because all receive clock signal simultaneously
121
Memory Conceptually, main memory is just a big array of registers Input: address lines, control lines, data lines Data lines are bidirectional (output also) Control signals: CS: Chip select, to enable or select the memory chip WE: Write enable, to write or store a memory word to the chip OE: Output enable, to enable the output buffer to read a word from the chip
122
Memory chips Storage capacity of each is identical (512 bits); left uses 8-bit word, right uses 1 Generally, chip with 2 n words has n address lines
123
Memory access To store a word (memory write) Select chip by setting CS to 1 Put data and address on the bus and set WE to 1 To retrieve a word (memory read) Select chip by setting CS to 1 Put address on the bus, set OE to 1, and read the data on the bus
124
4 x 2 memory chip 2 address lines (A0, A1) & 2 data lines (D0, D1) Stores 4 2-bit words each bit is D flip flop Address lines drive 2 x 4 decoder 1 output is 1, other 3 0 line with 1 signal selects row of D flip flops that make up word accessed by chip
125
Closer look Diagram below shows implementation of “Read enable” box Alphabet soup: WE: write enable CS: chip select OE: output enable MMV: monostable multivibrator (CP)
126
Read Enable Three normal modes: CS=0 (chip not selected) CS=1, WE=1, OE=0 (chip selected for write) CS=1, WE=0, OE=1 (chip selected for read) WE & OE not permitted to be 1 at same time
127
Memory types: volatile SRAM: Static random access memory most closely resembles model we’ve seen advantage: fast disadvantage: large – several transistors required for each bit cell DRAM: Dynamic RAM overcomes size problem of SRAM: one transistor, one capacitor per cell advantage: high capacity disadvantage: relatively slow because requires refresh operation
128
Memory types: non-volatile ROM: Read-only memory Simplest type, ROM, is prewritten to spec by manufacturer – can’t be overwritten PROM: Programmable ROM: user can write once (by blowing embedded fuses) – can’t be overwritten EPROM: Erasable PROM: can be wiped out & reprogrammed (requires removal from computer)
129
Memory types: non-volatile EEPROM: Electrically erasable PROM Like EPROM, but doesn’t require removal to reprogram Can reprogram individual cell (doesn’t have to be whole chip) Flash memory: A type of EEPROM flash card is array of flash chips flash drive has interface circuitry to mimic hard drive
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.