Download presentation
Presentation is loading. Please wait.
1
PIPELINE AND VECTOR PROCESSING
CHAPTER # 9 PIPELINE AND VECTOR PROCESSING
2
CONTENTS Parallel Processing Pipelining Arithmetic Pipeline
Instruction Pipeline RISC Pipeline Vector Processing Array Processors
3
Figure 9-1 Processor with multiple functional units
Adder-sub tractor Integer multiply Logic unit Shift unit Processor register Incrementer To memory Floating-point Add-subtract Floating-point multiply Floating-point divide
4
Instruction and stream.
Single instruction stream, single data stream (SISD). Single instruction stream, multiple data stream (SIMD). Multiple instruction stream, single data stream (MISD). Multiple instruction stream, multiple data stream (MIMD).
5
Figure 9-2 Example of Pipelining.
Ai Bi Ci R Ai , R Bi Input Ai and Bi R R1 * R2, R Ci Multiply and input Ci R R3 + R4 Add Ci to product R1 R2 Multiplier R3 R4 Adder R5
6
1 A1 B1 ---- ---- ---- Content of registers in pipeline example.
Table 9-1 Content of registers in pipeline example. Clock Pulse number Segment1 R R2 Segment2 R R4 Segment3 R5 A B A B A1*B C A B A2*B C A1*B1+C1 A B A3*B C A2*B2+C2 A B A4*B C A3*B3+C3 A B A5*B C A4*B4+C4 A B A6*B C A5*B5+C5 A7*B C A6*B6+C6 A7*B7+C7
7
Figure 9-3 Four segment pipeline.
Clock Input S1 R1 S2 R2 S3 R3 S4 R4
8
Figure 9-4 Space-time diagram for pipeline.
Clock cycle 1 2 3 4 5 6 7 8 9 T1 T2 T3 T4 T5 T6 Segment: 1 2 3 4
9
Figure 9-5 Multiple functional units in parallel.
Ii+3 P3 Ii+2 P2 Ii+1 P1 Ii
10
Add or subtract the mantissas. Normalize the result.
Arithmetic Pipeline Compare the exponents. Align the mantissas. Add or subtract the mantissas. Normalize the result.
11
Exponents Mantissas a b A B R Difference
Figure 9-6 Pipeline for floating-point and subtraction. Exponents Mantissas a b A B Segment 1 Segment 2 Segment 3 Segment 4 R Compare Exponent By subtraction Choose exponent Adjust Align mantissas Add or subtract mantissas Normalize result Difference
12
Instruction Pipeline Fetch the instruction from memory. Decode the instruction. Calculate the effective address. Fetch the operands from memory. Execute the instruction. Store the result in the proper place.
13
Figure 9-7 Four-segment CPU pipeline.
Decode instruction And calculate Effective address Fetch instruction from memory Branch? Fetch operand From memory Execute instruction Interrupt? Interrupt handling Update PC Empty pipe yes no
14
Segments and their purpose.
FI is the segment that fetches an instruction. DA is the segment that decodes the instruction and calculate the effective address. FO is the segment that fetches the operand. EX is the segment that executes the instruction.
15
Figure 9-8 Timing of instruction pipeline.
Step: 1 2 3 4 5 6 7 8 9 10 11 12 13 Instruction: 1 FI DA FO EX 2 FI DA FO EX (Branch) 3 FI DA FO EX 4 FI FI DA FO EX 5 -- -- -- FI DA FO EX 6 FI DA FO EX 7 FI DA FO EX
16
Pipeline Conflicts Resource conflicts Data dependency conflicts Branch difficulties conflicts
17
Three-segment instruction pipeline
I: Instruction fetch A: ALU operation E: Execute instruction
18
Delayed Load LOAD R1 M[address 1] LOAD R2 M[address 2] ADD R3 R1+R2
STORE M[address 3] R3
19
Figure 9-9 Three segment pipeline timing.
6 5 4 3 2 1 I Clock cycles A E 1. Load R1 2. Load R2 3. Add R1+R2 4. Store R3 Pipeline timing with data conflict 7 3. No-operation 4. Add R1+R2 5. Store R3 Pipeline timing with delayed load E
20
Figure 9-10 Examples of delayed branch.
Clock cycles A E 1. Load 2. Increment 3. Add 4. Subtract 10 9 8 7 6 5 4 3 2 1 5. Branch to X 6. NO-operation 7. NO-operation 8. Instruction in X Using no-operation instructions
21
Figure 9-10 Examples of delayed branch.
2 3 4 5 6 7 8 Clock cycles I A E 1. Load 2. Increment I A E 3. Branch to X I A E 4. Add I A E 5. Subtract I A E 6. Instruction in X I A E Rearranging instruction
22
Application of Vector Processing
Long range weather forecasting. Petroleum explorations. Seismic data analysis. Medical diagnosis. Aerodynamics and space flight simulations.
23
Figure 9-11 Instruction format for vector processor
Operation code Base address Source 1 Base address Source 2 Base address destination Vector length
24
Figure 9-12 Pipeline for calculating an inner product
Source A B Multiplier pipeline Adder
25
Figure 9-13 Multiple module memory organization
AR DR Memory array Address bus Data bus
26
Types of Array Processors
Attached Array Processor SIMD Array Processor
27
Figure 9-14 Attached Array Processor with host computer
General-Purpose computer input-output interface Attached array processor Local memory Main memory High-speed memory to Memory bus
28
Figure 9-15 SIMD array processor organization
Master control unit Main memory PE1 PE2 PE3 PEn M1 M2 M3 Mn
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.