Chapter One Introduction to Pipelined Processors
Principle of Designing Pipeline Processors (Design Problems of Pipeline Processors)
Job Sequencing and Collision Prevention
Job Sequencing and Collision Prevention Consider reservation table given below at t=1 1 2 3 4 5 6 Sa A Sb Sc
Job Sequencing and Collision Prevention Consider next initiation made at t=2 The second initiation easily fits in the reservation table 1 2 3 4 5 6 7 8 Sa A1 A2 Sb Sc
Job Sequencing and Collision Prevention Now consider the case when first initiation is made at t = 1 and second at t = 3. Here both markings A1 and A3 falls in the same stage time units and is called collision and it must be avoided 1 2 3 4 5 6 7 8 Sa A1 A3 Sb A1A3 Sc
Terminologies
Terminologies Latency: Time difference between two initiations in units of clock period Forbidden Latency: Latencies resulting in collision Forbidden Latency Set: Set of all forbidden latencies
General Method of finding Latency Considering all initiations: Forbidden Latencies are 2 and 5 1 2 3 4 5 6 7 8 9 10 11 Sa A1 A2 A3 A4 A5 A6A1 A6 Sb A1A3 A2A4 A3A5 A4A6 Sc
Shortcut Method of finding Latency Forbidden Latency Set = {5} U {2} U {2} = { 2,5}
Terminologies Latency Sequence : Sequence of latencies between successive initiations For a RT, number of valid initiations and latencies are infinite
Terminologies Latency Cycle: Among the infinite possible latency sequence, the periodic ones are significant. E.g. { 1, 3, 3, 1, 3, 3,… } The subsequence that repeats itself is called latency cycle. E.g. {1, 3, 3}
Terminologies Period of cycle: The sum of latencies in a latency cycle (1+3+3=7) Average Latency: The average taken over its latency cycle (AL=7/3=2.33) To design a pipeline, we need a control strategy that maximize the throughput (no. of results per unit time) Maximizing throughput is minimizing AL
Terminologies Latency sequence which is aperiodic in nature is impossible to design Thus design problem is arriving at a latency cycle having minimal average latency.
State Diagram
Shortcut Method of finding Latency Forbidden Latency Set,F = {5} U {2} U {2} = { 2,5}
State Diagram The initial collision vector (ICV) is a binary vector formed from F such that C = (Cn…. C2 C1) where Ci = 1 if i F and Ci = 0 if otherwise Thus in our example F = { 2,5 } C = (1 0 0 1 0)
State Diagram The procedure is as follows: Start with the ICV For each unprocessed state, For each bit i in the CVi which is 0, do the following: Shift CVi right by i bits Drop i rightmost bits
State Diagram Append zeros to left Logically OR with ICV If step(d) results in a new state then form a new node for this state and join it with node of CVi by an arc with a marking i. This shifting process needs to continue until no more new states can be generated.
State Diagram 1 0 0 1 0
State Diagram 1 0 0 1 0 ICV – 10010 OR CVi – 01001 CV* 11011 1 1 0 1 1
State Diagram 1 0 0 1 0 ICV – 10010 OR CVi – 00010 1 1 0 1 1 CV* 10010 3 i = 3 1 ICV – 10010 OR CVi – 00010 CV* 10010 1 1 0 1 1
State Diagram 1 0 0 1 0 1 1 0 1 1 1 0 0 1 1 ICV – 10010 OR CVi – 00001 3 i = 4 1 4 1 1 0 1 1 1 0 0 1 1 ICV – 10010 OR CVi – 00001 CV* 10011
State Diagram 1 0 0 1 0 1 1 0 1 1 1 0 0 1 1 ICV – 10010 OR CVi – 00000 5 1 0 0 1 0 3 i = 5 1 4 1 1 0 1 1 1 0 0 1 1 ICV – 10010 OR CVi – 00000 CV* 10010
State Diagram 1 0 0 1 0 1 1 0 1 1 1 0 0 1 1 ICV – 10010 OR CVi – 00010 5 1 0 0 1 0 3 4 3 1 1 1 0 1 1 1 0 0 1 1 i =3 ICV – 10010 OR CVi – 00010 CV* 10010
State Diagram 1 0 0 1 0 1 1 0 1 1 1 0 0 1 1 ICV – 10010 OR CVi – 00001 5 1 0 0 1 0 3 4 3 1 4 1 1 0 1 1 1 0 0 1 1 i =4 ICV – 10010 OR CVi – 00001 CV* 10011
State Diagram 1 0 0 1 0 1 1 0 1 1 1 0 0 1 1 ICV – 10010 OR CVi – 00011 5 1 0 0 1 0 3 4 3 1 4 1 1 0 1 1 1 0 0 1 1 3 i =3 ICV – 10010 OR CVi – 00011 CV* 10011
State Diagram 1 0 0 1 0 1 1 0 1 1 1 0 0 1 1 ICV – 10010 OR CVi – 00000 5+ 1 0 0 1 0 3 5+ 4 3 1 4 1 1 0 1 1 1 0 0 1 1 3 i = 5 ICV – 10010 OR CVi – 00000 CV* 10010
State Diagram 1 0 0 1 0 1 1 0 1 1 1 0 0 1 1 ICV – 10010 OR CVi – 00000 5+ 1 0 0 1 0 3 5+ 4 5+ 3 1 4 1 1 0 1 1 1 0 0 1 1 3 i = 5 ICV – 10010 OR CVi – 00000 CV* 10010
State Diagram The state with all zeros has a self-loop which corresponds to empty pipeline and it is possible to wait for indefinite number of latency cycles of the form (7),(8), (9),(10) etc. Simple Cycle: latency cycle in which each state is encountered only once. Complex Cycle: consists of more than one simple cycle in it. It is enough to look for simple cycles
State Diagram Greedy Cycle: A simple cycle is a greedy cycle if each latency contained in a cycle is the minimal latency(outgoing arc) from a state in the cycle. A good task initiation sequence should include the greedy cycle.
Simple cycles & Greedy cycles The Simple cycles are? The Greedy cycles are ?
Simple cycles & Greedy cycles The simple cycles are (3),(5) ,(1,3,3),(4,3) and (4) The Greedy cycle is (1,3,3)
State Diagram In the above example, the cycle that offers MAL is (1, 3, 3) (MAL = (1+3+3)/3 =2.333)
1 2 3 4 5 6 7 8 9 10 11 12 13 Sa A1 A2 A5 A8 Sb Sc
UQ: Problem Consider the reservation table given below 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 9 S1 x S2 S3 S4 S5
Problem Find the forbidden set of latencies State the collision vector Draw the state transition diagram List simple cycles and greedy cycles Calculate MAL (minimum average latency)