Pipelining Example Cycle 1 b[0] b[1] b[2] + + for (i=0; i < 100; I++) a[i] = b[i] + b[i+1] + b[i+2]; +
Pipelining Example Cycle 2 b[1] b[2] b[3] + + b[0]+b[1] b[2] for (i=0; i < 100; I++) a[i] = b[i] + b[i+1] + b[i+2]; b[0]+b[1] b[2] +
Pipelining Example Cycle 3 b[2] b[3] b[4] + + b[1]+b[2] b[3] for (i=0; i < 100; I++) a[i] = b[i] + b[i+1] + b[i+2]; b[1]+b[2] b[3] + b[0]+b[1]+b[2]
Pipelining Example Cycle 4 b[3] b[4] b[5] + + for (i=0; i < 100; I++) a[i] = b[i] + b[i+1] + b[i+2]; b[2]+b[3] b[4] + b[1]+b[2]+b[3] a[0] First output appears, takes 4 cycles to fill pipeline
Pipelining Example Cycle 5 Total Cycles => 4 init + 99 = 103 b[4] for (i=0; i < 100; I++) a[i] = b[i] + b[i+1] + b[i+2]; b[3]+b[4] b[5] + b[2]+b[3]+b[4] Total Cycles => 4 init + 99 = 103 One output per cycle at this point, 99 more until completion a[1]
Entire Circuit Input Address Generator RAM Buffer Controller RAM delivers “streams” of data to the datapath Input Address Generator RAM Buffer Controller Pipelined Datapath Buffer Separate RAM writes “streams” of data from the datapath Output Address Generator RAM