Presentation is loading. Please wait.

Presentation is loading. Please wait.

11/4/091 Implementing an ISA, part II - Control David E. Culler CS61CL Nov 4, 2009 Lecture 10 UCB CS61CL F09 Lec 10.

Similar presentations


Presentation on theme: "11/4/091 Implementing an ISA, part II - Control David E. Culler CS61CL Nov 4, 2009 Lecture 10 UCB CS61CL F09 Lec 10."— Presentation transcript:

1 11/4/091 Implementing an ISA, part II - Control David E. Culler CS61CL Nov 4, 2009 Lecture 10 UCB CS61CL F09 Lec 10

2 Review: TinyMIPS Reg-Reg instructions (op == 0) –adduR[rd] := R[rs] + R[rt]; pc:=pc+4 –subuR[rd] := R[rs] - R[rt]; pc:=pc+4 Reg-Immed (op != 0) –lw R[rt] := Mem[ R[ rs ] + signEx(Im16) ] –swMem[ R[ rs ] + signEx(Im16) ] := R[rt] Jumps –jPC := PC 31..28 || addr || 00 –jrPC := R[rs] Branches –BEQPC := (R[rs] == R[rt]) ? PC + signEx(im16) : PC+4 –BLTZPC := (R[rs] < 0) ? PC + signEx(im16) : PC+4 11/4/09 UCB CS61CL F09 Lec 10 2

3 Review: DataPath + Control 11/4/09 UCB CS61CL F09 Lec 10 3 °°° Asel Bsel Dsel ld PC + A BCi ~ IR RAM A D xtxt +4 pc2Ald_pcld_irwrtm2Dld_regsx_selcomps2As2D rs rtrd b2D rt_sel i2D ||+ npc_sel

4 Control State Machine (abstract) 11/4/09 UCB CS61CL F09 Lec 10 4 I-Fetch IR := Mem[pc] AddU R[rd]:=R[rs]+R[rt]; pc := pc+4 SubU R[rd] := R[rs]-R[rt]; pc := pc+4 LW R[rt] := mem[R[rs]+sx16]; pc := pc+4 SW mem[R[rs]+sx16] := R[rt]; pc := pc+4 J pc := pc 31..28 ||addr||00 JR pc := R[rs] BR-taken pc := pc + sx16||00 BR-not taken pc := pc + 4 reset ~reset&OP==addu ~reset&OP==lw ~reset & ( (OP==beq) & ~EQ)) | (OP==Bneg) & ~N)) )

5 Ifetch: IR := mem[pc] RAM_addr <- A <- PC;(pc2A, ~s2A) IR_in <- D <- RAM_data;(~i2D,m2D,~b2D,~s2D) IR := IR_in;(ld_ir,~ld_pc,~ld_reg, ~wrt) 11/4/09 UCB CS61CL F09 Lec 10 5 °°° Asel Bsel Dsel ld PC + A BCi ~ IR RAM A D xtxt +4 pc2Ald_pcld_irwrtm2Dld_regsx_selcomps2As2D rs rtrd b2D rt_sel i2D ||+ npc_sel

6 Control State Machine 11/4/09 UCB CS61CL F09 Lec 10 6 I-Fetch IR := Mem[pc] pc2A,~s2A,~ir2D,m2D, ~b2D,~s2D,ld_ir,~ld_pc, ~ld_reg, ~wrt AddU R[rd]:=R[rs]+R[rt]; pc := pc+4 SubU R[rd] := R[rs]-R[rt]; pc := pc+4 LW R[rt] := mem[R[rs]+sx16]; pc := pc+4 SW mem[R[rs]+sx16] := R[rt]; pc := pc+4 J pc := pc 31..28 ||addr||00 JR pc := R[rs] BR-taken pc := pc + sx16||00 BR-not taken pc := pc + 4 reset ~reset&OP==addu ~reset&OP==lw ~reset & ( (OP==beq) & ~EQ)) | (OP==Bneg) & ~N)) )

7 Exec: R[rd]:=R[rs]+R[rt]; pc:=pc+4; npc_sel=0,ld_pc,~pc2A,~ld_ir,~i2D,~wrt,~m2D,~rt_sel,ld_reg,~b2D,~sx_sel,~ comp,~s2A,s2D 11/4/09 UCB CS61CL F09 Lec 10 7 °°° Asel Bsel Dsel ld PC + A BCi ~ IR RAM A D xtxt +4 pc2Ald_pcld_irwrtm2Dld_regsx_selcomps2As2D rs rtrd b2D rt_sel i2D ||+ npc_sel

8 Control State Machine 11/4/09 UCB CS61CL F09 Lec 10 8 I-Fetch IR := Mem[pc] pc2A,~s2A,~ir2D,m2D, ~b2D,~s2D,ld_ir,~ld_pc, ~ld_reg, ~wrt AddU R[rd]:=R[rs]+R[rt]; pc := pc+4 npc_sel=0,ld_pc,~pc2A,~ld_ir,~i2D,~wrt,~m2D,~rt_sel,ld_reg,~b2D,~sx_sel,~comp,~s2A,s2D SubU R[rd] := R[rs]-R[rt]; pc := pc+4 LW R[rt] := mem[R[rs]+sx16]; pc := pc+4 SW mem[R[rs]+sx16] := R[rt]; pc := pc+4 J pc := pc 31..28 ||addr||00 JR pc := R[rs] BR-taken pc := pc + sx16||00 BR-not taken pc := pc + 4 reset ~reset&OP==addu ~reset&OP==lw ~reset & ( (OP==beq) & ~EQ)) | (OP==Bneg) & ~N)) )

9 Exec: R[rd]:=R[rs]-R[rt]; pc:=pc+4; npc_sel=0,ld_pc,~pc2A,~ld_ir,~i2D,~wrt,~m2D,~rt_sel,ld_reg,~b2D,~sx_sel,c omp,~s2A,s2D 11/4/09 UCB CS61CL F09 Lec 10 9 °°° Asel Bsel Dsel ld PC + A BCi ~ IR RAM A D xtxt +4 pc2Ald_pcld_irwrtm2Dld_regsx_selcomps2As2D rs rtrd b2D rt_sel i2D || + npc_sel

10 Exec LW: R[rt]:= Mem[R[rs]+SXim16]; npc_sel=0, ld_pc, m2D, rt_sel, ld_reg, sx_sel, s2A 11/4/09 UCB CS61CL F09 Lec 10 10 °°° Asel Bsel Dsel ld PC + A BCi ~ IR RAM A D xtxt +4 pc2Ald_pcld_irwrtm2Dld_regsx_selcomps2As2D rs rtrd b2D rt_sel i2D || + npc_sel

11 Exec SW: Mem[R[rs]+SXim16] := R[rt] 11/4/09 UCB CS61CL F09 Lec 10 11 °°° Asel Bsel Dsel ld PC + A BCi ~ IR RAM A D xtxt +4 pc2Ald_pcld_irwrtm2Dld_regsx_selcomps2As2D rs rtrd b2D rt_sel i2D || + npc_sel

12 Exec J: PC := PC 31..28 || addr || 00 npc_sel=1, ld_pc, i2D 11/4/09 UCB CS61CL F09 Lec 10 12 °°° Asel Bsel Dsel ld PC + A BCi ~ IR RAM A D xtxt +4 pc2Ald_pcld_irwrtm2Dld_regsx_selcomps2As2D rs rtrd b2D rt_sel i2D ||+ npc_sel

13 Exec JR: PC := R[rs] npc_sel=2, ld_pc, s2D, sx_sel=2 11/4/09 UCB CS61CL F09 Lec 10 13 °°° Asel Bsel Dsel ld PC + A BCi ~ IR RAM A D xtxt +4 pc2Ald_pcld_irwrtm2Dld_regsx_selcomps2As2D rs rtrd b2D rt_sel i2D ||+ npc_sel

14 Exec Br Taken: PC := PC + SX16 npc_sel=3, ld_pc, i2D 11/4/09 UCB CS61CL F09 Lec 10 14 °°° Asel Bsel Dsel ld PC + A BCi ~ IR RAM A D xtxt +4 pc2Ald_pcld_irwrtm2Dld_regsx_selcomps2As2D rs rtrd b2D rt_sel i2D ||+ npc_sel

15 Controller Specification 11/4/09 UCB CS61CL F09 Lec 10 15

16 Adminstration HW7 due midnight Mid Term 2 Monday 11/9 –5:30 – 7:30 RM: 145 Dwinelle –alternate Friday 11/4 3:00-5:00 rm – 310 Soda –Review session Sunday 5-7 306 Soda Project 3 –incremental lab check offs Flex lab mon (9-1) and tues (9-5) –midterm final prep –project 3 help 11/4/09 UCB CS61CL F09 Lec 10 16

17 Controller Implementation 11/4/09 UCB CS61CL F09 Lec 10 17 clk reset exec op eq n npc_sel s2D

18 Combinational Logic per Ctrl Point 11/4/09 UCB CS61CL F09 Lec 10 18 execpc2A execldPC wrt exec op

19 Multiplexor Control I01000000100001 11/4/09 UCB CS61CL F09 Lec 10 19 op I3100100x00xx00 I01000000100101

20 Faster Clock Clock Period > Longest path from reg out to input + reg delay 11/4/09 UCB CS61CL F09 Lec 10 20 °°° Asel Bsel Dsel ld PC + A BCi ~ IR RAM Addr D xtxt +4 pc2Ald_pcld_irwrtm2Dld_regsx_selcomps2As2D rs rtrd b2D rt_sel i2D ||+ npc_sel A B S

21 Multi-Cycle Controller State Machine 11/4/09 UCB CS61CL F09 Lec 10 21 Op / Dcd A := R[rs], B := R[rt] AddU S:=A+B; pc := pc+4 SubU S := A – B; pc := pc+4 LW S=A+sx16; pc := pc+4 SW S=A+sx16; pc := pc+4 J pc := pc 31..28 ||addr||00 JR pc := R[rs] BR-taken pc := pc + sx16||00 BR-not taken pc := pc + 4 I-Fetch IR := Mem[pc] R[rd]:=S; read MAR:=S; R[rd]:=D wrt MAR:=S; MDR:=B’

22 Time State control Move control word through the stages Decode per stage Active stage moves “around the ring” 11/4/09 UCB CS61CL F09 Lec 10 22 °°° PC + A BCi IR IR_ex IR_mem IR_wb mem

23 Time State Control 11/4/09 UCB CS61CL F09 Lec 10 23 °°° Asel Bsel Dsel ld PC + A BCi ~ IR RAM Addr D xtxt +4 pc2Ald_pcld_irwrtm2Dld_regsx_selcomps2As2D rs rtrd b2D rt_sel i2D ||+ npc_sel A B S ifetch decode exec mem wb

24 More regular multi-cycle execution 11/4/09 UCB CS61CL F09 Lec 10 24 Op / Dcd A := R[rs], B := R[rt] AddU S:=A+B; pc := pc+4 SubU S := A – B; pc := pc+4 LW S=A+sx16; pc := pc+4 SW S=A+sx16; pc := pc+4 J pc := pc 31..28 ||addr||00 JR pc := R[rs] BR-taken pc := pc + sx16||00 BR-not taken pc := pc + 4 I-Fetch IR := Mem[pc] R[rd]:=S; R[rd]:=S’; read MAR:=S; R[rd]:=D wrt MAR:=S; MDR:=B’ S’:=S;

25 Sequence of Multi-step Operations Operation implemented as sequence of step on distinct resources –wash => dry => fold Multiple independent Operations 11/4/09 UCB CS61CL F09 Lec 10 25 IF DCD Ex Mem WB

26 6/27/2015 cs61cl f09 lec 5 26 Technology Trends Clock Rate: ~30% per year Transistor Density: ~35% Chip Area: ~15% Transistors per chip: ~55% Total Performance Capability: ~100% by the time you graduate... –3x clock rate (>10 GHz) –10x transistor count (100 Billion transistors) –30x raw capability plus 16x dram density, 32x disk density (60% per year) Network bandwidth, …

27 Pipelining Overlap consecutive operations 11/4/09 UCB CS61CL F09 Lec 10 27

28 6/27/2015 cs61cl f09 lec 5 28 Performance(X) Execution_time(Y) n == Performance(Y) Execution_time(X) Definition: Performance Performance is in units of things per sec –bigger is better If we are primarily concerned with response time performance(x) = 1 execution_time(x) " X is n times faster than Y" means

29 Pipeline Performance N operations performed in k steps each Sequential Time: N*k Lower bound: N (1 every cycle) Pipeline Time: k – 1 + N Bound on Speedup on k-stage pipeline < k Speedup(k,N) = Time(1,N)/Time(k,N) = N*k / (N+k-1) ≈ N / (1+k/N) StartUp Cost: k-1 Peak Rate Half Power point 11/4/09 UCB CS61CL F09 Lec 10 29

30 6/27/2015 cs61cl f09 lec 5 30 Performance Trends MIPS R3000

31 6/27/2015 cs61cl f09 lec 5 31 Processor Performance (1.35X before, 1.55X now) 1.54X/yr

32 Pipelined control 11/4/09 UCB CS61CL F09 Lec 10 32 °°° PC + A BCi IR IR_ex IR_mem IR_wb imem Dmem

33 Pipelined Instruction Execution Fetch Instruction Every cycle Launch into a pipeline What if they are not independent? –structural hazards »two operations need to use same resource –data dependence »later instruction needs to use the value produce by an earlier on Detect Wait till hazard clears 11/4/09 UCB CS61CL F09 Lec 10 33

34 Pipelined “Bubble” 11/4/09 UCB CS61CL F09 Lec 10 34 °°° PC + A BCi IR IR_ex IR_mem IR_wb imem Dmem


Download ppt "11/4/091 Implementing an ISA, part II - Control David E. Culler CS61CL Nov 4, 2009 Lecture 10 UCB CS61CL F09 Lec 10."

Similar presentations


Ads by Google