Download presentation
Presentation is loading. Please wait.
1
Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter
2
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada2 / 30 6/18/2015 Understanding the TigerSHARC ALU pipeline TigerSHARC has many pipelines If these pipelines stall – then the processor speed goes down Need to understand how the ALU pipeline works Learn to use the pipeline viewer May be different answer for floating point and integer operations
3
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada3 / 30 6/18/2015 Register File and COMPUTE Units
4
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada4 / 30 6/18/2015 Simple Example IIR -- Biquad For (Stages = 0 to 3) Do S0 = X in * H5 + S2 * H3 + S1 * H4 Y out = S0 * H0 + S1 * H1 + S2 * H2 S2 = S1 S1 = S0 S0 S1 S2
5
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada5 / 30 6/18/2015 Set up the tests. Want to make sure correct answer as code changes
6
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada6 / 30 6/18/2015 Step 1 – Stub plus return value Build an assembly language stub for float iirASM(void); Make it return a floating point value of 40.5 to show that we can return a value of 40.5 J8 is an INTEGER so how can we return 40.5? ANSWER – WE DON’T We return the “bit pattern” for 40.5, which is “INTEGER”
7
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada7 / 30 6/18/2015 Code does not work when passing back floats with J8 register
8
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada8 / 30 6/18/2015 Code does work when using XR8 register – NOTE NOT XFR8
9
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada9 / 30 6/18/2015 Step 2 – Using C++ code as comments set up the coefficients XFR0 = 0.0;; Does not exist XR0 = 0.0;; DOES EXIST Bit-patterns require integer registers Leave what you wanted to do behind as comments
10
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada10 / 30 6/18/2015
11
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada11 / 30 6/18/2015 Modify C++ code so that it can be translated into assembly code Can only have 1 instruction per line Code must execute sequentially so remember the ;;
12
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada12 / 30 6/18/2015 Start with S0 = Xin instruction Can’t use XFR8 = XFR6 to copy a register
13
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada13 / 30 6/18/2015 Since XFR8 = XFR6 is not allowed Try XR8 = R6; SIMD Single instruction Multiple Data R6 means move XR6 and YR6 (Multiple data move described in 1 instruction) Try XR8 = XR6
14
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada14 / 30 6/18/2015 Some operations are FLOAT operations and must have XFR on left side of equation BUT only R on the right Some operations are SISD operations and must have XR on both side of the equation (or just R on both sides of the equation making them SIMD X and Y with garbage happening on Y) Personally, I think all these problems are “assembler” issues and could be made consistent
15
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada15 / 30 6/18/2015 Disconnect from target and go to simulator
16
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada16 / 30 6/18/2015 Activate Simulator
17
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada17 / 30 6/18/2015 Rebuild the project and set breakpoints at start and end of ASM code
18
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada18 / 30 6/18/2015 Activate the pipeline viewer
19
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada19 / 30 6/18/2015 Adjust the pipeline window so can see all the instruction pipeline stages
20
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada20 / 30 6/18/2015 PIPELINE STAGES See page 8-34 of Processor manual Instruction fetch -- F1, F2, F3 and F4 Fetch Unit Pipe – memory driven 128 bits fetched – may make up 1, 2, 3, or 4 instructions (or parts of a couple instructions Instructions into IAB, instruction alignment buffer Integer ALU pipe – PD, D, I and A
21
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada21 / 30 6/18/2015 PIPELINE STAGES See page 8-34 of Processor manual 10 pipeline stages, but may be completely desynchronized (happen semi-indepently) Instruction fetch -- F1, F2, F3 and F4 Integer ALU – PreDecode, Decode, Integer, Access Compute Block – EX1 and EX2
22
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada22 / 30 6/18/2015 PIPELINE STAGES See page 8-34 of Processor manual Instruction fetch -- F1, F2, F3 and F4 Fetch Unit Pipe Memory driven not instruction driven 128 bits fetched – may make up 1, 2, 3, or 4 instruction lines (or parts of a couple of instruction lines Instruction fetched into IAB, instruction alignment buffer
23
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada23 / 30 6/18/2015 PIPELINE STAGES See page 8-34 of Processor manual Integer ALU pipe – PD, D, I and A PreDecode – the next COMPLETE instruction line (1, 2, 3 or 4 ) fetched from IAB Decode – different instructions dispatched to different execution units (J-IALU, K-IALU, Compute Blocks) Data memory access start in Integer stage A stands for Access stage Results are not available EX2 stage, but (by register forwarding) can be sometimes accessed earlier
24
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada24 / 30 6/18/2015 PIPELINE STAGES See page 8-34 of Processor manual Compute Block EX1 and EX2 Result is always written to the target register on the rising edge of CCLK after stage EX2 Following guaranteed R2 = R0 + R1; R6 = R2 * R3;; R2 at end of instruction R2 value at beginning of instruction used
25
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada25 / 30 6/18/2015 Only interested in later stages of the pipeline. Adjust properties
26
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada26 / 30 6/18/2015 Run the code till first ASM break point: Note cycle Number 39830 Then run again till reach second ASM breakpoint Calculate execution time
27
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada27 / 30 6/18/2015
28
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada28 / 30 6/18/2015 Pipeline viewer says 26 cycles but what do we expect 8 cycles
29
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada29 / 30 6/18/2015 Pipeline viewer says 26 cycles but what do we expect -- 21 13 cycles expected Where are the extra cycles coming from and how easy is it to code in such a way that the extra cycles can be removed ANSWER Fairly straight forward in idea, can be difficult in practice
30
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada30 / 30 6/18/2015 Understanding the TigerSHARC ALU pipeline TigerSHARC has many pipelines If these pipelines stall – then the processor speed goes down Need to understand how the ALU pipeline works Learn to use the pipeline viewer May be different answer for floating point and integer operations
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.