Presentation is loading. Please wait.

Presentation is loading. Please wait.

Generating a software loop with memory accesses TigerSHARC assembly syntax.

Similar presentations


Presentation on theme: "Generating a software loop with memory accesses TigerSHARC assembly syntax."— Presentation transcript:

1 Generating a software loop with memory accesses TigerSHARC assembly syntax

2 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 2 / 38 Concepts Learning just enough TigerSHARC assembly code to make a software loop “work” Comparing the timings for rectification of integer and floating point arrays, using debug C++ code, Release C++ code Our FIRST_ASM code Looking in “MIXED mode” at the code generated by the compiler

3 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 3 / 38 Test Driven Development CUSTOMER DEVELOPER Work with customer to check that the tests properly express what the customer wants done. Iterative process with customer “heavily involved” – “Agile” methodology.

4 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 4 / 38 Note Special marker Compiler optimization FLOATS 927  304 -- THREE FOLD INTS 960  150 – SIX FOLD Why the difference, and can we do better, and do we want to? Note the failures – what are they

5 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 5 / 38 Write tests about passing values back from an assembly code routine

6 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 6 / 38 More detailed look at the code Single semi-colons Double semi-colons Start function label End function label Used for “profiling code” Label format similar to 68K Needs leading underscore and final colon As with 68K and Blackfin needs a.section But name and format different As with 68K need.align statement Is the “4” in bytes (8 bits) or words (32 bits) As with 68K need.global to tell other code that this function exists

7 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 7 / 38 Return registers There are many, depending on what you need to return Here we need to use J8 as the return register to pass back “integer” pointer Many registers available – need ability to control usage J0 to J31 – registers (integers and pointers) (SISD mode) XR0 to XR31 – registers (integers) (SISD mode) XFR0 to XFR31 – registers (floats) (SISD mode) Did I also mention I0 to I31 – registers (integers and pointers) (SISD mode) YR0 to YR31, YFR0 to YFR31 (SIMD mode) XYR, YXR and R registers (SIMD mode) And also the MIMD modes And the double registers and the quad registers ……. #define return_pt_J8 J8 // J8 is a VOLATILE, NON-PRESERVED register

8 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 8 / 38 Parameter passing SPACES for first four parameters ARE ALWAYS present on the stack (as with 68K) But the first four parameters are passed in registers (J4, J5, J6 and J7 most of the time) (as with MIPS and Blackfin) The parameters passed in registers are often stored into the spaces on the stack (like the MIPS) as the first step when assembly code functions call assembly code functions J4, J5, J6 and J7 are volatile, non-preserved registers

9 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 9 / 38 Can we pass back the start of the final array Still passing tests by accident and this needs to be conditional return value

10 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 10 / 38 What we need to know based on experiences from other processors Can we return from an assembly language routine without crashing the processor? Return a parameter from assembly language routine (Is it same for ints and floats?) Pass parameters into assembly language (Is it same for ints and floats?) Do IF THEN ELSE statements Read and write values to memory Read and write values in a loop Do some mathematics on the values fetched from memory All this stuff is demonstrated by coding HalfWaveRectifyASM( )

11 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 11 / 38 Why is ELSE a keyword FOUR PART ELSE INSTRUCTION IS LEGAL IF JLT; ELSE, J1 = J2 + J3; // Conditional execution – if true ELSE, XR1 = XR2 + XR3; // Conditional – if true YFR1 = YFR2 + YFR3;; // Unconditional -- always IF JLT; DO, J1 = J2 + J3; // Conditional execution -- if true DO, XR1 = XR2 + XR3; // Conditional -- if true YFR1 = YFR2 + YFR3;; // Unconditional -- always Having this sort of format means that the instruction pipeline is not disrupted when we do IF statements

12 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 12 / 38 Label name is not the problem NOTE: This is “C-like” syntax, But it is not “C” Statement must end in ;; Not ; ONE semicolon = end of instruction TWO semicolons = end of parallel instruction line

13 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 13 / 38 Add dual-semicolons everywhere Worry about “multiple issues” later This dual semi-colon Is so important that you MUST code review for it all the time or else you waste so much time in the Lab. Key in exams / quizzes At last an error I know how to fix

14 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 14 / 38 Well I thought I understood it !!! Speed issue – JUMP instructions can’t be too close together when stored in memory Not normally a problem when “if” code is larger

15 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 15 / 38 Add a single instruction of 4 NOPs nop; nop; nop; nop;; TEMPORARY Fix the last error as part of Assignment 1 Fix the remaining error In handling the IF THEN ELSE as part of assignment 1 Worry about code efficiency later (refactor) when all code working

16 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 16 / 38 What we need to know based on experiences from other processors Can we return from an assembly language routine without crashing the processor? Return a parameter from assembly language routine (Is it same for ints and floats?) Pass parameters into assembly language (Is it same for ints and floats?) Do IF THEN ELSE statements Read and write values to memory Read and write values in a loop Do some mathematics on the values fetched from memory All this stuff is demonstrated by coding HalfWaveRectifyASM( )

17 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 17 / 38 Target. Changing this C++ code into assembly (to get “more” speed) Code we generated yesterday was similar to parts of this, but not equivalent. Re-factor the code to make the assembly code and C++ functionality equivalent

18 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 18 / 38 The code was not exactly what we designed (C++ equivalent) – re-factor and retest after the re-factoring NEXT STEP

19 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 19 / 38 Refactored C++ code I THINK I UNDERSTAND ENOUGH TO CHANGE THE FORMAT OF THE IF-THEN-ELSE TO OPTIMIZE THIS PARTICULAR CODE BIT USE : IF TRUE EXECUTE THIS STATEMENT – SINGLE LINE Avoiding JUMPS in the main flow of the code will speed the flow of the code Almost right. SYNTAX ERROR Look in the manual to find the correct syntax IF NJLE; DO, J8 = 0

20 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 20 / 38 No syntax errors (No CODE ERRORS). Code does not work (CODE DEFECTS) We don’t have enough code to pass all the tests but we are failing tests we did not expect to fail

21 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 21 / 38 Run “forensic tests” to find out where DEFECT is being introduced Identify mistake by removing “code sections” Without the IF

22 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 22 / 38 Add another line to the code Can now spot the error New format of IF-THEN-ELSE Is doing exactly the opposite of what we want IF NOT TRUE return NULL (0) Need JLE not NJLE

23 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 23 / 38 Assignment 1 – code the following as a software loop – follow MIPS / Blackfin approach DONE DURING TUTOTIAL int CalculateSum(void) { int sum = 0; for (int count = 0; count < 6; count++) { sum = sum + count; } return sum; }

24 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 24 / 38 Reminder – software for-loop becomes “while loop” with initial test int CalculateSum(void) { int sum = 0; int count = 0; while (count < 6) { sum = sum + count; count++; } return sum; } Do line by line translation into assembly code

25 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 25 / 38 USE SOFTWARE LOOP HERE Do loop control first Have some jumps too close together NOTE JGE is ILLEGAL USE NJLT Customize? #define JGE NJLT

26 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 26 / 38 Run the tests with 4 nop padding to check that get out of loop as expected Adding 4 nops -- lose 1 cycle gain an hour not trying to solve the problem If need the 1 cycle refactor the code later

27 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 27 / 38 Accessing memory Basic mode Special register J31 – acts as zero when used in additions Pt_J5 is a pointer register into an array Value_J1 is being used as a data register J registers like MIPS registers (used as pointer and data). NOT like 68K or Blackfin registers – those can be used as either data or address registers but not both NOTE: Later we will find that using TigerSHARC registers for data operations is a BAD idea 1. Value_J1 = [Pt_J5];; read value from memory location pointed to by J5 -- Compare to Blackfin Value_R0 = [Pt_P0];; 2. Value_J1 = [Pt_J5 + J31];; read value from memory location pointed to by J5 – but read somewhere that this CAN be faster than just Value_J1 = [Pt_J5];; -- NEED TO CONFIRM

28 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 28 / 38 Accessing memory – step 2 Basic mode Pt_J5 is a pointer register into an array Offset_J4 is used as an offset Value_J1 is being used as a data register to receive the memory value – load / store architecture 1. Read_J1 = [Pt_J5 + Offset_J4];; read value from memory location pointed to by (J5 + J4) PRE-MODIFY – address used J5 + J4, no change in J5 2. Read_J1 = [Pt_J5 += Offset_J4];; read value from memory location pointed to by J5, and then perform add operation on the J5 register (points to NEXT location) POST-MODIFY – address used J5, then perform J5 = J5 + J4

29 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 29 / 38 Add in the memory accesses FORGET TigerSHARC = RISC PROCESSOR LOAD/STORE ONLY Like MIPS and Blackfin Must place value into register, and then copy register to memory NO [J5 +J0] = 0; NO J3 = 0 ; [J5 + J0] = J3; Uses wrong J3 – Remember TigerSHARC can handle parallel instructions YES J3 = 0 ;; [J5 + J0] = J3;

30 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 30 / 38 Understand the error message Too many J resource usage = missing ;; Unintentionally doing the parallel instruction line [J5 + J0] = J2; J0 = J0 + 1;;

31 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 31 / 38 Note: Missing label is not an assembler error, it’s a linker error Fix warnings DEFECT may be days before try to link then hard to find

32 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 32 / 38 NOW the assembler know where “CONTINUE” is, then it can tell you that you have two JUMP instructions too close together Fix with magic 4 nops; and lose one cycle / loop

33 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 33 / 38 Not getting expected Test results Something is logically wrong (DEFECT)

34 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 34 / 38 Obvious question – are we even getting into the loop. Add BREAKPOINT to TEST code flow. (We don’t add BREAKPOINTS to code follow in detail) CODE NEVER GOT TO BREAKPOINT means code never entered loop Forgot to do count = 0 So not even getting into loop as there is a garbage value already in Count_J0 from code we executed earlier -- DEFECT

35 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 35 / 38 Not bad for a first effort Faster than compiler in debug mode

36 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 36 / 38 Where did the float ASM code suddenly appear from? Integer 0 has bit pattern 0x0000 0000 Float 0.0 has bit pattern 0x0000 0000 Integer +6 has format b 0??? ???? ???? ???? ???? ???? ???? ???? Float +6.0 has format b 0??? ???? ???? ???? ???? ???? ???? ???? Integer -6 has format b 1??? ???? ???? ???? ???? ???? ???? ???? Float -6.0 has format b 1??? ???? ???? ???? ???? ???? ???? ???? Format’s are very different, but the sign bit is in the same place Float algorithm - if S == 1 (negative) set to zero Otherwise leave unchanged – same as integer algorithm Just re-use integer algorithm with a change of name EXPONENT

37 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 37 / 38 Final code – Float rectify code just has a different name

38 10/1/2016 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada 38 / 38 What we NOW KNOW Can we return from an assembly language routine without crashing the processor? Return a parameter from assembly language routine (Is it same for ints and floats?) Pass parameters into assembly language (Is it same for ints and floats?) Do IF THEN ELSE statements Read and write values to memory Read and write values in a loop Do some mathematics on the values fetched from memory All this stuff is demonstrated by coding HalfWaveRectifyASM( )


Download ppt "Generating a software loop with memory accesses TigerSHARC assembly syntax."

Similar presentations


Ads by Google