Generating the “Rectify” code (C++ and assembly code)

Slides:



Advertisements
Similar presentations
Review of the MIPS Instruction Set Architecture. RISC Instruction Set Basics All operations on data apply to data in registers and typically change the.
Advertisements

Branches Two branch instructions:
Lecture 6 Programming the TMS320C6x Family of DSPs.
1 Lecture 4: Procedure Calls Today’s topics:  Procedure calls  Large constants  The compilation process Reminder: Assignment 1 is due on Thursday.
Building a simple loop using Blackfin assembly code M. Smith, Electrical and Computer Engineering, University of Calgary, Canada.
Review of Blackfin Syntax Moves and Adds 1) What we already know and have to remember to apply 2) What we need to learn.
Software and Hardware Circular Buffer Operations First presented in ENCM There are 3 earlier lectures that are useful for midterm review. M. R.
Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter.
Understanding the Blackfin ADSP-BF5XX Assembly Code Format
TigerSHARC processor General Overview. 6/28/2015 TigerSHARC processor, M. Smith, ECE, University of Calgary, Canada 2 Concepts tackled Introduction to.
Processor Architecture Needed to handle FFT algoarithm M. Smith.
Blackfin Array Handling Part 2 Moving an array between locations int * MoveASM( int foo[ ], int fee[ ], int N);
Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter – Part 3 Understanding the memory pipeline issues.
A Play Core Timer Interrupts Acted by the Human Microcontroller Ensemble from ENCM511.
Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter – Part 2 Understanding the pipeline.
Generating “Rectify( )” Test driven development approach to TigerSHARC assembly code production Assembly code examples Part 1 of 3.
Moving Arrays -- 1 Completion of ideas needed for a general and complete program Final concepts needed for Final Review for Final – Loop efficiency.
Blackfin Array Handling Part 1 Making an array of Zeros void MakeZeroASM(int foo[ ], int N);
A first attempt at learning about optimizing the TigerSHARC code TigerSHARC assembly syntax.
Building a simple loop using Blackfin assembly code If you can handle the while-loop correctly in assembly code on any processor, then most of the other.
Generating a software loop with memory accesses TigerSHARC assembly syntax.
Course Contents KIIT UNIVERSITY Sr # Major and Detailed Coverage Area
CS2100 Computer Organisation
Help for Lab. 1 Subroutines calling Subroutines
RISC Concepts, MIPS ISA Logic Design Tutorial 8.
Moving Arrays -- 1 Completion of ideas needed for a general and complete program Final concepts needed for Final Review for Final – Loop efficiency.
Pick up the handout on your way in!!
Software and Hardware Circular Buffer Operations
General Optimization Issues
TigerSHARC processor General Overview.
Generating “Rectify( )”
A Play Core Timer Interrupts
The planned and expected
Trying to avoid pipeline delays
Generating a software loop with memory accesses
Understanding the TigerSHARC ALU pipeline
Handling Arrays Completion of ideas needed for a general and complete program Final concepts needed for Final.
Lab. 2 – More details – Later tasks
VisualDSP++ and Test Driven Development What happened last lecture?
Moving Arrays -- 1 Completion of ideas needed for a general and complete program Final concepts needed for Final Review for Final – Loop efficiency.
Understanding the TigerSHARC ALU pipeline
Moving Arrays -- 2 Completion of ideas needed for a general and complete program Final concepts needed for Final DMA.
Using Arrays Completion of ideas needed for a general and complete program Final concepts needed for Final.
A Play Lab. 2 Task 8 Core Timer Interrupts
Moving Arrays -- 2 Completion of ideas needed for a general and complete program Final concepts needed for Final DMA.
Handling Arrays Completion of ideas needed for a general and complete program Final concepts needed for Final.
* M. R. Smith 07/16/96 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint.
Getting serious about “going fast” on the TigerSHARC
General Optimization Issues
Explaining issues with DCremoval( )
General Optimization Issues
Lab. 4 – Part 2 Demonstrating and understanding multi-processor boot
COMS 361 Computer Organization
Handling Arrays Completion of ideas needed for a general and complete program Final concepts needed for Final.
Instruction Set Principles
Developing a bicycle speed-o-meter
Building a simple loop using Blackfin assembly code
Understanding the TigerSHARC ALU pipeline
A first attempt at learning about optimizing the TigerSHARC code
CPU Structure CPU must:
Generalities for Assembly Language
Working with the Compute Block
Computer Organization and Assembly Language
COMPUTER ORGANIZATION AND ARCHITECTURE
9/27: Lecture Topics Memory Data transfer instructions
Blackfin Syntax Moves and Adds
Blackfin Syntax Stores, Jumps, Calls and Conditional Jumps
A first attempt at learning about optimizing the TigerSHARC code
Building tests and code for a “software radio”
Conditional Branching (beq)
Presentation transcript:

Generating the “Rectify” code (C++ and assembly code) Prelaboratory assignment information

Concepts Concepts of C++ “stubs” Forcing the test to fail Generating valid “C++ code” to satisfy the tests Need for “name mangling” for overloaded functions How do you find out the name mangled name so it can be used in assembly code Learning just enough TigerSHARC assembly code to make things “work” 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

“C++ stub” Just enough information to satisfy the compiler #include <stdio.h> #include <string.h> #include <stdlib.h> int *HalfWaveRectifyDebug(int initial_array[], int final_array[], int N) { return NULL; } 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

Integer rectify – force the tests to fail --- Test of the test 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

Passing integer rectify 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

Add the “ASM” tests Want link to fail to find mangled name Name mangled function name 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

Generate assembly code Do the code in steps, attempting to satisfy one test at a time Learn “the assembler” in steps Get “some idea” of the issues we need to learn about as we go along Just enough knowledge “to get things to work” Worry about full details later 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

Add ASM code stub Lab1/HalfWaveRectifyASMint.asm 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

Where failed within Test file 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

What we have learned We passed the “very general” test Managed to call and return from an assembly code and did not crash the system We passed some specific tests in the test file “by accident” Which tests and why did they pass? CJUMP – is the “way to return” from an assembly code function to “C++” Instruction format is interesting nop; nop; nop;; ; separate instructions executed together cjump;; ;; indicates the end of an “grouped” instruction When jumps are involved, TigerSHARC seems to prefer code that involves “four 32-bit instruction: because of “BTB requirement” 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

More detailed look at the code As with 68K needs a .section But name and format different As with 68K need .align statement Is the “4” in bytes (8 bits) or words (32 bits) As with 68K need .global to tell other code that this function exists Single semi-colons Double semi-colons Start function label End function label Label format similar to 68K Needs leading underscore and final colon 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

Need to know How do we return “an integer pointer” Need to look at “C++” manual for coding conventions As with 68K expect to have Volatile registers – function variate registers, that DON’T need to be conserved Non-volatile registers – function invariate registers, that DO need to be conserved 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

Return registers There are many, depending on what you need to return Here we need to use J8 Many registers available – need ability to control usage J0 to J31 – registers (integers and pointers) (SISD mode) XR0 to XR31 – registers (integers) (SISD mode) XFR0 to XFR31 – registers (floats) (SISD mode) Did I also mention I0 to U31 – registers (integers and pointers) (SISD mode) YR0 to YR31 , YFR0 to YFR31 (SIMD mode) XYR, YXR and R registers (SIMD mode) And also the MIMD modes And the ……. #define return_pt_J8 J8 // J8 is a VOLATILE register 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

Using J8 for returned int * value Now passing this test “by accident Should be conditionally passing back NULL 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

Conditional tests Need to code – returning a NULL or the starting address of the final array int *HalfWaveRectifyRelease(int initial_array[ ], int final_array[ ], int N) if ( N < 1) return_pt = NULL; else /* after some calculations */ return_pt = &final[ ]; Questions to ask the instruction manual How are parameters passed to us? On the stack (as with 68K) or in registers / stack (as with MIPS)? – answer turns out to be more like MIPS How do you do an IF? How do you do conditional jumps? 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

Parameter passing Spaces for first four parameters present on the stack (as with 68K) But the first four parameters are passed in registers (J4, J5, J6 and J7 most of the time) (as with MIPS) The parameters passed in registers are often stored into the spaces on the stack (like the MIPS) when assembly code functions call assembly code functions J4, J5, J6 and J7 are volatile registers 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

Coding convention // int *HalfWaveRectifyRelease(int initial_array[ ], // int final_array[ ], int N) #define initial_pt_inpar1 J4 #define final_pt_inpar2 J5 #define M_J6_inpar3 J6 #define return_pt_J8 J8 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

Doing an IF (N < 1) JUMP type of instruction 68K version CMP.L #1, D1 ; Performs D1 – 1 and sets ; condition code flag BLT ELSE ; Branch if result of D1 – 1 < 0 ; BLE is a branch if less than ; zero instructions not on D1 < 1 TigerSHARC version COMP(N_inpar3, 1);; // Perform N – 1 test IF JLT, JUMP ELSE;; // Use of comma , and semi-colons ;; Same possible error on BOTH processors 68K -- which BLE, BLT or BGT? TigerSHARC – which JLE, JLT or NJLE? 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

Note END_IF not defined and not yet recognized as an error ELSE is a KEYWORD Missing ;; means all these instructions are joined into “1-line” of more than 4 instructions Note END_IF not defined and not yet recognized as an error 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

Why is ELSE a keyword IF JLT; ELSE, J1 = J2 + J3; // Conditional execution ELSE, XR1 = XR2 + XR3; // Conditional YFR1 = YFR2 + YFR3;; // Unconditional IF JLT; DO, J1 = J2 + J3; // Conditional execution DO, XR1 = XR2 + XR3; // Conditional D0, YFR1 = YFR2 + YFR3;; // Unconditional I think I have also seen a IF, DO, ELSE instruction that can be used under special circumstances 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

Jumps can be predicted to happen (default) Quad stuff issue Personally, because of name mangling issues, I cut-and-paste function name into labels Two issues Jumps can be predicted to happen (default) Quad stuff issue 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

QUAD and predicted jumps Apparently both predicted and unpredicted jumps All jumps very disruptive to the TigerSHARC pipeline Uses something called “Branch Target Buffer” (BTB) to assist in overcome this. Saw this on AMD-29050 RISC processor Probably a 4 instructions–per-line cache so that jumps need to have 4 instructions between them to work 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

But at this stage “best” is not needed “working” is needed Not the best solution But at this stage “best” is not needed “working” is needed 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

The code was not exactly what we designed (C++ equivalent) – refactor and retest after the refactoring NEXT STEP 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

Exercise – code the following as a software loop – follow 68K approach extern “C” int CalculateSum(void) { int sum = 0; for (int count = 0; count < 6; count++) { sum = sum + count; } return sum; extern “C” – means that this function is “C” compatible rather than “C++”. No overloading (requiring name-mangling) permitted 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

Reminder – software for-loop becomes “while loop” with initial test extern “C” int CalculateSum(void) { int sum = 0; int count = 0; while (count < 6) { sum = sum + count; count++; } return sum; Do line by line translation 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada

Concepts Concepts of C++ “stubs” Forcing the test to fail Generating valid “C++ code” to satisfy the tests Need for “name mangling” for overloaded functions How do you find out the name mangled name so it can be used in assembly code Learning just enough TigerSHARC assembly code to make things “work” 11/15/2018 TigerSHARC assemble code 1, M. Smith, ECE, University of Calgary, Canada