Overview of SHARC processor ADSP Program Flow and other stuff

Slides:



Advertisements
Similar presentations
This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during.
Advertisements

6/2/20151 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items.
Systematic development of programs with parallel instructions SHARC ADSP2106X processor M. Smith, Electrical and Computer Engineering, University of Calgary,
Process for changing “C-based” design to SHARC assembler ADDITIONAL EXAMPLE M. R. Smith, Electrical and Computer Engineering University of Calgary, Canada.
This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during.
Software and Hardware Circular Buffer Operations First presented in ENCM There are 3 earlier lectures that are useful for midterm review. M. R.
ENCM 515 Review talk on 2001 Final A. Wong, Electrical and Computer Engineering, University of Calgary, Canada ucalgary.ca.
6/3/20151 ENCM515 Comparison of Integer and Floating Point DSP Processors M. Smith, Electrical and Computer Engineering, University of Calgary, Canada.
Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create.
This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during.
Squish-DSP Application of a Project Management Tool to manage low-level DSP processor resources M. Smith, University of Calgary, Canada ucalgary.ca.
RISC. Rational Behind RISC Few of the complex instructions were used –data movement – 45% –ALU ops – 25% –branching – 30% Cheaper memory VLSI technology.
TigerSHARC processor General Overview. 6/28/2015 TigerSHARC processor, M. Smith, ECE, University of Calgary, Canada 2 Concepts tackled Introduction to.
2000/03/051 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items.
RISC:Reduced Instruction Set Computing. Overview What is RISC architecture? How did RISC evolve? How does RISC use instruction pipelining? How does RISC.
Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter – Part 3 Understanding the memory pipeline issues.
Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter – Part 2 Understanding the pipeline.
Moving Arrays -- 1 Completion of ideas needed for a general and complete program Final concepts needed for Final Review for Final – Loop efficiency.
Systematic development of programs with parallel instructions SHARC ADSP21XXX processor M. Smith, Electrical and Computer Engineering, University of Calgary,
A first attempt at learning about optimizing the TigerSHARC code TigerSHARC assembly syntax.
Generating a software loop with memory accesses TigerSHARC assembly syntax.
William Stallings Computer Organization and Architecture 8th Edition
Moving Arrays -- 1 Completion of ideas needed for a general and complete program Final concepts needed for Final Review for Final – Loop efficiency.
واشوقاه إلى رمضان مرحباً رمضان
Software and Hardware Circular Buffer Operations
TigerSHARC processor General Overview.
Microcoded CCU (Central Control Unit)
Program Flow on ADSP2106X SHARC Pipeline issues
Overview of SHARC processor ADSP and ADSP-21065L
Trying to avoid pipeline delays
ENCM K Interrupts Theory and Practice
Understanding the TigerSHARC ALU pipeline
Comparing 68k (CISC) with 21k (Superscalar RISC DSP)
This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during.
M. R. Smith, University of Calgary, Canada ucalgary.ca
* 07/16/96 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items.
TigerSHARC processor and evaluation board
Comparing 68k (CISC) with 21k (Superscalar RISC DSP)
Moving Arrays -- 1 Completion of ideas needed for a general and complete program Final concepts needed for Final Review for Final – Loop efficiency.
* 07/16/96 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items.
* 07/16/96 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items.
Understanding the TigerSHARC ALU pipeline
* 07/16/96 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items.
ENCM Course Hand-out Outline and Marking Scheme
Moving Arrays -- 2 Completion of ideas needed for a general and complete program Final concepts needed for Final DMA.
Overview of TigerSHARC processor ADSP-TS101 Compute Operations
* 07/16/96 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items.
-- Tutorial A tool to assist in developing parallel ADSP2106X code
* 07/16/96 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items.
* From AMD 1996 Publication #18522 Revision E
Moving Arrays -- 2 Completion of ideas needed for a general and complete program Final concepts needed for Final DMA.
* M. R. Smith 07/16/96 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint.
This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during.
* 2000/08/1307/16/96 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these.
Getting serious about “going fast” on the TigerSHARC
* L. E. Turner and M. R. Smith, University of Calgary, Alberta, Canada
Explaining issues with DCremoval( )
Tutorial on Post Lab. 1 Quiz Practice for parallel operations
Introduction to Computer Systems
Overview of SHARC processor ADSP-2106X Compute Operations
Building a simple loop using Blackfin assembly code
Overview of SHARC processor ADSP-2106X Compute Operations
* 07/16/96 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items.
Overview of SHARC processor ADSP-2106X Memory Operations
Understanding the TigerSHARC ALU pipeline
This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during.
A first attempt at learning about optimizing the TigerSHARC code
Working with the Compute Block
* 07/16/96 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items.
* M. R. Smith 07/16/96 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint.
Presentation transcript:

Overview of SHARC processor ADSP-21061 Program Flow and other stuff * 07/16/96 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during your presentation In Slide Show, click on the right mouse button Select “Meeting Minder” Select the “Action Items” tab Type in action items as they come up Click OK to dismiss this box This will automatically create an Action Item slide at the end of your presentation with your points entered. Overview of SHARC processor ADSP-21061 Program Flow and other stuff M. R. Smith, Electrical and Computer Engineering, University of Calgary, Alberta, Canada smithmr @ ucalgary.ca *

To be tackled today Reference sources Program Flow Some warnings of expected errors Code review and code review standards 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

Reference Sources ADSP-2106x SHARC User’s Manual 2nd edition, Analog Devices -- provided to everybody ENCM515 SHARC Reference card ENCM515 Course, Reference and Laboratory Notes Check web-pages for links to VisualDSP++, Compiler, Assembler, Linker and other tools Also see ECE-ADI-Project (link from Dr. Smith Home Page) SHARC Navigator Tutorial Tool See January 2004 web pages for link – shows basic assembly language operations using simple animation 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

Picture Source SHARC Navigator Tutorial Tool T. Alukaidey@herts.ac.uk Talik Alukaidey Dept. of EEE Uninversity of Hertfordshire, Hatfield, U.K. 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

ADSP-2106x Core Architecture * ADSP-2106x Core Architecture 07/16/96 DAG 2 8 x 4 x 24 DAG 1 8 x 4 x 32 CACHE MEMORY 32 x 48 PROGRAM SEQUENCER PMD BUS DMD BUS 24 PMA BUS PMD DMD PMA 32 DMA BUS DMA 48 40 JTAG TEST & EMULATION FLAGS FLOATING & FIXED-POINT MULTIPLIER, FIXED-POINT ACCUMULATOR 32-BIT BARREL SHIFTER FLOATING-POINT & FIXED-POINT ALU REGISTER FILE 16 x 40 BUS CONNECT TIMER *

Processor Pipelines CISC processor – complex addressing modes – less memory used for storing instructions ADSP-2106X – Any instruction that makes the processor pipelines (note plural) slow has been discarded. Compute Unit Pipeline Memory Unit Pipeline Instruction Pipeline Some instructions can be described in fewer bits than other instructions, and therefore can be combined with other instructions – multi-function instructions 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

PIPELINE ISSUES FETCH DECODE EXECUTE/ WRITEBACK R1 = R1 = R2 + R3 R4 = R1 = ASTAT SET R7 = R8 + R9 R10 = R4 = ASTAT SET R13 = R7 = ASTAT SET 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

PIPELINE ISSUES FETCH DECODE EXECUTE/ WRITEBACK R1 = COMP COMP(R1,R2) R4 = R5 + R6 R7 = COMP AZ and AN set R7 = R8 + R9 R10 = R4 = ASTAT SET R13 = R7 = ASTAT SET R1 = 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

PIPELINE ISSUES – JUMP NOT TAKEN FETCH DECODE EXECUTE/ WRITEBACK COMP COMP(R1,R2) IF EQ IF EQ JUMP ELSE R7 = COMP AZ and AN set R7 = R8 + R9 R10 = R13 = R7 = ASTAT SET ELSE: R13 = R1 = 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

PIPELINE ISSUES – JUMP TAKEN FETCH DECODE EXECUTE/ WRITEBACK COMP COMP(R1,R2) IF EQ IF EQ JUMP ELSE R7 = COMP AZ and AN set R7 = R8 + R9 R10 = R13 = R7 = ASTAT SET ELSE: R13 = R1 = 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

PROGRAM FLOW CHANGES PROGRAM FLOW CHANGES ARE TO BE AVOIDED IF YOU WANT TO HAVE PROGRAM SPEED PROGRAMMING TECHNIQUES HARDWARE TECHNIQUES COMBO TECHNIQUES 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

Program Flow The obvious instructions JUMP -- JUMP to some new memory location to fetch the next instruction CALL – Store the next instruction location (return address), then JUMP to some new memory location to fetch the next instruction don’t exist as such – too destructive to program flow IF condition JUMP and IF condition CALL do exist These instruction are treated as NOPs if the condition is FALSE 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

STANDARD IF THEN ELSE IF (TEST) { CODE IF TEST IS TRUE; } ELSE { CODE IF TEST IS FALSE; 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

STANDARD IF THEN ELSE Recode as TEST IF TEST IS FALSE JUMP ELSE CODE IF TEST IS TRUE; JUMP ENDIF ELSE: CODE IF TEST IS FALSE; ENDIF: IF (TEST) { CODE IF TEST IS TRUE; } ELSE { CODE IF TEST IS FALSE; 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

STANDARD IF THEN ELSE COMP(Fx, Fy); IF NE JUMP ELSE; CODE IF TEST IS TRUE; JUMP ENDIF; ELSE: CODE IF TEST IS FALSE; ENDIF: COURSE EXPECTATION – CODE INDENTATION IF (TEST) { CODE IF TEST IS TRUE; } ELSE { CODE IF TEST IS FALSE; 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

PIPELINE ISSUES – TEST TRUE FETCH DECODE EXECUTE/ WRITEBACK COMP COMP(R1,R2) IF NE IF NE JUMP ELSE R7 = COMP AZ and AN set R7 = R8 + R9 JUMP JUMP ENDIF ELSE: R13 R7 = ASTAT SET ELSE: R13 = R1 = ENDIF: R7 = 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

PIPELINE ISSUES – TEST FALSE FETCH DECODE EXECUTE/ WRITEBACK COMP COMP(R1,R2) IF NE IF NE JUMP ELSE R7 = COMP AZ and AN set R7 = R8 + R9 JUMP JUMP ENDIF ELSE: R13 R7 = ASTAT SET ELSE: R13 = R1 = ENDIF: R7 = 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

Efficient if-then-else Efficient coding practices to be demonstrated in this course Efficient coding practices change depending on algorithm being used What looks efficient – may not be If (R1 ==R2) R3 = 0; else R3 = 1; 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

PIPELINE ISSUES – TEST TRUE FETCH DECODE EXECUTE/ WRITEBACK COMP COMP(R1,R2) IF NE IF NE JUMP ELSE R3 = 0 COMP AZ and AN set JUMP IF NE JUMP JUMP ENDIF ELSE: R3 = 1 R3 = 0 ASTAT SET ELSE: R3 = 1; ENDIF: ELSE: R13 ENDIF: R7 = Refetched 9 to endif completion 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

PIPELINE ISSUES – TEST FALSE FETCH DECODE EXECUTE/ WRITEBACK COMP COMP(R1,R2) IF NE IF NE JUMP ELSE R3 = 0 COMP AZ and AN set JUMP JUMP ENDIF ELSE: R3 = 1 R3 = 0ASTAT SET ELSE: R3 = 1; ENDIF: ELSE: R13 ENDIF: R7 = 7 to endif completion 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

KNOW YOUR CODE PATH NE MOST OFTEN Code as If (R1 ==R2) R3 = 0; else R3 = 1; FASTER – IN THIS CASE EQ MOST OFTEN Code as If (R1 != R2) R3 = 1; else R3 = 0; FASTER – IN THIS CASE 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

Other alternatives R3 = 1; IF (R1 == R2) R3 = 0; Standard approach COMP(R1, R2); IF NE JUMP ENDIF; R3 = 0; ENDIF: Avoid a JUMP since do the else condition whether needed or not 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

DO pipeline trace IF JUMP TAKEN IF JUMP NOT TAKEN R3= R3= F D E/W F D COMP IF JUMP ENDIF F D E/W R3= COMP IF JUMP ENDIF 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

Other alternatives R3 = 1; IF (R1 == R2) R3 = 0; Use IF COMPUTE 21061 approach R3 = 1; COMP(R1, R2); IF EQ R3 = R3 – R3; Avoids 2 JUMPs 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

DO pipeline trace IF TRUE IF FALSE R3= R3= F D E/W F D E/W IF COMPUTE ENDIF F D E/W R3= COMP IF COMPUTE ENDIF 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

Why is this not allowed? IF EQ R3 = R3 – R3; is okay BUT IF EQ R3 = 0; is illegal IF EQ COMPUTE is the instruction format R3 = R3 – R3; is a compute operation (23 bit) R3 = 0; is a UREG operation using a 32 bit constant -- illegal 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

Allowed or not? IF EQ R3 = (R2 + R2) / 2; is okay IF EQ R3 = MIN(R2, R2); is okay BUT IF EQ R3 = R2; is not a valid compute operation IF EQ COMPUTE is the instruction format R3 = (R2 + R2)/2 ; and R3 = PASS R2; are compute operations (23 bit) R3 = R2; is a UREG operation using a second UREG BUT IT IS ALLOWED How to check – put instruction or instruction sequence through a project building project (Assemble and link); 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

SHARC Program Flow SEE REF-CARD <reladdr6> Key issues Condition affects ALL of the instruction, Compute and jump both become conditional JUMP and also JUMP (DB) 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

Delayed branch Rather than R4 = 6; R8 = 7; JUMP _Label; ……… _Label: 5 cycles from R4 to start of execution of _Label instruction Use Delayed Branch JUMP _Label (DB); R4 = 6; R8 = 7; ……… _Label: 3 cycles from JUMP to start of execution of _Label instruction 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

DO pipeline trace NORMAL JUMP DELAYED JUMP F D E/W R4= R8 = F D E/W NEXT1 NEXT2 LABEL F D E/W JUMP (DB) R4= R4 = R8 = LABEL 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

Delayed Branch -- A killer! SEE REF-CARD 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

Special JUMP instructions to handle “C” CJUMP-- getting to “C” compatible subroutine Processor architecture customized for C Replaces 3 instructions for faster operations Inefficient for use in ENCM515 Will not be having assembly code calling other subroutines (95%) -- Why bother since slow! RFRAME -- returning to “C” environment Part of MAGIC lines of code See reference card 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca

To be tackled today Reference sources Program Flow Some warnings of expected errors Code review and code review standards 11/30/2018 ENCM515 -- Review of SHARC Processor Copyright smithmr@ucalgary.ca