Comparing 68k (CISC) with 21k (Superscalar RISC DSP)

Comparing 68k (CISC) with 21k (Superscalar RISC DSP)
* 07/16/96 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during your presentation In Slide Show, click on the right mouse button Select “Meeting Minder” Select the “Action Items” tab Type in action items as they come up Click OK to dismiss this box This will automatically create an Action Item slide at the end of your presentation with your points entered. Comparing 68k (CISC) with 21k (Superscalar RISC DSP) M. R. Smith, Electrical and Computer Engineering University of Calgary, Alberta, Canada ucalgary.ca *

ENCM515 -- Compare 68k and 21k Copyright smithmr@ucalgary.ca
To be tackled today When to use assembly code Useful sub-set of 68K CISC instructions Recap Effective addressing modes Load/Store Programming style for 68K Load/Store Architecture of 21K by comparison with 68K 12/2/2018 ENCM Compare 68k and 21k Copyright

“Reminder” Reuse the following ENCM415 concepts Don’t use “Assembly Code” unless “really have” to Write in “C/C++” whenever appropriate Connect to the hardware “in assembler” using instructions that always work -- RISC-like (MIPS) Understand linkages between “assembly” and “C” Customize “C” only when necessary ENCM515 Basic requirement for “Custom DSP” code -- need to know features of processor Recognize that speed comes from instructions that work only under special conditions because of processor architectural constraints -- opcode size, bus availability 12/2/2018 ENCM Compare 68k and 21k Copyright

Very limited set of instructions used in Assembly Code most of the time Operational Instructions MOVE ADD, SUB (FADD, FSUB) AND, OR Program Flow BRA, JMP, JSR, RTS, TRAP CMP, BNE, BEQ BHI, HLO, BLS (unsigned branches) BGE, BLT, BGT (signed branches) 12/2/2018 ENCM Compare 68k and 21k Copyright

Easiest way to program 68K in assembly
Have a PSP process to avoid the stupid mistakes that stop you getting to the stuff that is worth doing Never bother with the complex EA-mode instructions Don’t gain much any way Program CISC as if had “LOAD/STORE” architecture like the MIPS processor MOVE memory to register (LOAD) MOVE register to memory (STORE) OPERATE register on register -- Memory access in FETCH only Plus a few other non-RISC instructions that you find very useful to use (e.g. ADD.L #5, D0) Customize for speed later -- if it is worth the effort EASIER TO CUSTOMIZE when in this “simple” mode 12/2/2018 ENCM Compare 68k and 21k Copyright

Easiest way to program 21k in assembly
Have a PSP process to avoid the stupid mistakes that stop you getting to the stuff that is worth doing Never bother with the complex EA-mode instructions Don’t gain much any way Program Superscalar RISC DSP which has “LOAD/STORE” architecture like the MIPS processor PLUS DSP-special MOVE memory to register (LOAD) MOVE register to memory (STORE) OPERATE register on register Plus a few other non-RISC instructions that you find very useful to use (e.g. ADD.L #5, D0) Customize for speed later -- if it is worth the effort 12/2/2018 ENCM Compare 68k and 21k Copyright

Some of the effective address modes for 68k MOVE
Register to Register -- RISC like MOVE.L D1, D0 [D0] <- [D1] (31:0) Immediate to Register -- RISC like MOVE.L #0x5000, D1 [D1] <- 0x5000 (31:0) Memory to Register -- RISC like MOVE.L 0x5000, D1 [D1] <- [M(0x5000)] (31:0) Memory to Memory -- CISC MOVE.L 0x5000, 0x [M(0x6000)] <- [M(0x5000)] (31:0) 21k equivalent R0 = R1; 21k equivalent R1 = 0x5000; 21k equivalent R1 = dm(0x5000); 21k equivalent 4 animations 12/2/2018 ENCM Compare 68k and 21k Copyright

Look behind the instruction at the architecture
68k --- MOVE.L D0, D0 Involves fetching the instruction (4 cycles) and then everything else is done with out extra (slow) memory operations 21k --- R0 = R1 Involves fetching the instruction (1 cycle) and then everything else is done with out extra memory operations. Pipelining issue 12/2/2018 ENCM Compare 68k and 21k Copyright

68k --- MOVE.L #0x5000, D0 Involves fetching the instruction, (4 cycles) then fetching the hi (4 cycles) and low (4 cycles) components of the constant stored in program space and then everything else is done with out extra memory operations -- Really MOVE.L #0x , D0 21k --- R0 = 0x5000 Involves fetching the instruction (1 cycle) and then everything else is done with out extra memory operations. More like MOVEQ.L #0x5000, D0 where constant is built into the op-code 12/2/2018 ENCM Compare 68k and 21k Copyright

68k --- MOVE.L 0x5000, D0 Involves fetching the instruction (4), then fetching the hi (4) and low (4) components of the constant stored in program space, then fetching the hi (4) and low (4) values from adjacent addresses in data space and then everything else is done with out extra memory operations. Again really MOVE.L 0x ,D0 21k --- R0 = dm(0x5000) Involves fetching the instruction (1) and then later fetching the value from data memory space (1). More like MOVE.L (Address_temp), D0 with the address register being preloaded during the instruction fetch. 12/2/2018 ENCM Compare 68k and 21k Copyright

Some of the effective address modes for ADD
* 07/16/96 Register to Register -- RISC like ADD.L D1, D0 [D0] <- [D0] + [D0] Immediate to Register -- CISC ADD.L #0x5000, D1 [D1] <- [D1] + 0x5000 Memory to Register -- CISC ADD.L 0x5000, D1 [D1] <- [D1] + [M(0x5000)] Memory to Memory -- CISC ADD.L 0x5000, 0x illegal on 68K [M(0x6000)] <- [M(0x6000)] + [M(0x5000)] animations 21k equivalent R0 = R0 + R1; 21k equivalent 21k illegal too 12/2/2018 ENCM Compare 68k and 21k Copyright *

68k --- ADD.L #0x5000, D0 Involves fetching the instruction (4), then fetching the hi (4) and low (4) components of the constant stored in program space and then doing addition during “execution” phase. On the 68k the 32-bit add takes extra cycles. 21k --- R1 = 0x5000; R0 = R1 + R0; Involves fetching the two instructions and then everything else is done with out extra memory operations. More like MOVEQ.L #0x5000, D0 12/2/2018 ENCM Compare 68k and 21k Copyright

Basic LOAD/STORE operations
CAREFULL!!!! 21k -- NOT QUITE 2 memory busses LOAD -- Memory to register [Reg] <- [Memory(address)] MOVE.L 0x5000, D1 R1 = dm(0x5000); [D1] <- [Memory(0x5000)] R1 = pm(0x5000); F1 = dm(0x5000); STORE -- Register to Memory [Memory(address)] <- [Reg] MOVE.L D1, 0x5000 dm(0x5000) = R1; [Memory(0x5000)] <- [D1] pm(0x5000) = R1; 12/2/2018 ENCM Compare 68k and 21k Copyright

LOAD register with a constant [Reg] <- constant value MOVE.L #0x5000, D1 R1 = 0x5000; [D1] <- 0x5000 CAREFULL!!!! 21k -- NOT QUITE Can’t always make parallel 12/2/2018 ENCM Compare 68k and 21k Copyright

Basic Register-to Register operations
CAREFULL!!!! 21k -- NOT QUITE especially when parallel LOAD -- Register to register [Reg] <- [Reg2] MOVE.L D1, D0 R0 = R1; [D0] <- [D1] Sometimes R0 = pass R1; is better Operation -- Register to register [Reg] <- [Reg] Operation [Reg2] ADD.L D1, D R0 = R1 + R2; [D0] <- [D0] + [D1] is also possible on 21k 12/2/2018 ENCM Compare 68k and 21k Copyright

Basic 68k Register-to Register operations
Operation -- Register to register [Reg] <- [Reg] Operation [Reg2] ADD.L D0, D [D1] <- [D1] + [D0] SUB.L D0, D [D1] <- [D1] - [D0] AND.L D0, D [D1] <- [D1] & [D0] OR.L D0, D [D1] <- [D1] | [D0] CMP.L D0, D [BB] <- [D1] - [D0] ASR #3, D0 [D0] <- [D0] >> 3 (signed) LSR #3, D0 [D0] <- [D0] >> 3 (unsigned) 12/2/2018 ENCM Compare 68k and 21k Copyright

Basic 21k Register-to Register operations
Operation -- Register to register [Reg] <- [Reg1] Operation [Reg2] [D1] <- [D2] + [D0] [D1] <- [D1] - [D2] [D1] <- [D1] & [D2] [D1] <- [D1] | [D2] Compare [D0] <- [D0] >> 3 (signed) [D0] <- [D0] >> 3 (unsigned) YOU COMPLETE THE 21k Instructions 12/2/2018 ENCM Compare 68k and 21k Copyright

Basic Indirect Addressing Operations to Memory
CAREFULL!!!! 21k -- NOT QUITE LOAD INDIRECT [Reg] <- [Memory([AddressReg2])] R1 = dm(0, I4); MOVE.L (A0), D0 R1 = dm(I4, 0) ; [D0] <- [Memory([A0])] R1 = pm(I12, 0); R1 = dm(I12, 0); NO! LOAD INDIRECT with CONSTANT offset [Reg] <- [Memory([AddressReg2 + offset])] MOVE.L (8, A0), D0 R1 = dm(2, I4); [D0] <- [Memory([A0] + 8)] R1 = dm(I4, 2) ; NO! Same with store operations 12/2/2018 ENCM Compare 68k and 21k Copyright

Indirect Addressing Operations to Memory
LOAD INDIRECT with Register offset [Reg] <- [Memory([AddressReg2 + offset])] D1 used as loop counter R0 = dm(R1, I4); NO!! MOVE.L (A0,D1), D0 M4 = R1; R0 = dm(M4, I4); [D0] <- [Memory([A0] + [D1])] R1 = dm(I4, M4); NO!! R1 = pm(M12, I12); R1 = pm(M4, I12); NO!! Same with store operations LOAD INDIRECT with Register + constant offset [Reg] <- [Memory([AddressReg2 + offset1 + offset 2])] MOVE.L (8, A0, D1), D0 NO!!, multiple 21k [D0] <- [Memory([A0] + [D1] + 8)] CAREFULL!!!! 21k -- NOT QUITE 12/2/2018 ENCM Compare 68k and 21k Copyright

MOVE.L (8,A0,D1),D0 Fetch the MOVE instruction (4 cycles) Fetch the Value 8 (4 cycles) Move A0 to ALU then add D1 (loop variable) Move result of ALU to ALU then add 8 (structure offset) Move result to address register -- fetch memory value and store in high part of D0 (4 cycles) Move result of ALU and add 2 (next address) (?) Move result to address register -- fetch (4 cycles) memory value and store in low part of D0 Note A0 and D1 must remain unchanged 12/2/2018 ENCM Compare 68k and 21k Copyright

MOVE.L (8,A0,D1),D k style A0 -> I4, D1 -> R1, D0 -> R0 Fetch the MOVE instruction (4 cycles) Fetch the Value 8 (4 cycles) R2 = 8; Move A0 to ALU then add D R2 = R1 + R2; Move result of ALU to ALU then add M4 = R2; Move result to address register -- fetch memory value and store in high part of D0 Move result of ALU and add 2 (next address) Move result to address register -- fetch memory value and store in low part of D R0 = dm(M4, I4) If using 21k hardware loop, how do you access the loop counter with minimum overhead? 12/2/2018 ENCM Compare 68k and 21k Copyright

LOAD INDIRECT with register post-increment [Reg] <- [Memory([AddressReg2])] [AddressReg2] <- [AddressReg2] + 4 MOVE.L (A0)+, D R0 = dm(I4, 1); [D0] <- [Memory([A0])] ; [A0] <- [A0] + 4 LOAD INDIRECT with register pre-decrement [AddressReg2] <- [AddressReg2] - 4 [Reg] <- [Memory([AddressReg2])] Modify (I4, -1); MOVE.L -(A0), D0 R0 = dm(0, I4); [A0] <- [A0] - 4 ; [D0] <- [Memory([A0])] 12/2/2018 ENCM Compare 68k and 21k Copyright

21k processor is DSP Digital Signal Processing Processor Customized for DSP In real life, programmer must be really close to the architecture if want speed However most of the time, treat like a version of the 68K 12/2/2018 ENCM Compare 68k and 21k Copyright

Compare MOVE on 29K and 68K Register to Register R1 = R MOVE.L D0, D1 Immediate to Register R0 = 0x MOVE.L #0x5000, D0 Memory to Register R0 = dm(0x5000) MOVE.L 0x5000, D0 R0 = pm(0x5000) No equivalent --- Memory to Memory -- No equivalent MOVE.L 0x5000, 0x6000 R0 = dm(0x5000); dm(0x6000) = R0; 12/2/2018 ENCM Compare 68k and 21k Copyright

Comparing ADD operations
Register to Register Add R1 = R1 + R0 ADD.L D0, D1 Immediate to Register Add -- No equivalent ADD.L #0x5000, D0 R1 = 0x5000; R0 = R1 + R0; Memory to Register Add -- No equivalent ADD.L 0x5000, D0 What are the equivalent? Memory to Memory Not available on EITHER processor What are the equivalents 12/2/2018 ENCM Compare 68k and 21k Copyright

Easiest way to program 21K assembly
Can’t bother with the complex instructions DSP has “LOAD/STORE” architecture like the MIPS processor MOVE memory to register (LOAD) MOVE register to memory (STORE) OPERATE register on register There are not any other type of instructions Customize for speed later using hardware Develop a process to avoid the standard simple errors so that you can get to the stuff that is important. Most of you will not bother to use the process for 5 minutes in order to avoid wasting 1 hour of time 12/2/2018 ENCM Compare 68k and 21k Copyright

LOAD -- Memory to register [Reg] <- [Memory(address)] R0 = dm(0x5000) MOVE.L 0x5000, D0 STORE -- Register to Memory [Memory(address)] <- [Reg] pm(0x5000) = R0 no 68k equivalent for pm 12/2/2018 ENCM Compare 68k and 21k Copyright

LOAD register with a constant [Reg] <- constant value R0 = 0x5000 MOVE.L #0x5000, D0 12/2/2018 ENCM Compare 68k and 21k Copyright

LOAD -- Register to register [Reg] <- [Reg2] R0 = R1 MOVE.L D1, D0 Operation -- Register to register [Reg] <- [Reg] Operation [Reg2] R1 = R1 + R0 ADD.L D0, D1 R1 = R2 + R3 -- no equivalent -- 12/2/2018 ENCM Compare 68k and 21k Copyright

Operation -- Register to register [Reg] <- [Reg] Operation [Reg2] R1 = R1 + R0 ADD.L D0, D1 R1 = R1 - R0 SUB.L D0, D1 R1 = R1 AND R0 AND.L D0, D1 R1 = R1 OR R0 OR.L D0, D1 -- many alternatives -- CMP.L D0, D1 12/2/2018 ENCM Compare 68k and 21k Copyright

Basic Indirect Addressing Operations to Memory
LOAD INDIRECT [Reg] <- [Memory([AddressReg2])] R0 = dm(I0) MOVE.L (A0), D0 LOAD INDIRECT with CONSTANT offset [Reg] <- [Memory([AddressReg2 + offset])] R0 = dm(2, I4) MOVE.L (8, A0), D0 but R0 = pm(2, I12) -- No need for distinction -- Special DAGS for custom data and program memory ops 12/2/2018 ENCM Compare 68k and 21k Copyright

LOAD INDIRECT with Register offset [Reg] <- [Memory([AddressReg2 + offset])] R0 = dm(M4, I4) MOVE.L (A0,D1), D0 Order is absolutely key -- dm(I4, M4) means something VERY different Same with store operations LOAD INDIRECT with Register + constant offset [Reg] <- [Memory([AddressReg2 + offset1 + offset 2])] -- NO Equivalent MOVE.L (8, A0, D1), D0 but wait till Lab. 2, 3 and 4 for some REALLY fancy SHARC addressing modes 12/2/2018 ENCM Compare 68k and 21k Copyright

LOAD INDIRECT with register post-increment [Reg] <- [Memory([AddressReg2])] [AddressReg2] <- [AddressReg2] + 4 R0 = dm(I4, M6) MOVE.L (A0)+, D0 (with M6 preset to 1) R0 = dm(I4, 1) -- An instruction that is only useful on a Monday/Weds and our labs are on Friday and exams on Tues! LOAD INDIRECT with register pre-decrement R0 = dm(I4, M7) MOVE.L -(A0), D0 (with M7 preset to -1) R0 = dm(I4, -1) -- Only useful on a Monday/Weds R0 = dm(I4, M15) illegal but R0 = pm(I12, M15) is OKAY 12/2/2018 ENCM Compare 68k and 21k Copyright

You complete, without next slide
// long int value = 6; // Memory[2000] = value; // Memory[3000] = 7; // long int pt = &Memory[4000]; // *pt = value; // *pt = 9; // *pt++ = value + 1; // *pt-- = value + 2; 12/2/2018 ENCM Compare 68k and 21k Copyright

Fix RISC architecture and speed Issues
#define valueR1 R1 valueR1 = 6; // long int value = 6; dm(2000) = value; // Memory[2000] = value; #define tempR0 R0 tempR0 = 7; dm(2000) = tempR0; // Memory[3000] = 7; #define ptI4 I4 // long int pt = &Memory[4000]; ptI4 = 4000; dm(ptI4) = value; // *pt = value; tempR0 = 9; // *pt = 9; dm(ptI4, M5) = tempR0; // M5 preset to 0 by C start-up procedure #define tempR2 R2 tempR2 = valueR1 + 1; // *pt++ = value + 1; dm(ptI4, M6) = tempR2; // M6 preset to +1 by C startup procedure tempR0 = 2; // *pt-- = value + 2; tempR2 = tempR1 + tempR0; dm(pt4, M7) = tempR2 // M7 preset to -1 by C startup procedure 12/2/2018 ENCM Compare 68k and 21k Copyright

NON-NEGIOTABLE NON-NEGIOTABLE -- means that is the way the processor is designed and you can’t fight it NON-NEGIOTABLE -- means that if you don’t do it this way you will waste a lot of time in the labs on the simple stuff -- and lose many marks in quizzes NON-NEGIOTABLE -- means that this is fixed, standard, life. Develop a simple PSP process to review code to make sure this stuff is not there and you can get onto the interesting stuff. CONTRACT -- The moment the class stops making 80% of these simple errors, I will stop taking most marks off in the quizzes for the simple stuff. 12/2/2018 ENCM Compare 68k and 21k Copyright

Tackled today When to use assembly code Useful sub-set of 68K CISC instructions Recap Effective addressing modes Load/Store Programming style for 68K Load/Store Architecture of 21K by comparison with 68K 21K architecture is customized for DSP 12/2/2018 ENCM Compare 68k and 21k Copyright

Comparing 68k (CISC) with 21k (Superscalar RISC DSP)

Similar presentations

Presentation on theme: "Comparing 68k (CISC) with 21k (Superscalar RISC DSP)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Comparing 68k (CISC) with 21k (Superscalar RISC DSP)

Similar presentations

Presentation on theme: "Comparing 68k (CISC) with 21k (Superscalar RISC DSP)"— Presentation transcript:

Similar presentations

About project

Feedback