Multiply Instructions

Multiply Instructions
Integer multiplication (32-bit result) Long integer multiplication (64-bit result) Built in Multiply Accumulate Unit (MAC) Multiply and accumulate instructions add product to running total

32-bit result MULA Multiply accumulate UMULL Unsigned multiply 64-bit result UMLAL Unsigned multiply accumulate SMULL Signed multiply SMLAL Signed multiply accumulate

Multiply instructions
There are some important differences from the other arithmetic instructions: Immediate second operands are not supported The result register must not be the same as the first source register

Multiplication MUL R0, R1, R2;R0 = (R1xR2)[31:0] Features:
Second operand can’t be immediate The result register must be different from the first operand Cycles depends on core type If S bit is set, C flag is meaningless

Multiplication

2-D array indexing

Multiplication Multiply-accumulate (2D array indexing)
MLA R4, R3, R2, R1 @ R4 = R3xR2+R1 Multiply with a constant can often be more efficiently implemented using shifted register operand MOV R1, #35 MUL R2, R0, R1 or ADD R0, R0, R0, LSL #2 @ R0’=5xR0 RSB R2, R0, R0, LSL #3 @ R2 =7xR0’

The following instruction produce the full 64 bit result:
<mul>{<cond>}{S} RdHi, RdLo, Rm, Rs The 64-bit multiply types are UMULL, UMLAL, SMULL, SMLAL Example: UMULL r6, r5, r3. r9 ;multiplies the values of r3 and r9 and stores the 64 bit result as least significant 32 bits are stored in r5 and most significant 32 bits are stored in r6.

Summary- Multiply instructions

Flow control instructions
Determine the instruction to be executed next pc-relative offset BX and BLX – Thumb related instructions

Branch instruction conditional branches B label … label: … MOV R0, #0
loop: … ADD R0, R0, #1 CMP R0, #10 BNE loop

BL instruction save the return address to R14 (lr)
Branch and link BL instruction save the return address to R14 (lr) BL sub @ call sub CMP R1, #5 @ return to here MOVEQ R1, #0 … sub: … @ sub entry point MOV PC, return

Branch conditions

Branches

Conditional execution
CMP R0, #5 BEQ bypass @ if (R0!=5) { ADD R1, R1, R1=R1+R0-R2 SUB R1, R1, } bypass: … CMP R0, #5 ADDNE R1, R1, R0 SUBNE R1, R1, R2 smaller and faster Rule of thumb: if the conditional sequence is three instructions or less, it is better to use conditional execution than a branch.

Data Transfer Instructions

Register- indirect addressing

Addressing Modes in ARM
Load and store instructions have three primary addressing modes offset pre-indexed post-indexed.

One-dimensional array
Example:

One-dimensional array
Example: Another Approach:

Modifying the Status Registers
Only indirectly MSR moves contents from CPSR/SPSR to selected GPR MRS moves contents from selected GPR to CPSR/SPSR Only in privileged modes R0 R1 MRS R7 R8 CPSR SPSR MSR R14 R15

PSR Transfer Instructions
27 31 N Z C V Q 28 6 7 I F T mode 16 23 8 15 5 4 24 f s x c U n d e f i n e d J MRS and MSR allow contents of CPSR / SPSR to be transferred to / from a general purpose register. Syntax: MRS{<cond>} Rd,<psr> ; Rd = <psr> MSR{<cond>} <psr[_fields]>,Rm ; <psr[_fields]> = Rm where <psr> = CPSR or SPSR [_fields] = any combination of ‘fsxc’ Also an immediate form MSR{<cond>} <psr_fields>,#Immediate In User Mode, all bits can be read but only the condition flags (_f) can be written. The status registers are split into four 8-bit fields that can be individually written: bits 31 to 24 : the flags field (NZCV flags and 4 unused bits) bits 23 to 16 : the status field (unused in Arch 3, 4 & 4T) bits 15 to 8 : the extension field (unused in Arch 3, 4 & 4T) bits 7 to 0 : the control field (I & F interrupt disable bits, 5 processor mode bits, and the T bit on ARMv4T.) Immediate form of MSR can actually be used with any of the field masks, but care must be taken that a read-modify-write strategy is followed so that currently unallocated bits are not affected. Otherwise the code could have distinctly different effect on future cores where such bits are allocated. When used with the flag bits, the immediate form is shielded from this as bits can be considered to be read only. For MSR operations, we recommend that only the minimum number of fields are written, because future ARM implementations may need to take extra cycles to write specific fields; not writing fields you don't want to change reduces any such extra cycles to a minimum. For example, an MRS/BIC/ORR/MSR sequence whose purpose is to change processor mode (only) is best written with the last instruction being MSR CPSR_c,Rm, though any other set of fields that includes "c" will also work.

Software Interrupt SWI instruction Maximum 224 calls
Forces CPU into supervisor mode Usage: SWI #n Cond Opcode Ordinal Maximum 224 calls Suitable for running privileged code and making OS calls

Software Interrupt (SWI)
31 28 27 24 23 Cond SWI number (ignored by processor) Condition Field Causes an exception trap to the SWI hardware vector The SWI handler can examine the SWI number to decide what operation has been requested. By using the SWI mechanism, an operating system can implement a set of privileged operations which applications running in user mode can request. Syntax: SWI{<cond>} <SWI number> In effect, a SWI is a user-defined instruction. Used for calling the operating system (switches to privileged mode). SWI number field can be used to specify the operation code, e.g. SWI 1 start a new task, SWI 2 allocate memory, etc. Using a number has the advantage that the O.S. can have different revisions, and the same application code will work on each O.S. rev.

Multiply Instructions

Similar presentations

Presentation on theme: "Multiply Instructions"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Multiply Instructions

Similar presentations

Presentation on theme: "Multiply Instructions"— Presentation transcript:

Similar presentations

About project

Feedback