Introduction to Assembly Chapter 2

Slides:



Advertisements
Similar presentations
ARM versions ARM architecture has been extended over several versions.
Advertisements

Embedded Systems Programming
Overheads for Computers as Components 2nd ed.
1 ARM Movement Instructions u MOV Rd, ; updates N, Z, C Rd = u MVN Rd, ; Rd = 0xF..F EOR.
Chapter 8: Central Processing Unit
Chapter 2 Instruction Sets 金仲達教授 清華大學資訊工程學系 (Slides are taken from the textbook slides)
ARM Microprocessor “MIPS for the Masses”.
© 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. Instruction sets Computer architecture taxonomy. Assembly language. 1.
Embedded Systems Programming ARM assembler. Creating a binary from assembler source arm=linux-as Assembler Test1.S arm-linux-ld Linker Arm-boot.o Executable.
Topics covered: ARM Instruction Set Architecture CSE 243: Introduction to Computer Architecture and Hardware/Software Interface.
ARM C Language & Assembler. Using C instead of Java (or Python, or your other favorite language)? C is the de facto standard for embedded systems because.
ARM Core Architecture. Common ARM Cortex Core In the case of ARM-based microcontrollers a company named ARM Holdings designs the core and licenses it.
ARM Instructions I Prof. Taeweon Suh Computer Science Education Korea University.
Embedded System Design Center Sai Kumar Devulapalli ARM7TDMI Microprocessor Thumb Instruction Set.
Topic 8: Data Transfer Instructions CSE 30: Computer Organization and Systems Programming Winter 2010 Prof. Ryan Kastner Dept. of Computer Science and.
AVR Microcontroller and Embedded System Using Assembly and C Mazidi, Naimi, and Naimi © 2011 Pearson Higher Education, Upper Saddle River, NJ All.
ARM Assembly Programming Computer Organization and Assembly Languages Yung-Yu Chuang 2007/11/19 with slides by Peng-Sheng Chen.
Lecture 4. ARM Instructions Prof. Taeweon Suh Computer Science & Engineering Korea University COMP427 Embedded Systems.
Introduction to AVR Chapter 1
1 Chapter 4 ARM Assembly Language Smruti Ranjan Sarangi Computer Organisation and Architecture PowerPoint Slides PROPRIETARY MATERIAL. © 2014 The McGraw-Hill.
ARM Assembly Language Programming and Architecture by Mazidi et al
Unit-2 Instruction Sets, CPUs
Computer Organization Instructions Language of The Computer (MIPS) 2.
RISC / CISC Architecture by Derek Ng. Overview CISC Architecture RISC Architecture  Pipelining RISC vs CISC.
Lecture 8: Loading and Storing to Memory CS 2011 Fall 2014, Dr. Rozier.
Assembly Variables: Registers Unlike HLL like C or Java, assembly cannot use variables – Why not? Keep Hardware Simple Assembly Operands are registers.
Instruction Set Architectures Early trend was to add more and more instructions to new CPUs to do elaborate operations –VAX architecture had an instruction.
ARM Instruction Set Computer Organization and Assembly Languages Yung-Yu Chuang with slides by Peng-Sheng Chen.
Lecture 6: Decision and Control CS 2011 Spring 2016, Dr. Rozier.
Chapter 4: Introduction to Assembly Language Programming
COMP2121: Microprocessors and Interfacing
Assembly language.
Immediate Addressing Mode
Control Unit Lecture 6.
MCI PPT AVR MICROCONTROLLER Mayuri Patel EC-1 5th sem
Introduction to the ARM Instruction Set
ARM Registers Register – internal CPU hardware device that stores binary data; can be accessed much more rapidly than a location in RAM ARM has.
ECE 3430 – Intro to Microcomputer Systems
Assembly Language Assembly Language
The Cortex-M3/m4 Embedded Systems: Cortex-M3/M4 Instruction Sets
Choices in Designing an ISA
Chapter 4 Addressing modes
A Closer Look at Instruction Set Architectures
Prof. Sirer CS 316 Cornell University
Microprocessor and Assembly Language
Computer Architecture and Organization Miles Murdocca and Vincent Heuring Chapter 4 – The Instruction Set Architecture.
Chapter 4 The Von Neumann Model
Instruction Set Architectures
Introduction to Assembly Chapter 2
CSCI206 - Computer Organization & Programming
The University of Adelaide, School of Computer Science
ARM Load/Store Instructions
Introduction to Assembly Chapter 2
Figure 2- 1: ARM Registers Data Size
Prof. Sirer CS 316 Cornell University
Overheads for Computers as Components 2nd ed.
Introduction to Microprocessor Programming
Computer Organization and Assembly Languages Yung-Yu Chuang 2008/11/17
Branching instructions
ARM Introduction.
Overheads for Computers as Components 2nd ed.
Computer Architecture
Branch & Call Chapter 4 Sepehr Naimi
Computer Architecture Assembly Language
Computer Operation 6/22/2019.
ARM Load/Store Instructions
Embedded Systems- Instruct set
Embedded Systems- Micro-controller
An Introduction to the ARM CORTEX M0+ Instructions
Arithmetic and Logic Chapter 3
Presentation transcript:

Introduction to Assembly Chapter 2 Sepehr Naimi www.NicerLand.com

Topics ARM’s CPU Data Memory access Program memory RISC architecture Its architecture Some simple programs Data Memory access Program memory RISC architecture

CPU ARM ’s CPU ARM ’s CPU ALU Registers Instruction decoder General Purpose registers (R0 to R12) PC register (R15) Instruction decoder CPU ALU R1 R0 R13 (SP) R2 … R14 (LR) CPSR: N Z C V I F M4-M0 T R15 (PC) registers Instruction decoder Instruction Register

CPU 4

Some simple instructions 1. MOV (MOVE) MOV Rd, #k Rd = k k is an 8-bit value Example: MOV R5,#53 R5 = 53 MOV R9,#0x27 R9 = 0x27 MOV R3,#0b11101100 MOV Rd, Rs Rd = Rs Example: MOV R5,R2 R5 = R2 MOV R9,R7 R9 = R7

LDR pseudo-instruction (loading 32-bit values) LDR Rd, =k Rd = k k is an 32-bit value Example: LDR R5,=5543 R5 = 5543 LDR R9,=0x123456 R9 = 0x123456 LDR R4,=0b10110110011011001

Some simple instructions 2. Arithmetic calculation Description ADD Rd, Rn,Op2 * ADD Rn to Op2 and place the result in Rd ADC Rd, Rn,Op2 ADD Rn to Op2 with Carry and place the result in Rd AND Rd, Rn,Op2 AND Rn with Op2 and place the result in Rd BIC Rd, Rn,Op2 AND Rn with NOT of Op2 and place the result in Rd CMP Rn,Op2 Compare Rn with Op2 and set the status bits of CPSR** CMN Rn,Op2 Compare Rn with negative of Op2 and set the status bits EOR Rd, Rn,Op2 Exclusive OR Rn with Op2 and place the result in Rd MVN Rd,Op2 Store the negative of Op2 in Rd MOV Rd,Op2 Move (Copy) Op2 to Rd ORR Rd, Rn,Op2 OR Rn with Op2 and place the result in Rd RSB Rd, Rn,Op2 Subtract Rn from Op2 and place the result in Rd RSC Rd, Rn,Op2 Subtract Rn from Op2 with carry and place the result in Rd SBC Rd, Rn,Op2 Subtract Op2 from Rn with carry and place the result in Rd SUB Rd, Rn,Op2 Subtract Op2 from Rn and place the result in Rd TEQ Rn,Op2 Exclusive-OR Rn with Op2 and set the status bits of CPSR TST Rn,Op2 AND Rn with Op2 and set the status bits of CPSR * Op2 can be an immediate 8-bit value #K which can be 0–255 in decimal, (00–FF in hex). Op2 can also be a register Rm. Rd, Rn and Rm are any of the general purpose registers ** CPSR is discussed later in this chapter Opcode destination, source1, source2 Opcodes: ADD, SUB, AND, etc. Examples: ADD R5,R2,R1 R5 = R2 + R1 SUB R5, R9,#23 R5 = R9 - 23

A simple program Write a program that calculates 19 + 95 MOV R6, #19 ;R6 = 19 MOV R2, #95 ;R2 = 95 ADD R6, R6, R2 ;R6 = R6 + R2

A simple program Write a program that calculates 19 + 95 - 5 MOV R1, #19 ;R6 = 19 MOV R2, #95 ;R2 = 95 MOV R3, #5 ;R21 = 5 ADD R6, R1, R2 ;R6 = R1 + R2 SUB R6, R6, R3 ;R6 = R6 - R3 MOV R1, #19 ;R6 = 19 MOV R2, #95 ;R2 = 95 ADD R6, R1, R2 ;R6 = R1 + R2 MOV R2, #5 ;R21 = 5 SUB R6, R6, R2 ;R6 = R6 - R2

Von-Neumann and ARM7 10 10

Harvard in ARM9 and Cortex 11 11

Memory Map in Raspberry Pi Example: Add contents of location 0x90 to contents of location 0x94 and store the result in location 0x300. STR Rx,[Rd] ;[Rd]=Rx Example: ;[0x20000000]=0x12345678 LDR R5,=0x12345678 LDR R2, =0x20000000 STR R5,[R2] ; [R2] = R5 STR (Store register) Example: Write a program that copies the contents of location 0x80 of into location 0x88. Solution: LDR R6,=0x90 ;R6 = 0x90 LDR R1,[R6] ;R1 = [0x90] LDR R6,=0x94 ;R6 = 0x94 LDR R2,[R6] ;R1 = [0x94] ADD R2,R2,R1 ;R2 = R2 + R1 LDR R6,=0x300 ;R6 = 0x300 STR R2,[R6] ;[0x300] = R2 LDR Rd, [Rx] ;Rd = [Rx] Example: LDR R4,=0x20000000 LDR R1, [R4] LDR (Load register) Solution: LDR R2,=0x80 ;R2 = 0x80 LDR R1,[R2] ;R1 = [0x80] LDR R2,=0x88 ;R2 = 0x88 STR R1,[R2] ;[0x88] = R1

LDRB, LDRH, STRB, STRH LDR Rd,[Rs] STR Rs,[Rd] LDRB Rd,[Rs] Data Size Bits Load instruction used Store instruction used Byte 8 LDRB STRB Half-word 16 LDRH STRH Word 32 LDR STR LDR Rd,[Rs] STR Rs,[Rd] LDRB Rd,[Rs] LDRH Rd,[Rs] STRB Rs,[Rd] STRH Rs,[Rd]

Status Register (CPSR) Negative oVerflow Interrupt Thumb Zero carry Example: Show the status of the Z flag after the subtraction of 0x9C from 0x9C in the following instructions: LDR R0,=0x9C LDR R1,=0x9C SUBS R0,R0,R1 ;subtract R21 from R20 Example: Show the status of the C and Z flags after the addition of 0x0000009C and 0xFFFFFF64 in the following instructions: LDR R0,=0x9C LDR R1,=0xFFFFFF64 ADDS R0,R0,R1 ;add R1 to R0 Example: Show the status of the Z flag after the subtraction of 0x73 from 0x52 in the following instructions: LDR R0,=0x52 LDR R1,=0x73 SUBS R0,R0,R1 ;subtract R1 from R0 Example: Show the status of the C and Z flags after the addition of 0x38 and 0x2F in the following instructions: MOV R6, #0x38 ;R6 = 0x38 MOV R7, #0x2F ;R17 = 0x2F ADDS R6, R6,R7 ;add R7 to R6 Example: Show the status of the Z flag after the subtraction of 0x23 from 0xA5 in the following instructions: LDR R0,=0xA5 LDR R1,=0x23 SUBS R0,R0,R1 ;subtract R1 from R0 Solution: 9C 1001 1100 - 9C 1001 1100 00 0000 0000 R0 = $00 Z = 1 because the R20 is zero after the subtraction. C = 1 because R21 is not bigger than R20 and there is no borrow from D32 bit. Solution: 52 0101 0010 - 73 0111 0011 DF 1101 1111 R0 = 0xDF Z = 0 because the R20 has a value other than zero after the subtraction. C = 0 because R1 is bigger than R0 and there is a borrow from D32 bit. Solution: 38 00000000 00000000 00000000 0011 1000 + 2F 00000000 00000000 00000000 0010 1111 67 00000000 00000000 00000000 01100111 R6 = 0x67 C = 0 because there is no carry beyond the D31 bit. Z = 0 because the R6 (the result) has a value other than 0 after the addition. Solution: 0xA5 1010 0101 - 0x23 0010 0011 0x82 1000 0010 R0 = 0x82 Z = 0 because the R20 has a value other than 0 after the subtraction. C = 1 because R1 is not bigger than R0 and there is no borrow from D32 bit. Solution: 0000009C 00000000 00000000 00000000 10011100 + FFFFFF64 11111111 11111111 11111111 01100100 100000000 1 00000000 00000000 00000000 00000000 R0 = 00000000 C = 1 because there is a carry beyond the D7 bit. Z = 1 because R0 (the result) has a value 0 in it after the addition.

Assembler

Instructions, pseudo-instructions, and assembler directives Instructions: they are the real instructions of the CPU mov r5,#23 add r4,r5,r6 Pseudo-instructions: the pseudo-instructions are translated to some real instructions by the assembler ldr r5,=0x253234 Assembler directives: .equ myValue, 0x3fffc045 .global _start

Some Widely Used Directives Description .text Informs the assembler that a code section begins. .data Informs the assembler that an initialized data section begins. .global To inform the assembler that a name or symbol will be referenced in other files. .extern Informs the assembler that the code accesses a name or symbol defined in other file. .thumb Forces the assembler to convert the next instructions to THUMB machine instructions. .arm Forces the assembler to convert the next instructions to ARM machine instructions.

.global and .extern directives .text .extern myFunc ... bl myFunc file1.s .text .global myFunc myFunc: add r2,r1,r5 ... file2.s

Assembler Directives .EQU .equ name,value Example: .equ COUNT, 0x25 mov r1, #COUNT ;r1 = 0x25 mov r2, #COUNT + 3 ;r2 = 0x28

Assembler Directives .INCLUDE .INCLUDE “filename.ext” .equ SREG = 0x3f .equ SPL = 0x3d .equ SPH = 0x3e .... hFile.inc .include “hFile.inc” Program.asm

A simple program @ ARM Assembly language program to add some data @ and store the SUM in r0. .text .global _start _start: @ the beginning point for ARM assembly programs mov r1, #0x25 @ r1 = 0x25 mov r2, #0x34 @ r2 = 0x34 add r0, r2, r1 @ r0 = r2 + r1 mov r7, #1 svc 0 @ system call to terminate the program

Memory allocation using .byte, .hword, and .word Directive Description .byte Allocates one or more bytes of memory, and defines the initial runtime contents of the memory .hword Allocates one or more halfwords of memory, and defines the initial runtime contents of the memory. The data is not aligned. .word Allocates one or more words of memory and defines the initial runtime contents of the memory. The data is not aligned. .float Allocates one or more words of memory and initializes with a floating point number.

A Simple Code that Stores Fixed Data in Program Memory @ storing data in program memory. .text .global _start _start: ldr r2, =our_fixed_data @ point to our_fixed_data @ load r0 with the contents of memory pointed to by r2 ldrb r0, [r2] @ terminate the program mov r7, #1 svc 0 our_fixed_data: .byte 0x55, 0x33, 1, 2, 3, 4, 5, 6 .word 0x23222120, 0x30 .hword 0x4540, 0x50

Defining variables A, B, and C in RAM .text .global _start _start: @ r1 = a ldr r0, =a @ r0 = addr. of a ldr r1, [r0] @ r1 = value of a @ r2 = b ldr r0, =b @ r0 = addr. of b ldr r2, [r0] @ r2 = value of b @ c = r1 + r2 (c = a + b) add r3, r1, r2 @ r3 = a + b ldr r0, =c @ r0 = addr. of c str r3, [r0] @ c = r3 mov r7, #1 @ terminate the program svc 0 @ allocates the followings in data memory .data a: .word 5 b: .word 4 c: .word 0

Flash memory and PC register .text .global _start _start: mov r1, #0x25 mov r2, #0x34 add r0, r2, r1 mov r7, #1 svc 0 start E3A01025 +04 E3A02034 +08 E0810002 +12 E3A07001 +16 EF000000 E3A01025 32-bit E3A02034 E0810002 E3A07001 EF000000 4 14 8 10 C 6

How to speed up the CPU Increase the clock frequency More frequency  More power consumption & more heat Limitations Change the architecture Pipelining RISC

Changing the architecture RISC vs. CISC CISC (Complex Instruction Set Computer) Put as many instruction as you can into the CPU RISC (Reduced Instruction Set Computer) Reduce the number of instructions, and use your facilities in a more proper way.

RISC architecture Feature 1 RISC processors have a fixed instruction size. It makes the task of instruction decoder easier. In ARM the instructions are 4 bytes. In Thumb2 the instructions are either 2 or 4 bytes. In CISC processors instructions have different lengths E.g. in 8051 CLR C ; a 1-byte instruction ADD A, #20H ; a 2-byte instruction LJMP HERE ; a 3-byte instruction

RISC architecture Feature 2: reduce the number of instructions Pros: Reduces the number of used transistors Cons: Can make the assembly programming more difficult Can lead to using more memory

RISC architecture Feature 3: limit the addressing mode Advantage hardwiring Disadvantage Can make the assembly programming more difficult

RISC architecture Feature 4: Load/Store LDR R8,=0x20 LDR R0,[R8] ADD R0, R0,R1 LDR R8,=0x230 STR R0,[R8]

RISC architecture SUB R3,R3,R4 Fetch Decode LDR R2, [R4] ; R2 = [R4] ADD R0,R0,R1 ; R20 = R20 + R21 SUB R3,R3,R4 Execute ADD R0, R0,R1 LDR R2, [R4] Feature 5 (Harvard architecture): separate buses for opcodes and operands Advantage: opcodes and operands can go in and out of the CPU together. Disadvantage: leads to more cost in general purpose computers. CPU Code Memory Control bus Control bus Data Memory Data bus Data bus Address bus Address bus

RISC architecture Feature 6: more than 95% of instructions are executed in 1 machine cycle

RISC architecture Feature 7 RISC processors have at least 32 registers. Decreases the need for stack and memory usages. In ARM there are 16 general purpose registers (R0 to R15)