Download presentation
Presentation is loading. Please wait.
Published byMaverick Hambley Modified over 10 years ago
2
ARM System - On - Chip Architecture2 INTRODUCTION ARM is a RISC processor. It is used for small size and high performance applications. Simple architecture – low power consumption.
3
ARM System - On - Chip Architecture3 TIMELINE (1/2) 1985: Acorn Computer Group manufactures the first commercial RISC microprocessor. 1990: Acorn and Apple participation leads to the founding of Advanced RISC Machines (A.R.M.). 1991: ARM6, First embeddable RISC microprocessor. 1992 – 1994: Various companies use ARM (Sharp, Samsung), while in 1993 ARM7, the first multimedia microprocessor is introduced.
4
ARM System - On - Chip Architecture4 TIMELINE (2/2) 1995: Introduction of Thumb and ARM8. 1996 – 2000: Alcatel, Huindai, Philips, Sony, use ΑRM, while in 1999 η ARM cooperates with Erickson for the development of Bluetooth. 2000 – 2002: ARM’s share of the 32 – bit embedded RISC microprocessor market is 80%. ARM Developer Suite is introduced.
5
THE ARM ARCHITECTURE
6
ARM System - On - Chip Architecture6 GENERAL INFO (1/2) AIM: Simple design Load – store architecture 32 bit data bus 3 addressing modes
7
ARM System - On - Chip Architecture7 GENERAL INFO (2/2) Simple architecture + Simple instruction set + Code density Small size Low power consumption
8
ARM System - On - Chip Architecture8 Registers 32 general purpose registers 7 modes of operation Different set of visible registers and different cpsr control level in each mode.
9
ARM Programming Model r13_und r14_und r14_irq r13_irq SPSR_und r14_abt r14_svc user mode fiq mode svc mode abort mode irq mode undefined mode usable in user mode system modes only r13_abt r13_svc r8_fiq r9_fiq r10_fiq r11_fiq SPSR_irq SPSR_abt SPSR_svc SPSR_fiq CPSR r14_fiq r13_fiq r12_fiq r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 r15 (PC)
10
ARM System - On - Chip Architecture10 CPSR N: Negative Z: Zero C: Carry V: Overflow Q: Saturation (for enhanced DSP instructions) ARM CPSR format
11
ARM System - On - Chip Architecture11 Memory Organization Address bus: 32 – bits 1 word = 32 – bits
12
ARM System - On - Chip Architecture12 Instruction Set Three instruction types Data processing Data transfer Control flow
13
ARM System - On - Chip Architecture13 Supervisor mode In user mode the operating system handles operations outside user privileges. Using “supervisor calls”, the user goes to system level and can perform system functions.
14
ARM System - On - Chip Architecture14 I/O System ARM handles peripherals as “memory mapped devices with interrupt support”. Interrupts: IRQ: normal interrupt FIQ: fast interrupt
15
ARM System - On - Chip Architecture15 Exceptions Exceptions: Interrupts Supervisor Call Traps When an exception takes place: The value of PC is copied to r14_exc The operating mode changes into the respective exception mode. The PC takes the exception handler vector address.
16
ARM programming model r13_und r14_und r14_irq r13_irq SPSR_und r14_abt r14_svc user mode fiq mode svc mode abort mode irq mode undefined mode usable in user mode system modes only r13_abt r13_svc r8_fiq r9_fiq r10_fiq r11_fiq SPSR_irq SPSR_abt SPSR_svc SPSR_fiq CPSR r14_fiq r13_fiq r12_fiq r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 r15 (PC)
17
THE ARM INSTRUCTION SET
18
ARM System - On - Chip Architecture18 Data Processing Instructions (1/2) Arithmetic Operations ADD r0, r1, r2; r0:= r1+r2 and don’t update flags ADDS r0, r1, r2 ; r0:= r1+r2 and update flags Logical Operations AND r0, r1, r2; r0:= r1 AND r2 Register Movement MOV r0, r2 Comparison CMP r1, r2
19
ARM System - On - Chip Architecture19 Data Processing Instructions (2/2) Operands: Immediate operands ADD r3, r3, #1 Shifted register operands: ADD r3, r2, r1, LSL #3 Miscellaneous data processing instructions: Multiplication: MUL r4, r3, r2
20
ARM System - On - Chip Architecture20 Data transfer instructions Load and store instructions: LDR r0, [r1] STR r0, [r1] Offset: LDR r0, [r1,#4] Post – indexed: LDR r0, [r1], #16 Auto – indexed: LDR r0, [r1,#16]! Multiple data transfers: LDMIA r1, {r0,r2,r5}
21
ARM System - On - Chip Architecture21 Examples PRE: r0 = 0x00000000 r1 = 0x00009000 mem32[0x00009000] = 0x01010101 mem32[0x00009004] = 0x02020202 LDR r0, [r1, #4]! POST: r0 = 0x02020202 r1 = 0x00009004
22
ARM System - On - Chip Architecture22 Examples PRE: r0 = 0x00000000 r1 = 0x00009000 mem32[0x00009000] = 0x01010101 mem32[0x00009004] = 0x02020202 LDR r0, [r1, #4] POST: r0 = 0x02020202 r1 = 0x00009000
23
ARM System - On - Chip Architecture23 Examples PRE: r0 = 0x00000000 r1 = 0x00009000 mem32[0x00009000] = 0x01010101 mem32[0x00009004] = 0x02020202 LDR r0, [r1], #4 POST: r0 = 0x01010101 r1 = 0x00009004
24
ARM System - On - Chip Architecture24 Examples mem32[0x80018] = 0x03 mem32[0x80014] = 0x02 mem32[0x80010] = 0x01 r0 = 0x00080010 LDMIA r0!, {r1-r3} r0 = 0x0008001c r1 = 0x00000001 r2 = 0x00000002 r3 = 0x00000003
25
ARM System - On - Chip Architecture25 Examples mem32[0x8001c] = 0x04 mem32[0x80018] = 0x03 mem32[0x80014] = 0x02 mem32[0x80010] = 0x01 r0 = 0x00080010 LDMIB r0!, {r1-r3} r0 = 0x0008001c r1 = 0x00000002 r2 = 0x00000003 r3 = 0x00000004
26
ARM System - On - Chip Architecture26 Conditional execution Instructions can be executed conditionally without braches CMP r2, r3 ;subtract and set flags ADDGE r4, r5, r6 ; if r2>r3 SUBLT r4, r5, r6 ; else
27
ARM System - On - Chip Architecture27 Conditional execution mnemonics
28
ARM System - On - Chip Architecture28 Control flow instructions Branch instruction: B label Conditional branch: BNE label Branch and Link: BL label BLloop… Loop……… MOV PC, r14; επιστροφή
29
ARM System - On - Chip Architecture29 Example 1 AREA ARMex, CODE, READONLY ; Name this block of code ARMex ENTRY ; Mark first instruction to execute start MOV r0, #10 ; Set up parameters MOV r1, #3 ADD r0, r0, r1 ; r0 = r0 + r1 stop MOV r0, #0x18 ; angel_SWIreason_ReportException LDR r1, =0x20026 ; ADP_Stopped_ApplicationExit SWI 0x123456 ; ARM semihosting SWI END ; Mark end of file
30
ARM System - On - Chip Architecture30 Example 2 AREA subrout, CODE, READONLY; Name this block of code ENTRY ; Mark first instruction to execute start MOV r0, #10 ; Set up parameters MOV r1, #3 BL doadd ; Call subroutine stop MOV r0, #0x18 ; angel_SWIreason_ReportException LDR r1, =0x20026 ; ADP_Stopped_ApplicationExit SWI 0x123456 ; ARM semihosting SWI doadd ADD r0, r0, r1 ; Subroutine code MOV pc, lr ; Return from subroutine END ; Mark end of file
31
ARM ORGANIZATION AND IMPLEMENTATION
32
3 – Stage Pipeline (ARM7 – 80MHz) Fetch Decode Execute Throughput: 1 instruction / cycle
33
ARM System - On - Chip Architecture33 5 – stage pipeline (1/2) Program execution time: Ways to reduce : Increase Logic simplification Reduce CPI reduce the number of multicycle instructions.
34
5 – stage pipeline (ARM9- 150MHz) (2/2) Fetch Decode Execute Buffer / Data Write - Back
35
ARM System - On - Chip Architecture35 ARM coprocessor interface ARM supports upto 16 coprocessors, which can be software emulated. Each coprocessor has upto 16 general- purpose registers ARM is a load and store architecture. Coprocessors usually handle on – chip functions, such as cache and memory management.
36
ARCHITECTURAL SUPPORT FOR HIGH – LEVEL LANGUAGES
37
ARM System - On - Chip Architecture37 Floating - point accelerator (1/2) For floating-point operations, ARM has the FPE software emulator and the FPA 10 hardware floating – point accelerator. FPA 10 includes: Coprocessor interface Load / store unit Register bank ( 8 registers 80 – bit ) ALU (adder, mult, div)
38
ARM System - On - Chip Architecture38 Floating - point accelerator (2/2)
39
ARM System - On - Chip Architecture39 APCS (1/2) APCS (ARM Procedure Call Standard) is a set of rules concerning C procedure input and output. Specific use of general purpose registers. (r0 – r4: arguments, r4 – r8 variables, r10 stack limit, etc. ) Procedure I/O: BL Loop … Loop … MOV pc, lr
40
ARM System - On - Chip Architecture40 APCS (2/2) C code void f1(int a) { f2(a); } Assembly code f1LDR r0, [r13] STR r13!, [r14] STR r13!, [r0] BL f2 SUB r13,#4 LDR r13!, r15 Stack pointer0 4 8 16
41
THUMB PROGRAMMER’S MODEL
42
ARM System - On - Chip Architecture42 General information Thumb objective: Code density. Thumb has a 16 – bit instruction set. A subset of the ARM instruction set is coded to a 16–bit space With appropriate use great benefits can be achieved in terms of Power efficiency Enhanced performance
43
ARM System - On - Chip Architecture43 Going in and out of Thumb mode Using the BX instruction, in ARM state: e.g. ΒΧ r0 Commands are assembled as 16 – bit instructions with the appropriate directive If r0[0] is 1, the T bit in the CPSR becomes 1 and the PC is set to the address obtained from the remaining bits of r0. Using the BX instruction from Thumb state, we return to ARM state.
44
ARM System - On - Chip Architecture44 The Thumb programmer’s model Thumb registers
45
ARM System - On - Chip Architecture45 ARM vs. Thumb (1/3) Thumb Upto 70% code size reduction 40% more instructions. 45% faster code with 16-bit memory Requires about 30% less external memory ARM 40% faster code when coupled with a 32-bit memory
46
ARM System - On - Chip Architecture46 ARM vs. Thumb (2/3) If performance is critical: ARM If cost and power consumption are critical: Thumb
47
ARM System - On - Chip Architecture47 ARM and Τhumb interaction A 32 – bit ARM system can go into Thumb mode for specific routines, in order to meet power and memory constraints. A 16 – bit system: Can use an on – chip, 32 – bit memory for ARM state routines, and a 16-bit off – chip memory and Thumb code for the rest of the application.
48
ARM System - On - Chip Architecture48 Example 3 AREA ThumbSub, CODE, READONLY ; Name this block of code ENTRY ; Mark first instruction to execute CODE32 ; Subsequent instructions are ARM header ADR r0, start + 1 ; Processor starts in ARM state, BX r0 ; so small ARM code header used ; to call Thumb main program CODE16 ; Subsequent instructions are Thumb start MOV r0, #10 ; Set up parameters MOV r1, #3 BL doadd ; Call subroutine stop MOV r0, #0x18 ; angel_SWIreason_ReportException LDR r1, =0x20026 ; ADP_Stopped_ApplicationExit SWI 0xAB ; Thumb semihosting SWI doadd ADD r0, r0, r1; Subroutine code MOV pc, lr ; Return from subroutine END ; Mark end of file
49
ARM System - On - Chip Architecture49 Example 4 Implement the following pseudocode in ARM and Thumb assembly. Which is more efficient in terms of execution time and which in terms of code size? If r1>r2 then R3= r4 + r5 R6 = r4 – r5 Else R3= r4 - r5 R6 = r4 + r5
50
ARM System - On - Chip Architecture50 Example 5 Write an ARM assembly program that loads data from memory location 0x40, sets bits 3 to 5, clears bits 0 to 2 and leaves the remaining bits unchanged. Test it using 0xAD as input data
51
ARCHITECTURAL SUPPORT FOR SYSTEM DEVELOPMENT
52
The ARM memory interface A basic ARM memory system
53
ARM System - On - Chip Architecture53 AMBA (1/4) Advanced Microcontroller Bus Architecture Advanced High – Performance Bus Advanced System Bus Advanced Peripheral Bus AMBA objectives: Technology – independence To encourage modular system design
54
ARM System - On - Chip Architecture54 AMBA (2/4) A typical AMBA – based system
55
ARM System - On - Chip Architecture55 AMBA (3/4) AHB bus Burst transaction Split transaction Data bus 64 – 128 bit
56
ARM System - On - Chip Architecture56 AMBA (4/4) AMBA Design Kit (ADK) An environment that assists designers in developing ΑΜΒΑ based components και SoC designs.
57
ARM System - On - Chip Architecture57 Signal Processing Support (1/2) Piccolo DSP coprocessor. Various data memories for maximizing throughput.
58
Signal Processing Support (2/2) Piccolo
59
MEMORY HIERARCHY
60
ARM System - On - Chip Architecture60 Memory hierarchy Larger size Lower speed Memory type SizeSpeed Registers32 – bitA few nsec On – chip cache 8 – 32kbytes 10 nsec Off – chip cache 100 – 200 kbytes 10 – 30 nsec RAMMbytes100 nsec
61
ARM System - On - Chip Architecture61 On – chip memory Necessary for performance Some system prefer RAM to on – chip cache. Simpler, cheaper and less power- hungry.
62
ARM System - On - Chip Architecture62 Cache types Cache types: Unified cache. Separate instruction and data caches. Performance:hit rate – miss rate Compulsory miss: first time and address is accessed Capacity miss: When cache full Conflict miss: Two addresses compete for the same place in the cache
63
ARM System - On - Chip Architecture63 Replacement policy -implementation Least Recently Used (LRU) Least Frequently Used (LFU) Data prediction Fully-associative Direct-mapped Set-associative
64
ARM System - On - Chip Architecture64 Direct – mapped cache (1/2) A line of data stored in a tag of memory
65
ARM System - On - Chip Architecture65 Direct – mapped cache (2/2) Each memory location has a specific place in the cache. Tag and data can be accessed at the same time. Tag RAM smaller than data RAM and has a smaller access time allowing the comparison to complete before accessing the data RAM.
66
2 – way set – associative cache. (1/3)
67
ARM System - On - Chip Architecture67 Set associative cache (2/3) A set – associative cache has a number of sets yielding n – way associative cache. Two addresses that would be competing for the same spot in a direct mapped cache, can be stored in different locations and accessed independently.
68
ARM System - On - Chip Architecture68 Set associative (3/3) Set selection: Random allocation Least recently used (LRU) Round – robin (cyclic)
69
Fully associative (1/2)
70
ARM System - On - Chip Architecture70 Write strategies Write – through All write operations are passed to main memory Write – through with buffered write Write operations are passed to main memory through the write buffer Copy – back (write – back) Write operations update only the cache.
71
ARM System - On - Chip Architecture71 Cache feature summary
72
ARM System - On - Chip Architecture72 ‘Perfect’ cache performance
73
ARM System - On - Chip Architecture73 MMU (1/3) Two memory management approaches: Segmentation Paging
74
ARM System - On - Chip Architecture74 MMU (2/3) Segmented memory management:
75
ARM System - On - Chip Architecture75 MMU (3/3) Paging memory management:
76
ARCHITECTURAL SUPPORT FOR OPERATING SYSTEMS
77
ARM System - On - Chip Architecture77 CP15 On – chip coprocessor for MMU, cache, protection unit control. Control takes place through registers with instructions executed in supervisor mode.
78
ARM System - On - Chip Architecture78 Protection Unit Simpler alternative to the MMU. Requires simpler software and hardware. Does not use translation tables, but 8 protection regions instead.
79
ARM DEVELOPER SUITE
80
ARM System - On - Chip Architecture80 ARMULATOR (1/2) Armulator: Emulator of various ARM processors. Allows project development in C, C++ or Assembly. It includes debugger, compilers, assembler and this entire set is called ARM Developer Suite (ADS).
81
ARM System - On - Chip Architecture81 ARMULATOR (2/2) Possible project options: ARM and Thumb Interworking Mixing C, C++ and Assembly Code for ROM Exception handlers MM
82
ARM System - On - Chip Architecture82 ARMULATOR TUTORIAL CODEWARRIOR ENVIRONMENT
83
ARM System - On - Chip Architecture83
84
ARM System - On - Chip Architecture84
85
ARM System - On - Chip Architecture85
86
ARM System - On - Chip Architecture86
87
ARM System - On - Chip Architecture87
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.