ARM7 Microprocessor Thank you, chairman Good morning everyone,

Slides:



Advertisements
Similar presentations
CPU Structure and Function
Advertisements

Computer Organization and Architecture
Computertechniek Hogeschool van Utrecht / Institute for Computer, Communication and Media Technology 1.
Microprocessors. Von Neumann architecture Data and instructions in single read/write memory Contents of memory addressable by location, independent of.
Processor System Architecture
Khaled A. Al-Utaibi  Computers are Every Where  What is Computer Engineering?  Design Levels  Computer Engineering Fields  What.
Chapter 12 CPU Structure and Function. CPU Sequence Fetch instructions Interpret instructions Fetch data Process data Write data.
Computer Organization and Architecture
Computer Organization and Architecture
COMP3221: Microprocessors and Embedded Systems Lecture 2: Instruction Set Architecture (ISA) Lecturer: Hui Wu Session.
Computer Organization and Architecture The CPU Structure.
Introduction To The ARM Microprocessor
Introduction to ARM Architecture, Programmer’s Model and Assembler Embedded Systems Programming.
Chapter 15 IA 64 Architecture Review Predication Predication Registers Speculation Control Data Software Pipelining Prolog, Kernel, & Epilog phases Automatic.
11/11/05ELEC CISC (Complex Instruction Set Computer) Veeraraghavan Ramamurthy ELEC 6200 Computer Architecture and Design Fall 2005.
ARM programmer’s model and assembler Embedded Systems Programming.
ARM 7 Datapath. Has “BIGEND” input bit, which defines whether the memory is big or little endian Modes: ARM7 supports six modes of operation: (1) User.
Computer Organization and Assembly language
Embedded Systems Programming
Prardiva Mangilipally
Computer Organization and Assembly language
Group 5 Alain J. Percial Paula A. Ortiz Francis X. Ruiz.
CH12 CPU Structure and Function
The ARM Programmer’s Model
Embedded System Design Center Sai Kumar Devulapalli ARM7TDMI Microprocessor Thumb Instruction Set.
What are Exception and Interrupts? MIPS terminology Exception: any unexpected change in the internal control flow – Invoking an operating system service.
CHAPTER 2: ARM Processor fundamental
Exception and Interrupt Handling
Introduction to Embedded Systems
Intel
PART 4: (1/2) Central Processing Unit (CPU) Basics CHAPTER 14: P ROCESSOR S TRUCTURE AND F UNCTION.
Edited By Miss Sarwat Iqbal (FUUAST) Last updated:21/1/13
CHAPTER 3 TOP LEVEL VIEW OF COMPUTER FUNCTION AND INTERCONNECTION
ARM for Wireless Applications ARM11 Microarchitecture On the ARMv6 Connie Wang.
© 2004, D. J. Foreman 1 Computer Organization. © 2004, D. J. Foreman 2 Basic Architecture Review  Von Neumann ■ Distinct single-ALU & single-Control.
© 2004, D. J. Foreman 1 Computer Organization. © 2004, D. J. Foreman 2 Basic Architecture Review  Von Neumann ■ Distinct single-ALU & single-Control.
Overview of Super-Harvard Architecture (SHARC) Daniel GlickDaniel Glick – May 15, 2002 for V (Dewar)
Introduction to Microprocessors
Interrupt driven I/O. MIPS RISC Exception Mechanism The processor operates in The processor operates in user mode user mode kernel mode kernel mode Access.
AT91 Interrupt Handling. 2 Stops the execution of main software Redirects the program flow, based on an event, to execute a different software subroutine.
Computer and Information Sciences College / Computer Science Department CS 206 D Computer Organization and Assembly Language.
Lecture 15 Microarchitecture Level: Level 1. Microarchitecture Level The level above digital logic level. Job: to implement the ISA level above it. The.
Processor Structure and Function Chapter8:. CPU Structure  CPU must:  Fetch instructions –Read instruction from memory  Interpret instructions –Instruction.
Different Microprocessors Tamanna Haque Nipa Lecturer Dept. of Computer Science Stamford University Bangladesh.
PART 4: (1/2) Central Processing Unit (CPU) Basics CHAPTER 12: P ROCESSOR S TRUCTURE AND F UNCTION.
بسم الله الرحمن الرحيم MEMORY AND I/O.
1 Basic Processor Architecture. 2 Building Blocks of Processor Systems CPU.
8085 INTERNAL ARCHITECTURE.  Upon completing this topic, you should be able to: State all the register available in the 8085 microprocessor and explain.
Interrupts and Exception Handling. Execution We are quite aware of the Fetch, Execute process of the control unit of the CPU –Fetch and instruction as.
Introduction to ARM processor. Intro.. ARM founded in November 1990 Advanced RISC Machines Company headquarters in Cambridge, UK Processor design centers.
ARM7 TDMI INTRODUCTION.
Introduction to Exceptions 1 Introduction to Exceptions ARM Advanced RISC Machines.
Computer Organization and Assembly Languages Yung-Yu Chuang
Protection in Virtual Mode
Timer and Interrupts.
William Stallings Computer Organization and Architecture 8th Edition
Morgan Kaufmann Publishers Computer Organization and Assembly Language
Computer Organization
Instruction Execution Cycle
Computer Architecture
Introduction to Microprocessor Programming
ARM Introduction.
Computer Organization and Assembly Languages Yung-Yu Chuang 2008/11/17
Chapter 11 Processor Structure and function
Presentation transcript:

ARM7 Microprocessor Thank you, chairman Good morning everyone, The title of my presentation is “Schedule-Aware performance estimation of communication architecture for efficient design space exploration”

Contents System overview Introduction to ARM ARM7 Instruction Set Architecture ARM7 Microarchitecture

System Code Density Code Exe. Speed SW system HW system Size application Code Density Code Exe. Speed OS & middleware SW system micro Processor HW system Memory system Size Power consumption peripherals Throughput controller

Hardware/Software System Architecture

Microprocessor Factors in deciding processor architecture for a system Operating environment General purpose system Special / limited purpose system (embedded system) Required performance Is high throughput required? (e.g. clock speed, pipeline depth) Is optimized functionalities required? (e.g. communication) Is power consumption control critical? Tradeoffs High performance = high power Many functionalities = high power & size

Microprocessor(cont’d) High performance, general purpose Microprocessor Processor Architecture & Performance General purpose processors Very high performance (e.g. throughput, clock speed, etc.) Provide various functionalities (e.g. multimedia instruction set) High throughput at cost of high power Software vs. Hardware Implementation overhead is in software Software optimization is not critical Examples Intel Pentium class AMD processors

Microprocessor(cont’d) Medium Performance, Embedded processors Processor Architecture & performance Embedded processors Relatively high performance Provided limited bus specialized functionalities (e.g. low power) Architecture is decided by its main application environment Software vs. Hardware Implementation overhead is balanced between HW and SW Hardware is optimized for a limited range of tasks Software optimization in terms of hardware utilization is critical Examples ARMx processor MIPSx processor

Microprocessor(cont’d) Low Performance, Cost-effective Processors Processor Architecture & Performance Low performance Provide basic functionalities Used in simple systems where cost is critical Examples 8051, 8086, 8088 Motorola 68k series

Contents System overview Introduction to ARM ARM7 Instruction Set Architecture ARM7 Microarchitecture

ARM (Advanced Risc Machines) Strength High performance Low price Very low power consumption Good development environment Weakness Lack of DSP operations Opportunity Mobile Computing Trend Coming of Post-PC Age Threat Nothing at now

Contents System overview Introduction to ARM ARM7 Instruction Set Architecture ARM7 Microarchitecture

ARM processor overview What is ARMx processor? Designed by ARM(Advanced RISC Machine) Standard 32-bit SoC pocessor(most widely used) Balanced performance & size / power ARM(T) Architecture Support THUMB mode (16bit instruction) Load-Store Architecture Data processing operations only operate on register contents, not directly on memory contents Powerful load & store instructions (e.g. indexing) Conditional execution of all instructions (conditional flag) Memory Mapped I/O Four-word depth write buffer Two-way set-associative, unified 8K-byte cache (instruction cache and data cache)

load/store architecture the access to memory is provided through a pair of dedicated instructions: load - copy a value from memory into a register store - copy a value from a register into memory The alternative to load/store is found in CISC processors offer a variety of addressing modes. With addressing modes, all instructions (for example arithmetic instructions) are able to use operands which are directly in memory. Since all of the operations can get directly to memory there is no need for special load and store instructions. Elliminating the addressing modes is one of the ways that RISC processors are able to simplify the instruction set.

ARM7 Programmer’s model Overview Operational Modes Exceptions

Overview From the programmer’s point of view, the ARM can be in one of two states Normal state: execute 32-bit, word-aligned ARM instructions THUMB state: operate with 16-bit, half-word-aligned THUMB instructions Transition between these two states does not affect the processor mode or the contents of the registers THUMB instructions are one-half the bit width of normal ARM instructions Produce very high-density codes If the memory bus width is 16-bit or 8-bit, the THUMB instruction will be has a good performance than normal instruction sets

Overview Memory formats View memory as a linear collection of bytes numbered upwards from zero Bytes 0 to 3 hold the first stored word, bytes 4 to 7 the second and so on. Can treat words in memory as being stored either in Big-Endian or Little-Endian format Big-Endian format : the most significant byte of a word is stored at the lowest numbered byte and the least significant byte at the highest numbered byte (byte 0 of the memory system is therefore connected to data lines 31 through 24) Little-Endian format: the lowest numbered byte in a word is considered the word’s least significant byte, and the highest numbered byte the most significant. (byte 0 of the memory system is therefore connected to data lines 7 through 0)

Little- and big-endian memory organizations If unaligned instruction fetches or data accesses will incur errors

ARM7 Operational Modes Table of ARM7 operational modes User USR Normal application execution environment* Fast Interrupt FIQ Response-time critical interrupt Interrupt IRQ General purpose interrupt Supervisor SVC Protected mode for operating system Abort ABT Virtual memory protection & management Undefined UND Undefined Instruction (reserved for coprocessor) System SYS Privileged user mode for operating system *User mode is subdivided into ARM and THUMB mode

IRQ Mode When the nIRQ signal asserts, the ARM chip changes to IRQ Mode

FIQ Mode When the nFIQ pin signal asserts, the ARM enters to the FIQ mode

Supervisor mode Reset or SWI instruction, the ARM enters to the Supervisor mode

Abort Mode Access an non-exist instruction or illegal memory address, the ARM enters to the Abort mode The programmer can use BKPT instruction to enter Abort mode

System mode and undefined mode It is not entered by any exception Intended for use by operating system tasks which need access to system resources Use software to enter this mode Undefined mode ARM CPU tries to decode an illegal instruction then enter to the Undefined mode

Register File Structure The ARM processor has a total of 37 registers General Purpose Register Files (GPR) 31 general-purpose registers, including a program counter These registers are 32 bits Program Status Register Files (PSR) 6 status registers These registers are also 32 bits

Register File Structure Table of ARM7 general purpose register (GPR) file Purpose Register USR/SYS ABT UND SVC IRQ FIQ R0 R0 R0 R0 R0 R0 R0 R1 R1 R1 R1 R1 R1 R1 R2 R2 R2 R2 R2 R2 R2 R3 R3 R3 R3 R3 R3 R3 R4 R4 R4 R4 R4 R4 R4 R5 R5 R5 R5 R5 R5 R5 R6 R6 R6 R6 R6 R6 R6 R7 R7 R7 R7 R7 R7 R7 R8 R8 R8 R8 R8 R8 R8 R9 R9 R9 R9 R9 R9 R9 R10 R10 R10 R10 R10 R10 R10 R11 R11 R11 R11 R11 R11 R11 R12 R12 R12 R12 R12 R12 R12 Stack Pointer R13 R13 R13 R13 R13 R13 R13 Link Register R14 R14 R14 R14 R14 R14 R14 PC R15 R15 R15 R15 R15 R15 R15

ARM7 GPR (cont’d) Visible register set Registers that are visible during specific mode 16x32bit registers are visible at any mode Some registers are shared, some are not Banked register Registers that share the same index Only 1 of banked registers are visible at each mode R13(SP) and R14(LR) are banked FIQ has 5 additional banked registers Register dump overhead is reduced at context switch

ARM7 GPR (cont’d) R13: Stack Pointer R13 Selector=CPSR Banked Register R13_USER R13_SVC R13 R13_ABORT R13_UNDEF Selector=CPSR

ARM7 GPR (cont’d) Totally 37 registers, 18 registers are visible USR/SYS ABT UND SVC IRQ FIQ R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13 R14 (PC)R15 CPSR SPSR Totally 37 registers, 18 registers are visible

ARM7 GPR (cont’d) R13, Stack pointer Used when stack are implemented Used when context switch occurs Stores the stack pointer value of tasks R14, Link Register Used when mode change with return occurs Stores the return address (current PC) R15, Program Counter Used to store current instruction address A write to R15 is equivalent to branch instruction

Instruction Pipeline Three-stage pipeline is used Fetch, Decoder, Execution The program counter points to the instruction being fetch rather than to the instruction being execution The Program Counter (PC) value used in an executing instruction is always two instructions ahead of the address

The Relationship between pipeline and PC Normal ARM Mode

The Relationship between pipeline and PC THUMB Mode

Pipeline and return address

Program Status Register Files (PSR) Table of ARM7 program status register file Register USR/SYS ABT UND SVC IRQ FIQ CPSR CPSR CPSR CPSR CPSR CPSR CPSR SPSR SPSR SPSR SPSR SPSR SPSR CPSR Stores current processor state Contains condition flag and control bits SPSR Stores processor state before entering exception mode Structure is identical to CPSR

ARM7 PSR (cont’d) ARM7 CPSR / SPSR Format

ARM7 PSR (cont’d) Control Bits I – Interrupt Mask bits (I, F) Can be set or reset in privileged mode If ‘1’, IRQ or FIQ requests are ignored Control Bits II – THUMB Bit (T) Must not be allocated by software Is set or reset by H/W If ‘1’, processor is running in THUMB state, else ARM state Control Bits III – Mode Bits (M4 ~ M0) Mode bits reflect current processor mode Can be changed in privileged mode (results in mode change) Is automatically changed in user mode by H/W

Exceptions Mode changes can be made under Software control External interrupts Exception process The modes other than user mode are privileged modes Have full access to system resources Can change mode freely Exception modes FIQ IRQ Supervisor mode Abort: data abort and instruction prefetch abort Undefined

Exception Task flow Class Cause Interrupt External stimulus Fault Internal cause Trap Trap instruction

Exception (cont’d) ARM7 (ISA v4) Exceptions Type Class Description (Cause) Reset Power Up Undefined Instruction FAULT Invalid / coprocessor instruction Prefetch Abort FAULT TLB miss for instruction Data Abort FAULT TLB miss for data access IRQ INTERRUPT Normal interrupt FIQ INTERRUPT Fast Interrupt (no context switch) SW Interrupt TRAP Undefined / coprocessor instruction

Exception (cont’d) ARM7 (ISA v4) Exception Vectors Exception Address Mode on Entry Reset 0x00000000 Supervisor Undefined Instruction 0x00000004 Undefined SW Interrupt 0x00000008 Supervisor Prefetch Abort 0x0000000C Abort Data Abort 0x00000010 Abort IRQ 0x00000018 IRQ FIQ 0x0000001C FIQ Reserved 0x00000014 Reserved

ARM Exceptions (cont’d) On entry (automatically done by ARM) 1) completes the current instruction (except reset exception) 2) Changes to the operating mode corresponding to the 1) particular exception 3) Saves the address of the following instruction in r14 of new mode 4) Saves the old value of the CPSR in the SPSR of the new mode 5) Disables IRQ exception; set bit 7 of the CPSR 6) If it a FIQ exception, disable further FIQ; disables bit 6 of the CPSR 7) Forces the PC to the address of exception handler

ARM Exceptions (cont’d) On exit 1) Restores user registers 2) Restores the CPSR using the SPSR 3) set proper return address to PC !! Conflict in performing step 2) and 3) If step 2) is performed prior to step 3), then since lower bits of the CPSR determines the operating mode, restoring the CPSR makes it impossible to access the banked r14 If step 3) is performed prior to step 2), exception handler loses the control and the code to perform step 2) is never accessed

ARM7 Instruction Set Overview A load-store architecture Auto-increment/decrement addressing Load/store multiple 64 bit multiplication/MAC operation Conditional execution (not exact RISC type)

ARM Instruction Set Format

Condition Code

Contents System overview Introduction to ARM ARM7 Instruction Set Architecture ARM7 Microarchitecture

ARM7 Core Debugger ARM7 Core ICache Wrapper SRAM DCache Arbiter

ARM7 Datapath ARM7 Datapath Pipeline Model Datapath Overview Clock Scheme IF Stage – Address MUX. & Incremental Block ID Stage – Register File EXE Stage Overview of EXE Stage ALU Multiplier

Datapath - Microprocessor General purpose register Process unit enable signal Control logic

ARM7 Pipeline Model FETCH DECODE EXECUTE ARM7  standard 3-stage pipelined architecture FETCH DECODE EXECUTE Fetch Instruction Select/Increment PC Read next instruction Decode Instruction Generate Ctrl. signals Generate immediate Read from register file Execute Instruction Arithmetic / Logic Calc. branch addr. Load / Store Related Blocks Address Selector Address Incrementer Address Register Related Blocks Control Logic (Decoder) Register File Related Blocks Shifter Multiplier ALU *Register write back (WB) is hidden

ARM7 Pipeline Model(cont’d) Normal Instruction Flow Stalls Needed for Longer Instructions

Data Hazard on r1: add r1,r2,r3 sub r4,r1,r3 and r6,r1,r7 or r8,r1,r9 Time (clock cycles) IF ID/RF EX MEM WB add r1,r2,r3 ALU Im Reg Reg Dm I n s t r. O r d e ALU sub r4,r1,r3 Im Reg Dm Reg ALU and r6,r1,r7 Im Reg Dm Reg Im ALU or r8,r1,r9 Reg Dm Reg ALU Im Reg Dm xor r10,r1,r11

ARM7 Pipeline Model(cont’d) CISC Behavior of ARM7 Many ARM instructions are complex Instruction consists of 1 or more microcodes Execution time is not equally distributed among instructions ARM7 Pipeline is unbalanced Execution state of ARM7 is bottleneck Shift, ALU, ICACHE access are done in a single stage ARM9 expanded EXE to EXE-MEM (thus IF-ID-EXE-MEM-WB) Instructions that take more than 1 exe cycle All multiply instructions (due to complexity) All instructions that read 3 register values All LOAD/STORE instructions

ARM7 Datapath Overview FETCH DECODE EXECUTE (WB) *Pipeline registers are omitted

ARM7 Clock Scheme ARM7 clock phase ARM7 generates 2 non-overlapping internal clock Some data blocks operate during phase 1 or 2 only E.g. Shifter (phase1), ALU (phase 2)

ARM7 IF Stage Instruction Fetch Stage Diagram (example) Next instruction address To ICache Exception ALU Address Mux. + Reg. +2/4 Increment Incrementer bus *PC stores at R15 should always be +8 of EXE address

ARM7 ID Stage OP Code PSR GPR Instruction Decode Stage Diagram (example) OP Code PSR read PSR write GPR read1 GPR read2 GPR write PSR GPR PSR out read1 data read2 data Data bus B Data bus A Read port : sampled at start of phase 1 write port : sampled at start of phase 2 *PC port is omitted for simplicity

ARM7 Execution Stage Execute Stage Diagram (example)