ARM Processor Architecture

Slides:



Advertisements
Similar presentations
Microprocessors A Beginning.
Advertisements

Computer System Organization Computer-system operation – One or more CPUs, device controllers connect through common bus providing access to shared memory.
Performance of Cache Memory
8086.  The 8086 is Intel’s first 16-bit microprocessor  The 8086 can run at different clock speeds  Standard 8086 – 5 MHz  –10 MHz 
Khaled A. Al-Utaibi  Computers are Every Where  What is Computer Engineering?  Design Levels  Computer Engineering Fields  What.
The ARM7TDMI Hardware Architecture
Spring 2003CSE P5481 Introduction Why memory subsystem design is important CPU speeds increase 55% per year DRAM speeds increase 3% per year rate of increase.
COMP3221 lec31-mem-bus-I.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 31: Memory and Bus Organisation - I
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
1 Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
Vacuum tubes Transistor 1948 –Smaller, Cheaper, Less heat dissipation, Made from Silicon (Sand) –Invented at Bell Labs –Shockley, Brittain, Bardeen ICs.
1 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value is stored as a charge.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy (Part II)
1 CSE SUNY New Paltz Chapter Seven Exploiting Memory Hierarchy.
Prardiva Mangilipally
© 2009 Acehub Vista Sdn. Bhd Introduction to ARM ® Processors.
Group 5 Alain J. Percial Paula A. Ortiz Francis X. Ruiz.
Cache Memories Effectiveness of cache is based on a property of computer programs called locality of reference Most of programs time is spent in loops.
ARM Processor Architecture (II)
MODES OF Details of Pins Pin 1 –Connected Ground Pins 2-16 –acts as both input/output. Outputs address at the first part of the cycle and outputs.
Exception and Interrupt Handling
Introduction to Embedded Systems
Intel
Basic Microcomputer Design. Inside the CPU Registers – storage locations Control Unit (CU) – coordinates the sequencing of steps involved in executing.
Topics covered: Memory subsystem CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
Samsung ARM S3C4510B Product overview System manager
CHAPTER 2: COMPUTER-SYSTEM STRUCTURES Computer system operation Computer system operation I/O structure I/O structure Storage structure Storage structure.
1 CSCI 2510 Computer Organization Memory System I Organization.
NS7520.
Computer Architecture System Interface Units Iolanthe II approaches Coromandel Harbour.
Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.
1 Virtual Memory Main memory can act as a cache for the secondary storage (disk) Advantages: –illusion of having more physical memory –program relocation.
The ARM7TDMI Processor Block Diagram Vector Table
Presented By: Rodney Fluharty Dec. 07, Who is ARM? Advanced Risc Microprocessor is the industry's leading provider of 16/32-bit embedded RISC microprocessor.
SOC Consortium Course Material Debugging and Evaluation Speaker: Yung-Tsung Wang InstructorProf. Tsung-Han Tsai.
Computer Hardware A computer is made of internal components Central Processor Unit Internal External and external components.
80386DX functional Block Diagram PIN Description Register set Flags Physical address space Data types.
EFLAG Register of The The only new flag bit is the AC alignment check, used to indicate that the microprocessor has accessed a word at an odd.
Computer and Information Sciences College / Computer Science Department CS 206 D Computer Organization and Assembly Language.
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
The Intel 86 Family of Processors
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Nov. 15, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 8: Memory Hierarchy Design * Jeremy R. Johnson Wed. Nov. 15, 2000 *This lecture.
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
Computer Architecture System Interface Units Iolanthe II in the Bay of Islands.
Processor Memory Processor-memory bus I/O Device Bus Adapter I/O Device I/O Device Bus Adapter I/O Device I/O Device Expansion bus I/O Bus.
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
1  1998 Morgan Kaufmann Publishers Chapter Seven.
Different Microprocessors Tamanna Haque Nipa Lecturer Dept. of Computer Science Stamford University Bangladesh.
Fundamentals of Programming Languages-II
بسم الله الرحمن الرحيم MEMORY AND I/O.
SOC Consortium Course Material ARM Processor Architecture Speaker: Lung-Hao Chang 張龍豪 Advisor: Porf. Andy Wu 吳安宇 March 12, 2003 National Taiwan University.
1 Chapter Seven. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
CENG 336 ARM core 1. 2 ARM Ltd Founded in November 1990 – Spun out of Acorn Computers Designs the ARM range of RISC processor cores Licenses ARM core.
ARM 7 & ARM 9 MICROCONTROLLERS AT91 1 ARM920T Processor.
ARM7 TDMI INTRODUCTION.
Memory Management memory hierarchy programs exhibit locality of reference - non-uniform reference patterns temporal locality - a program that references.
Class Exercise 1B.
Computer Organization
Architecture Background
Morgan Kaufmann Publishers Computer Organization and Assembly Language
May 2006 Saeid Nooshabadi ELEC2041 Microprocessors and Interfacing Lectures 30: Memory and Bus Organisation - I
Outline Module 1 and 2 dealt with processes, scheduling and synchronization Next two modules will deal with memory and storage Processes require data to.
ARM Processor Architecture
ARM Introduction.
Modified from notes by Saeid Nooshabadi
ARM920T Processor This training module provides an introduction to the ARM920T processor embedded in the AT91RM9200 microcontroller.We’ll identify the.
Presentation transcript:

ARM Processor Architecture

Outline ARM processor core Memory hierarchy

ARM Processor Core

ARM7TDMI Processor Core Current low-end ARM core for applications like digital mobile phones TDMI T: Thumb, 16-bit compressed instruction set D: on-chip Debug support, enabling the processor to halt in response to a debug request M: enhanced Multiplier, yield a full 64-bit result, high performance I: EmbeddedICE hardware Von Neumann architecture 3-stage pipeline CPI ~ 1.9

ARM7TDMI Block Diagram

ARM7TDMI Core Diagram

ARM7TDMI Interface Signals (1/4)

ARM7TDMI Interface Signals (2/4) Clock control All state change within the processor are controlled by mclk, the memory clock Internal clock = mclk AND \wait eclk clock output reflects the clock used by the core Memory interface 32-bit address A[31:0], bidirectional data bus D[31:0], separate data out Dout[31:0], data in Din[31:0] seq indicates that the memory address will be sequential to that used in the previous cycle

ARM7TDMI Interface Signals (3/4) Lock indicates that the processor should keep the bus to ensure the atomicity of the read and write phase of a SWAP instruction \r/w, read or write mas[1:0], encode memory access size – byte, half-word or word bl[3:0], externally controlled enables on latches on each of the 4 bytes on the data input bus MMU interface \trans (translation control), 0: user mode, 1: privileged mode \mode[4:0], bottom 5 bits of the CPSR (inverted) Abort, disallow access State T bit, whether the processor is currently executing ARM or Thumb instructions Configuration Bigend, big-endian or little-endian

ARM7TDMI Interface Signals (4/4) Interrupt \fiq, fast interrupt request, higher priority \irq, normal interrupt request isync, allow the interrupt synchronizer to be passed Initialization \reset, starts the processor from a known state, executing from address 0000000016 ARM7TDMI characteristics

Memory Access The ARM7 is a Von Neumann, load/store architecture, i.e., Only 32 bit data bus for both inst. And data. Only the load/store inst. (and SWP) access memory. Memory is addressed as a 32 bit address space Data type can be 8 bit bytes, 16 bit half-words or 32 bit words, and may be seen as a byte line folded into 4-byte words Words must be aligned to 4 byte boundaries, and half-words to 2 byte boundaries. Always ensure that memory controller supports all three access sizes

ARM Memory Interface Sequential (S cycle) Non-sequential (N cycle) (nMREQ, SEQ) = (0, 1) The ARM core requests a transfer to or from an address which is either the same, or one word or one-half-word greater than the preceding address. Non-sequential (N cycle) (nMREQ, SEQ) = (0, 0) The ARM core requests a transfer to or from an address which is unrelated to the address used in the preceding address. Internal (I cycle) (nMREQ, SEQ) = (1, 0) The ARM core does not require a transfer, as it performing an internal function, and no useful prefetching can be performed at the same time Coprocessor register transfer (C cycle) (nMREQ, SEQ) = (1, 1) The ARM core wished to use the data bus to communicate with a coprocessor, but does not require any action by the memory system.

Cached ARM7TDMI Macrocells AMBA Interface Inst. & data cache MMU ARM Core CP15 EmbeddedICE & JTAG Write Buffer Address Data Virtual Physical Inst. & data ARM710T 8K unified write through cache Full memory management unit supporting virtual memory Write buffer ARM720T As ARM 710T but with WinCE support ARM 740T 8K unified write through cache Memory protection unit Write buffer

Processor Core Vs CPU Core The engine that fetches instructions and execute them E.g.: ARM7TDMI, ARM9TDMI, ARM9E-S CPU Core Consists of the ARM processor core and some tightly coupled function blocks Cache and memory management blocks E.g.: ARM710T, ARM720T, ARM74T, ARM920T, ARM922T, ARM940T, ARM946E-S, and ARM966E-S ARM710T

Memory Hierarchy

2nd-level off chip cache Memory Size and Speed registers Small Fast Expensive On-chip cache memory 2nd-level off chip cache Main memory Large Slow Hard disk Cheap Access time capacity Cost

Caches (1/2) A cache memory is a small, very fast memory that retains copies of recently used memory values. It usually implemented on the same chip as the processor. Caches work because programs normally display the property of locality, which means that at any particular time they tend to execute the same instruction many times on the same areas of data. An access to an item which is in the cache is called a hit, and an access to an item which is not in the cache is a miss.

Caches (2/2) A processor can have one of the following two organizations: A unified cache This is a single cache for both instructions and data Separate instruction and data caches This organization is sometimes called a modified Harvard architectures

Unified instruction and data cache

Separate data and instruction caches

Cache Write Strategies Write-through All write operations are passed to main memory Write-through with buffered write All write operations are still passed to main memory and the cache updated as appropriate, but instead of slowing the processor down to main memory speed the write address and data are stored in a write buffer which can accept the write information at high speed. Copy-back (write-back) Not kept coherent with main memory

Summary (1/2) ARM Processor Family Processor family # of pipeline stages Memory organization Clock Rate MIPS/MHz ARM6 3 Von Neumann 25 MHz ARM7 66 MHz 0.9 ARM8 5 72 MHz 1.2 ARM9 Harvard 200 MHz 1.1 ARM10 6 400 MHz 1.25 StrongARM 233 MHz 1.15 ARM11 8 Von Neumann/ 550 MHz

Summary (2/2) Memory hierarchy Unified cache/Separate instruction and data cache Write-through with buffered write