Mobile Handset Microprocessor
Outline Terms ISA Basics ARM Comparison between ARM and x86
Terms ARM: Advanced RISC Machine CISC: Complex Instruction Set Computer ISA: Instruction Set Architecture RISC: Reduced Instruction Set Computer x86: a generic name given to Intel processors beginning with the 8086 processor released in 1978
ARM and x86 Microprocessor (or CPU, processor) is the brain of computing device Two major types of microprocessors Intel x86-based dominate the desktop market ARM-based dominate the mobile handset market Before we introduce the ARM architecture, we need review some ISA basics which are useful to distinguish ARM from other architectures
Instruction Set Architecture (ISA) A set of computer hardware instructions, for examples Arithmetic operations ADD r0, r1, r2; r0 = r1 + r2 Register movement MOV r0, r2 An interface to allow easy communication between high level software and low level hardware
More about ISA It defines the set of instructions, their binary formats and operation specifications It has a significant impact on microprocessor’s performance, cost and complexity All ISAs fall into two categories Complex Instruction Set Computer (CISC) architecture Reduced Instruction Set Computer (RISC) architecture
CISC Overview It contains a large set of computer instructions that range from very simple to very complex and specialized ones Each instruction might perform a series of operations inside a processor It makes chips easy to program and use memory more efficiently Typical chip example who runs CISC: Intel x86 processors
Why CISC? Memory in the early days was slow and expensive Bigger program -> more memory space -> more money and more slowly It shifts the burden of generating machine codes from compiler to processor For example, instead of making a compiler to generate long machine codes for calculating a square-root, a CISC processor would have a built-in ability to do so
CISC Disadvantages The development of computer technology is constantly introducing new and complex set of instructions More complex instructions make the hardware more complex Almost 20% of the instructions are used repeatedly. The rest 80% are not used frequently
RISC Overview It came out after the CISC architecture It defines a small, simple and highly-optimized set of instructions, rather than a more specialized set of instructions often found in CISC It is not just simply to reduce the size of the instruction set. The amount of work each instruction accomplishes is also reduced
RISC History 1975 – The first RISC project started at IBM 1980 – The original RISC prototype computer (801 Minicomputer) was built at IBM 1981 – The MIPS architecture grew out of a graduate course at Stanford Univ. 1982 – The Berkeley RISC project delivered RISC-I and RISC II 1990s – RISC-based commercial products started to flourish and become prevalent
RISC Attributes The instruction set contains simple, basic instructions The size of the instruction set is reduced Each instruction has the same length Each instruction completes in one machine-cycle It uses pipelining which allows the processor to handle several instructions at the same time
Number of cycles per instruction CISC vs RISC Feature CISC RISC Philosophy Emphasize on hardware Emphasize on software Number of cycles per instruction Multiple Single Code size Small Large Power Many watts Several mill watts Computing Speed Faster Slower Cost Expensive Cheaper Temperature Need fan Lower
ARM – Advanced RISC Machines ARM is a family of instruction set architectures for computer processors based on the RISC architecture ARM is developed by British company ARM Holdings 95% of the world’s smartphones are using ARM-based processors
ARM Holdings ARM Holdings developed the instruction set and the ARM architecture, but does not manufacture ARM products. They periodically releases updates to its designs ARM Holdings licenses chip designs to third parties, who design their own products
ARM Chip Manufactures Apple AppliedMicro Samsung Atmel Texas Instruments Qualcomm Nvidia NXP AppliedMicro Atmel Broadcom Cypress Freescale ST Microelectronics
ARM Chip Manufactures These manufactures implement the licensed architectures by putting the processor core inside their chipsets in combination with whatever GPUs, memory, interfaces, radios and other things they desire. That is why two chipsets from different companies can both appear to contain the same processor.
ARM Business Model ARM licenses technology to semi conductor partner Partner develops chips using ARM’s designs ARM designs technology for energy-efficient chips Device manufacturer builds consumer products ARM gets license fee which is typically several million dollars for each design ARM receives a royalty based on a percentage of the chip price, for each sold chip
ARM Version Releases Version Release Year Features Implementations Typical Applications ARMv1 1985 First commercial product ARM1 BBC Micro computers ARMv2 1987 Coprocessor support ARM2, ARM3 Acorn Archimedes computers ARMv3 1992 32 bit, 25MHz clock ARM6, ARM7 Zarlink GPS receiver, Acorn Risc PC 700 ARMv4 1996 Thumb support ARM7TDI, ARM8, ARM9TDMI, StrongARM Nintendo DS, Garmin Navigation devices, HP Jornada 7xx ARMv5 1999 DSP, Jazelle extensions ARM10, Xsacle Samsung SGH-i780, Blackberry 8700, HTC universal ARMv6 2001 SIMD TrustZone, multiprocessing ARM11, ARM11MP Raspberry Pi, Samsung I5700, iPhone 3G/3GS ARMv7 2004 Floating point Cortex-A series, Cortex-R series, Cortex-M3, Cortex-M4 iPhone 4/4S/5/5C, Google Nexus S, Apple iPad, HTC Desire, Samsung Galaxy S2/S3 ARMv8 2011 64 bit Cortex-A53, Cortex-A57 Samsung Galaxy Note 4, iPhone 5S/6/6 Plus
A Generic ARM-based Design 32 bit RAM Peripherals 16 bit RAM Interrupt Controller I/O ARM Core 8 bit ROM
ARM Instruction Set Three instruction types Data processing Data transfer Control flow
Data Processing Instructions Arithmetic Operations ADD r0, r1, r2; r0 = r1 + r2 Logical Operations AND r0, r1, r2; r0 = r1 AND r2 Register Movement MOV r0, r2 Comparison CMP r1, r2 Multiplication MUL r4, r3, r2; r4 = r3 * r2
Data Transfer Instructions Move data between ARM registers and memory Load Instruction LDR r0, [r1]; r0 = memory[r1] Store Instruction STR r0, [r1]; memory[r1] = r0
Control Flow Instructions Determine which instructions get executed next
3-Stage Pipeline Each instruction’s processing can be divided into three stages: fetch, decode and execute
Why ARM Architecture ARM processors significantly reduce costs, heat and power use, compared with x86 processors. Such reductions are desirable for portable, battery-powered devices such as smartphones E.g., ARM7100 consumes 72mW when operating at 14MIPS while Intel Atom (x86) consumes ~1W
What Makes ARM-based Chips Power Efficient Slower speed They use lower-speed transistors which require lower voltage, reducing power consumption. Smaller scale Fewer transistors are used because ARM is a RISC architecture. This means lots of operations are processed in small and simple chunks at the expense of more machine codes.
What Makes ARM-based Chips Power Efficient Sleep mode Some modern ARM processors save power by going to sleep mode until it receives instructions to do something. X86 currently only supports reducing the core frequency to run at lower voltage Simple instructions The instruction set stays simple and minimal. Extension is done through co-processors
ARM Extensions DSP (Digital Signal Processing) Enhancement Improves architecture for DSP and multimedia use Include variations on instructions such as signed multiply-accumulate, saturated add and subtract, and count leaning zeroes Java Support Allows Java Bytecode to be executed directly in the ARM architecture
ARM Extensions Multimedia Extension Security Extensions Add a 64/128-bit instruction set that provides standardized acceleration for media and signal processing applications Security Extensions Called TrustZone Technology Provide two virtual processors with hardware-based access control instead of adding another dedicated security core
ARM and x86 Instructions An example: multiplying two numbers in memory. Data 1 in location 2:3 and data 2 in location 5:2; Store back the result in location 2:3. Intel x86: MULT 2:3, 5:2 ARM: LDR A, 2:3 LDR B, 5:2 MUL A, B STR 2:3, A
x86 vs ARM Features x86 ARM Operand reuse Must reload the operand into register because the registers are automatically erased after computation The operand can be reused because it will remain in the register until another value is loaded in its place Code Length Relatively short More lines of code RAM Usage Little RAM is required More RAM is used Hardware Require more transistors, more hardware space, more power consumption Requires less transistors, less hardware space, less power consumption Focus Emphasis on speed and performance Emphasis mainly on power consumption Compatibility Compatible with most of the operating systems like Windows, Linux, Android, etc. Only supports Linux and Android so far Products Laptops, desktops and servers Smartphones, tablets
References http://en.wikipedia.org/wiki/ARM_architecture http://www.quora.com/What-makes-ARM-based-chips-relatively-power-efficient https://web.eecs.umich.edu/~prabal/teaching/eecs373-f10/slides/lec22.pdf http://en.wikipedia.org/wiki/List_of_applications_of_ARM_cores http://ir.arm.com/phoenix.zhtml?c=197211&p=irol-homeprofile http://www.crn.com/news/components-peripherals/240003811/arm-snags-95-percent-of-smartphone-market-eyes-new-areas-for-growth.htm https://www.cis.upenn.edu/~milom/cis501-Fall05/lectures/02_isa.pdf http://www.slideshare.net/ManasaSushmitha/x86-and-arm-performance-comparison http://cs.stanford.edu/people/eroberts/courses/soco/projects/risc/risccisc/ http://www.csie.nuk.edu.tw/~kcf/course/98_Spring/Embedded%20System/2-Introduction%20to%20ARM%20architecture.pdf