Windows CE 에서 ARM 프로세서의 동작 김대홍 소프트웨어 팀장 ㈜씨랩시스. Agenda ARM 소개 ARM 버전 별 특징 Windows CE 에서의 ARM Windows CE 5.0 에서의 ARM 디렉토리구조 Windows CE 에서의 특정 ARM 명령 지원.

Slides:



Advertisements
Similar presentations
EZ-COURSEWARE State-of-the-Art Teaching Tools From AMS Teaching Tomorrow’s Technology Today.
Advertisements

AMD OPTERON ARCHITECTURE Omar Aragon Abdel Salam Sayyad This presentation is missing the references used.
KeyStone ARM Cortex A-15 CorePac Overview
Computer Organization and Architecture
Processor Overview Features Designed for consumer and wireless products RISC Processor with Harvard Architecture Vector Floating Point coprocessor Branch.
1 VR BIT MICROPROCESSOR โดย นางสาว พิลาวัณย์ พลับรู้การ นางสาว เพ็ญพรรณ อัศวนพเกียรติ
Microprocessors. Von Neumann architecture Data and instructions in single read/write memory Contents of memory addressable by location, independent of.
Khaled A. Al-Utaibi  Computers are Every Where  What is Computer Engineering?  Design Levels  Computer Engineering Fields  What.
Chapter 12 CPU Structure and Function. CPU Sequence Fetch instructions Interpret instructions Fetch data Process data Write data.
Computer Organization and Architecture
Computer Organization and Architecture
The ARM7TDMI Hardware Architecture
This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during.
Embedded Systems Programming
COMP3221: Microprocessors and Embedded Systems Lecture 2: Instruction Set Architecture (ISA) Lecturer: Hui Wu Session.
Introduction to ARM Architecture, Programmer’s Model and Assembler Embedded Systems Programming.
The ARM Microprocessor: A Little British Success Story Michelle Nabavian V Microprocessors Professor Robert Dewar Spring 2002.
Vacuum tubes Transistor 1948 –Smaller, Cheaper, Less heat dissipation, Made from Silicon (Sand) –Invented at Bell Labs –Shockley, Brittain, Bardeen ICs.
Embedded Systems Programming
Prardiva Mangilipally
© 2009 Acehub Vista Sdn. Bhd Introduction to ARM ® Processors.
Computer Organization and Assembly language
Lect 13-1 Lect 13: and Pentium. Lect Microprocessor Family  Microprocessor  Introduced in 1989  High Integration  On-chip 8K.
ECE 353 Introduction to Microprocessor Systems Michael G. Morrow, P.E. Week 2.
CH12 CPU Structure and Function
ARM Processor Architecture
ARM Processor Architecture (II)
What are Exception and Interrupts? MIPS terminology Exception: any unexpected change in the internal control flow – Invoking an operating system service.
Module 5: Programmable Components in SoC I
Cortex-M3 Debugging System
Exception and Interrupt Handling
hardware and operating systems basics.
COMPUTER SYSTEM LABORATORY Lab10 - Sensor II. Lab 10 Experimental Goal Learn how to write programs on the PTK development board (STM32F207). 2013/11/19/
ECE 353 Introduction to Microprocessor Systems
National Taiwan University JTAG and Multi-ICE Speaker : 沈文中.
Topic:The Motorola M680X0 Family Team:Ulrike Eckardt Frederik Fleck André Kudra Jan Schuster Date:Thursday, 12/10/1998 CS-350 Computer Organization Term.
NS7520.
ARM for Wireless Applications ARM11 Microarchitecture On the ARMv6 Connie Wang.
Enabling the ARM Learning in INDIA ARM Workshop on Blueboard Part-1 By B. Vasu Dev
1 TM The ARM Architecture - 1 Embedded Systems Lab./Honam University ARM Architecture SA-110 ARM7TDMI 4T 1 Halfword and signed halfword / byte support.
Computer Science 516 RISC Architecture: MIPS, ARM.
Presented By: Rodney Fluharty Dec. 07, Who is ARM? Advanced Risc Microprocessor is the industry's leading provider of 16/32-bit embedded RISC microprocessor.
SOC Consortium Course Material Debugging and Evaluation Speaker: Yung-Tsung Wang InstructorProf. Tsung-Han Tsai.
ARM 2007 Chapter 15 The Future of the Architecture by John Rayfield Optimization Technique in Embedded System (ARM)
Different Microprocessors Tamanna Haque Nipa Lecturer Dept. of Computer Science Stamford University Bangladesh.
VLSI Algorithmic Design Automation Lab. THE TI OMAP PLATFORM APPROACH TO SOC.
Next Generation ISA Itanium / IA-64. Operating Environments IA-32 Protected Mode/Real Mode/Virtual Mode - if supported by the OS IA-64 Instruction Set.
Processor Structure and Function Chapter8:. CPU Structure  CPU must:  Fetch instructions –Read instruction from memory  Interpret instructions –Instruction.
Different Microprocessors Tamanna Haque Nipa Lecturer Dept. of Computer Science Stamford University Bangladesh.
ARM 2007 Chapter 2 ARM Processor Fundamentals Optimization Technique in Embedded System (ARM) 2008 Mar, April.
©2000 Addison Wesley Little- and big-endian memory organizations.
1 x86 Programming Model Microprocessor Computer Architectures Lab Components of any Computer System Control – logic that controls fetching/execution of.
ARM 7 & ARM 9 MICROCONTROLLERS AT91 1 ARM920T Processor.
CECS 347 Microprocessors and Controllers II Chapter 1 - An Overview of Computing Systems Instructor: Eric Hernandez.
ARM7 TDMI INTRODUCTION.
SEMINAR ON ARM PROCESSOR
Introduction to Exceptions 1 Introduction to Exceptions ARM Advanced RISC Machines.
Chap. 4 ARM Boot Loader Internals. 2 S3C2500 ARM940T Core module ARM9TDMI CoreIC.
Computer Organization and Assembly Languages Yung-Yu Chuang
ARM Embedded Systems
Computer System Laboratory
Implementation of Embedded OS
Andes Technology Innovate SOC ProcessorsTM
ARM Introduction.
JTAG, Multi-ICE and Angel
Computer Organization and Assembly Languages Yung-Yu Chuang 2008/11/17
ADSP 21065L.
ARM920T Processor This training module provides an introduction to the ARM920T processor embedded in the AT91RM9200 microcontroller.We’ll identify the.
Presentation transcript:

Windows CE 에서 ARM 프로세서의 동작 김대홍 소프트웨어 팀장 ㈜씨랩시스

Agenda ARM 소개 ARM 버전 별 특징 Windows CE 에서의 ARM Windows CE 5.0 에서의 ARM 디렉토리구조 Windows CE 에서의 특정 ARM 명령 지원

Who is the ARM Founded in November 1990 spun out of Acorn Computers Design the ARM RISC processor cores Licenses ARM core designs to semiconductor partners who fabricate and sell to their customers. ARM does not fabricates silicon itself Develop technologies to assist with the design-in of the ARM architecture Software tools, boards, debug hardware, application software, bus architectures, peripherals etc

The ARM® Connected Community

Segment Converging on Consumer

Licensing Status

ARM Instruction Instruction Code Width 32bit : ARM 16bit : Thumb 8bit : Java DSP Extension Instruction Thumb-2 (ARM1156) TrustZone (ARM1176) Instruction Type Data Processing Instruction Data Transfer Instruction Control Flow

32 bit ARM 명령어들을 16bit 의 명령어로 압축한 명령어군을 THUMB 이라고 함 THUMB 명령어로 이루어진 프로그램은 ARM 명령어로 이루어진 프로그램에 비해 70% 의 크기로 줄어듦 THUMB 은 16bit 메모리에서 ARM 보다 130% 의 속도증가를 보임 THUMB 시스템은 ARM 시스템에 비해 전력소모가 적음 THUMB 시스템은 ARM 시스템에 비해 시스템 비용이 적게 소요됨 Thumb

Jazelle Jazelle-enabled ARM cores execute 8-bit Java bytecode 95% of bytecodes executed in hardware(typical) Normal JVM : 1.0 Caffeinemarks/MHz ARM9EJ : 5.5 Caffeinemarks/MHz Significantly more power-efficient < 12K extra gates ( ARM9EJ-S vs ARM9E-S )

Agenda ARM 소개 ARM 버전 별 특징 Windows CE 에서의 ARM Windows CE 5.0 에서의 ARM 디렉터리구조 Windows CE 에서의 특정 ARM 명령 지원

ARM 제품군별 특징 ARM7ARM9ARM10ARM11 파이프라인 수 3 단계 5 단계 6 단계 8 단계 속도 mW/MHz0.06mW/MHz0.19mW/MHz( + 캐시 ) 0.5mW/MHz(+ 캐시 ) 0.4mW/MHz(+ 캐시 ) MIPS/MHz Architecture 폰노이만하버드 곱셈기 8x32 16x32

ARM 명령법 ARM{x}{y}{z}{T}{D}{M}{I}{E}{J}{F}{-S} x : 제품군 y : MMU/MPU z : 캐시 T : Thumb 16 비트 디코더 D : JTAG 디버그 M : 고속 곱셈기 I : Embedded ICE macrocell E : DSP 확장 명령어 J : Jazelle F : VFP 장치 S : synthesizible 버전

ARM 버전별 특징 버전코어 예향상된 ISA* ARMv4StrongARM Singned/Unsigned 하프워드 / 바이트 로드 - 스토어 명령어 System 추가 ARMv4TARM7TDMI/ARM9TDMI ARM720T/ARM920T Thumb ARMv5TEARM9E-S ARM966E-S ARM1020E Xscale 개선된 ARM/Thumb Interworking 향상된 곱셈 명령어 DSP 명령어 보다 빠른 곱셈 누산기 ARMv5TEJ ARM7EJ 와 ARM926EJJAVA 가속기 ARMv6ARM1136EJ-SSIMD Unaligned data support Multi-processing *ISA (Instruction Set Architecture)

ARM 캐시 코어 정책 코어쓰기정책교체정책할당정책 ARM720T 연속기입방식랜덤방식읽기미스 ARM740T 연속기입방식랜덤방식읽기미스 ARM920T 연속기입방식, 후기입방식랜덤방식, 라운드 로빈방식 읽기미스 ARM926EJS 연속기입방식, 후기입방식랜덤방식읽기미스 ARM940E 연속기입방식, 후기입방식랜덤방식, 라운드 로빈방식 읽기미스 ARM1020E 연속기입방식, 후기입방식랜덤방식, 라운드 로빈방식 읽기미스 ARM1026EJS 연속기입방식, 후기입방식랜덤방식, 라운드 로빈방식 읽기미스 인텔 StrongARM 연속기입방식, 후기입방식랜덤방식읽기미스 인텔 XScale 연속기입방식, 후기입방식랜덤방식읽기미스, 쓰기미스

ARM9TDMI Caches Macrocells Due to complexity of attaching it to a memory system, the ARM9TDMI is not licensed as a stand-alone core It is available in a range of cached macrocells ARM922T 2x8k caches Memory Management Unit (MMU) Write Buffer ARM920T As ARM922T but 2 x 16K caches ARM940T 2 x 4K caches Memory Protections Unit (MPU)

ARM9E Processor Core ARM9E is based on the ARM9TDMI core Architecture V5TE support Improved ARM/Thumb interworking New 32x16 and 16x16 multiply instructions New Count leading zeros instruction New Saturated maths instructions Core implementation differences Single cycle 32x16 multiplier implementation EmbeddedICE Logic RT

ARM926EJ-S Overview Jazelle state allows direct execution of Java bytecodes ARM926EJ-S ARM9EJ-S core Configurable Instruction and Data caches Instruction and Data TCM interfaces Memory Management Unit 2 x 32-bit AHB bus interfaces – Instructions and Data

ARM1136 Family Overview ARM1136JF-S Synthesizable ARM V6 Architecture High Performance Core 8-stage pipeline Static and Dynamic branch prediction Return stack Low latency interrupt mode Physically-lagged 4-64K I & D Caches Internal Configurable TCMS Four main memory ports Jazelle technology Integrated VFP coprocessor ARM1136J-S As above but with no VFP

ARM Architecture v5TE Architecture v5TE contains full v4T ARM and Thumb instructions sets plus: Improved support for interworking Covered in ARM/Thumb Interworking module Breakpoint instruction (ARM and Thumb) Count Leading Zeros instruction Extended coprocessor instructions – MCR2 etc. Support for saturated mathmatics Packed half-word signed multiplication instructions Doubles-word coprocessor transfer instructions – MCRR/MRRC

Intel StrongARM Overview ARM V4 Architecture (no Thumb support) 5-stage pipeline, reduced branch penalty Improved multiplier (typically 2cycles faster than ARM9TDMI) No support for Multi-ICE debugging (JTAG limited to connectivity test) No external coprocessor bus SA-110: 16K I&D caches, 8 x 16 byte write buffer SA-1100/1110: On-Chip peripherals, memory controller Smaller cache sizes PID register Instruction breakpointing via CP15

Intel Xscale Overview Architecture V5TE compatibility 7-8 stage pipeline with statistical branch prediction 32k Data and Instruction Caches, plus 2k data Minicache 8-entry write buffer, 4-entry Fill and Pend buffers Full 32-bit coprocessor interface Debug and performance monitoring logic(via CP14) Multiply-Accumulate block(as CP0) Configurable core clock speed – MHz from 33-66MHz iput clock Async input bus clock to 100MHz(max 1/3 of bus core clock) e.g. Intel Processor: Interrupt controller (implemented as CP13) ECC Memory Protection

Agenda ARM 소개 ARM 버전 별 특징 Windows CE 에서의 ARM Windows CE 5.0 에서의 ARM 디렉터리구조 Windows CE 에서의 특정 ARM 명령 지원

The microprocessor families supported in Windows CE ARM architectures v4, v4T, Thumb, v5TE, and Intel XScale X86 SHx Renesas SuperH SH4 microprocessors MIPS NEC Toshiba Philips Semiconductor Integrated Device Technologies LSI Logic Quantum Effect Design

CE 버전에 따른 ARM 지원변화 Windows CE 5.0 ARMV4 kernel 이 ARMV4I kernel 로 합쳐졌음. Windows CE.net 4.2 ARMV4T (Thumb) kernel 이 ARMV4I kernel 로 합쳐졌음. 하지만, ARMV4I kernel 에서 16-bit Thumb 응용프로그램은 계속 지원

ARM CPU 사용시 주의사항 (I) ARM kernel 은 registers 사용에 제약을 두지 않음 물리적 주소를 정적 맵핑을 통해 (direct-mapped) 1M 단위로 가상 주소로 맵핑 할 수 있음 OEMAddressTable – 정적 테이블 커널은 이 테이블을 통해 두개의 영역을 만듬 0x ~ 0x9FFFFFFF: cache and buffering 활성 0xA ~ 0xBFFFFFFF: cache and buffering 비활성 OEMInterruptHandlerFIQ 함수는 사용되지 않을지라도 성공적인 빌드를 위해 반드시 OAL 에 속해 있어야 함

ARM CPU 사용시 주의사항 (II) Nested interrupts ARM CPU 는 두 개의 인터럽트 지원 IRQ – Interrupt Request FIQ – Fast Interrupt Request GetSystemInfo 를 사용하여 프로세서의 정보를 얻을 때, SYSTEM_INFO 구조체의 dwProcessorType 변수는 정확한 값을 리턴 하지 않음 CEProcessorType 전역 변수를 시스템에 따라서 PROCESSOR_ARM720 나 PROCESSOR_STRONGARM 로 설정 OEMInit 함수에서 IOCTL_PROCESSOR_INFORMATION 을 구현하고 CEProcessorType 에 값을 설정

CPU Initialization ARM Kernels CPU 를 Supervisor mode 로 전환 IRQ & FIQ 를 비활성화 MMU, I-cache and D-cache 를 비활성화 I-cache, D-cache and TLB 를 flush 하거나 invalidate 과정을 거치고 the write buffer 를 비움 설정 GPIO Memory controller Interrupt controller RTC OEMAddressTable 의 physical 주소를 얻어서 R0 에 저장 KernelStart 로 Jump

CPU Initialization XScale 성능개선 Xscale 에서는 Branch Target Buffer Enabel bit 를 변경함으로써 성능을 개선시킬 수 있음 OEMInit 함수에서 변경

CPU Dependencies for OAL Functions (I) MIPSII MIPSII_FP MIPSIV MIPSIV_F P ARMV4ARMV4I AMRV4T SH4x86 CacheErrorHandlerXX InitClockXXXXX OEMARMCacheModeXX OEMCacheRangeFlushXXXX OEMClearDebugCommErrorX OEMDataAbortHandlerX OEMFlushCacheX OEMGetExtensionDRAMXXXXXX OEMGetRealTimeXXXXXX OEMIdleXXXXXX OEMInitXXXXXX OEMInitDebugSerialXXXXXX OEMInterruptDisableXXXXXX OEMInterruptDoneXXXXXX

CPU Dependencies for OAL Functions (II) MIPSII MIPSII_FP MIPSIV MIPSIV_F P ARMV4ARMV4I AMRV4T SH4x86 OEMInterruptEnableXXXXXX OEMInterruptHandlerXX OEMInterruptHandlerFIQXX OEMIoControlXXXXXX OEMNMIX OEMNMIHandlerX OEMPowerOffXXXXXX OEMReadDebugByteXXXXXX OEMSetAlarmTimeXXXXXX OEMSetRealTimeXXXXXX OEMWriteDebugByteXXXXXX OEMWriteDebugStringXXXXXX SC_GetTickCountXXXXXX

Agenda ARM 소개 ARM 버전 별 특징 Windows CE 에서의 ARM Windows CE 5.0 에서의 ARM 디렉터리구조 Windows CE 에서의 특정 ARM 명령 지원

The production-quality OEM adaptation layer (OAL) OAL 계발과정을 단축시킴 OAL 를 개선된 컴포넌트 형태로 제공 프로세서 family 에 대해 일관된 형태의 사용편리성 제공

디렉터리 %_WINCEROOT%\Platform\ the production-quality OAL model Memory-mapped configuration files Some include files that define the memory layout for the hardware platform that matches Config.bib Some glue logic, which is board-level customization code that unites everything in the %_WINCEROOT%\Platform\Common directory %_WINCEROOT%\Platform\Common contains the CPU-specific OAL routines %_WINCEROOT%\Public\Common\Oak\CSP The chip support package (CSP) directory contains a collection of system-on-a-chip (SOC) and CPU or chipset-level peripheral drivers. You can port the CSP driver for a core peripheral to any new hardware platform environment that makes use of the SOC or chipset without modification.

\Platform\ the production-quality OAL BSP Samsung SMDK2410 %_WINCEROOT%\Platform\SMDK2410 Intel Mainstone II %_WINCEROOT%\Platform\MainstoneII

\Platform\Common (I) \Platform\Common subdirectoryDescription Src\ARMContains the ARM processor-specific OAL code. Src\ARM\ARM920TContains all the CPU OAL code required for the ARM920T processor. Src\ARM\ARM920T\AbortContains the abort routines specific to the ARM920T CPU. Src\ARM\ARM920T\CacheContains the cache routines specific to the ARM920T CPU. Src\ARM\ARM926Contains the CPU OAL code required for the ARM926 processor. Src\ARM\ARM926\CacheContains the cache routines specific to the ARM926 CPU. Src\ARM\CommonContains routines that are generic to ARM-based hardware platforms. Src\ARM\Common\CacheContains the cache routines that are common for all ARM CPUs. Src\ARM\Common\MemoryContains the memory translation routines that are common for all ARM CPUs. The memory routines are used for translating physical addresses to physical addresses, and vice versa.

\Platform\Common (II) \Platform\Common subdirectoryDescription Src\ARM\IncContains include files that are generic to all ARM CPUs. Src\ARM\IntelContains the OAL code specific to the Intel hardware platform. Src\ARM\Intel\PXA250Contains routines specific to the PXA250 processor. Src\ARM\Intel\PXA27xContains routines specific to the PXA27x processor. Src\ARM\Intel\SA1100Contains routines specific to the SA1100 processor. Src\ARM\SamsungContains the OAL code specific to the Samsung hardware platform. Src\ARM\Samsung\S3C2410xContains routines specific to the S3C2410x processor.

\Public\Common\Oak\CSP the production-quality OAL CSP\ \ \. Ex) the serial function CSP driver \Public\Common\Oak\CSP\ARM\SAMSUNG\S 3C2410X\SERIAL

Agenda ARM 소개 ARM 버전 별 특징 Windows CE 에서의 ARM Windows CE 5.0 에서의 ARM 디렉터리구조 Windows CE 에서의 특정 ARM 명령 지원

ARM10 Intrinsic Functions CLZ Counts leading zeroes before first 1-bit The common intrinsic _CountLeadingZeros accessed the CLZ instruction BKPT Create soft breakpoint The common intrinsic __trap accessed the BKPRT instruction _swi Generates a call to the OS using the SWI software interrupt instruction. __emit Insert a specified instruction into the instuction stream _MoveFromCoprocessor,_MoveFromCoprocessor2 Read data from the ARM coprocessor _MoveToCoprocessor,_MoveToCoprocessor2 Write Data to te ARM coprocessor

ARM DSP-enhanced Intrinsic Functions (I) FunctionsCorresponding ARM DSP instruction Description _SmulAddLo_SW_SL _SmulAddHi_SW_SL _SmulAddHiLo_SW_SL _SmulAddLoHi_SW_SL SMLAxyA signed-integer multiply and accumulate operation: 16x16-bit multiply followed by a 32-bit add. _SmulAddWLo_SW_SL _SmulAddWHi_SW_SL SMLAWyA 32x16-bit multiply operation, followed by a 32- bit add of the upper 32 bits of the 48 bit product. _SmulAddHi_SW_SQ _SmulAddLo_SW_SQ _SmulAddHiLo_SW_SQ _SmulAddLoHi_SW_SQ SMLALxyA 16x16-bit multiply operation, followed by a 64- bit add of the product, with a 64-bit integer. _SmulLo_SW_SL _SmulHi_SW_SL _SmulHiLo_SW_SL _SmulLoHi_SW_SL SMULxyA signed-integer 16x16-bit multiply operation. _SmulWLo_SW_SL _SmulWHi_SW_SL SMULWyA signed-integer 32x16-bit multiply operation, returning the upper 32-bits.

ARM DSP-enhanced Intrinsic Functions (II) FunctionsCorresponding ARM DSP instruction Description _AddSatIntQADDA saturating add instruction. _SubSatIntQSUBA saturating subtract instruction. _DAddSatIntQDADDAn instruction to double an integer and saturate, and then add to a second integer and saturate. _DSubSatIntQDSUBAn instruction to double an integer and saturate, and then subtract from a second integer and saturate. _ReadCoProcessorMRRCAn operation to transfer values from a coprocessor to two ARM registers. _WriteCoProcessorMCRRAn operation to transfer two ARM register values to a coprocessor.

ARM XSCALE Intrinsic Functions FunctionsARM Xscale Instruction Description _SmulAdd_SL_ACCMIAMultiplies the signed value in register Rs by the signed value in register Rm, and then adds the result to the 40-bit accumulator. _SmulAddPack_2SW_ACCMIAPHPerforms two 16x16 signed multiplications on packed half-word data and accumulates these to a single 40-bit accumulator. _SmulAddLo_SW_ACC _SmulAddHi_SW_ACC _SmulAddLoHi_SW_ACC _SmulAddHiLo_SW_ACC MIAxyPerforms one 16-bit signed multiplication and accumulates the result to a single 40-bit accumulator. _WriteCoProcessorMARMoves 64 bits of data from ARM registers to coprocessor registers. _ReadCoProcessorMRAMoves 64 bits of data to ARM registers from coprocessor registers. _PreLoadPLDThis instruction is used as a hint to the memory system that a memory access from the specified address will occur shortly.

참고자료 ARM System Developer’s Guide ( 한국어판 ) “ARM Technical Training Course” manual

© 2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.