Download presentation
Presentation is loading. Please wait.
Published byCory Lindsey Modified over 9 years ago
1
Renesas Electronics America Inc. ID 130C: Increasing Application Performance and Data Throughput with SH-2A MCUs Dean Chang Product Marketing Manager 12 October 2010 Version: 1.1
2
2 Mr. Dean Chang Product Marketing Manager SH-2A MCU/MPUs Wi-Fi Wireless LAN Partners Building Automation Segment Marketing Previous Experience Responsible for launch of a new line of Cortex M3 MCUs at Fujitsu Semiconductor Active in wireless standards activities such as Wi-Fi (IEEE 802.11), WiMAX (802.16, board member and chair of the service provider working group), Bluetooth (802.15) and ZigBee (802.15) BSEE from Cal Poly San Luis Obispo
3
3 Renesas Technology and Solution Portfolio Microcontrollers & Microprocessors #1 Market share worldwide * Analog and Power Devices #1 Market share in low-voltage MOSFET** Solutions for Innovation ASIC, ASSP & Memory Advanced and proven technologies * MCU: 31% revenue basis from Gartner "Semiconductor Applications Worldwide Annual Market Share: Database" 25 March 2010 **Power MOSFET: 17.1% on unit basis from Marketing Eye 2009 (17.1% on unit basis).
4
44 Renesas Technology and Solution Portfolio Microcontrollers & Microprocessors #1 Market share worldwide * Analog and Power Devices #1 Market share in low-voltage MOSFET** ASIC, ASSP & Memory Advanced and proven technologies * MCU: 31% revenue basis from Gartner "Semiconductor Applications Worldwide Annual Market Share: Database" 25 March 2010 **Power MOSFET: 17.1% on unit basis from Marketing Eye 2009 (17.1% on unit basis). Solutions for Innovation
5
55 Microcontroller and Microprocessor Line-up Superscalar, MMU, Multimedia Up to 1200 DMIPS, 45, 65 & 90nm process Video and audio processing on Linux Server, Industrial & Automotive Up to 500 DMIPS, 150 & 90nm process 600uA/MHz, 1.5 uA standby Medical, Automotive & Industrial Legacy Cores Next-generation migration to RX High Performance CPU, FPU, DSC Embedded Security Up to 10 DMIPS, 130nm process 350 uA/MHz, 1uA standby Capacitive touch Up to 25 DMIPS, 150nm process 190 uA/MHz, 0.3uA standby Application-specific integration Up to 25 DMIPS, 180, 90nm process 1mA/MHz, 100uA standby Crypto engine, Hardware security Up to 165 DMIPS, 90nm process 500uA/MHz, 2.5 uA standby Ethernet, CAN, USB, Motor Control, TFT Display High Performance CPU, Low Power Ultra Low Power General Purpose
6
66 Microcontroller and Microprocessor Line-up Superscalar, MMU, Multimedia Up to 1200 DMIPS, 45, 65 & 90nm process Video and audio processing on Linux Server, Industrial & Automotive Up to 500 DMIPS, 150 & 90nm process 600uA/MHz, 1.5 uA standby Medical, Automotive & Industrial Legacy Cores Next-generation migration to RX High Performance CPU, FPU, DSC Embedded Security Up to 10 DMIPS, 130nm process 350 uA/MHz, 1uA standby Capacitive touch Up to 25 DMIPS, 150nm process 190 uA/MHz, 0.3uA standby Application-specific integration Up to 25 DMIPS, 180, 90nm process 1mA/MHz, 100uA standby Crypto engine, Hardware security Up to 165 DMIPS, 90nm process 500uA/MHz, 2.5 uA standby Ethernet, CAN, USB, Motor Control, TFT Display High Performance CPU, Low Power Ultra Low Power General Purpose SuperH
7
7 Utility Electric Meter Innovation IP Network Core Solar Inverter Utility Smart Meter – Smart Energy Network
8
8 Our SH-2A MCU Solution High Performance SH-2A Process Core efficiencies Equivalent to Application Processors (2 DMIPS/MHz) Combined with High Performance Memory and Peripherals on MCUs The SH-2A with Floating Point Unit (FPU) Integrates enough functionality to replace a DSP + MCU into a single chip Benefits include smaller form factors, lower cost, less EMI Software development is simple because everything is coded on a single platform Application Processor Core With FPU MCU Memory And Peripherals
9
9 Agenda SH-2A Key Features and Benefits SH-2A Architecture Peripherals to Enhance Your Applications Applications Development Tools and Starter Kits Q&A
10
10 Key Takeaways By the end of this session you will be able to: Identify the strengths of the SH-2A Architecture and Peripherals Identify the right applications for SH-2A MCUs Understand some key ways to optimize performance while minimizing CPU bandwidth
11
11 SH-2A Key Features and Benefits Double Precision 64 bit FPU Short 9 cycle Interrupt Latency Multifunction Timers 2 Inverters supported Industry’s Fastest 10ns Flash Memory (1MB) Large SRAM Up to 128KB SH-2A MCUs shine when customer needs performance & high- throughput 2.0 DMIPs/MHz Superscalar RISC Core Peripherals 12 bit ADC Fast 1.0uS Sampling Rate High Speed Memory Rich Connectivity 10/100 EthernetCAN 2.0USB 2.0
12
12 SH-2A Architecture
13
13 Super Scalar versus Dual Core Scalar – One Thread/One Instruction at a time Single Instruction Stream/Single Pipeline – Fetch, Decode, Execute Super Scalar – One or more threads/More than one instruction at a time Example: One Thread / Two instructions at a time – 2 FETCH, 2 DECODE, 2 EXECUTE Dual Core –2 independent threads
14
14 SH-2A Features: Superscalar Pipeline / Floating Point Unit 5 Stages SH-2A-FPU CPU Core only CPU FPU 12345 Pipeline Superscalar 2.4 DMIPS/MHz from RAM 2.0 DMIPS/MHz from Flash
15
15 SH-2A offers the Highest DMIPS/MHz Source: Respective vendor’s web sites.
16
16 QUESTIONS? What is the DMIPS/MHz performance of the SH-2A when executing out of Flash Memory? 2.0 DMIPs/MHz Does a SH-2A allow you to execute 4 simultaneous instructions? No, 2 simultaneous instructions
17
17 Floating Point Unit 2 Million FLoating Point Operations Per Sec (MFLOPS)/MHz Total of 400MFLOPS @ 200MHz IEEE754-compliant Easily share data with other systems Single (32-bit) & Double (64-bit) Precision Precise and faster control loops & algorithms Designed for Embedded Systems Automatic scaling of floating format Supports FMAC, FABS, FLOAT, FDIV, FSQRT etc. Function (Double Precision) Time * (nS) sin680 cos650 tan900 asin995 acos1225 atan695 log910 exp950 pow1140 * Based on SH7203 (SH-2A core with FPU) 200MHz execution from SDRAM with cache enabled Performance using flash-based MCU & FPU at 200MHz is not available at this time.
18
18 FPU Advantages Floating Point based math is easy to understand Simulations HW FPU based math is faster and requires less code space – FPU Performance of a polynomial formula (R32C @ 32MHz) – SUM(An * x^n), where n = 0 to 5 and A0 to A5 are constants // Read the ADC code into float value rawADCFloatValue = float(adcCode); // Linearize the ADC Value actualTemperatureValue = 1.23456*rawADCFloatValue + 45.8 // Read the ADC code rawADCFixedValue = adcCode; // Linearize the ADC Value actualTemperatureValue = FIX12_MUL(FIX12_fromfloat(1.23456), rawADCFloatValue) + FIX12_fromfloat(45.8); With Floating PointWithout Floating Point
19
19 SH-2A Fast Interrupt Response Drawing not to scale CPU Latency Save Context (By Complier) User Code Restore Context Typical MCUs INT Trigger 9 Cycles CPU Latency + Save Context User Code Restore Context SH-2A MCU 15 Reg. Banks LIFO HW saves the context in register bank LIFO One Primary Reg. Bank + Latency SH7216Cortex-M3ARM7TDMIPIC32 MCU Interrupt Latency918+24 – 4218 – 40+
20
20 SH-2A Bus Structure SH-2A CPU (Superscalar) On-chip RAM F bus (instruction) M bus (data) 32bit/1cyc Cache controller I bus (internal bus) 32bit/1cyc DMAC/DTC Bus State Controller External bus Bridge P bus (peripheral bus) TimersADCSCIPORT 32bit/1cyc 16bit/3cyc Cache memory Instruction/Data cache: 8KB/8KB 4way set associative (LRU) On-chip Flash SDRAM, SRAM, etc... I/F FPU Harvard Architecture
21
21 2 wait cycles 1 wait cycle 30 MHz no wait 100 MHz Processing performance MCU frequency SH with 100 MHz Flash Competing MCU with 30 MHz Flash High Performance 100 MHz Flash
22
22 Fast Flash = More RAM for Application Code or Frequently used Tables in RAM to achieve full speed CPU Code RAM Slow Flash Data Code/ Tables Competitor MCU With Slow Flash Slower Access Fastest Access Less RAM for Data More RAM for Data Result: SH MCUs can execute similar applications with less RAM CPU Code RAM Fast Flash Data Fast Access Fastest Access
23
23 QUESTIONS? Name one unique feature of the FPU in the SH-2A relative to other Flashed-based MCUs Supports Dual Precision (64bit) Floating Point Math How many register banks SH-2A contain and why does it matter? 16. It reduces the interrupt latency to just 9 cycles. What’s the maximum frequency of the flash without adding a wait state? 100 MHz
24
24 Peripherals to Enhance your Application
25
25 Multi-function Timer Units Timer Unit 2 Timer Unit 2S ADC TriggerDTC TriggerDMA TriggerADC TriggerDTC TriggerDMA Trigger Auto Shutdown 2x Encoder I/Ps Dead Time Comp. 12x PWMs 6x 16-bit Timers Auto Shutdown Dead Time Comp. 100MHz Clock 8x PWMs 3x 16-bit Timers Support for Two 3 Phase Motors at the Same Time Computational Power to Support Advanced Sensorless Vector Algorithms Up to 8 Different Operational Modes
26
26 Dual 12-bit ADC with 8 Channels S/H SAR ADC INT Multiple ADC Result Registers 3 Simultaneous Sample & Hold ADC Clock Up to 50 MHz Fast 1µs Conversion Rate Analog Input supports 0-5V SH7216 (single ADC shown here)
27
27 External Memory Interface Flash/ ROM SRAM Burst ROM SDRAM Separate Read & Write Wait Cycles for each CS 8 CS Regions Bus Arbitrator SDRAM Auto Refresh Little/Big Endian 8/16/32 bus
28
28 Data Transfer Controller (DTC) IRQ On chip Data Transfer Controller INTC IRQ Clear Less Interrupts Increase CPU Efficiency Less Interrupts Increase CPU Efficiency
29
29 QUESTIONS? Name one kind of advanced motor algorithm that is supported by the SH-2A Multi Function Timers? Sensorless Vector Algorithms What name an advantage of having a Data Transfer Controller? Extends the number of DMAs via software limited only by memory Reduces number of interrupts on the CPU for greater efficiency
30
30 Rich Connectivity
31
31 10/100 Ethernet MAC Full and Half-duplex modes Can connect to any MII-compliant PHY Magic Packet detection & Wake-on- LAN Transmit and Receive FIFO – 2 KB each Two Integrated DMA channels TCP/IP Open source TCP/IP supporting uIP in Renesas Demonstration Kit Many TCP/IP options available from third parties SH7216 100pin PHY Magnetics MII 10/100 Ethernet MAC
32
32 Controller Area Network (CAN) Common Control/Status Registers CAN 2.0B Protocol Engine CPU Interface Message Buffer Acceptance Filter Control Registers 15 Tx/Rx + 1 Rx Up to 1Mbps data rate INTs Clock Data Control Unique Features: Hardware support to simplify SW & Reduce CPU load Disable Automatic Retransmission on Bus Error Automatic Priority-based transmission – Mailbox number or ID-based RX TX SH7137, SH7147, SH7286, SH7216
33
33 USB 2.0 Full Speed Device Status & Control FIFO USB Engine Transceiver D+ D- Integrated USB Transceiver External 48MHz clock or shared 12MHz+PLL clock Ability to disable USB module to save power 128 byte FIFO on transmit and receive SH7285, SH7286, SH7216
34
34 Renesas Wi-Fi Solutions SPI, UART I/F IEEE 802.11a/b/g/n Integrated TCP/IP Stack Up to 10Mbps SPI/UART Redpine Driver Example Demo 32-bit RISC Flash MCU Up to 400DMIPS @ 200MHZ 32/64-bit FPU, Ethernet, USB, CAN www.am.renesas.com/wifi
35
35 Low Cost Motor Control Demo Board On-board 24VDC PMAC Motor USB Powered to 6000 RPM External power to 10000 RPM Drive larger motor with external power module Pre-programmed Vector Control Algorithm 3 Shunt Current Detection Hall & Encoder Connectors PC Application to learn/experiment Real-time display of parameters
36
36 Target Applications
37
37 Target Applications Factory Automation Precision Motion Control Industrial Connectivity Fast I/O Control Operator Panels Scientific & Medical Signal Analysis Quiet Motor Control Connectivity Operator Panels Building Automation High-end Security Systems Image Processing Speech, Connectivity Thermostat, Control Panel Control Panels Office Automation Image Processing Precise Stepper Control Connectivity Operator Panels White Goods Energy Efficient Motor Control Information Displays Consumer Media Players User Interfaces
38
38 SH7216 Application Example Factory Automation
39
39 Real Life Performance Enhancements Analog Data Collection with Hardware Assist Combination of MTU timers ADC, DMA and Buffers Saves 7% CPU Bandwidth Data Transfer Controller – Sound Pump Announces Phone Numbers – 10 sec time for a single interrupt rather than an interrupt every 8 kHz Data Transfer Controller – Data Scattering ADC collects data from 4 different sources After completed, one Interrupt is generated Data is automatically stored at different buffer locations Complex Stepper Motor S-Curve Profiles in FLASH Timer sets “Profile Rate” Triggers DTC DTC transfers PWM data based on Profile All Features Increase CPU Efficiency All Features Increase CPU Efficiency
40
40 Development Tools & Software Solutions
41
41 Development Tools C/C++ Compilers MULTI ® KPIT GNU Tools FREE Evaluation Systems Motor Control Emulators E10A-USB E200F FREE Development Environments MULTI ® FREE Sample Code & Libraries from Renesas FREE RTOS & Middleware
42
42 Hardware Debuggers E200F E10A On-chip Debug interfaces USB E10A for JTAG Advanced User Debug (AUD) version Pipeline trace RTOS Aware Full In-Circuit Emulators Non-intrusive debugging Application uses all package pins Application uses all ROM & RAM Advanced debug features Complex events Full bus trace Coverage Seamless Integration with HEW
43
43 Free SW from Renesas Sample code for major peripherals Vector Control Motor Algorithms Ethernet Send/Receive, Open Source TCP/IP supporting uIP CAN API – compatible with R8C & R32C API USB Device – CDC, MSC, HID Fixed Point Math & DSP Libraries for SH-2A FPU Available on am.renesas.com
44
44 Third Party Support by SW Components Third PartyIDECompilerDebugRTOS TCP/IP Stack USB Device USB Host GraphicsFile CMX--- SH7216 SH7264 SH7216 No-SH7216 Express Logic--- SH7216 SH7264 SH7216 SH7264 SH7216 SH7264 FreeRTOS.org--- SH7216 SH7264 SH7216---- IAR SH7216 SH7264 SH7216 SH7264 SH7216 SH7264 ------ KPIT GNU Tools SH7216 SH7264 SH7216 SH7264 SH7216 SH7264 ------ Jungo-----SH7264 -- Micrium--- SH7216 SH7264 SH7216 SH7264* SH7216 SH7264* SH7216 SH7264* Micro Digital---No SH7264-No RoweBots--- SH7216 SH7264 SH7216* SH7264* SH7216* SH7264* - SH7216 SH7264 Segger--- SH7216 SH7264 SH7216 SH7264*SH7264 SH7216 SH7264 * = In Development No = Not Yet ‘-’ = Not Offered
45
45 Our SH-2A MCU Solution High Performance Core at 2 DMIPS/MHz Combined with High Performance Memory and Peripherals on MCUs Single chip can replace DSP + MCU Combination Smaller Form Factor Lower Cost Simplified Design Application Processor Core With FPU MCU Memory And Peripherals
46
46 Questions?
47
47 Utility Electric Meter Innovation – Smart Energy Network IP Network Core Solar Inverter Utility Smart Meter
48
48 Thank You!
49
49 Appendix
50
50 MTU2 triggers ADC (Accurate Sample Rate) ADC Complete triggers DMAC DMAC transfers data to buffer Half-Intr (PING ready) can be used to trigger filter TASK Complete Intr (PONG ready) triggers FILTER TASK and reloads Analog Data collection and DSP Processing HW Assist to Acquire and Transfer data to Buffer saves 7% CPU Bandwidth
51
51 DSP Processing DMAC Interrupt “passes” buffer to DSP Filters/Library Buffer is passed to Filter task created by the Signal Processing Library Extensive Signal Processing Library Available for Free
52
52 Data Transfer Control (DTC) Sound “pump” MTU2 triggers DTC at “sample rate” DTC sends data out “analog” (sound) port D/A, PWM, etc.... DTC “chains” to next part of sound DTC Continues until complete Example: Announcing Phone Number 8kHz sample rate 10 numbers CPU runs through the 10 blocks updating source pointer to correct “numbers” Software enables DTC start at “head” of Chain. DTC “1” Transfer DTC “2” Transfer DTC “3” Transfer DTC “4” Transfer DTC “6” Transfer DTC “7” Transfer DTC “8” Transfer DTC “9” Transfer DTC “0” Transfer DTC “5” Transfer intr Only one interrupt serviced with 10sec of sound, rather than 8kHz interrupts
53
53 DTC “Data Scattering” ADC collecting 4 pieces of “unrelated” data 4 contiguous result registers ADC “complete” triggers DTC DTC scatters ADC data to correct (non-contiguous) buffers Interrupt after ALL buffers updated Post flags for tasks waiting on Data DTC Transfer Current Sample transfer to Motor Control “Pot” Reading Sample transfer to User Interface Current Sample transfer to PFC Current Control Voltage Sample transfer to PFC Voltage Control intr
54
54 DTC “Data Gathering” Input task is “malloc-ing” buffers as data comes in 128 bytes chunks After collecting 4 buffers, data must be written to “sector” of FLASH Drive (512 Bytes) Operation: CPU updates DTC source pointers to the 4 buffers Command sequence to FLASH drive (Write Sector Command) Start DTC to transfer “data block” DTC Interrupt – “free” buffers, FLASH Drive writer Task goes idle buffer 3 buffer 1 buffer 4 buffer 2 Memory (non-contiguous buffers) FLASH Drive DTC Transfer 1 DTC Transfer 2 DTC Transfer 3 DTC Transfer 4
55
55 Complex Stepper profiles using DTC S-Curve Profiles in FLASH Timer sets “Profile Rate” Triggers DTC DTC transfers PWM data based on Profile
56
Renesas Electronics America Inc.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.