‘C’ for Microcontrollers, Just Being Efficient Lloyd Moore, President

Slides:



Advertisements
Similar presentations
Computer Architecture
Advertisements

Part IV: Memory Management
Instruction Set Design
1/1/ / faculty of Electrical Engineering eindhoven university of technology Introduction Part 2: Data types and addressing modes dr.ir. A.C. Verschueren.
INSTRUCTION SET ARCHITECTURES
MICROPROCESSORS TWO TYPES OF MODELS ARE USED :  PROGRAMMER’S MODEL :- THIS MODEL SHOWS FEATURES, SUCH AS INTERNAL REGISTERS, ADDRESS,DATA & CONTROL BUSES.
8051 Core Specification.
Procedures and Stacks. Outline Stack organization PUSH and POP instructions Defining and Calling procedures.
Set 20 Interrupts. INTERRUPTS The Pentium has a mechanism whereby external devices can interrupt it. Devices such as the keyboard, the monitor, hard disks.
COMP3221 lec08-arith.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lecture 8: C/Assembler Data Processing
COMP3221: Microprocessors and Embedded Systems Lecture 2: Instruction Set Architecture (ISA) Lecturer: Hui Wu Session.
1 ICS 51 Introductory Computer Organization Fall 2006 updated: Oct. 2, 2006.
Chapter 16 Control Unit Implemntation. A Basic Computer Model.
Chapter 4 Processor Technology and Architecture. Chapter goals Describe CPU instruction and execution cycles Explain how primitive CPU instructions are.
State Machines Timing Computer Bus Computer Performance Instruction Set Architectures RISC / CISC Machines.
Chapter 12 CPU Structure and Function. Example Register Organizations.
Embedded ‘C’.  It is a ‘mid-level’, with ‘high-level’ features (such as support for functions and modules), and ‘low-level’ features (such as good access.
ARM Core Architecture. Common ARM Cortex Core In the case of ARM-based microcontrollers a company named ARM Holdings designs the core and licenses it.
System Calls 1.
Computer Arithmetic. Instruction Formats Layout of bits in an instruction Includes opcode Includes (implicit or explicit) operand(s) Usually more than.
CoE3DJ4 Digital Systems Design Chapter 3: instruction set summary.
Operator Precedence First the contents of all parentheses are evaluated beginning with the innermost set of parenthesis. Second all multiplications, divisions,
Machine Instruction Characteristics
IT253: Computer Organization Lecture 4: Instruction Set Architecture Tonga Institute of Higher Education.
Recall: Three I/O Methods Synchronous: Wait for I/O operation to complete. Asynchronous: Post I/O request and switch to other work. DMA (Direct Memory.
8-1 Embedded Systems Fixed-Point Math and Other Optimizations.
DSP Processors We have seen that the Multiply and Accumulate (MAC) operation is very prevalent in DSP computation computation of energy MA filters AR filters.
‘C’ for Microcontrollers, Just Being Efficient Lloyd Moore, President Seattle Robotics Society.
© 2008, Renesas Technology America, Inc., All Rights Reserved 1 Course Introduction Purpose  This training course provides an overview of the CPU architecture.
CSCI 136 Lab 1: 135 Review.
Chapter 4 Memory Management Virtual Memory.
Moving Arrays -- 1 Completion of ideas needed for a general and complete program Final concepts needed for Final Review for Final – Loop efficiency.
CSNB374: Microprocessor Systems Chapter 5: Procedures and Interrupts.
Computer Architecture Lecture 32 Fasih ur Rehman.
Microcontrollers Class : 4th Semister E&C and EEE Subject Code: 06ES42
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
Silberschatz, Galvin and Gagne  Applied Operating System Concepts Chapter 2: Computer-System Structures Computer System Architecture and Operation.
Chapter 10 Instruction Sets: Characteristics and Functions Felipe Navarro Luis Gomez Collin Brown.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
Processor Structure and Function Chapter8:. CPU Structure  CPU must:  Fetch instructions –Read instruction from memory  Interpret instructions –Instruction.
Optimization of C Code The C for Speed
1 Becoming More Effective with C++ … Day Two Stanley B. Lippman
What is a program? A sequence of steps
EE/CS-352: Embedded Microcontroller Systems Part V The 8051 Assembly Language Interrupts.
MODULE 5 INTEL TODAY WE ARE GOING TO DISCUSS ABOUT, FEATURES OF 8086 LOGICAL PIN DIAGRAM INTERNAL ARCHITECTURE REGISTERS AND FLAGS OPERATING MODES.
8051 Micro Controller. Microcontroller versus general-purpose microprocessor.
Calling Procedures C calling conventions. Outline Procedures Procedure call mechanism Passing parameters Local variable storage C-Style procedures Recursion.
Preocedures A closer look at procedures. Outline Procedures Procedure call mechanism Passing parameters Local variable storage C-Style procedures Recursion.
Where Testing Fails …. Problem Areas Stack Overflow Race Conditions Deadlock Timing Reentrancy.
Introduction 8051 Programming language options: Assembler or High Level Language(HLL). Among HLLs, ‘C’ is the choice. ‘C’ for 8051 is more than just ‘C’
LECTURE 19 Subroutines and Parameter Passing. ABSTRACTION Recall: Abstraction is the process by which we can hide larger or more complex code fragments.
Week 6 Dr. Muhammad Ayaz Intro. to Assembly Language.
80C51 Block Diagram 1. 80C51 Memory Memory The data width is 8 bits Registers are 8 bits Addresses are 8 bits – i.e. addresses for only 256.
DEPARTMENT OF ELECTRONICS ENGINEERING V-SEMESTER MICROPROCESSOR & MICROCONTROLLER 1 CHAPTER NO microcontroller & programming.
7-Nov Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Oct lecture23-24-hll-interrupts 1 High Level Language vs. Assembly.
I NTEL 8086 M icroprocessor بسم الله الرحمن الرحيم 1.
Assembly language programming
80C51 Block Diagram ECE Overview.
Format of Assembly language
Protection of System Resources
Advanced Topic: Alternative Architectures Chapter 9 Objectives
The 8051 Microcontroller and Embedded Systems
BIC 10503: COMPUTER ARCHITECTURE
Lecture 4: MIPS Instruction Set
Code Generation.
Process.
‘C’ for Microcontrollers, Just Being Efficient
COMP755 Advanced Operating Systems
Presentation transcript:

‘C’ for Microcontrollers, Just Being Efficient Lloyd Moore, President

Agenda Microcontroller Resources Knowing Your Environment Memory Usage Code Structure Interrupts Math Tricks Optimization

Disclaimer Some microcontroller techniques necessarily need to trade one benefit for another – typically lower resource usage for maintainability Point of this presentation is to point out various techniques that can be used as needed Use these suggestions when necessary Feel free to suggest better solutions as we go along

Microcontroller Resources EVERYTHING resides on one die inside one package: RAM, Flash, Processor, I/O Cost is a MAJOR design consideration Typical costs are $0.25 to $25 each (1000’s) RAM: 16 BYTES to 32K Bytes typical Flash/ROM: 384 BYTES to 256K Bytes Clock Speed: 4MHz to 80MHz typical Much lower for battery saving modes (32KHz) Bus is 8, 16, or 32 bits wide (just like the old days)

Other Considerations Specialized resources often present Counters, UART, USB PHY, LCD Controller Portability inside families a big concern Across families, not so much Typically no operating system present May have hardware centric API, or just raw registers! No floating point hardware May have other math hardware (MAC, CRC) No protected memory / MMU Do have specialized memory segments

Power Consumption Microcontrollers typically used in battery operated devices Power requirements can be EXTREMELY tight Energy harvesting applications Long term battery installations (remote controls, hard to reach devices, etc.) EVERY instruction executed consumes power, even if you have the time!

Know Your Environment Traditionally we ignore hardware details Need to tailor code to hardware available Specialized hardware MUCH more efficient Compilers typically have extensions Interrupt – specifies code as being ISR Memory model – may handle banked memory and/or simultaneous access banks Multiple data pointers / address generators Debugger may use some resources

Memory Usage Use ‘const’ to put data into program memory Alignment / padding issues Typically NOT an issue, non-aligned access ok Avoid dynamic memory allocation Take extra space and processing time Memory fragmentation a big issue Use and reuse static buffers Reduces variable passing overhead Allows for smaller / faster code due to reduced indirections Does bring back over write bugs if not done carefully Use the appropriate variable type Don’t use int and double for everything!! Affects processing time as well as storage

Char vs. Int Increment on 8051 char cX; cX++; 000A MOV DPTR,#cX 000D E0 MOVX 000E 04 INC A 000F F0 6 Bytes of Flash 4 Instruction cycles int iX; iX++; MOV DPTR,#iX 0003 E4 CLR A F001 MOV B,#01H LCALL ?C?IILDX 10 Bytes of Flash + subroutine overhead Many more than 4 instruction cycles with a LCALL

Code Structure Count down instead of up Saves a subtraction on all processors DJNZ style instruction on some processors Pointers vs. array notation Generally better using pointers Bit Shifting May not always generate what you think May or may not have barrel shifter hardware May or may not have logical vs. arithmetic shifts

Shifting Example cX = cX << 3; RLC A RLC A RLC A F8 ANL A,#0F8H Constants turn into seperate statements Variables turn into loops Both of these can be one instruction with a barrel shifter cA = 3; cX = cX << cA; 000B MOV DPTR,#cA 000E E0 MOVX 000F FE MOV R6,A 0010 EF MOV A,R A806 MOV R0,AR INC R SJMP ?C ?C0004: 0016 C3 CLR C RLC A 0018 ?C D8FC DJNZ R0,?C0004

More Code Structure Actual parameters typically passed in registers if available Keep function parameters to less than 3 May also be passed on stack or special parameter area May be more efficient to pass pointer to struct Global variables While generally frowned upon for most code can be very helpful here Typically ends up being a direct access Read assembly code for critical areas Know which optimizations are present Small compilers do not always have common optimizations Inline, loop unrolling, loop invariant, pointer conversion

Indexed Array vs Pointer on M8C ucMode = g_Channels[uc_Channel].ucMode; 01DC 52FC mov A,[X-4] 01DE 5300 mov [__r1],A 01E mov A,0 01E2 08 push A 01E mov A,[__r1] 01E5 08 push A 01E mov A,0 01E8 08 push A 01E mov A,7 01EB 08 push A 01EC 7C0000 xcall __mul16 01EF 38FC add SP,-4 01F1 5F0000 mov [__r1],[__rX] 01F4 5F0000 mov [__r0],[__rY] 01F add[__r1],<_g_Channels 01FA 0E0000 adc[__r0],>_g_Channels 01FD 3E00 mvi A,[__r1] 01FF 5403 mov [X+3],A ucMode = pChannel->ucMode; 01ED 5201 mov A,[X+1] 01EF 5300 mov [__r1],A 01F1 3E00 mvi A,[__r1] 01F mov [X+5],A Does the same thing Saves 29 bytes of memory AND a call to a 16 bit multiplication routine! Pointer version will be at least 4x faster to execute as well, maybe 10x Most compilers not this bad – but you do find some!

Interrupts Generally implemented as individual hardware vectors with a small amount of program memory at the location ISR is what you get – no OS, no threads, no IST Can use a flag with main loop to get IST behavior for less time critical code Also very common to use interrupts to simulate threads Interrupt itself take the place of the WaitFor_XXX or signal Follows very naturally for hardware tasks and timers Generally an “interrupt” statement provided

Interrupt Example static unsigned char g_TimerTriggered; void main() { ConfigureTimer0(); g_TimerTriggered = 0; GlobalEnableInterrupt(); while(1) { if(g_TimerTriggered) { g_TimerTriggered = 0;//Could also disable the timer interrupt here DoTimerTask();//to avoid a race condition resetting g_TimerTriggered } //Can put optional sleep here, interrupts can wake up processor } void Timer0ISR(void) interrupt 1 using 2//Interrupt source 1, attached to vector 2 { g_TimerTriggered = 1; //Can put other small, quick work here }

Switch Statement Implementation Switch statements can be implemented in various ways Sequential compares In line table look up for case block Special function with look up table Specific implementation can also vary based case clauses Clean sequence (1, 2, 3, 4, 5) Gaps in sequence (1, 10, 30, 255) Ordering of sequence (5, 4, 1, 2, 3) Knowing which method gets implemented critical to optimizing!

Switch Statement Example switch(cA) { case 0: cX = 4; break; case 1: cX = 10; break; case 2: cX = 30; break; default: cX = 0; break; } MOV DPTR,#cA 0009 E0 MOVX 000A FF MOV R7,A 000B EF MOV A,R7 000C LCALL ?C?CCASE 000F 0000 DW ?C DB 00H DW ?C DB 01H DW ?C DB 02H DW 00H 001A 0000 DW ?C C ?C0002: 001C MOV DPTR,#cX 001F 7404 MOV A,#04H 0021 F SJMP ?C More blocks follow for each case

Bit Variables Some processors have special memory areas and op-codes for single bit storage Saves overhead of masking operations Some key from bit fields notation, some need keyword (frequently ‘bit’) struct { unsigned int foo : 1; } flags; unsigned int my_bit : 1; bit my_bit;

Math Tricks Floating point math VERY expensive on microcontrollers No hardware support Typically 32 bits for float, 64 bits for double Support provided by a BIG library Can use fixed point math in many cases Basically the same as integer math, however move the decimal inside the integer. Binary number is really: 2^7 + 2^6 +… 2^2 + 2^1 + 2^0 To make a fixed point number just adjust the exponents: 2^6 + 2^5 + … 2^1 + 2^0 + 2^-1 :Note 2^-1 = 0.5 Assume 8 bit value: Range = [0,255] Assume one binary decimal point XXXXXX.X Range is now [0, 127.5] All the internal math stays the same so long as only fixed point numbers with the same binary point location used together!

More Math Tricks You may not have multiply and/or divide ops! Decomposing operations can help X * 5 = X * 4 + X (X * 4) can become 2 shift left operations Formulas should also be restructured for math available: Y=ax^2 + bx + c : 1 Pow or Mult, 2 Mult, 2 Add Y = x (ax + b) + c : 2 Mult, 2 Add Lookup tables can be great for limited domain problems

Optimization Step 0 – Before coding anything think about risk points and prototype unknowns!!! Step 1 – Get it working!! Fast but wrong is of no use to anyone Optimization will typically reduce readability Step 2 – Profile to know where to optimize Usually only one or two routines are critical You need to have specific performance metrics to target

Optimization Step 3 – Let the tools do as much as they can Turn off debugging! Select the correct memory model Select the correct optimization level Step 4 – Do it manually Read the generated code! Might be able to make a simple code or structure change. Last – think about assembly coding

Summary Microcontroller hardware is much simpler than most of us are used to Be familiar with the hardware in your microcontroller Be familiar with your compiler options and how it translates your code For time or space critical code look at the assembly listing from time to time

Questions?