Basics of Embedded Systems IAX0230 Embedded Software Programming

Basics of Embedded Systems IAX0230 Embedded Software Programming
Prof. Dr. Kalle Tammemäe Prof. Dr.-Ing. Thomas Hollstein Dr. Uljana Reinsalu

Outline Program design challenges Assembler C for embedded
Storage classes Qualifiers Organization of the code Compiling for embedded Cross-compilers Design patterns Concurrent code in C for bare-metal Intro to Lab9 and Lab10

Embedded programming challenges
Embedded programs must meet system deadlines handle memory resources efficiently meet power consumption requirements Advances in design techniques and tools help to use more abstract languages than assembler in order to allow a higher design productivity

Embedded programming challenges
However … In order to get an efficient implementation we must: understand the underlying hardware understand our tools (compiler) understand how to program the hardware

Embedded SW difference from usual SW programming
It is possible to address directly registers and memory, interrupts can be handled and interact with main program; need to think about individual bits; Embedded SW functionality and efficiency very depends on used hardware Variables with certain size are frequently used, such as: uint8_t, int16_t Prediction of embedded SW time execution requires good knowledge of used hardware

Usual challenges for embedded SW
Human resources Embedded software programming is closely connected with hardware, then programmers must have good survey about hardware and electronics Memory limits Processor performance Power consumption It is very important for devices using batteries Device price

Advantage of Assembler
It is possible to use all hardware opportunities It is possible to use such instructions which C compiler does not recognize There is no need for optimization while translating the assembler to machine code It is possible to create programs of certain size and working fast There are 203 instructions in the ARM Cortex-M4 instruction set…

Disadvantage of Assembler
Code is difficult to read It is difficult to write big program without bugs It is difficult to work in groups It is difficult to port code from one architecture to another It is very difficult to create reusable code

When to use Assembler In the places when it is not possible to program in other language (e.g. C): Some schedulers When compiler is not familiar with some op-codes If you need to feel the cycles with a delay void CPUsleep(void){ _asm (“WFI\n”); //wait for interrupt return;} [Q] What is underscore before name? But two underscores? int ADD32(int s1, int s2){ int res; _asm{ ADD res, s1, s2} return res;}

When to use Assembler For programs which must be certain size, such as bootloaders or at least their first steps For program parts which require very high speed (if it is not possible to write in other language) Mathematical functions Interrupts handling In the critical part of the program, when you are not sure how compiler would translate the code It is better to encapsulate Assembler code into separate functions Be careful when using registers – Except few the most of registers are used by the compiler. Restore registers at the end of assembler. Don’t mess with PC, SP, LR Scratch registers are not expected to be preserved upon returning from a called subroutine – This applies to r0–r3 • Preserved (“variable”) registers are expected to have their original values upon returning from a called subroutine – This applies to r4–r8, r10–r11 – Use PUSH {r4,..} and POP {r4,...} R9 – platform register

C programming language
C is a high-level language Abstracts hardware Expressive Readable Analyzable C is a procedural language The programmer explicitly specifies steps Program composed of procedures Functions/subroutines C is based on expressions C is a sequential imperative language C is compiled (not interpreted) Code is analyzed as a whole (not line by line)

Why C? C is popular C influenced many languages
Big community of programmers A lot of libraries are written in C C influenced many languages C is considered close-to-machine (HW) Language of choice when careful coordination and control is required Straightforward behavior (typically) The program can be as fast as ASM program if good compiler is in use Typically used to program low-level software (with some assembly) Drivers, runtime systems, operating systems, schedulers, … As in October 2017: Java, C, C++, C#, Python; Java Script … 9th Assembly language!

Programming in C C has be designed with respect to an efficient implementation, ‘unnecessary’ features have been left out: There is no Boolean variable, integers are used instead There is no garbage collection Programmers can access memory location that is not declared …

Programming in C C programmers have a huge responsibility!
C-code can be very fast BUT it is easy to produce bugs, which are not caught by the compiler: Memory leaks Accessing wrong memory locations Assignments instead of comparison … Be extremely careful, when programming in C!

Data structures in memory
In order to design an efficient programs it is often important to know how data structures are located in memory Number of cache misses can often be reduced Example: an array is located as sequence in the memory int x = 5; int y = 6; int a[5];

Typecasting in C Typecasting from smaller type size to bigger usually is not dangerous e.g. to cast uint16_t variable to uint32_t is not bad for the memory. Be careful with register (fixed size)! When typecasting from bigger type size to smaller, always control must be done With pointer typecasting from smaller to bigger size must be very cautious: C compiler does not give any error, if with this typecasting memory part of other function is corrupted Conversion from one value type to another. Is considered to be good programming practice to typecast whenever there is type conversion.

Example: typecasting in array
#include <stdio.h> #include <stdint.h> int main (void){ uint8_t arr[] = {0x11, 0x22}; uint16_t *ptr; ptr = (uint16_t*)&arr[0]; printf ("arr: 0x%02x,0x%02x\n",arr[0],arr[1]); printf ("ptr: 0x%04x\n", *ptr); ptr = (uint16_t*)&arr[1]; //erroneous typecast. return 0;} /* output */ arr: 0x11, 0x22 ptr: 0x1122 ptr: 0x2200

C bitwise operators Bitwise complement (~) Bitwise AND (&)
Bitwise exclusive OR (^) Bitwise inclusive OR (|) Left shift (<<) 0’s are shifted in Right shift (>>) for unsigned data types 0’s are shifted in  for signed data types the sign bit is shifted in

Logic operations Logic Instructions
AND{S} {Rd,} Rn, <op2> ; Rd=Rn&op2 ORR{S} {Rd,} Rn, <op2> ; Rd=Rn|op2 EOR{S} {Rd,} Rn, <op2> ; Rd=Rn^op2 BIC{S} {Rd,} Rn, <op2> ; Rd=Rn&(~op2) ORN{S} {Rd,} Rn, <op2> ; Rd=Rn|(~op2) [J.Valvano Lec]

Storage class auto Default storage class for variables declared in a function An automatic variable is only visible in the block it is declared in System allocates memory when entering the block System releases memory when leaving the block Variable value is lost when leaving the block As it is default, then no need to worry at all… int f(int x) { * both a and b are of storage class auto */ int a; auto int b; … }

Storage class extern Default storage class for variables declared outside a function An extern variable is visible in the whole file after its declaration (global variable) System allocates permanent storage for the variable The keyword extern is used to tell the compiler to look for the variable or function elsewhere in another file

Storage class register
register storage class may only be declared inside functions It promotes faster access to the content of the variable Behaves in the same way as auto, but gives an additional recommendation to compiler that variable should be placed in register Compiler does not have to follow this advice Number of registers is limited in CPU int f() { register int i; … for(i=0; i < 1000; i++) a[i] = i; …}

Storage class static In contrast to auto variables, static variables retain their values permanent storage is assigned to the variable This means its value is kept between function calls int counter(void) { static int i = 0; i++; return i; }

The type qualifier volatile
The type qualifier volatile indicates that a variable can be changed by other parts of the hardware in the system This statement is used to tell the compiler to evaluate the variable every time, rather than optimizing it. In practice, only following types of variables could change: Memory-mapped peripheral registers Global variables modified by an interrupt service routine Global variables accessed by multiple tasks within a multi-thread application

Peripheral registers & volatile
When compiler optimizations are turned on, compiler generates Using ‘volatile’ uint8_t *pReg = (uint8_t *) 0x1234; //Wait for register to become non-zero while (*pReg == 0) { } // Do something else mov ptr, #0x1234 mov loop: bz loop uint8_t volatile *pReg =(uint8_t volatile *) 0x1234; mov ptr, #0x1234 loop: mov bz loop

Interrupt Service Routines & volatile
ISR often set variables that are tested in mainline code Compiler has no idea that etx_rcvd can be changed in ISR => all code after ‘while’ loop could be removed => use volatile etx_rcvd int volatile etx_rcvd = FALSE; void main() { ... while (!ext_rcvd) { // wait } } interrupt void rx_isr(void) { if (ETX == rx_char) { etx_rcvd = TRUE;}

Multithread applications & volatile
It is common to exchange information via shared memory location int volatile cntr; void task1(void) { cntr = 0; while (cntr == 0) { sleep(1); } ... } void task2(void) { cntr++; sleep(10);

Structures Struct –mechanism to store composite information; type is defined by user Allocation of variable of type playerType will occupy 4 bytes in RAM We can access the individual elements of this variable using the syntax name.element struct player{ unsigned char Xpos; // first element unsigned char Ypos; // second element unsigned short Score; // third element }; typedef struct player playerType; playerType Sprite; Sprite.Xpos = 10; Sprite.Ypos = 20; Sprite.Score = 12000;

If memory size is constant
Several small variables could be packed into a single variable: Set of masks Bit-fields Unions

Set of masks #define FLAG1 1 #define FLAG2 2 #define FLAG3 4 …
char flags; /* contains all flags */ if ((flags & (FLAG1 | FLAG3)) == 0) /*true if both flags are 0 */

Bit-fields Implementation of bit-field is machine dependent!
A bit-field is a packed representation, where several small variables are integrated into one variable struct myfield { unsigned var1 : 4; /* values from 0 to 15 */ int var2 : 3; /* values from -4 to 3 */ } x; … x.var1 = 6; Implementation of bit-field is machine dependent!

Union A union defines a set of alternative values that can be stored in the same portion of memory Programmer has responsibility to keep track of the data type stored in a union The size of the union corresponds to the largest data type union int_or_float { int i; float f; } u;

Union union int_or_float { int i; float f; } u; int main( ) {
union u data; data.i = 10; data.f = 220.5; printf( "data.i : %d\n", data.i); printf( "data.f : %f\n", data.f); return 0; } data.i : data.f : 220.5

Organization of code file
Opening comments The overall purpose of the SW module The names of programmers The creation and last update dates The HW/SW configuration required to use module Copyright information Including .h files Extern references #define statements Struct union enum statements Global variables and constants Prototypes of private functions Implementations of the functions Employ run-time testing

C coding tips Use local variable as much as possible (local variables use CPU registers whereas global variables use RAM); Use unsigned data type where possible; Use pointers to access structures and unions; Use “static const” class to avoid run-time copying of structures, unions, and arrays; Count down “for” loops.

Compiler, assembler and linking

ARM Cortex-M4 software development procedure
Keil ARM MDX Vision (v4.74, v5) TI CCS (v6) [Bai Y Book, p. 157]

Kahoot time!

SW development environment
[J.Valvano Lec]

Compiling C code Compiler steps: Linking Process macros Control syntax
Transform the code to the op-code Optimize Generate object file Linking Link all object files Generate final executable file

For good embedded programs
An awareness of compiler functionality, code generation and optimization is essential. Although there is little need to write assembly language code, an ability to read and understand the generated code is very useful. Optimization should be used with care and tuned appropriately for the debug phase and for shipping code.

Linker: main purpose how the sections in the input files should be mapped into the output file; to control the memory layout of the output file; TI TM4C123G

Cross-Compilation Cross-Compiler: generating code for a another platform, than the one on which the compiler is running. Example: compiling mobile applications code under Windows for Android, iOS, Tizen and Windows Mobile. Embedded Systems development environments typically are built on cross-compilers.

Design patterns A design pattern is a typical repeatable solution to a problems that occurs in many different SW designs For such design patterns there exists: general description of the problem good understanding of the problem good understanding of efficient solution to the problem Designer can use existing solutions to customize it and solve its own problem Example design patterns: State machine Circular buffer

State machine State machines are key elements in
State machine describes reactive system characterized in terms of: Input received Current state of the system State machine is useful in many contexts: parsing user input responding to complex stimuli controlling sequential outputs controlling the data path State machines are key elements in hardware and software designs!

Example: seat belt controller

C implemenation of seat belt controller
#define IDLE 0 #define SEATED 1 #define BELTED 2 #define BUZZER 3 switch (state) { case IDLE: if (seat) { state = SEATED; timer_on = TRUE; } break; case SEATED: if (belt) state = BELTED; else if (timer) state = BUZZER; break; case BELTED: if(!seat) state = IDLE; else if(!belt) state = SEATED; break; case BUZZER: if(belt) state = BELTED; else if(!seat) state = IDLE; break; }

Buffers for streaming data
In many embedded systems data streams come at a regularly basis and have to be processed on the fly Implementation has to react on each new sample and must also avoid unnecessary use of memory

Producer-Consumer The producer send data to a consumer
Producer sends data at non-constant rate Producer may send data in bursts The consumer reads data at non-constant rate Average rate of producer is less than average rate of consumer

Streaming data – the problem
Data is inserted to the buffer tail (push) and taken from the head (pop) How shall such a buffer be implemented?

Very naïve implementation
A memory location buf with space N items is reserved The head is always located in buf[0] The tail is always located in buf[N]

Very naïve implementation
A memory location buf with space N items is reserved The head is always located in buf[0] The tail is always located in buf[N] Not efficient! Every time an item is popped from the head, all data items are moved

Circular Buffer The circular buffer is a very important design pattern for processing of data steams Instead moving data, pointers are moved!

C implementation of circular buffer
#define N 10 int buf[N]; int head = 0; … int pushpop(int n){ int oldheadval= buf[head]; buf[head]=n; if(head == N-1) head=0; else head++; return oldheadval; } The example expects the buffer is full of useful information. Always the last old value will be returned, and new value written instead, moving the head further

Circular buffer Circular buffer can also be used for buffers with a non-constant number of elements Head points to the head element Tail points to the first empty element What happens, if an item is added to a full buffer?

Programming Concurrency
Assumption: one processor for execution available Three Programs (Processes) should run in parallel: Execution time of P1, P2, P3 can vary (branch prediciton success, availability of ressources, cache hits/misses) BCET(P1): best case execution time of P1 WCET(P1): worst case execution time of P1 Difference between BCET and WCET can be quite significant (many clock cycles) Processes have to share the „ressource“ Processor - how to control that? Fair assignment scheme needed P1 P2 P3

Time-Triggered Approach
Timer controls periodic execution interval TC Starting time of P1 is deterministic, but not for P2 and P3 must be fulfilled Polling I/O Interrupt Timer Processor Polling P1 I/O P2 P3 Config. Polling I/O 𝑖=1 3 𝑊𝐶𝐸𝑇 𝑃𝑖 ≤𝑇𝐶 P1 P2 P3 P1 P2 P3 TC TC

Event-Triggered Approach
System „reacts“ on external events in a defined way Pre-emptive scheduling method based on Process hierarchy is needed more in operation system chapter Interrupt I/O Interrupt Timer Processor Interrupt P1 I/O P2 P3 Config. Interrupt I/O Future topic! P1 P2 P3 P2 P1

Intro to LAB9: Functional debugging
Instrumentation: dump into array without filtering Assume PortA and PortB have strategic 8-bit information. Cnt will be used to index into the buffers Cnt must be set to index the first element, before the debugging begins. The debugging instrument saves the strategic Port information into respective Buffers Use the debugger to display the results after program is done. #define SIZE 20 uint8_t ABuf[SIZE]; uint8_t BBuf[SIZE]; uint32_t Cnt; Cnt = 0; void Save(void){ if(Cnt < SIZE){ ABuf[Cnt]=GPIO_PORTA_DATA_R; BBuf[Cnt]=GPIO_PORTB_DATA_R; Cnt++; }

Intro to LAB10: FSM: Traffic light controller
SysTick Timer/Counter operation 24-bit counter decrements at bus clock frequency With 80 MHz bus clock, decrements every 12.5 ns Counting is from n  0 Setting n appropriately will make the counter a modulo n+1 counter. That is: next_value = (current_value-1) mod (n+1) Sequence: n,n-1,n-2,n-3… 2,1,0,n,n-1… [J.Valvano Lec]

SysTick Timer Initialization (4 steps)
Step1: Clear ENABLE to stop counter Step2: Specify the RELOAD value Step3: Clear the counter via NVIC_ST_CURRENT_R Step4: Set NVIC_ST_CTRL_R CLK_SRC = 1 (bus clock is the only option) INTEN = 0 for no interrupts ENABLE = 0 to enable [J.Valvano Lec]

SysTick Timer in C #define NVIC_ST_CTRL_R(*((volatile uint32_t *)0xE000E010)) #define NVIC_ST_RELOAD_R(*((volatile uint32_t *)0xE000E014)) #define NVIC_ST_CURRENT_R(*((volatile uint32_t *)0xE000E018)) void SysTick_Init(void){ NVIC_ST_CTRL_R = 0; // 1) disable SysTick during setup NVIC_ST_RELOAD_R = 0x00FFFFFF; // 2) maximum reload value NVIC_ST_CURRENT_R = 0; // 3) any write to CURRENT clears it NVIC_ST_CTRL_R = 0x ; // 4) enable SysTick with core clock } // The delay parameter is in units of the 80 MHz core clock(12.5 ns) void SysTick_Wait(uint32_t delay){ NVIC_ST_RELOAD_R = delay-1; // number of counts NVIC_ST_CURRENT_R = 0; // any value written to CURRENT clears while((NVIC_ST_CTRL_R&0x )==0){ // wait for flag // Call this routine to wait for delay*10ms void SysTick_Wait10ms(uint32_t delay){ unsigned long i; for(i=0; i<delay; i++){ SysTick_Wait(800000); // wait 10ms [J.Valvano Lec]

Finite State Machine (FSM)
Finite State Machines (FSMs) Set of inputs, outputs, states and transitions State graph defines input/output relationship What is a state? Description of current conditions What is a state graph? Graphical interconnection between states What is a controller? Software that inputs, outputs, changes state Accesses the state graph [J.Valvano Lec]

What is a finite state machine? Inputs (sensors) Outputs (actuators) Controller State graph [J.Valvano Lec]

Moore FSM output value depends only on the current state, inputs affect the state transitions significance is being in a state Input: when to change state Output: definition of being in that state [J.Valvano Lec]

Moore FSM Execution Sequence Perform output corresponding to the current state Wait a prescribed amount of time (optional) Read inputs Change state, which depends on the input and the current state Go back to 1. and repeat [J.Valvano Lec]

FSM Implementation Data Structure embodies the FSM Linked Structure
multiple identically-structured nodes statically-allocated fixed-size linked structures one-to-one mapping FSM state graph and linked structure one structure for each state Linked Structure pointer (or link) to other nodes (define next states) Table structure indices to other nodes (define next states) [J.Valvano Lec]

Traffic Light Control goN, PB5-0 = makes it green on North and red on East waitN, PB5-0 = makes it yellow on North and red on East goE, PB5-0 = makes it red on North and green on East waitE, PB5-0 = makes it red on North and yellow on East [J.Valvano Lec]

Traffic Light Control [J.Valvano Lec]

FSM Data Structure in C (Indexes)
const struct State { uint32_t Out; uint32_t Time; // 10 ms units uint32_t Next[4]; // list of next states }; typedef const struct State STyp; #define goN 0 #define waitN 1 #define goE 2 #define waitE 3 STyp FSM[4] = { {0x21,3000,{goN,waitN,goN,waitN}}, {0x22, 500,{goE,goE,goE,goE}}, {0x0C,3000,{goE,goE,waitE,waitE}}, {0x14, 500,{goN,goN,goN,goN}} [J.Valvano Lec]

Basics of Embedded Systems IAX0230 Embedded Software Programming

Similar presentations

Presentation on theme: "Basics of Embedded Systems IAX0230 Embedded Software Programming"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Basics of Embedded Systems IAX0230 Embedded Software Programming

Similar presentations

Presentation on theme: "Basics of Embedded Systems IAX0230 Embedded Software Programming"— Presentation transcript:

Similar presentations

About project

Feedback