1 Operating System Requirements for Embedded Systems Rabi Mahapatra.

1 Operating System Requirements for Embedded Systems Rabi Mahapatra

2 Complexity trends in OS OS functionality drives the complexity Router Small controllers sensors Home appliances Mobile phones PDAs Game Machines

3 Requirements of EOS Memory Resident: size is important consideration –Data structure optimized –Kernel optimized and usually in assembly language Support of signaling & interrupts Real-time scheduling – tight-coupled scheduler and interrupts Power Management capabilities –Power aware schedule –Control of non-processor resources

4 Embedded OS Design approach Networking support Traditional OS: Monolithic or Distribute Embedded: Layered is the key (Constantine D. P, UIUC 2000) Memory Management Real-Time Scheduler Interrupt & signalling Basic Loader Power Management Custom device support

5 Power Management by OS Static Approach: Rely on pre-set parameters –Example: switch off the power to devices that are not in use for a while (pre-calculated number of cycles). Used in laptops now. Dynamic Approach: Based on dynamic condition of workloads and per specification of power optimization guidelines –Example: Restrict multitasking progressively, reduce context switching, even avoid cache/memory access, etc..

6 Popular E-OS WinCE (proprietary, optimized assembly..) VxWorks Micro Linux MuCOS Java Virtual Machine (Picojava) OS –Most likely first open EOS!

7 Interrupts Each device has 1-bit “arm’ register to be set by software if interrupt from the device to be accepted. CCR is used to program the interrupts A good design should provide for extensibility in the number of devices that can issue interrupts and also number of ISRs. Either polled or vectored interrupts depending on nature of processors and I/O devices. –Polling: Dedicated controllers, data acquisition with periodicity and the I/O devices are slow –Interrupts: Real-time environments, when events are unpredictable and asynchronous

8 Direct Memory Access DMA is used when low latency and/or high bandwidth is required. (disk IO, video output or low latency data acquisition) Software DMA: starts with normal interrupts, the ISR sets the device resisters and initiate I/O, processor returns to normal operation, on completion of I/O device inform the processor. Hardware DMA: the above can be implemented in hardware Burst DMA: when buffers are put in I/O devices (disk) –Low latency asynchronous I/O can not use burst DMA.

9 Real-Time Scheduling Interrupts are heavily used in scheduling when real-time events are to be completed by some deadline. Events or threads or tasks or processes need to use priority, deadline, blocking, restoring and nesting NP-hard problem with out an optimal solution. Greedy heuristics are proposed as working solutions with some assumptions. Dynamic RT Scheduling: Use greedy heuristics together with priority-based interrupts.

10 OS directed power reduction Dynamic power management: determine the power state of a device based on the current workload, move through the power transitions based on shot down policy Usually, in stead of power off/on, there are dynamic voltage setting and variable clock speeds => multiple power states Previous works: –Shot down device if idle long enough –Hardware centric => observe past requests at device to predict future idleness, no OS info, no study on characteristics of requsters –Use stochastic model and assume randomly one request without distinguishing the source of the requester

11 OS directed power reduction Disk request sources: compiler, text editor, ftp program Network card: internet browser or telnet session? Important that we have accurate model of requesters in concurrent environment.( Task Based Power Management) A software-centric approach Two methods to reduce power: adjust CPU clock speed, sleeping states

12 Process states ready waiting running new terminated IO or event wait IO or event completion Scheduler dispatch interrupt exit admitted

13 TBPM’s supplement on device drivers Four problems: –Requesters are generated by multiple tasks. TBPM uses the knowledge from OS kernel to separate tasks –Tasks are created, executed and terminated. (DD has no knowledge on multiple tasks and their termination) –Tasks have different characteristics in device utilization. –Task can generate requests while running. TBPM considers CPU time of tasks while deciding the power states Data structures: –device-requester utilization matrix U (d, r) : utilization of device d by requester r; –processor utilization vector P ( r ) : percentage of processor time used by requester.

14 Updating U, P U matrix example: HDD NIC Gcc emacs netscape 12 0.7 0.4 00 2.3 Matrix element refers to the reciprocal of the average Time between requests (TBR) TBRn = . TBR + (1-  ). TBRn-1 U(d,r) – 1/ TBRn 0 <  <1 If  = 0, TBRn is constant using the first TBR and for  = 1, TBRn is last TBR.

15 Updating U, P P(r ) is the percentage of CPU time executing task r or = CPU time (r )/  CPU time by all requester Updated based on sliding window scheme but not a discounted scheme as used for U. –Incase of IO bound bursty requests, TBR will show on high utilization but can not capture the running time requirements –Sliding window is used to compute CPU time distributed among processes. But the window time should be such that it samples all processes (long) and also reflect the workload variation (short).

16 Shutdown condition Break-even-time: minimum length of idle time –Depends on device characteristics –Independent of workloads Performance Consideration –Interactive system: If many shutdowns issued in short time, will increase response time => degrade “perceived interactivity” –User might react to obtain response and hence steep increase in system load. –Restrict two consecutive shutdowns within “time to wake up” (say)

17 TBPM Procedure Integration of power management with process management ready waiting running new terminated Update P(r ) Update U(d,r) Update P(r ) Delete column Allocate column

18 TBPM procedure A requester column is allocated when a new task is created and the column is deleted when task terminates Utilization set to zero but updated on issue of a request. The PM evaluates the utilization in the process scheduler. In lightly loaded system: –Sparse requests will not cause the PM to keep a device in working state long since P( r ) is small for this requester With heavy workload: –Does not use device frequently since the PM shuts the device after its use ( U(d) is small)

19 Experiments Platform: –Personal computer, TBPM in Linux kernel, Redhat 6.0 –To control power states of HD and network transmitter ( wireless) –Modify kernel and device drivers of PC with xWindow and NW, configured as a client, server daemons (http server, internet news server) are turned off, cron tasks are scheduled at low frequencies. Power state changes in an HD and NIC are emulated with two states: working & sleeping –To compare with other PM policies, power state changes were emulated without actually setting the hardware power state. By maintaining a set of variables, record was maintained on device statistics – number of shutdown and wake up by various policies. –See tables 1 for hardware parameters

20 Experimental Results Other PM policies: –Exponential regression relationship between two adjacent idle periods –Event driven semi markov model –Policy that set the time out value to T be –Time out with one and two minutes At least 10 hours of work load running Table 2 shows the compared results –Ts: time in the sleeping state, Nd: number of shutdowns, St: longest sequence that cause delay ever 30 sec, Pa: average power in W, R: power consumption relative to TBPM.

21 Dynamic Voltage Scaling in processors Processor usage model –Compute intensive: use full throughput –Low-speed: fraction of full throughput, not required fast processing –Idle Desired Throughput System idleBackground & Long-latency processes Maximum processor speed Compute intensive and Short-latency process

22 Why DVS Design objective of a processor: provide the highest possible peak throughput for compute-intensive tasks while maximizing battery life for remaining low-speed and idle periods Common Power saving technique: Reduce clock frequency during non-compute-intensive activity. –This reduces power but not the total energy consumption per task, since energy consumption is frequency independent to a first order approximation. –Conversely, reducing voltage improves the energy efficiency, but compromises peak throughput

23 DVS If, both clock frequency and voltage are dynamically varied in response to computational load demands, then energy consumed per task can be reduced for low computational period and while retain the peak throughput when required The strategy, which achieves the highest possible energy efficiency for time-varying computational load, is called DVS.

24 DVS overview Key components: – an OS: that can intelligently vary the processor speed, –regulation loop: to generate minimum voltage required for desired frequency, –processor: that can operate over wide voltage range Circuit characteristics? (F – V) HW or SW control of Processor speed? –SW, since hw can not know if instruction being executed is part of compute intensive task! Control from Application program? –NO, it can not set the processor speed being unaware of other tasks. But can give useful information about their requirements.

25 DVS As frequency varies, Vdd must vary in order to optimize energy consumption. But the SW is not aware of minimum required supply voltage for a given speed. It is a function of hw implementation, process variation and temperature A ring oscillator provides this translation ARM8 16KB cache Co-proc Write buffer VCOVCO Regulator 64KB SRAM I/O Chip BUSBUS FdesiredVdd V battery CPU 3.3V

26 DVS Know the transition time and transition energy in order to know cost of interrupt and wakeup latency. Voltage scheduler as new OS component: –Controls the processor speed by writing desired f to system’s control register, that is used by regulation loop in adjusting the voltage & frequency – operates processor at minimum throughput level required by current tasks and minimizes energy consumption –Note: job of determining optimal frequency and job scheduling are independent of each other. –Hence, voltage scheduler can be retrofitted to the OS.

27 Voltage Scheduling Algorithm Determines the optimal clock frequency by combining computation requirements of all the active tasks in the system and ensure that latency requirements are met given the task ordering of temporal scheduler. Multiple tasks case is complex. Considers predicting the workload and updated by the VS at the end of each task. Research issue!

1 Operating System Requirements for Embedded Systems Rabi Mahapatra.

Similar presentations

Presentation on theme: "1 Operating System Requirements for Embedded Systems Rabi Mahapatra."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Operating System Requirements for Embedded Systems Rabi Mahapatra.

Similar presentations

Presentation on theme: "1 Operating System Requirements for Embedded Systems Rabi Mahapatra."— Presentation transcript:

Similar presentations

About project

Feedback