Chapter 6 Processes and Operating Systems 金仲達教授 清華大學資訊工程學系 (Slides are taken from the textbook slides)
Overview Processes Context switching Operating systems Scheduling policies Interprocess communication Evaluation and optimization Design example: telephone answering machine
Introduction When multiple operations must be performed at widely varying times, a single program can easily become too complex Two abstractions to build complex applications: process: defines the state of an executing program => compartmentalize functions operating system: provides mechanism for switching execution between the processes => encapsulate control for switching processes Allowing switching state of processor between multiple tasks building applications with more complex functionality and greater flexibility to satisfy timing requirements
Why multiple processes? Need to structure programs to perform multiple tasks Processes help us manage timing complexity: receive and send data at multiple rates, e.g., multimedia, automotive asynchronous input, e.g., user interfaces, communication systems Multirate systems make meeting timing requirements even more complex: certain operations must be executed periodically , and each is executed at its own rate
Example: engine control Tasks: spark control crankshaft sensing fuel/air mixture oxygen sensor Kalman filter To fire spark plug periodically, set throttle, adjust fuel/air mixture, etc. engine controller
Life without processes Code turns into a mess: interruptions of one task for another spaghetti code A_code(); … B_code(); if (C) C_code(); switch (x) { case C: C(); case D: D(); ... A time B C A C
Early multitasking: co-routines ADR r14,co2a co1a … ADR r13,co1b MOV r15,r14 co1b … ADR r13,co1c co1c ... co2a … ADR r14,co2b MOV r15,r13 co2b … ADR r14,co2c co2c … Co-routine 1 Co-routine 2 r13: holds return address for co-routine 1 r14: holds return address for co-routine 2
Co-routine methodology A co-routine has several different entry points give hooks for nonhierarchical calls and returns Like subroutine, but caller determines the return address Co-routines voluntarily give up control to other co-routines Pattern of control transfers is embedded in the code => difficult to handle timing requirements
Processes A process is a unique execution of a program. A process is defined by its code and data Several copies of a program may run simultaneously or at different times A process has its own state: registers memory The operating system manages processes
Processes and CPUs Activation record: copy of process state Context switch: current CPU context goes out new CPU context goes in process 1 PC registers process 2 CPU ... memory
Terms Thread = lightweight process: a process that shares memory space with other processes avoid the cost and complexity of memory management units that provide strict separation between memory spaces Reentrancy: ability of a program to be executed several times with the same results.
Processes in POSIX Create a process with fork(): Two processes are exactly the same in code and data, except the return value of fork() process a process a (parent) process a (child)
fork() The fork system call creates a child process: childid = fork(); if (childid == 0) { /* child operations */ } else {/* childid = child proc id */ /* parent operations */ }
execv() Overlays child code: childid = fork(); if (childid == 0) { execv(“mychild”,childargs); perror(“execv”); exit(1); } file with child code
wait() Parent waits for the child to terminate and release all resources: childid = fork(); if (childid == 0) { /* child things */ } else { /* parent things */ wait(&cstatus); exit(0); }
Overview Processes Context switching Operating systems Scheduling policies Interprocess communication Evaluation and optimization Design example: telephone answering machine
Context switching Moving CPU from one executing process to another Bug-free: process not know it was stopped fast Who controls when the context is switched? How is the context switched?
Co-operative multitasking A restricted form of context switches One process gives up CPU to another voluntarily Improvement on co-routines: hides context switching mechanism; still relies on processes to give up CPU. Each process allows a context switch by calling cswitch() A scheduler chooses which process runs next
Cooperative multitasking processes If (x>2) sub1(y); else sub2(y,z); cswitch(); proca(a,b,c); Process 1 proc_data(r,s,t); cswitch(); if (val1==3) abc(val2); Process 2 save_state(current); p=choose_process(); load_and_go(p); Scheduler
Context switching in ARM Save and restore process contexts current process context block CPSR r13 PC r0 CPU r14 Memory
Cooperative context switching in ARM Save old process: STMIA r13,{r0-r14}^ MRS r0,SPSR STMDB r13,{r0,r15} Start new process: ADR r0,NEXTPROC LDR r13,[r0] LDMDB r13,{r0,r14} MSR SPSR,r0 LDMIA r13,{r0-r14}^ MOVS pc,r14 STMIA r13,{r0-r14}^: store all registers to memory pointed by r13 MRS r0,SPSR: save content of SPSR to r0 STMDB r13,{r0,r15}: save SPSR and PC to memory pointed by r13 NEXTPROC: variable pointing to next process’s context block
Problems with co-operative multitasking Programming errors can keep other processes out, because the CPU must be voluntarily given up by a process process never gives up CPU process waits too long to switch, missing input process2() { x = global1; /* input to process */ while (x<500) x = aproc(global2); switch(); } x may always <500 and process never switch
Preemptive multitasking Most powerful form of multitasking: OS controls when contexts switches OS determines what process runs next Take advantage of interrupt mechanism Use timer to interrupt OS, switch contexts Reduce consequence of programming errors Allocate CPU time more efficiently interrupt CPU timer
Preemptive context switching Timer interrupt gives control to OS, which saves interrupted process’s state in activation record OS chooses next process to run. OS installs desired activation record as current CPU state. interrupt interrupt P1 OS P1 OS P2 time
Processes and UML A process is an active class, independent thread of control Signal: object that is passed between processes for active communication processClass1 myAttributes acomm: datasignal myOperations() start resume Signals
Designing with active objects Can mix normal and active objects: p1: processClass1 a: rawMsg w: wrapperClass ahat: fullMsg master: masterClass
Overview Processes Context switching Operating systems Scheduling policies Interprocess communication Evaluation and optimization Design example: telephone answering machine
Operating systems The operating system controls resources: who gets the CPU when I/O takes place how much memory is allocated The most important resource is the CPU itself CPU access controlled by the scheduler OS simplifies the control code required to coordinate processes scheduling is centralized
Process state A process can be in one of three states: executing on the CPU ready to run waiting for data Use process priorities to choose next executing process executing gets data and CPU gets CPU preempted needs data gets data ready waiting needs data
Priority-driven scheduling Each process has a priority. CPU goes to highest-priority process that is ready. Priorities determine scheduling policy: fixed priority; time-varying priorities.
Priority-driven scheduling example Rules: each process has a fixed priority (1 highest); highest-priority ready process gets CPU; process continues until done. Processes P1: priority 1, execution time 10 P2: priority 2, execution time 30 P3: priority 3, execution time 20
Priority-driven scheduling example P3 ready t=18 P2 ready t=0 P1 ready t=15 interrupt P2 P2 P1 P2 P3 10 20 30 40 50 60 time
Operating system structure OS needs to keep track of: process priorities; scheduling state; process activation record. Processes may be created: statically before system starts; dynamically during execution (process state records kept in a linked list) When some events occur aperiodically and infrequently, it is reasonable to create a process to handle the event when it occurs
Operating system structure OS generally executes in the protected mode ARM uses SWI (software interrupt) to provide OS access Puts CPU into the supervisor mode Exception vector table directs execution to the proper place in OS The one argument passes a parameter to the OS
Process timing requirements Process initiation disciplines: Periodic process: executes on (almost) every period. Aperiodic process: executes on demand. Analyzing aperiodic process sets is harder---must consider worst-case combinations of process activations.
Timing requirements on processes Period: interval between process activations. Rate requirement: how quickly processes must be initiated Inverse gives the initiation interval Initiation time: time at which process goes from waiting to ready. Deadline: time at which process must finish.
Timing requirements time P1 time P1 time P1 Deadline Aperiodic process Initiating event Deadline Periodic process (initiated at period start) time P1 Period Deadline Periodic process time P1 Initiating event Period
Timing violations What happens if a process doesn’t finish by its deadline? Hard deadline: system fails if missed. Soft deadline: user may notice, but system doesn’t necessarily fail.
Example: Space Shuttle software error Space Shuttle’s first launch was delayed by a software timing error: Primary control system PASS and backup system BFS. BFS failed to synchronize with PASS. Change to one routine added delay that threw off start time calculation. 1 in 67 chance of timing problem.
Interprocess communication Interprocess communication (IPC): OS provides mechanisms so that processes can pass data. Two types of semantics: blocking: sending process waits for response; non-blocking: sending process continues.
IPC styles Shared memory: Message passing: processes have some memory in common; must cooperate to avoid destroying/missing messages. Message passing: processes send messages along a communication channel---no common address space They are logically equivalent
Shared memory Shared memory on a bus: Need a flag to tell one CPU when the data from the other CPU is ready Uni-directional flag: easy with a memory write Problem with bi-directional flag memory CPU 1 CPU 2
Race condition in shared memory Problem when two CPUs try to write the same location: CPU 1 reads flag and sees 0. CPU 2 reads flag and sees 0. CPU 1 sets flag to one and writes location. CPU 2 sets flag to one and overwrites location.
Atomic test-and-set Problem can be solved with an atomic test-and-set over the bus: single bus operation reads memory location, tests it, writes it. ARM test-and-set provided by SWP Rd,Rm,Rn: Memory pointed by Rn is loaded into Rd and Rm is written to Rn ADR r0,SEMAPHORE LDR r1,#1 GETFLAG SWP r1,r1,[r0] BNZ GETFLAG
Critical regions Critical region: section of code that cannot be interrupted by another process. Examples: writing shared memory, accessing I/O device Controlling access to critical region using semaphores: Get access to semaphore with P() Perform critical region operations Release semaphore with V() P() and V() can be implemented with test-and-set
Message passing Message passing on a network: Messages stored at endpoints, not comm link Good for applications with autonomic components, specially for those operate at different rates CPU 1 Message box Message Box CPU 2 message
Process data dependencies Express relationships of processes running at the same rate One process may not be able to start until depending ones finish Dependencies form a partial ordering of process execution Data dependencies defined in a directed acyclic graph, task graph P1 P2 P3 P4
Other operating system functions The role of OS may be viewed as managing shared resources Date/time File system: for organizing large data sets, even without hard disks Networking Security Debugging facility
Overview Processes Context switching Operating systems Scheduling policies Interprocess communication Evaluation and optimization Design example: telephone answering machine
Embedded vs. general-purpose scheduling Workstations try to avoid starving processes of CPU access. Fairness = access to CPU. Embedded systems must meet deadlines. Low-priority processes may not run for a long time
The scheduling problem Scheduling policy: defines how processes are selected for promotion from ready to running Can we meet all deadlines? Must be able to meet deadlines in all cases. How much CPU horsepower do we need to meet our deadlines?
Metrics How do we evaluate a scheduling policy: Ability to satisfy all deadlines. CPU utilization: percentage of time devoted to useful work Scheduling overhead: time required to make scheduling decision
POSIX for real-time scheduling Defined in _POSIX_PRIORITY_SCHEDULING Use sched_setschedule() to change scheduling: #includ <sched.h> int i, my_process_id; struct sched_param my_sched_params; … i = sched_setschedule(my_process_id, SCHED_FIFO, &my_sched_params);
Priority-Driven Algorithm Fixed priority algorithm Assigns the same priority to all the jobs in each task Example: RM (Rate-Monotonic), DM(Deadline-Monotonic) Dynamic priority algorithm Assigns different priorities to the individual jobs in each task Example: EDF (Earliest Deadline First)
Rate monotonic scheduling RMS [Liu and Layland, 73]: widely-used, analyzable scheduling policy Analysis is called Rate Monotonic Analysis (RMA) Assumptions: All processes run periodically on single CPU. Zero context switch time. No data dependencies between processes. Process execution time is constant. Deadline is at end of respective period. Highest-priority ready process runs.
RMS priorities Optimal (fixed) priority assignment: shortest-period process gets highest priority; priority inversely proportional to period; break ties arbitrarily. No fixed-priority scheme does better In terms of CPU utilization while ensuring all processes meet their deadlines
RMS example Process Execution time Period P1 1 4 P2 2 6 P3 3 12 P3 P3 2 4 6 8 10 12
Examples of RM schedules A RM schedule of T1=(2,0.9) and T2=(5,2.3) T1:Utilizations=0.9/2=0.45 T2:Utilizations=2.3/5=0.46 T1 2 4 6 8 T2 10 5
RMS example Process Execution time Period P1 2 4 P2 3 6 P3 3 12 No feasible assignment of priorities to satisfy deadlines: In 12 unit intervals, execute P1 3 times, P2 2 times, P3 1 times => (6+6+3)=15 unit intervals
Process parameters Ti is computation time of process i; ti is period of process i. period ti Pi computation time Ti
RMS CPU utilization S i Ti / ti Utilization for n processes is Maximum utilization: n (21/n -1) As number of tasks approaches infinity, maximum utilization approaches 69% RMS cannot use 100% of CPU, even with zero context switch overhead. Must keep idle cycles available to handle worst-case scenario. However, RMS guarantees all processes will always meet their deadlines.
RMS example P1: period 4, execution time 2 CPU utilization = (2 x 3 + 1)/12 = 0.5833 P2 period P2 P1 period P1 P1 P1 time 5 10
Rate-monotonic analysis Response time: time to finish process Critical instant: scheduling state that gives worst response time. occurs when all higher- priority processes are ready to execute. P1 P2 P3 interfering processes P1 P2 P3 critical instant P4
RMS implementation Efficient implementation: POSIX: SCHED_FIFO for RMS scan processes choose highest-priority active process POSIX: SCHED_FIFO for RMS int i, mypid; struct_sched_param my_param; mypid = getpid(); i = sched_getparam(mypid, &my_params); my_params.sched_priority = maxval; i = sched_setparam(mypid, &my_params);
Earliest-deadline-first scheduling EDF: dynamic priority scheduling scheme. Process closest to its deadline has highest priority. Requires recalculating processes at every timer interrupt.
Example of the EDF Algorithm T1=(2,0.9) and T2=(5,2.3) J1.2 preempts J2.1 At time 4.1,J2.1 completes , J1.3 start to execute J1.1 is 2 , J2.1 is 5 Priority:J1.1>J2.1 J1.3 is 6 , J2.1 is 5 Priority:J2.1>J1.3 T1 J1.2 is 4 , J2.1 is 5 Priority:J1.2>J2.1 2 4 6 8 T2 10 5
EDF analysis EDF can use 100% of CPU. Scheduling cost is high and ready queue can reassign priority. But EDF may fail to meet a deadline. Cannot guarantee who will miss deadline, RM can guarantee the lowest priority task miss deadline.
EDF implementation On each timer interrupt: compute time to deadline; choose process closest to deadline. Generally considered too expensive to use in practice.
Fixing scheduling problems What if your set of processes is unschedulable? Change deadlines in requirements. Reduce execution times of processes. Get a faster CPU. Can we relax the assumptions?
Assumption on process self-contain Priority inversion: low-priority process keeps high-priority process from running if we do not consider resources Low-priority process grabs I/O device. High-priority device needs I/O device, but can’t get it until low-priority process is done. Can cause deadlock Solution: Give priorities to system resources. Process inherit priority of a resource that it requests. Low-priority process inherits priority of device if higher
Assumption on data independence Data dependencies allow us to improve utilization Restrict combination of processes that can run simultaneously. E.g., P1 and P2 can’t run simultaneously. P1 P2
Assumption on context switch time Non-zero context switch time can push limits of a tight schedule. Hard to calculate effects---depends on order of context switches. In practice, OS context switch overhead is small.
POSIX scheduling policies SCHED_FIFO: RMS SCHED_RR: round-robin within a priority level, processes are time-sliced in round-robin fashion SCHED_OTHER: undefined scheduling policy used to mix non-real-time and real-time processes.
Overview Processes Context switching Operating systems Scheduling policies Interprocess communication Evaluation and optimization Design example: telephone answering machine
Signals A Unix mechanism for simple communication between processes. Analogous to an interrupt---forces execution of a process at a given location. But a signal is caused by one process with a function call. No data---can only pass type of signal.
POSIX signals Must declare a signal handler for the process using sigaction(). Handler is called when signal is received. A signal can be sent with sigqueue(): sigqueue(destpid,SIGRTMAX-1,sval)
POSIX signal types SIGABRT: abort SIGTERM: terminate process SIGFPE: floating point exception SIGILL: illegal instruction SIGKILL: unavoidable process termination SIGUSR1, SIGUSR2: user defined
Signal example /* define handler for SIGUSR1 */ #include <signal.h> extern void usr1_handler(int); /* SIGUSR1 */ struct sigaction act,oldact; int retval; /* set up the descriptor data structure */ act.sa_flags=0; sigemtyset(&act.sa_mask); /* initialize */ act.sa_handler=usr1_hanler; /*add handler*/ /* tell OS about the handler */ relval=sigaction(SIGUSR1,&act,&oldact);
POSIX shared memory POSIX supports counting semaphores with _POSIX_SEMAPHORES option. Semaphore with N resources will not block until N processes hold the semaphore. Semaphores are given name, e.g., /sem1 int i, oflags; sem_t *my_semaphore; /* descriptor */ my_semaphore = sem_open(“/sem1”,oflags); /* do useful work here */ i = sem_close(my_semaphore);
POSIX counting semphores P() is sem_wait(), V() is sem_post() int i; i = sem_wait(my_semaphore); /* P */ /* do useful work */ i = sem_post(my_semaphore); /* V */ /* sem_trywait tests without blocking */ i = sem_trywait(my_semaphore);
POSIX shared memory POSIX shared memory is supported under _POSIX_SHARED_MEMORY_OBJECT Use shm_open() to open a shared object: objdesc = shm_open(“/memobj1”,O_RDWR); /*return origin of shared memory in disk*/ Map shared memory object into process memory if(mmap(addr,len,O_RDWR,MAP_SHARED, objdesc,0) == NULL) { /* error */ } /*MAP_SHARED propagate all writes to all sharing processes */
POSIX message-based communication Unix pipe supports messages between processes. % foo files | baz > file2 Parent process uses pipe() to create a pipe. Pipe is created before child is created so that pipe ID can be passed to child. An alternative is message queues under _POSIX_MESSAGE_PASSING
POSIX pipe example /* create the pipe */ if (pipe(pipe_ends) < 0) { perror(“pipe”); break; } /* create the process */ childid = fork(); if (childid == 0) {/*child reads from pipe_ends[1]*/ childargs[0] = pipe_ends[1]; execv(“mychild”,childargs); perror(“execv”); exit(1); } else { /* parent writes to pipe_ends[0] */ … }
POSIX message queue example struct mq_attr mq_attr; /*attributes of the queue */ mqd_t myq; /*queue descriptor */ mq_attr.mq_maxmsg = 50; /* max number of messages */ mq_attr.mq_msgsize = 64; /* max size of a message */ mq_attr.mq_flags = 0; myq= mq_open(“/q1”,O_CREATE|RDWR,S_IRWXU,&mq_attr); …. /* enqueue and dequeue */ char data[MAXLEN], rcvbuf[MAXLEN]; if(mq_send(myq,data,len,priority)<0) { /* errors */} nbytes = mq_receive(myq,rcvbuf,MAXLEN,&prio);
Overview Processes Context switching Operating systems Scheduling policies Interprocess communication Evaluation and optimization Design example: telephone answering machine
Evaluating performance May want to test: context switch time assumptions; scheduling policy. Can use OS simulator to exercise process set, trace system behavior.
Processes and caches Processes can cause additional caching problems. Even if individual processes are well-behaved, processes may interfere with each other. Worst-case execution time with bad behavior is usually much worse than execution time with good cache behavior.
Power optimization Power management: determining how system resources are scheduled/used to control power consumption. OS can manage for power just as it manages for time. OS reduces power by shutting down units. May have partial shutdown modes.
Power management and performance Power management and performance are often at odds. Entering power-down mode consumes energy, time. Leaving power-down mode consumes
Simple power management policies Request-driven: power up once request is received. Adds delay to response. Predictive shutdown: try to predict how long you have before next request. May start up in advance of request in anticipation of a new request. If you predict wrong, you will incur additional delay while starting up.
Probabilistic shutdown Assume service requests are probabilistic. Optimize expected values: power consumption; response time. Simple probabilistic: shut down after time Ton, turn back on after waiting for Toff.
Advanced Configuration and Power Interface ACPI: open standard for power management services. applications device drivers OS kernel power management ACPI BIOS Hardware platform
ACPI global power states G3: mechanical off G2: soft off S1: low wake-up latency with no loss of context S2: low latency with loss of CPU/cache state S3: low latency with loss of all state except memory S4: lowest-power state with all devices off G1: sleeping state G0: working state
Overview Processes Context switching Operating systems Scheduling policies Interprocess communication Evaluation and optimization Design example: telephone answering machine
Theory of operation Compress audio using adaptive differential pulse code modulation (ADPCM). analog time ADPCM 3 2 1 -1 -2 -3 time
ADPCM coding Coded in a small alphabet with positive and negative values. {-3,-2,-1,1,2,3} Minimize error between predicted value and actual signal value.
ADPCM compression system quantizer inverse quantizer integrator encoder samples inverse quantizer integrator decoder
Telephone system terms Subscriber line: line to phone. Central office: telephone switching system. Off-hook: phone active. On-hook: phone inactive.
Real and simulated subscriber line Real subscriber line: 90V RMS ringing signal; companded analog signals; lightning protection, etc. Simulated subscriber line: microphone input; speaker output; switches for ring, off-hook, etc.
Requirements
Comments on analysis DRAM requirement influenced by DRAM price. Details of user interface protocol could be tested on a PC-based prototype.
Answering machine class diagram 1 1 1 Microphone* 1 Controls Record * Outgoing- message 1 1 1 1 1 1 Line-in* * 1 * 1 Incoming- message 1 Playback Line-out* * 1 1 Lights Buttons* 1 1 Speaker*
Physical interface classes Microphone* Line-in* Line-out* sample() sample() ring-indicator() sample() pick-up() Buttons* Lights* Speaker* record-OGM play messages num-messages sample()
Message classes Message length start-adrs next-msg samples Outgoing-message Incoming-message length=30 sec msg-time
Operational classes Controls Record Playback operate() record-msg() playback-msg()
Software components Front panel module. Speaker module. Telephone line module. Telephone input and output modules. Compression module. Decompression module.
Controls activate behavior Compute buttons, line activations Activations? Play OGM Record OGM Play ICM Erase Answer Play OGM Wait for timeout Allocate ICM Erase Record ICM
Record-msg/playback-msg behaviors nextadrs = 0 nextadrs = 0 msg.samples[nextadrs] = sample(source) speaker.samples() = msg.samples[nextadrs]; nextadrs++ F F End(source) nextadrs=msg.length T T record-msg playback-msg
Hardware platform CPU. Memory. Front panel. 2 A/Ds: 2 D/A: subscriber line, microphone. 2 D/A: subscriber line, speaker.
Component design and testing Must test performance as well as testing. Compression time shouldn’t dominate other tasks. Test for error conditions: memory overflow; try to delete empty message set, etc.
System integration and testing Can test partial integration on host platform; full testing requires integration on target platform. Simulate phone line for tests: it’s legal; easier to produce test conditions.