CIS 191: Linux and Unix Class 6 February 25th, 2016
Good News
Happy Spring Break NO CLASS NEXT THURSDAY Homework 5 still due March 17th 3pm
Bad News Quiz now
Outline Scheduled Jobs Processes Concurrent Operation and Scheduling Unix Process Management Devices
Daemons
Daemons are just background processes They’re typically used to provide services that must always run – sshd = ssh server process – udevd = hardware daemon – syslogd = logging daemon – (Names end in “d” by convention) Normal users may never come to realize they exist But you want to be a power user
Running tasks periodically Sometimes you want to run a program on a schedule – Generally for administrative tasks “Water my lawn at noon every other day in the summer” In UNIX systems, your tasks might be more along the lines of clearing an error log or taking a backup of your thesis
Cron The cron daemon handles periodic tasks – So if you want a task to run periodically, you must submit jobs to the cron daemon…. – It wakes up once a minute and services whatever tasks it has been assigned (after checking to make sure a particular task should run in that minute, of course) Many system tasks use cron logging backups updates
Using cron Don’t use cron From the man page: – “The cron utility is launched by launchd when it sees the existence of /etc/crontab or files in /usr/lib/cron/tabs. There should be no need to start it manually.” cron will start by itself when there is a crontab file which specifies jobs that should be run! – So, if you want to interact with cron, you need to edit either the system crontab file or your personal crontab file.
crontabs In general, you don’t want to edit the system crontab – Unless you really know what you’re doing! If you want to edit your user-specific crontab, access it by running – $ crontab –e – This will launch your favorite editor, as specified in the shell’s VISUAL or EDITOR environment variable. export EDITOR=vim export VISUAL=vim – set vim as your default EDITOR for command line programs
crontab format Each line in a crontab file represents some task you would like the cron daemon to execute MIN HR DOM MN DOW CMD – MIN – minute of the hour – HR – hour of the day – DOM – day of the month – MN – month of the year – DOW – day of the week (0=Sunday,1=Monday,2=Tuesday,…) – CMD – your command – -- whatever arguments you’d like to pass to your command If both DOM and DOW are non-restricted, the command will be run when either field is satisfied
crontab syntax Again, recall the * wildcard (yes, it is important) – Wildcard: * /path-to/my_sweet_backup.sh / Run backup at 8:30 on June 10 th, regardless of what weekday it is – Comma: 00 11,16,21 * * * /path-to/my_spam_script.sh Spam peeps at 11:00AM, 4:00PM, and 9:00PM every day – Hyphen: * * 1-5 /path-to/check_db_status.sh Check database status every hour from 9AM to 6PM, M-F – * * * * * /path-to/tick.sh Run a ticker script every minute (uses every available time frame to cron, and the cron daemon wakes up every minute) – */10 * * * * /path-to/ten_minutes.sh Runs ten_minutes.sh every ten minutes every day See this awesome link for more information!this awesome link
Caveat If something goes wrong, you won’t see any warning or error! Test your cron commands carefully before you put them in crontab Always use absolute file path to be safe Permission?
Scheduling a one-off task with at Suppose you want to manage submissions right before and right after lecture, but you’re lecturing at that time. The at utility is perfect for this – Allows you to specify a time, and then enter commands to run at the time you’ve specified Perfect for, say, disabling hw2 submission at 1:30, enabling hw2-late submission at 1:30, and enabling hw3 submission at 3:00.
at syntax at 1:30PM sep 17 >project –d hw2 >project –e hw2-late >^D job 2 at Wed Sep 17 13:30:
Running at In Ubuntu, you’ll probably need to install at – sudo apt-get install at – It should just work after this… In OSX, at relies on the atrun daemon to manage its jobs – atrun is disabled by default – To enable: launchctl load -w /System/Library/LaunchDaemons/com.apple.atrun.plist – See man atrun for details
Viewing and Manipulating the atq at –q (or atq) will list the jobs you’ve submitted to at which have not been completed yet at –c will cat the job text to the terminal for you to view at –r will remove a job from the atq
Side note about notifications on Mac OS Super cool trick osascript -e 'display notification "Lorem ipsum dolor" with title "Title"’ Send desktop notification Can be used combined with cron/at, better than any reminder/todo app!
Outline Scheduled Jobs Processes Concurrent Operation and Scheduling Unix Process Management Devices
Processes Processes are “programs in execution” – They have associated context (data and execution) – The execution must progress sequentially
Processes Processes are “programs in execution” – They have associated context (data and execution) – The execution must progress sequentially Processes are not programs or applications – We could run a program any number of times Each instance of a program is run as a process! Each process has its own address space and contexts
Why Processes? Operating systems (including Linux versions) have to execute a lot of different programs – And these programs have to run concurrently… Concurrent Processes
Why Processes? Operating systems (including Linux versions) have to execute a lot of different programs – And these programs have to run concurrently… Concurrent Processes – Word processor, browser, all running at once – Internally, memory management processes – Device serving processes…
Processes are Separated Each process has its own context in which it executes In fact, each process believes it has access to the entire machine! – This isn’t true, of course – that would lead to disaster – We aren’t going to go into the details of how this is implemented in this class, but you’ll get a full dose of it in both CIS 380 (OS) and CIS 371 (computer organization) Yes, it’s sufficiently complicated to be fully covered in two classes
Process Contexts PC – Program Counter – address of next instruction Stack – temporary data – function parameters, return addresses, locals Heap – dynamically allocated memory – objects, strings Static Data – globals Code – text that’s executed
Process Creation Parent processes create children processes (fork) – This forms a tree of processes A Process is identified and managed through its process identifier (pid) – View processes on your machine with the ps command ps -a will allows you to see all processes (including the os’s)
Unix Process Hierarchy A Unix process is spawned by a parent process A Unix process can spawn any amounts of children The “root” process is called init, and it is created when the operating system is booted – So every process can be traced back to init Each process has a unique process ID number
Unix Shell and Processes When you type a command, the shell clones itself (fork) The child process then calls exec, which causes it to stop executing the shell and start executing your command The parent shell waits for the child to terminate
Recall the unix shell and subshells Call a command in parenthesis ( ) to execute the command in a subshell cd ~/Documents/ ; (cd ~; pwd;) ; pwd Call a command in curly braces { } to group them but not spawn a subshell cd ~/Documents/ ; { cd ~; pwd; } ; pwd Call a command ending with the & symbol to execute the command in a subshell in the background – Commands in the background can’t write to stdout until they terminate (finish)
Keeping track of Processes The kernel’s job A data structure, usually called the process table, keeps track of each active process Entry in this table are called process control blocks (PCBs) So a process’ context is described by its PCB! – The kernel can stop and restart processes – In other words, the state of execution of the process is stored
Outline Scheduled Jobs Processes Concurrent Operation and Scheduling Unix Process Management Devices
Process states Running – The process is currently executing. Only one process can be executing at a time on a single core. Blocked – The process is waiting – Perhaps for I/O? Ready – The process would like to run
How to schedule? We need to schedule processes so that the system runs “well” But what does “well” mean?
How to schedule? We need to schedule processes so that the system runs “well” But what does “well” mean? It doesn’t necessarily mean what’s best for any particular application – If you’re an application writer, you should consider the operating system to be a malicious adversary… – It makes it easier to write robust programs
We need to balance between… Efficiency – Spend as much time in user processes as we can Fairness – Avoid starving processes for compute time Priority Handling – Need to allow more important processes better service Real-time constraints – Need to have a guaranteed level of service Hardware constraints – How much time do we waste switching processes?
Take One : FCFS First Come First Serve – When a job arrives, place it on the end of the service queue – The dispatcher selects the first job and runs it to completion of a CPU burst It’s a nice, simple algorithm But it’s inappropriate for interactive systems… – Why?
Take two : Shortest Job First Ready queue is thought of as a priority queue, with the shortest job sorted to be first Then run like in FCFS This would have a provably optimal waiting time But starvation is possible And you can’t actually implement it on real computers – Would have to approximate based on CPU bursts
Take three : Round Robin First come first serve, but preemptive – So a process can be interrupted against its will We have a circular ready queue, similar to in FCFS Arriving jobs are placed at the end The dispatcher selects the first job in the queue and runs it to the completion of a CPU burst, or until the time quantum expires If the quantum expires, then the job is placed at the end
Take three : Round Robin This is a simple, low overhead approach that works for interactive systems But this relies on the time quantum value – If it’s too large (longer than a CPU burst), we approach FCFS – If it’s too small, we spend too much time in context switching – A typical value is around ms A good rule of thumb is to choose a quantum so that the large majority of jobs finish their CPU burst in one quantum Round robin makes the assumption that all processes are equally important
Task priority But priority can be very important… – We want user-facing processes to be addressed first – Keyboard input, text processors How can we select/compute the priorities?
Multi-Level Feedback Queues Use the round robin technique, but have some levels of queues The time quantum gets bigger as you go “down” the levels; priority decreases as you go down Processes start in the highest level queue If a process relinquishes control, it leaves the queue until it’s ready and re-insert to the end of the same queue If a process is not complete at the end of a time quantum, it is put at the bottom of the next queue down A process in a given queue is not scheduled until all the higher queues are empty
What happens here? Processes that do a lot of I/O congregate in the higher level queues – Then don’t have to spend much time doing computation – Most of their time is spend waiting for devices Processes that do a lot of processing sink into lower queues, where they’ll have more time to execute (when they are scheduled) This will result in a more responsive system overall – Why?
Completely Fair Scheduler In practice, today most Linux distributions make use of the “Completely Fair Scheduler”, or CFS.“Completely Fair Scheduler”, or CFS This makes use of a red-black tree to sort processes which would like to execute based on the amount of time they spent in execution so far The process which has spent the least time in execution is what’s selected next For more information, check out the wikipedia page! (Linked above)
Outline Processes Concurrent Operation and Scheduling Unix Process Management Devices
Scheduling in Unix Based on the idea of multi-level feedback queues Priorities from -64 to 63 (lower = higher priority) Negative numbers are reserved for processes waiting to service devices Time quantum is.1 seconds – Empirically found to be the longest quantum which doesn’t reduce perceived responsiveness for editors, etc – But the longer quantum means less context switch Adjust priorities to reflect resource requirement (waiting for event) and resource consumption (CPU time)
Unix CPU Scheduler There are two values (stored in PCB/process table entry) – p_cpu: estimate of recent CPU usage – p_nice: a user/OS settable weighting factor (-20 -> 20) default is 0, negative is higher priority, positive is lower priority Each process’ priority is calculated periodically – priority = base + p_cpu + p_nice p_cpu is incremented each time the system clock ticks and the process is found to be executing
Manually managing processes You can view which processes are running by using the ps command – This can be piped to grep to find a particular process You can then stop a running process by sending signals with the kill command (it does more than kill…) – Really it sends a signal, based on what you pass it – kill -9 Sends the kill signal! This can’t be handled or interrupted. – kill -2 Sends SIGINT (same as ctrl-C) – And other things! Check out the man page for more details.
Viewing live process update feed You can run the top command to see a running update of the processes running on your machine Like running ps over and over again… You can also install htop, which is a tricked out version of top, through the apt-get.
Outline Processes Concurrent Operation and Scheduling Unix Process Management Devices
Remember /dev? /dev contains system device files Linux systems accomplish tons of magic by treating devices as special files – And by pretending that certain non-device objects are files… – Linux employs devices to provide lightweight “system services” The contents of /dev have odd permissions if you check with ls -l
Device Files are “Pseudofiles” When you read from or write to a device “file”, the operating system silently intercepts your request and feeds it to a driver The driver knows how to interact with the device Drivers are typically written in C – Can anyone tell me why that might be?
Device File Permissions If you took the liberty of running ls –l on the files, you might see something like this The “b” means that this is a “block” device You could also see a “c”, which would mean it is a “character” device The size field has been replaced by a csv field where – The first value represents the major device number – The second value represents the minor device number
Types of Devices Character Devices – Denoted by “c” character at start of permissions – Provide direct, unbuffered access to the hardware – Examples: Serial ports, keyboards, sound cards, modems Block Devices – Denoted by “b” character at start of permissions – Provide buffered access to the hardware – Always allowed to read/write any sized block – Buffered => We don’t know when the data will be written Pseudo Devices – Fake devices that are useful
Some Pseudo Devices /dev/null – Accepts all data written to it and does nothing – /dev/full – Always full; returns a continuous stream of NULL bytes when read and returns “disk full” error when written to /dev/zero – Produces endless string of zero bits when read /dev/random and /dev/urandom – Contains a stream of random bytes
Hard Disk Partitions Each computer may have several hard drives attached, and each hard drive can be divided into several partitions – Windows assigns each partition its own drive letter (like C:) – Linux allows you to specify where the data on a given partition should appear in the filesystem Every hard drive is assigned a name in /dev – Like /dev/sda for the first drive, or /dev/sdb for the second – Naming starts at sd followed by a letter The n th partition of the drive sdb is sdb(n) – So sdb3 is the third partition on the second hard disk
Mounting and Unmounting To use a partition, you can use the mount command – The usage is mount device location – For example, mount /dev/sda2 /media/windows The mounted filesystems and devices are tracked in /etc/mtab. You’ll probably need to be root to access it. umount unmounts a directory – Note the absence of the ‘n’ in umount
fstab Non-changing filesystem information is written in /etc/fstab – According to the man page, “it is the duty of the system administrator to properly create and maintain this file” – At boot, fstab tells the system which filesystems should be loaded – Afterwards, fstab is used (by mount/umount) to describe how to mount and unmount filesystems fstab entries contain the filesystem location, the mount point, the file system type, options, and information about core dumping and checking the filesystem