Download presentation
Presentation is loading. Please wait.
Published byPhilip Miller Modified over 8 years ago
1
1 Xiaolan Zhang Spring 2013 Unix Commands
2
2 Outlines awk Commands working with files Process-related commands
3
3 Some useful tips Bash stores the commands history Use UP/DOWN arrow to browse them Use “history” to show past commands Repeat a previous command ! e.g., !239 “! E.g., !g++ Search for a command Type Ctrl-r, and then a string Bash will search previous commands for a match File name autocompletion: “tab” key
4
awk: what is it? programming language was designed to simplify many common text processing tasks Online manual: info system vs. man system Version issue: old awk (before mid-1980, and after) awk, oawk, nawk, gawk, mawk … 4
5
Overview awk [ -F fs ] [ -v var=value... ] 'program' [ -- ] [ var=value... ] [ file(s) ] awk [ -F fs ] [ -v var=value... ] -f programfile [ -- ] [ var=value... ] [ file(s) ] -F option: define the field seperator Program: Consists of pairs of pattern and braced action, e.g., /zhang/ {print $3} NR<10 {print $0} provided in command line or file … Initialization: With –v option: take effect before the program is started Other: might be interspersed with filenames, i.e., apply to different files supplied after them 5
6
awk script An executable file starts with line #!/bin/awk –f BEGIIN{ lines=0; total=0; } { lines++; total+=$1; } 6 END{ if (liens>0) print “agerage is “, total/lines; else print “no records” } Demo: $ average.awk avg.data
7
awk programming model Input: awk views an input stream as a collection of records, each of which can be further subdivided into fields. Normally, a record is a line,and a field is a word of one or more nonwhite space characters. However, what constitutes a record and a field is entirely under the control of the programmer, and their definitions can even be changed during processing. 7
8
awk program An awk program: consists of pairs of patterns and braced actions, possibly supplemented by functions that implement the actions. For each pattern that matches input, the action is executed; all patterns are examined for every input record pattern { action } Run action if pattern matches Either part of a pattern/action pair may be omitted. If pattern is omitted, action is applied to every input record { action } Run action for every record If action is omitted, default action is to print matching record on standard output pattern Print record if pattern matches 8
9
BEGIN, AND pattern The action associated with BEGIN is performed just once, before any command-line files or ordinary command-line assignments are processed, but after any leading –v option assignments have been done. It is normally used to handle any special initialization tasks required by the program. The END action is performed just once, after all of the input data has been processed. It is normally used to produce summary reports or to perform cleanup actions. 9
10
Input is switched automatically from one input file to the next,and awk itself normally handles the opening,reading,and closing of each input file, 10
11
Action Enclosed by braces Statements: separated by newline or ; Assignment statement print statement if statement, if/else statement while loop, do/while loop, for loop (three parts, and one part) break, continue 11
12
12
13
Using awk to cut awk -F ':' '{print $1,$3;}' /etc/passwd To simulate head awk 'NR<10 {print $0}' /etc/passwd To count lines: awk ‘END {print NR}’ /etc/passwd What’s my UID (numerical user id?) awk –F ‘:’ ‘/^zhang/ {print $3}’ /etc/passswd 13
14
Doing something new Output the logarithm of numbers in first field echo 10 | awk ‘{print $0,log($0)}’ Sum all fields together awk '{sum=0; for (i=1;i<NF;i++) sum+=sum+$i*0.2; print sum}' data2 How about weighted sum? Four fields with weight assignments (0.1, 0.3, 0.4,0.2) awk '{sum= $1*0.1+$2*0.3+$3*0.4+$4*0.2; print sum}' data2 14
15
Awk variables Difference from C/C++ variables Initialized to 0, or empty string No need to declare, variable types are decided based on context All variables are global (even those used in function!) Difference from shell variables: Reference without $, except for $0,$1,…$NF Conversion between numeric value and string value N=123; s=“”N ## s is assigned “123” S=123, N=0+S ## N is assigned 123 Floating point arithmetic operations awk '{print $1 “F=“ ($1-32)*5/9 “C”}' data echo 38 | awk '{print $1 “F=“ ($1-32)*5/9 “C”}' 15
16
16
17
17
18
Working with strings length(a): return the length of a stirng substr (a, start, len): returns a copy of sub-string of len, starting at start-th character in a substr(“abcde”, 2, 3) returns “bcd” toupper(a), tolower(a): lettercase conversion index(a,find): returns starting position of find in a Index(“abcde”, “cd”) returns 3 match(a,regexp): matches string a against regular express regexp, return index if matching succeeed, otherwise return 0 Similar to (a ~ regexp): return 1 or 0 18
19
Working with strings (2) sub (regexp, replacement, target) gsub(regexp, replacement, target) -- global Matches target against regexp, and replaces the lestmost (sub) or all (gsub) longest match by string replacement E.g., gsub(/[^$-0-9.,]/,”*”, amount) Replace illegal amount with * To extract all constant string from a file sub (/^[^"]+"/, "", value) ## replace everything before " by empty string sub(/".*$/, "", value); ## replace everything after " by empty string 19
20
Working with string (3) split (string, array, regexp): break string into pieces stored in array, using delimiter as given by regexp function split_path (target) { n = split (target, paths, "/"); for (k=1;k<=n;k++) print paths[k] ##Alternative way to iterate through array: ## for (path in paths) ## print paths[path] } 20
21
String formatting sprintf(), printf () 21
22
awk array variables Array can be indexed using integers Associated array: Example: weighted sum Read the weights from a file Calculate weighted sum using the above weight for another file 22
23
#!/bin/awk -f NR==1 { ## read the weights for (num=1;num<=NF;num++) { w[num] = $num } NR==2 { ## read the letter-grade ##mapping thresholds Athresh = $1 Bthresh = $2 Cthresh = $3 Dthresh = $4 } 23 NR>2 { # process each record sum=0; ## this is optional for (col=1;col<=NF;col++) sum+=($col*w[col]); printf ("%s %d ", $0, sum); if (sum>=Athresh) print "A" else if (sum>=Bthresh) print "B" else if (sum>=Cthresh) print "C" else if (sum>=Dthresh) print "D" else print "F" } Need $ when refer to the fields in the record No $ for other variables ! weightedsum.awk To do: 1.Try using data2 2.Use an array to store four thresholds 2.Check to make sure weights sum up to 1
24
Associative array Suppose input file is as follows: 0.1 0.2 0.3 0.4 ## weights A 90 ## A if total is greater than or equal to 90 B 80 C 70 D 60 F 0 alice 100 100 100 200 jack 10 10 10 300 smith 20 20 20 200 john 30 30 30 200 zack 10 10 10 10 24
25
#!/bin/awk -f NR==1 { ## read the weights for (num=1;num<=NF;num++) { w[num] = $num } /^[A-F] / { ## read the letter-grade mapping ##thresholds thresh[$0] = $1 } 25 /^[a-z]/ { # this code is executed once for each line sum=0; for (col=2;col<=NF;col++) sum+=($col*w[col-1]); printf ("%s %d ", $0, sum); if (sum>=thresh["A"]) print "A" else if (sum>=thresh["B"]) print "B" else if (sum>=thresh["C"]) print "C" else if (sum>=thresh["D"]) print "D" else print "F" }
26
Awk user-defined function Can be defined anywhere: before, after or between pattern/action groups Convention: placed after pattern/action code, in alphabetic order function name(arg1,arg2, …, argn) { statement(s) } name(exp1,exp2,…,expn); result = name(exp1,exp2,…,expn); return statement: return expr Terminate current func, return control to caller with value of expr Default value: 0 or “” (empty string) 26 Named argument: local variable to function, Hide global var. with same name
27
Variable and argument function a(num) { for (n=1;n<=num;n++) printf ("%s", "*"); } { n=$1 a(n) print n } 27 Warning: Variables used in function body, but not included in argument list are global variable Todo: 1.What’s the output? echo 3 | awk –f global_var.ark 2. Try it …
28
Solution: make n local variable Hard to avoid variables with same name , espeically i, j, k,... function a(num, n) { for (n=1;n<=num;n++) printf ("%s", "*"); } { n=$1 a(n) print n } 28 Todo: 1.What’s the output now? echo 3 | awk –f global_var.ark Convention, list non-argument local variables last, with extra leading spaces
29
#!/bin/awk -f function factor (number) { factors="" ## intialize string storing the factoring result m=number; ## m: remaining part to be factored for (i=2;(m>1) && (i^2<=m);) ## try i, i start from 2, goes up to sqrt of m { ## code omitted … } if ( m>1 && factors!="" ) ## if m is not yet 1, factors = factors " * " m print number, (factors=="")? " is prime ": (" = " factors) } { factor($1);} ## call factor function to factor first field for each record Awk function, 29 factoring.awk Do these: 1. Test it: echo 2013 | factoring.awk 2. Modify to return factors string, instead of print it 3. Add a function, isPrime, Hint: you can call factor() 4. For each line in inputs, count # of prime numbers in the line
30
User-controlled Input Usually, one does not worry about reading from file You specify what to do with each line of inputs Sometimes, you want to Read next record: in order to processing current one … Read different files: Dictionary files versus text files (to spell check): need to load dictionary files first … Read record from a pipeline: Use getline 30
31
User-controlled Input 31
32
Interact awk $ awk 'BEGIN {print "Hi:"; getline answer; print "You said: ", answer;}' Hi: Yes? You said: Yes? To load dictionary: nwords=1 while ((getline words[nwords] 0) nwords++; To get current time into a variable “date” | getline now Close(“date”) print “time is now: “ now 32
33
Output redirection: to files print or printf to file, using > and >> #!/bin/awk -f #usage: copy.awk file1 file2 … filen target=targetfile BEGIN { for (k=0;k<ARGC;k++) if (ARGV[k] ~ /target=/) { ## Extract target file name target_file=substr(ARGV[k],8); } printf " " > target_file close (file) } END {close(target_file); } ## optional, as files will be closed upon termination { print FILENAME, $0 >> target_file } 33
34
Output redirection: to pipeline #!/bin/awk -f # demonstrate using pipeline BEGIN { FS = ":" } { # select username for users using bash if ($7 ~ "/bin/bash") print $1 >> "tmp.txt" } 34 END{ while ((getline 0) { cmd="mail -s Fellow_BASH_USER " $0 print "Hello," $0 | cmd ## send an email to every bash user } close ("tmp.txt") } sort_pipe.awk Todo: 1. 2.
35
Execute external command Using system function (similar to C/C++) E.g., system (“rm –f tmp”) to remove a file if (system(“rm –f tmp”)!=0) print “failed to rm tmp” A shell is started to run the command line passed as argument Inherit awk program’s standard input/output/error 35
36
36 Outlines awk Commands working with files Process-related commands
37
df report file system disk space usage df [OPTION]... [FILE]... Show information about the file system on which each FILE resides, or all file systems by default. du - estimate file space usage du [OPTION]... [FILE]... Summarize disk usage of each FILE, recursively for directories. quota - display disk usage and limits 37
38
38 What’s in a file ? files are organized in a hierarchical directory structure Each file has a name, resides under a directory, is associated with some admin info (permission, owner) Contents of file: Text (ASCII) file (such as your C/C++ source code) Executable file (commands) A link to other files, … To check the type of file: “file ” To view “octal dump” of a file: od
39
ln - make links between files ln -s /path/to/file1.txt /path/to/file2.txt 39
40
40 Compare file contents Suppose you carefully maintain diff. versions of your projects (so that you can undo some changes), and want to check what’s the difference. cmp file1 file2: finds the first place where two files differ (in terms of line and character) diff file1 file2: reports all lines that are different
41
Working with files (chapter 10) 41
42
42 Outlines awk Commands working with files Process-related commands
43
43 The workings of shell For each command line, shell creates a new child process to run the command Sequential commands: e.g. date; who Two commands are run in sequence Pipelined commands: e.g. ls –l | wc Two programs are load/execute simultaneously Shell waits for the completion, and then display prompt to get next command …
44
44 Important concept: Process Early computers run a job from starting to end Multiprogramming was popularized later To load multiple programs in memory and switch between them when one is waiting for I/O => increase CPU utilization Timesharing: a variant of multiprogramming, in which each user has an online terminal (multiple users sharing the system)
45
45 Process A process is an instance of a running program It’s associated with a unique number, process-id. OS stores its running state A process is different from a program wc, ls, a.out, … are programs, i.e., executable files which program […] When you run a program, you start a process to execute the program’s code Multiple processes can run same program At any time, there are multiple processes in the system One of them is running, the rest is either waiting for I/O, or waiting to be scheduled
46
46 Loading Program Programs are stored in secondary storage (hard disks, CD-ROM, DVD) To process data, CPU requires a working area, the Main Memory Also called: RAM (random access memory), primary storage, and internal memory. Before a program is run, it must first be copied from the slow secondary storage into fast main memory Provides the CPU with fast access to instructions to execute.
47
47 ps command To report a snapshot of current processes: ps By default: report processes belonging to current user and associated with same terminal as invoker. Example: [zhang@storm ~]$ ps PID TTY TIME CMD 15002 pts/2 00:00:00 bash 15535 pts/2 00:00:00 ps List all processes: ps -e
48
48 BSD style output of ps Learn more about the command, using man ps [zhang@storm ~]$ ps axu USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1 0.0 0.0 2112 672 ? Ss Jan17 0:11 init [3] root 2 0.0 0.0 0 0 ? S< Jan17 0:00 [kthreadd] root 3 0.0 0.0 0 0 ? S< Jan17 0:00 [migration/0] root 4 0.0 0.0 0 0 ? S< Jan17 0:00 [ksoftirqd/0] root 5 0.0 0.0 0 0 ? S< Jan17 0:00 [watchdog/0] root 6 0.0 0.0 0 0 ? S< Jan17 0:00 [migration/1] root 7 0.0 0.0 0 0 ? S< Jan17 0:00 [ksoftirqd/1] root 8 0.0 0.0 0 0 ? S< Jan17 0:00 [watchdog/1] root 9 0.0 0.0 0 0 ? S< Jan17 0:00 [migration/2]
49
49 Run program in background To start some time-consuming job, and go on to do something else $ command [ [ - ] option (s) ] [ option argument (s) ] [ command argument (s) ] & wc ch * > wc.out & Shell starts a process to run the command, and does not wait for its completion, i.e., it goes back to reads and parses next command Shell builtin command: wait Kill a process: kill
50
50 Some useful commands To let process keep running even after you log off (no hangup) nohup COMMAND & Output will be saved in nohup.out To run your program with low priority nice [OPTION] [COMMAND [ARG]...]
51
51 Some useful commands To start programs at specified time (e.g. midnight) at [-V] [-q queue] [-f file] [-mldv] timespec... By default, read programs from standard input: [zhang@storm assignment]$ date Mon Jan 31 21:51:38 EST 2011 [zhang@storm assignment]$ at 10pm < todo job 15 at Mon Jan 31 22:00:00 2011 [zhang@storm assignment]$ more todo echo "HI!" ls | wc –l > temp
52
top command top - display Linux tasks SYNOPSIS top -hv | -bcHisS -d delay -n iterations -p pid [, pid...] The traditional switches '-' and whitespace are optional. The top program provides a dynamic real-time view of a running system. It can display system summary information as well as a list of tasks currently being managed by the Linux kernel. The types of system summary information shown and the types, order and size of information displayed for tasks are all user config ‐ urable and that configuration can be made persistent across restarts. The program provides a limited interactive interface for process manipulation as well as a much more extensive interface for personal configuration -- encom ‐ passing every aspect of its operation. And while top is referred to throughout this document, you are free to name the program anything you wish. That new name, possibly an alias, will then be reflected on top's display and used when reading and writing a configuration file. 52
53
53 Outlines awk (chapter 9) Commands working with files (chapter 10) Process-related commands (chapter 13)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.