Adv. UNIX: Filters/41 Advanced UNIX v Objectives –to discuss five useful filters: tr, grep, awk, sed, and find Special Topics in Comp. Eng. 1 Semester 2, Filters (Part II, Sobell)
Adv. UNIX: Filters/42 1. tr v format: tr [options] string1 [string2] tr reads its standard input and translates each character in string1 to the corresponding character in string2
Adv. UNIX: Filters/43 Examples $ echo 12abc3def4 | tr ’abcdef’ ’xyzabc’ 12xyz3abc4 $ echo 12abc3de4 | tr ’[a-c][d-f]’ ’[x-z][a-c]’ 12xyz3abc4 $ cat foo.txt | tr ’[A-Z]’ ’[a-z]’
Adv. UNIX: Filters/44 $ tr ’\015’ ’ ’ file2 –\015 is carriage return $ cat mail.txt | tr -s ’ ป ’ ’ ’ > new-mail.txt – ป represents tab; could write \011 –-s means remove duplicates of string2 in output $ echo Can you read this? | tr -d ’aeiou’ Cn y rd ths?
Adv. UNIX: Filters/45 “rot13” Text $ echo Gur chapuyvar bs gur wbxr vf... | tr ’[N-Z][A-M][n-z][a-m]’ ’[A-M][N-Z][a-m][n-z]’ The punchline of the joke is... Popular in ’s.
Adv. UNIX: Filters/46 2. grep v Format: grep [options] pattern [file-list] v Search one or more files, line by line, for a pattern (a regular expression). Actions taken depend on options.
Adv. UNIX: Filters/47 Variants of grep grep Uses basic RE pattern fgrep Fast grep. Pattern can only be an ordinary string. egrep Extended grep. Pattern can use full REs.
Adv. UNIX: Filters/48 grep options v -cprint a count of matching lines v -iignore case in pattern during search v -llist filenames with match v -nprecede each matching line by a line number v -vprint lines that do not match pattern
Adv. UNIX: Filters/49 Examples File testaFile testbFile testc aaabbaaaaaAAAAA bbbccbbbbb BBBBB ff-ffcccccCCCCC cccddddddd DDDDD dddaa continued
Adv. UNIX: Filters/410 v $ grep bb testa aaabb bbbcc v $ grep -v bb testa ff-ff cccdd dddaa v $ grep -n bb testa 1: aaabb 2: bbbcc continued
Adv. UNIX: Filters/411 v $ grep bb * testa: aaabb testa: bbbcc testb: bbbbb v $ grep -i bb *$ grep -i BB * testa: aaabbtesta: aaabb testa: bbbcctesta: bbbcc testb: bbbbbtestb: bbbbb testc: BBBBBtestc: BBBBB
Adv. UNIX: Filters/412 Fancier Patterns v $ grep ’fun..ion’ file v $ grep -n ’^#define’ file v $ grep ’^#de[a-z]*’ file v $ egrep ’while|if’ *.c v $ egrep ’[0-9]+’ *.c
Adv. UNIX: Filters/ awk v format: awk program file-list awk -f program-file file-list awk is a pattern scanning and action processing language v The action language is very like C.
Adv. UNIX: Filters/414 Overview 3.1. Patterns & Actions 3.2. awk Processing Cycle 3.3. How awk Sees a Line 3.4. Pattern Expressions 3.5. ‘,’ Range Operator continued
Adv. UNIX: Filters/ Many Built-in Functions 3.7. BEGIN and END 3.8. First awk Program File: pre_header 3.9. Action Language Associative Arrays
Adv. UNIX: Filters/ Patterns & Actions An awk program consists of: pattern {action} pattern {action} :
Adv. UNIX: Filters/ awk Processing Cycle 1. Read next input line. 2. Apply all awk patterns sequentially. 3. If a pattern matches, do its action. 4. Go to step (1).
Adv. UNIX: Filters/418 Example v $ cat cars plymfury chevynova fordmustang volvogl fordltd chevynova fiat hondaaccord fordthundbd toyotatercel chevyimpala fordbronco continued
Adv. UNIX: Filters/419 v $ awk ’/chevy/ {print}’ cars chevynova chevynova chevyimpala v $ awk ’/chevy/’ cars chevynova chevynova chevyimpala v $ awk ’/^h/’ cars hondaaccord
Adv. UNIX: Filters/ How awk Sees a Line awk views each line as a record consisting of fields separated by spaces. Each field is referred to by a variable called $ : –$1, $2, $3, etc. –$0 refers to the whole line (record) The current line number is stored in NR continued
Adv. UNIX: Filters/421 v $ awk ’{print $3, $1}’ cars 77 plym 79 chevy 65 ford : 83 ford v $ awk ’/chevy/ {print $3, $1}’ cars 79 chevy 80 chevy 65 chevy
Adv. UNIX: Filters/ Pattern Expressions v Format: variable OP pattern OP forms: –matching:~!~ –ariithmetic: = > –boolean:&&||! continued
Adv. UNIX: Filters/423 v $ awk ’$1 ~ /h/’ cars chevynova chevynova hondaaccord chevyimpala v $ awk ’$1 ~ /^h/’ cars hondaaccord continued
Adv. UNIX: Filters/424 v $ awk ’$2 ~ /^[tm]/ {print $3, $2, “$” $5}’ cars 65 mustang $ thundbd $ tercel $750 v $ awk ’$3 ~ /5$/ {print $3, $1, “$” $5}’ cars 65 ford $ fiat $ chevy $1550 continued
Adv. UNIX: Filters/425 v $ awk ’$3 == 65’ cars fordmustang fiat chevyimpala v $ awk ’$5 <= 3000’ cars plymfury chevynova fiat toyotatercel chevyimpala continued
Adv. UNIX: Filters/426 v $ awk ’$5 >= “2000” && $5 = “2000” && $5 < “9000”’ cars plymfury chevynova chevynova fiat hondaaccord toyotatercel v $ awk ’$5 >= 2000 && $5 = 2000 && $5 < 9000’ cars plymfury chevynova chevynova hondaaccord
Adv. UNIX: Filters/ ‘,’ Range Operator v Format: pattern1, pattern2 v Select a range of lines. –the first line of the range matches pattern1 –the last line of the range matches pattern2 v May return several groups of lines continued
Adv. UNIX: Filters/428 v $ awk ’/volvo/, /fiat/’ cars volvogl fordltd chevynova fiat v $ awk ’NR == 2, NR ==4’ cars chevynova fordmustang volvogl continued
Adv. UNIX: Filters/429 v $ awk ’/chevy/, /ford/’ cars chevynova fordmustang chevynova fiat hondaaccord fordthundbd chevyimpala fordbronco three groups
Adv. UNIX: Filters/ Many Built-in Functions length(str) length of string str length length of current line split(strings, array, delimitor) split string into parts based on the delimitor, and place in array –split(“a bcd ef g1”, arr, “ “) continued
Adv. UNIX: Filters/431 v $ awk ’length > 23 {print NR}’ cars
Adv. UNIX: Filters/ BEGIN and END BEGIN {action} executed before first line is processed END {action} executed after last line is processed l $ awk ’END {print NR, “cars for sale.”}’ cars 12 cars for sale
Adv. UNIX: Filters/ First awk Program File v $ cat pr_header # # pr_header # BEGIN { print “Make Model Year Miles Price” print “ ” } {print} continued
Adv. UNIX: Filters/434 $ awk -f pr_header cars Make Model Year Miles Price plymfury chevynova : : chevyimpala fordbronco
Adv. UNIX: Filters/435 redirect_out v $ cat redirect_out /chevy/ {print > “chev.txt”} /ford/ {print > “ford.txt”} END {print “done.”} v $ awk -f redirect_out cars done. $ cat chev.txt chevynova chevynova chevyimpala
Adv. UNIX: Filters/ Action Language v Very C like: –var = expr –if (cond) stat1 else stat2 –while (cond) stat –for (expr1; cond; expr2) stat –printf “format” expr1, expr2,... –{ stat1 ; stat2;... ; statN } v User-defined variables do not need to be declared continued
Adv. UNIX: Filters/437 v Long statements, conditions, expressions may need to be typed over several lines. v Use ‘\’ to hide newline: if ($3 > 2000 && \ $ && \ $3 < 3000) print $3
Adv. UNIX: Filters/438 price_range v $ cat price_range { if ($ && $5 = 10000) $5 = “expensive” printf “%-10s %-8s 19%2d %5d %-12s\n”, \ $1, $2, $3, $4, $5 } continued
Adv. UNIX: Filters/439 v $ awk -f price_range cars plymfury197773inexpensive chevynova197960inexpensive : : fordbronco198325please ask
Adv. UNIX: Filters/440 summary v $ cat summary BEGIN { yearsum = 0 ; costsum = 0 newcostsum = 0 ; newcnt = 0 } { yearsum += $3 ; costsum += $5 } $3 > 80 { newcostsum += $5 ; newcnt++ } END { printf “Avg. car age: %3.1f yrs\n”, \ 90 - (yearsum/NR) printf “Avg. car cost: $%7.2f\n”, \ costsum/NR printf “Avg. newer car cost: $7.2f\n”, \ newcostsum/newcnt } continued
Adv. UNIX: Filters/441 v $ awk -f summary cars Avg. car age: 13.2 yrs Avg. car cost: $ Avg. newer car cost: $
Adv. UNIX: Filters/ Associative Arrays v Arrays that use strings as indexes: –array[string] = value Special for-loop for awk arrays: –for (elem in array) action continued
Adv. UNIX: Filters/443 manuf v $ cat manuf {manuf[$1]++} END { for (name in manuf) \ print name, manuf[name] } continued
Adv. UNIX: Filters/444 v $ awk -f manuf cars honda 1 fiat 1 volvo 1 ford 4 plym 1 chevy 3 toyota 1
Adv. UNIX: Filters/445 Sorted Output v Sort by first column (i.e. by name): $ awk -f manuf cars | sort v Sort by second column (i.e. by number): $ awk -f manuf cars | sort +1
Adv. UNIX: Filters/ sed v Format: sed ’list of ed commands’ file v Read lines one at a time from the input file –apply ed commands in order to each line –write edited line to stdout ed is an old UNIX editor –vi without full-screen mode –did you think vi was tough :)
Adv. UNIX: Filters/ Search and Replace The ‘ s ’ command searches for a pattern (a regular expression), and replaces it with the new string: ’s/pattern/new-string/g’ –‘ g ’ means global (everywhere on line)
Adv. UNIX: Filters/448 Examples v $ sed ’s/UNIX/UNIX(TM)/g’ file > new-file v $ sed ’s/^//’ file > new-file –put a tab at the start of every line (no g needed) $ sed ’s/[ ][ ]*/\/g’ file > new-file –replace every sequence of blanks or tabs with a newline –this splits the input into 1 word/line continued
Adv. UNIX: Filters/449 v $who ad tty1 Sep 29 07:14 ron tty3 Sep 29 10:31 td tty4 Sep 29 08:36 $ who | sed ’s/.* / /’ ad 07:14 ron 10:31 td 08:36 $ replace a blank and everything that follows it (as much as possible, including more blanks) up to the last blank
Adv. UNIX: Filters/450 More Information sed can use most ed commands, not just s See the entry on sed in Sobell, p
Adv. UNIX: Filters/ find v Format: find starting-directory matching-conditions-and-actions find searches all the directories below the starting directory. –it carries out the specified actions on the files that match the specified conditions
Adv. UNIX: Filters/452 Assume we are in my home directory, and want to find the cars file (used in the awk examples): $ find. -name cars -print./teach/adv-unix/filters/cars $ Basic Example starting point -name condition -print action
Adv. UNIX: Filters/453 -name nm the filename is nm -type tyty is a file type: f = file, d = directory, etc. -user usr the file’s owner is usr -group grp the file’s group owner is grp continued 5.1. Some Matching Conditions
Adv. UNIX: Filters/454 -atime n file was last accessed exactly n days ago -mtime n file was last modified exactly n days ago -size n file is exactly n 512-byte blocks long v Can use + or - to mean more or less.
Adv. UNIX: Filters/ Example Conditions -mtime +7 last modified more than 7 days ago -size +100 larger than 50K v “And”ing conditions: -atime +60 -mtime +120 –files last accessed more than 2 months ago and last modified more than 4 months ago continued
Adv. UNIX: Filters/456 v “Or”ing Conditions: \( -mtime +7 -o -atime +30 \) –files last modified more than 7 days ago or last accedded more than 30 days ago v “Not” -name \*.dat \! -name gold.dat –all “. dat ” files except gold.dat
Adv. UNIX: Filters/ Some Actions -print display pathname of matching file -exec cmd execute cmd on file -ok cmd prompt before executing cmd on file Commands must end with \; and use {} to mean the matching file, e.g.: -ok rm {} \;
Adv. UNIX: Filters/ Examples v $ find. -name \*.c -print –Starting from the current directory, display the pathnames of all the files ending in “.c ” v $ find. \( -name core -o -name junk \) -print -ok rm {} \; –Print the pathnames of all the core and junk files in the current directory and below, and prompt to remove them. continued
Adv. UNIX: Filters/459 $ find /usr -size mtime +30 -exec ls -l {} \; –Display a long list of all the files under /usr larger than about 500K that have not been modified in a month.
Adv. UNIX: Filters/ Problems with Permissions A find over the entire filesystem will print many error messages when access is denied to other user’s directories. These error messages (sent to stderr ) can be redirected to /dev/null (a UNIX “black hole”).
Adv. UNIX: Filters/461 Example Search for a file/directory called zip anywhere below the root directory: $ find / -name zip -print find: /exports/tmp/code/ : Permission denied find: /exports/tmp/code/ : Permission denied find: /exports/home/suthon/private: Permission denied find: /exports/home/cj/mail: Permission denied : : continued
Adv. UNIX: Filters/462 Redirect standard errors to the black hole using 2> $ find / -name zip -print 2> /dev/null /exports/home/s /project/zip /exports/home/s /project/zip /exports/home/s /zip /exports/home/s /zip/zip $