Lecture 7.2 awk
History of AWK The name AWK – Initials of designers: Alfred V. Alo, Peter J. Weinberger, and Brian W. Kernighan. Appear 1977, stable release 1985 In BSD, OS X: bawk or nawk. GNU/Linux : gawk The basic function of AWK: – Search files for lines that contains certain patterns
Basic command pattern {action} pattern, can use R.E pattern is optional awk ‘/foo/{print $0}’ file Use single quote, to avoid shell interpret the pattern print the whole line that matches pattern The whole line
The Basics of AWK A line is called a record text separated by delimiter is called field $0, $1,... etc – $0 : the whole line – $1 : the first field in a line NR : Number of record – also the line number NF : number of fields in a line
Demos print line number, employee first name, and number of fields % cat employees Tom Jones /12/ Mary Adams /4/ Sally Chang /22/ Billy Black /23/ %awk ‘{print NR, $1, NF}’ employees 1 Tom 5 2 Mary 5 3 Sally 5 4 Billy 5
The Basics of AWK cont. FS : field separator – default is space and/or tabs (strip the leading blanks) – change by -F (e.g. -F:, -F’[ :\t]’ ) OFS : output field separator – default is space – change by ‘{OFS=DELIMITER};’ –% awk ‘{OFS=“-----”} ;{print $1, $2}’ employees Tom----Jones Mary----Adams Sally----Chang Billy----Black
Demos use ‘:’ as the delimiter, instead of space awk –F: ‘{print $1}’ newlist Use ‘:’ and ‘ ‘ as delimiter %cat newlist.txt Tom Jones:4424:5/12/66: Mary Adams:5346:11/4/63:28765 Sally Chang:1654:7/22/54: Billy Black:1683:9/23/44: %awk -F: '{print $1}' newlist.txt Tom Jones Mary Adams Sally Chang Billy Black %awk -F'[: ]' '{print $1}' newlist.txt Tom Mary Sally Billy
Awk patterns and actions awk ‘$3<4000’ employees – print lines where $3 is less than 4000 awk ‘/Tom/{print “Hello, “ $1}’ employees – find the line contains Tom, then print “Hello Tom” awk ‘$1 !~ /ly$/{ print $1}’ employees – print the names that dose not end with ly
Awk in script #file: awk_first /Tom/{print “Tom‘s birthday is ” $3} /Mary/{print NR, $0} #print line number /^Sally/{print "Hi, Sally. " $1 " has salary of $" $4 "."} % awk –F: -f awk_first newlist.txt
AWK comparison expression OperatorMeaningExample < Less than $1 < 100 <= Less than or equal to $1 <= 100 == Equal to $1 == 100 !=!= Not equal to $1 ! =100 >= Greater than or equal to $1 >= 100 > Greater than $1 > 100 ~ Match by regular expression $1 ~ /*ly/ !~ Not match by regular expression $1 !~ /T*/ Conditional expression condition? exp1 : exp2 Logical Operation &&, ||, !
Condition expression example % awk ‘{max=($1>$2)? $1 : $2; print max}’ filename %cat needmax.txt % awk '{max=($1>$2)? $1 :$2; print max}' needmax.txt awk '{ max=0; if ($1 > $2 ) $max = $1 ; else $max = $2; print $max;}' needmax.txt
datafile northwestNWJoel Craig westernWESharon Kelly southwestSWChris Foster southernSOMay Chin southeastSEDerek Johnson easternEASusan Beal northeastNETJ Nichols northNOVal Shultz centralCTSheri Watson
More comparison examples awk ‘$7==5{print $7+5}’ datafile – print lines where the 7 th field is 5 awk ‘$2==“CT” { print $1, $2}’ datafile – print the 1 st and 2 nd field of lines that the 2 nd field is CT awk ‘!($2==“NW”) || $1 ~ /south/{ print $0}’ datafile awk '$8 > 10 && $8 < 17' datafile
Math operators awk '/southern/{print $ }' datafile awk '/southern/{print $8 - 10}' datafile awk '/southern/{print $8 / 2}' datafile awk '/southern/{print $8 * 2}' datafile awk '/northeast/ {print $8 % 3}' datafile
Assignment Operator assignment operators: =, +=, -=, *=, /=, %=, ^= increment and decrement: ++, -- awk '$3 == "Chris"{ $3 = "Christian"; print}' datafile – if a line’s 3 rd field is “Chris”, change it to Christian and print out the line awk ‘/Derek/{$8+=12; print $8}’ datafile awk ‘{$7^=2; print $7}’ datafile – square the 7 th field and print out the 7 th field awk ‘{x=1; y=x++; print x, y}’ datafile
BEGIN Patterns BEGIN pattern is followed by an action block that is executed before AWK processes any lines from the input file. – can run an awk command without file –%awk ‘BEGIN{ print “Hello”;}’ % awk ‘BEGIN{FS=“:”; OSF=“—”; ORS=“\n\n”}{print $1 $2 $3}’ newlist.txt Tom Jones /12/66 Mary Adams /4/63 Sally Chang /22/54 Billy Black /23/44
END Patterns END patterns executes the commands after processing a file –%awk ‘END{ print “The number of records is “ NR}’ employees –awk '/Mary/{count++}END{print "Mary was found " count " times."}' employees The number of records is 4 Mary was found 1 times.
Redirections and Pipes > : save to file >> : append to file awk '$7 >=5 {print $1, $2, $7> “out.txt" }‘ datafile – instead of print to screen, save to ‘out.txt’ file %cat out.txt western WE 5 eastern EA 5 north NO 5 central CT 5
Pipe %awk '/ly/{print $1, $2}' employees Sally Chang Billy Black %awk '/ly/{print $1, $2 |"sort"}' employees Billy Black Sally Chang %awk '/ly/{print $1, $2 |"sort | head – n1"}' employees Billy Black Sally Chang # you can chain as many as you want
There are much more to explore conditional statement loop arrays user-defined functions. Suggest Read Chapter 6 of the book