awk
awk? Alfred Aho, Peter Weinberger, Brian Kernighan Bells Labs 1977
You think awk is bad? Winner of the International Obfuscated C Code (one liner): main(int c,char**v){return!m(v[1],v[2]);}m(char*s,char*t) {return*t-42?*s?63==*t|*s==*t&&m(s+1,t+1):!*t:m(s,t+1)||*s&&m(s+1,t);}
Awk variants awk - original from AT&T - 1977 nawk - A newer & improved (AT&T) - 1993 gawk - The Free Software Foundation - 1985-88
Why awk? Excellent filter and report writer. Processing these rows and columns Easier to use AWK than most conventional programming languages. Considered as a pseudo-C interpreter understands the same arithmetic operators as C. Has string manipulation functions, so it can search for particular strings and modify the output. Has associative arrays, which are incredible useful.
Pattern Action Pairs condition { action } :
awk Syntax { [ statement ] ...} variable=expression print [ expression-list ] [ > expression ] printf format [ , expression-list ] [ > expression ] next exit
awk Syntax - more if ( conditional ) statement [ else statement ] while ( conditional ) statement for ( expression ; conditional ; expression ) statement for ( variable in array ) statement break continue
BEGIN …. END BEGIN { do something before main body } condition { action – main body } END { do this after main body } e.g., create a file called fields: #!/bin/awk –f BEGIN { FS = ":" } { print "Name:\t ", $1 print "Year:\t ", $2, "\tMovie: ", $3 } END { print "Number of records:\t ", NR print "Number of fields:\t ", NF } $ chmod 755 fields $ fields moviedb (or ./fields moviedb if you don’t have $PATTH set)
Run as awk or bash script? #!/bin/sh awk ‘ BEGIN { print "Using bash -f" } {print $8, "\t", $3} END { print " -- completed --" } ‘ Or #!/bin/awk -f BEGIN { print "Using awk -f" }
Arithmetic Operators Operator Type Meaning + Arithmetic Addition - Subtraction * Multiplication / Division % Modulo <space> String Concatenation
Arithmetic Operators Examples Expression Result 8+5 13 8-5 3 8*5 40 8/5 1.6 8%5 8 5 85 What’s the output of: x = 2+1*3 8 Same as: (2+(1*3)) “8” “58”
#0 Put this code in a file called avg BEGIN { FS = "\t" } #1 Expect 1st record = number of students NR == 1 { print "Number of students: ", $1 total=0 next } #2 Add each record and add to total { print $1, "\t", $2 total+=$2 END { print "Average = ", total/NR } $ cp ~tan/public/scores . $ avg scores
# File: matchregex Counts number of lines matching regex BEGIN { total=0 } /^..*:$/ { # line begins with a ".", followed by any number of chars # and ends in a colon print “Found: ", $0 # $0 means whole line total += 1 } END { print "\n---------------------------------" print "#Matches = ", total } $ chmod 755 match; cp ~tan/public/test . $ match test
Comparing Regex Operator Meaning ~ Matches !~ Doesn't match
# File: matchregex: Counts number of lines where 1st arg matches regex BEGIN { total=0 } $1 ~ /^\.[0-9]+.*/ { # line begins with a-m # followed by any number of char print "Found:\t", $0 total += 1 } END { print "#Matches = ", total } $ chmod 755 matchregex $ matchregex test
# File: matchregex: Counts number of lines where 1st arg matches regex BEGIN { total=0 } $1 ~ /^\.[0-9]+.*/ { # line begins with a-m # followed by any number of char print "Found:\t", $0 total += 1 } END { print "#Matches = ", total } $ chmod 755 matchregex $ matchregex test
Really Weird Syntax !!! Embedding awk in bash # Bash’s arguments vs. awk’s arguments # Find an acronym. File: lookup, DBfile: acronym (copy from ~tanjs/public) #!/bin/sh awk '$1 == find' find=$1 acronyms # or awk '$1 ~ find' find=$1 acronyms Parameters passed to awk are specified after awk script!! $ chmod 755 lookup $ lookup GOT acronyms
Arithmetic Operators Operator Type Meaning + Arithmetic Addition - Subtraction * Multiplication / Division % Modulo <space> String Concatenation
For loop #!/bin/awk –f BEGIN { sum=0 for (i=1; i <= 10; i++) { printf "The sum of integers up to : " printf sum+=i printf " is " print sum } # now end exit;
Associative Arrays #!/bin/awk –f # filename: assArray BEGIN { FS = "\t" } { acro[$1] = $2 } END { for ( abbrev in acro ) print abbrev, acro[abbrev] $ assArray acronyms
bash & awk combo #!/bin/sh # arraylookup: look for an abbreviation in a file using associatve array # Syntax: arraylookup <abbrev> <file>" assArray $2 | grep $1 } $ arraylookup GOT acronyms