CS 403: Programming Languages Fall 2004 Department of Computer Science University of Alabama Joel Jones
Other Filters (cont.) tr inputChars outputChar(s) tr a-z A-Z maps lower case to upper case Flags: -s squeezes multiple occurences of a character in the input to a single character in the output; -c takes the complement of the first argument, e.g. tr -c ab matches every character except a and b. tr also understands character ranges. uniq removes duplicate adjacent lines Flags: -c adds count of duplicate lines at beginning ‘\012’ is a new line Pair Up: Write a pipeline that prints the 10 most frequent words in its input.
Printing 10 most common words cat $* | # tr doesn’t take filename arguments tr -sc A-Za-z ‘\012’ | # all non alpha become newline sort | uniq -c | # get the count sort -n | # sort by count tail # prints 10 by default Use the man command to look at for help man sort
sed sed [options]‘list of commands’ filenames … Commands s/re1/re2/ substitute regular expression r1 with r2, first instance on every line s/re1/re2/g substitute regular expression r1 with r2, every instance on every line #command, does command for # times E.g. sed 3q prints first 3 lines /re1/q prints lines up to first one matching re1, then quits
sed (cont.) sed ‘s/^/ /’ file Pair Up: Write a sed command line that indents a line by adding four spaces to the beginning of the line More commands /re1/s/re2/re3/ substitute regular expression re2 with re3, first instance on every line matching re1 Pair Up: What does the above sed command do with empty lines? Write a sed command line that fixes this problem. sed ‘/./s/^/ /’ file # or sed ‘/^$/!s/^/ /’ file
sed (cont.) Commands (cont.) /re/d deletes lines matching re /re/p prints lines matching re Options -n turns off automatic printing -f filename takes sed commands from filename Pair Up: What does the sed command sed ‘/the/p’ < file do? Pair Up: Write a sed command line that does the same thing as: grep re file sed -n ‘/re/p’ file
awk awk [options]‘program’ filenames … Like sed, but program is different: pattern { action } pattern { action } … awk reads input in filenames one line at a time & when pattern matches, executes corresponding action Patterns Regular expressions C-like expressions
awk (cont.) Pattern or action is optional Pattern missing—perform action on every line Action missing—print every line matching pattern Simple action print—without argument prints current line Pair Up: Write an awk command line that does the same thing as: grep re file awk ‘/re/’ file Pair Up: Write an awk command line that does the same thing as: cat filenames … awk ‘{ print }’ filenames …
awk (cont.) Variables $0—entire line, $1-$NF—fields of line E.g. awk ‘{ print $2 }’ textFile prints 2nd field of every line of textFile E.g. who | awk ‘{ print $5, $1 }’ | sort prints name and login sorted by time NF is number of fields on current line NR is number of records (lines) read so far Options -Fchar—sets char as the field separator Pair Up: Write an awk command line that prints user names out of /etc/passwd, where the user name is the first field and fields are colon separated.
awk (cont.) awk -F: ‘{ print $1 }’ /etc/passwd N.B. most unix systems don’t store users in /etc/passwd anymore Field breaking Default is on space and tab and multiple contiguous white space counts as a single white space and leading separators are discarded Setting separator causes leading separators to be counted
awk (cont.) More on patterns Print user name of people who have no password $2 == “” 2nd field is empty $2 ~ /^$/ 2nd field matches empty string $2 !~ /./ 2nd field doesn’t match any character length($2) == 0 Length of 2nd field is zero Pair Up: Write an awk command line that prints lines of input that have an odd number of fields. awk ‘NF % 2 != 0’ files Pair Up: Write an awk command line that is the shortest equivalent of cat. awk '/^/' files