Unix Utilities (sort/uniq) CS465 – Unix
The sort command Sorts lines Default behavior: Do a case-sensitive, ascii- alphabetic line sort, starting at the beginning of each line Can use sort options to sort on different fields and in different ways.
sort options Format: $ sort [options][files] Options: +n skip n fields before sorting -- older method (i.e. sort from field n+1 to end of line) -kx sort from field x to end of line (new method) +n -m sort from field n+1 to field m -- older method -kx,y sort from field x to field y (new method) -kx,x -ky,y sort on field x, then on field y
sort options Format: $ sort [options][files] Options: -b ignore leading whitespace -d dictionary order (blanks and alphabetic chars only) -f ignore case (upper/lower considered same) -n sort in numeric order -o file output to named file -r sort in reverse (descending) order -tc separate fields using c (default is whitespace)
sort examples $ sort +1 list1 # sort list1 starting from field 2 to the end of the line $ sort –k2,3 list2 # sort list2 based upon the second and third fields together $ sort –k3,3 –k5,5 list3 # sort list3 on the third field, then the fifth field
sort examples $ ls -l | sort -k9 -r # sort long listing of current directory in reverse filename (field 9) order $ sort –k3 -o slist2 list2 # sort list2, starting with the third field, and output the results to slist2 $ sort -k2 -b list3 > slist3 # sort list3, starting with field 2, and ignoring blanks, and place the output in slist3
sort examples $ sort -k2 sortfile.txt bruce 1 david 10 edward 12 albert 2 chris 20 $ $ sort -n -k2 sortfile.txt bruce 1 albert 2 david 10 edward 12 chris 20 $ $ sort sortfile.txt albert 2 bruce 1 chris 20 david 10 edward 12 $
Handout Review sort examples on handout
uniq command Removes duplicate lines from a file: $ cat ab.txt aaa bbb $ uniq ab.txt aaa bbb Duplicate lines in file must be adjacent, so uniq is often used with sort : $ sort ab.txt | uniq > ab-uniq.txt
Using sort with uniq $ uniq fruit apple banana apple $ $ cat fruit apple banana apple banana $ $ sort fruit | uniq apple banana $
uniq options -c print each line once, along with a count of occurences of each -d print duplicate lines once (and don’t print any unique lines) -fN do not compare the first N fields (skip fields) -u print ONLY unique lines (discard ALL duplicates)
$ cat names Bill Pam Ron Sue $ uniq examples $ uniq names Bill Pam Ron Sue $ $ uniq -d names Pam Sue $ $ uniq –c names 1 Bill 2 Pam 1 Ron 2 Sue $ $ uniq -u names Bill Ron $
$ cat names Bill Jones Pam Smith Sue Smith Paul Jones Dave Smith Ron Smith $ uniq examples $ sort –k2 names Bill Jones Paul Jones Pam Smith Sue Smith Dave Smith Ron Smith $ $ sort –k2 names | uniq –f1 Bill Jones Dave Smith $ $ sort –k2 names | uniq –f1 -c 2 Bill Jones 4 Dave Smith $