UNIX Chapter 10 Advanced File Processing Mr. Mohammad Smirat
Regular Expression Some of the commonly used tools that allow the use of regular expressions are awk, ed, grep, egrep, and vi, but the level of support for regular expression isn’t the same for all these tools.
Regular Expressions (con … )
Regular Expression (cont … ) The following are some of vi commands which illustrated the use of regular expressions: / [0-9] / Do a forward search for a single stand alone digit. ?\.c[1-7]? Do a backward search for words or strings that end with.c followed by a single digit between 1 and 7. :1,$s/:$/./ search the whole file and substitute colon (:) at end of a line withperiod(.). :.,$s/^[Hh]ello /Greetings/ from the current line to the end of file, substitute the word hello and Hello, starting a line with the word Greeting. :1,$s/^ *// eliminates one or more spaces at the beginning of all the lines.
Regular Expression (cont … ) A friend helped you type a very long report, but he consistently typed “thorogh” instead of “thorough”, he typed line 223 through 456 fix the document in one command? :1,$ s/thorogh/thorough/g :223,456s/thorogh/thorough/g The phone company has moved the 710 exchange to a new area code, so you must include the area code 023 before you dialing all 710 exchange numbers, fix this problem to the phone company file in one command?
Compressing Files compress[options][file-list] The output will be compressed.Z file or standard output in input is from standard input. Options: -cwrite compressed file to the display screen instead of a.Z file. -fforce compression (no prompt) -vdisplay compression percentage and the names of compressed files. $compress -v t1 t2 t1: -- replaced with t1.Z compression: 14.50% t2: -- replaced with t2.Z compression: 17.89% $uncompress -v t1.Z t2.Z t1.Z: -- replaced with t1 t2.Z: -- replaced with t2
Compressing Files (cont..) zcat[options][file-list] Concatenate compressed files in their original form and send them to standard output. $zcat t2.z This file is being used to explain test two. $ The file t2.z remains.
Sorting Files sort[options][file-list] sorted lines in ASCII files in file list Options: -bignore leading blanks -dsort according to usual alphabetical order, ignore all chars except letters. -fconsider lowercase and uppercase to be equivalent +n1[-n2] specify a field as the sort key, starting with +n1 and ending at -n2, or end of line if n2 is not specified. -rsort in reverse order
Sorting Files (cont … ) $ sort students file will be sorted by all chars from left to right. $sort +1 students file will be sorted starting with the second field of the file (field # 1), because the first field is field #0. $sort +3 -r -b students file will be sorted starting with the fourth field, in reverse order and ignoring the leading white spaces. $sort b student file will be sorted starting with the second field and ending with the third field (primary key), the secondary key starts at the fourth field and ends at the end of line, ignore leading white spaces.
Searching for commands and Files find directory-list expression Search the directories in the list to locate files that meet the criteria described by the expression. Options: -inum Nsearch for files with inode number N -name patternsearch for files that are specified by the pattern. -user name search for files owned by user name. -links Nsearch for files with N links.
Searching for commands and Files (CONT … ) $find ~ -name JUST.gif will search the home directory and display the pathname of the directory that contains JUST.gif. $find /usr/include -name socket.h will search directory /usr/include for socket.j and print the absolute pathname of the file. find /usr. -inum 2342 will search /usr and. (pwd) directories for all the files that have an inode number 2342 and print the absolute pathname of all such files.
Searching Files grep[options] pattern [file list] egrep[options][string][file list] fgrep[options][expression][file list] Search the files in file list for given pattern, string, or expression, if no file list, take input from standard input. Options: -cprint the number of matching lines only -i ignore the case of letters -lprint only the names of files -nprint line # along with matching lines -vprint non matching lines -wsearch for given pattern as a string
Searching Files (CONT … ) $grep Ali students displays line in file students that contain the string Ali. $grep -n Ali students displays the line number along with the matching lines for string Ali in the file students. $grep -n include *.c search the pwd for the string include in all file with extension.c $grep ‘^[A-H]’ students displays the lines in the file students that start with the letter A through H. the ^ specifies the beginning of the line.
Searching Files (CONT … ) $egrep “^J” students displays the lines in the student file that start with letter J, the ^ indicate the beginning of the line. $egrep “^J|^K” students displays the lines in the file students that start with letters J or K. $grep ‘\<Ke’ students displays the lines that contains words starting with the string Ke. Not \< is used to indicate start of a word. $egrep -v “Ali|Nabeel” students displays the lines that do not contain Ali or Nabeel in the file students.
Cutting and Pasting cut –blist [-n][file list] cut –clist [file list] cut –flist [-dchar][-s][file list] cut out fields of a table in a file. Options: -b listTreat each byte as a column and cut bytes specified in the list. -c listTreat each char as a column and cut char specified in the list. -d charUse the char instead of the char as field separator. -f listCut fields specified in the list. -nDo not split chars (used with -b option) -sDo not output lines that do not have the delimiter character.
Cutting and Pasting (cont … ) $cut -f1,2 students specify the first and second field of the student file. $cut -f1,2,5 student specify the first, second and fifth field of the student file. $cut -f1-3 students specify fields one through three from the file name students. $cut -d: -f5,1,6 /etc/passwd the -d is used to specify : as the field separator, then display or cut field 5,1, and 6 from the directory /etc/passwd.
Cutting and Pasting (cont … ) Paste[options] file list Horizontally concatenate files in file list. Options: -d listUse list chars as line separators. $paste students student_record display on the screen both file concatenated horizontally. $cut -f1-3 students > table1 $cut -f4 student_record > table2 $paste table1 table3 $rm table1 table 2 display the first three field of students and the seventh field of file student_record.
File Encryption and Decryption Crypt [options] Encrypt (decrypt) standard input file and send it to standard output file. Options: key password to be used to perform encryption and decryption. -kuse the value of the environment variable CRYPTKEY.