Download presentation
Presentation is loading. Please wait.
1
grep (Global REgular expresion Print) Operation –Search a group of files –Find all lines that contain a particular regular expression pattern –Write the result to an output file –grep returns to the prompt with no extra output when it is done Syntax: grep [-cilLnrsvwx] pattern [list of files] Examples –find information about the user, harley >grep harley /etc/passwd –Find all lines in the files containing the string xxx. >grep xxx.
2
grep Flags 1.-c count the number of matches 2.-i Ignore case when searching for matches 3.-l List the file names containing matches 4.-L list files that do not have a match 5.-n Write the line number in front of each line 6.-r perform a recursive directory search 7.-s suppress warning and error messages 8.-v search for lines without the matching pattern 9.-w search only for complete words 10.-x lines that exactly match the pattern
3
Regular Expressions Industry standard way to specify patterns –In Java: string.match("pattern"); –In Java: string.replaceAll("pattern", string) Meta-characters/operators (some need to be escaped) ^ beginning of line, $ end of a line * match 0 or more of the previous group + match 1 or more of the previous group ? match 0 or one of the previous group {n} match n of the previous group {m,n} match m to n of the previous group {n,} match n or more of the previous group | match either the group before or the groups after. match any character except for new line \ literally interpret the following meta-character or operator Note: Many UNIX programs use these (vi, sed, more, grep, awk)
4
Regular Expression Examples Regular ExpressionStringMatch [a-z](12){3}[c-e]{3}a121212cdeYes a.*e+abc12cdeYes a.*fabc12cdeNo ^a.*e$abc12cdeYes ^b*e$abc12cdeNo ^a*e$abc12cdeNo \^.*\$^ab12cd$Yes ^.*$^ab12cd$Yes ^*$^ab12cd$No Note: To use ( ) { } or + grep use the –E (extended) switch or precede with \
5
More grep Examples Contents of a file called homework Math: problems 12-10 to 12-33, due Monday BasketWeaving: make a 6-inch basket, DONE Psychology: essay on Animal Existentialism, due end of term Surfing:catch at least 10 grep commands >grep –v DONE homework displays all but line 2 >grep –c DONE homework displays 1 >grep –wi ".*a.*" on homework displays all lines >grep –w "m.*e" homework displays line 2 >grep –i "d.*e" homework displays lines 1, 2 and 3 >grep '\(Ma\|DO\).*' homework displays lines 1 and 2 Note: the last example escapes the parentheses and the vertical bar
6
Sorting Data Background –Each line in a file is a record –Each line is a series of fields separated by spaces and/or tabs Commands >sort fileName sorts fileName on the 1 st field of each line >sort -k 6 fileName sorts on the 6 th field of each line >sort –n –k 5 fileName sort on the 5 th field numerically >sort –t sort –k4r –k3 abc fileName sort descending on the 4 th field, and then ascending on the 3 rd with ':' as a delimeter >sort –t ':' fileName sort using ':' as a separator character >sort –u –k2r fileName sort reverse on the 2 nd field and remove duplicates (output must be unique) >sort –k 3,4 in a pipe sorts by the key, from field 3 through field 4 >sort –k5n –k8 sorts numeric by the 5 th field and alphabetic by the 8 th
7
SED (Stream Editor SED is a filter –Input from stdin or a file –Output to stdout or a file –Modifies the input to produce the output –Non-interactive Processing –Read from an input stream –Perform line oriented commands –Write to an output stream Syntax: >sed [-i] command | [-e command] … [file]
8
Search and Replace Search, change and redirect to newFile >sed ‘s/cat/dog/g' file > newFile Search, change, and edit file >sed –i ‘s/cat/dog/g' file Specific range of lines: >sed '5,10s/cat/dog/g' file Lines apply search to lines containing OK: >sed '/OK/s/cat/dog/g' names Lines apply to lines having 2 numeric characters >sed '/[0-9]\{2\}/s/cat/dog/g' names Delete range of lines: >sed '5,10d' file Note: single quotes suppress the shell's interpretation of special characters Note: This syntax works in vi, more, awk Note: You must escape the characters: +, { and } for it to work
9
Complex Commands sed –i \ -e 's/mon/Monday/g' \ -e 's/tue/Tuesday/g' \ -e 's/wed/Wednesday/g' \ -e 's/thu/Thursday/g' \ -e 's/fri/Friday/g' \ -e 's/sat/Saturday/g' \ -e 's/sun/Sunday/g' \ calendar The backslash is a continuation character The –e specifies another command (extension) The '/g/ means change every occurrence on each line, not just the first
10
AWK AWK (Aho, Weinberger, Kernigham) Special purpose programming language –Interpretive –Useful for UNIX Scripts Purposes –Filter text files based on supplied patterns –Produce reports –Callable from "vi" –Create simple databases –Simple mathematical operations –Creating scripts Not good for large complicated tasks Other interpretive languages: perl, php
11
General Syntax The single quote causes the shell to ignore special characters The various clauses are optional Much of the syntax for clauses is c and Java compatible The patterns utilize regular expressions BEGIN { } { } { } END { } >awk ' '
12
AWK General Operation Each file consists of a series of records Each record is a series of fields Defaults –Record separator: new line character –Field separator: white space characters Flow of Operation –Read the input file line by line –If it matches the line, then process –Otherwise skip
13
Some AWK Simple Examples 1.Print fields of records in a file >awk ' {print $5, $6, $7, $8} ' fileName 2.Print lines with a search string >awk '/gold/ {print}' fileName 3.Print the number of records >awk 'END {print NR, "records"}' fileName 4.Print records using a condition >awk '{if ($3 awk ‘$2 > max {println $2}’ fileName 5.Comparing field to regular expression >awk ‘$2 ~ /[0-9]+/ {print $2}’ fileName 6.Using variables >awk '/gold/{sum += $2} END {print "value = " sum}‘ \ fileName
14
A Longer AWK command awk –F ';' \ 'BEGIN \ {num_gold=0; wt_gold=0; } \ \ /[Gg]old/ { num_gold++; wt_gold += $2; } \ \ END \ { printf("\n Gold Pieces: %2d %5.2f\n“, \ num_gold, wt_gold); \ }' \ goldFile Gold3.5 Silver2.25 Bronze5.31 Gold23.22 gold0.22 goldFil e Output Gold Pieces: 3 26.94 Note: The backslashes are continuation lines Semi colons delimit the fields in the file
15
Execute Program in a file # awk program summarizing a coin collection BEGIN {num_gold=0; wt_gold=0; } /gold/ {num_gold++; wt_gold += $2}; END { val_gold = 485 * wt_gold; printf("\n Gold Pieces: %2d", num_gold); printf("\n Gold Weight: %5.2f", wt_gold); printf("\n Gold Value: %7.2f\n", val_gold); } awk –F ' ; ' –f Output Gold Pieces: 3 Gold Weight: 26.94 Gold Value: 13065.90
16
Invoking AWK >awk [-F ] [ ] [-f ] [ ] [- | ] is a field separator (default: space, tab) an AWK program a file containing an AWK program a series of variables to initialize >awk –f program f1=file2 f2=file1 > output - means accept AWK input from STDIN a file containing data to process Note: AWK is often invoked repeatedly in shell scripts
17
Search Patterns An exact string: /The/ A string starting a line: /^The/ A string ending a line: /The$/ A String ignoring case of first letter: /[Tt]he Decimal: /[0-9]*.[0-9]*/ Alphanumeric: /[a-zA-Z0-9]*/ Choice between two strings: /(da|De).*/ Numeric: /[+-]?[0-9]+/ Any Boolean expression: $4>90 or $4>$5 Note: Some utilities require \(, \) and \| if you use ()| regular expression characters
18
Built in Variables NR: Total number of records NF: Total number of fields FILENAME: The current input file FS: Field separator character RS: Record separator character OFS: Output field separator character ORS: Output record separator character OFMT: The default printf output format
19
Arrays and control structures Indexed and associative arrays –By index: months[3] = "March"; –Associative: debts["Kim"] = 1000; –Note: arrays index from one, not zero Counter Controled: for (i=1, i<100; i++) data[i] = i; Iterator: for (i in myArray) print i, names[i]; Pre test: i=0; while (i<20) data[i] = i++; Condition: if (i==1) print debts["Kim"] else print debts["Joe"]; print (i==1)? debts["Kim"] : debts["Joe"]; Unconditional control statements –break: jump out of a loop –continue: next iteration –next: get next line of input –exit: exit the AWK program
20
Built-in functions Square root: print sqrt(3.6) Integer portion: print int(3.2) Substring: print substr("abcde", 3,2); Split: letters = split("a;b;c;d;e", ";"); Position: print index("gorbachev", "bach"); Note: if a substring doesn't exist, 0 returned Note: Strings index from one, not zero
21
printf printf(, ); –printf applies the template to the arguments –Formats are specified in the templates %d for integer output %o for octal %x for hexadecimal %s for string %e for exponential format %f for floating point format –Greater control %5.2f means 5 spaces wide, print two digits %-8.4s means left justify, 8 wide, print 4 characters %08s means output leading zeroes, print 8 characters
22
Escape Characters New line: \n Carriage return: \r Backspace: \b Horizontal tab: \t Form feed: \f A quote: \" A backslash: \\
23
AWK redirection and pipes Create a file with the first field >awk '{print $1 >> "file" }’ Pipe output to another utility >ls –l | awk '{print $8}' | tr '[a-z]' '[ A-Z]' Pipe to a utility to translate from lower to upper case Sort the grades file and print the first field >sort +4n grades | awk '{print $1}' list.txt files ls –l | grep '\.txt$' | awk '$5 < 2000 {print $9, $5}' | sort –nr +1
24
More Examples Print Bush's grades >awk '/Bush/{print $3, $4}' grades Print first name, last name, and quiz 3 grade for everyone who got more than a 90 on quiz 1 and 2 >awk '{if ($4>90 && $5>90) print $3, $2, $6}' grades >awk '$4>90 && $5>90 {print $3, $2, $6}' Print username for user with userid 502 >awk –F: '{if ($3==502) print $1}' >awk –F: '$3==502 {print $1}'
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.