Presentation is loading. Please wait.

Presentation is loading. Please wait.

Filters and Utilities. Notes: This is a simple overview of the filtering capability Some of these commands are very powerful ▫Only showing some of the.

Similar presentations


Presentation on theme: "Filters and Utilities. Notes: This is a simple overview of the filtering capability Some of these commands are very powerful ▫Only showing some of the."— Presentation transcript:

1 Filters and Utilities

2 Notes: This is a simple overview of the filtering capability Some of these commands are very powerful ▫Only showing some of the basics of a few of the commands

3 Reminder: Grave accent ▫AKA backtick or backquote ▫Used for command substitution in bash and other Linux utilities and languages ▫Typical use:  put a command between a pair of `  the std out of the command is substituted ▫Example:  #echo The date is:`date`! #The date is:Sun Mar 17 15:51:28 EDT 2013!

4 What are Filters? ▫Use std in and std out  Monitor the input  Modify data as appropriate  Change  Delete  Move  "as appropriate"  Send data to standard out

5 Filter examples Simple ▫pr ▫cmp ▫diff ▫comm ▫head ▫tail ▫cut ▫paste ▫sort ▫uniq ▫tr Complex ▫grep ▫sed Filter/script ▫awk

6 pr: Paginate Files Prepare files for printing Adds: ▫Headers ▫Footers ▫Formatted text Default adds 5 lines before and after text on page Options: ▫Make columns ▫Set page length ▫Set page width ▫Number lines in output

7 cmp: Byte by Byte Compare Compares two files Terminates on first delta ▫Echoes the location of first mismatch  Usually reports line and character position ▫Returns:  True if identical  False otherwise

8 comm: What Is Common between files Compares files line by line ▫Requires sorted files to work properly Returns 3 types of differently indented lines ▫Lines unique to first file ▫Lines unique to second file ▫Lines common to both Output is “weird” in columns 1 st col is lines unique to 1 st file 2 nd col is lines unique to 2 nd file 3 rd col is common lines comm.sh in ~/ITIS3110/bashscripts commbad.sh (with error)

9 diff: "How to make files the same" Details how to change one file to make it the same as the other ▫For deltas instructions of how to change

10 head: Display beginning of file Show the first n lines of a file ▫Default is 10 ▫Can change with –n x Example use: ▫Want to re-edit the last file you edited: ▫ nano `ls –t | head –n 1`  ls –t: list by time  head –n 1: list first entry  Feed as a parameter to nano with the backticks

11 tail: Display end of file Show the last n lines of a file ▫Default is 10 ▫Can change with –n x Options ▫-f  Monitor the file as it grows  Must terminate with ▫-c  Do the last n chars instead of lines

12 cut: Splitting a file vertically Cuts a range out by: ▫Columns  Good for fixed length entries  -c range  -c1-4 ▫Fields  Good for delimited entries  Tab is default  -d specifies delimiter  -d/ set the / as the delimiter  -f specifies the fields to use  -f1,4 specifies the first and fourth fields

13 paste: Paste files vertically Paste two files together line by line Can be used on a single file to join multiple sequential lines together ▫ -s  Do serial on a single file ▫ -d  Separate joined element with the list of delimiters

14 sort: Order files Put files in order ▫Default is ascending order on column 1  ASCII order Options: ▫-t  Define a delimiter ▫-k  Used with –t, which field to use  Can have multiple keys  Use commas to separate ranges  Use –k again to denote a new field  Can sort on columns in a field  Use a dot to separate ▫-n  Treat a field as a number, not an ASCII character  Remember the number 1 is different than the character "1" ▫-u  Remove repeated lines

15 uniq: Locating identical lines Returns only unique lines ▫Options:  -u  Return only the non-repeated lines  -d  Return only the repeated lines ▫But only one copy of each  -c  Return the count of how many times each line is repeated

16 tr: Translate characters Changes one set of characters to another, default input is the standard input Example: ▫ #tr 'ab' 'cd' This is abnormal This is cdnormcl absolute cdsolute ab a b c cd c d c ^C  Blue is std in  Red is std out – bold is what changed ▫Note: a  c and b  d, not ab  cd ▫Note: ^D can be used to denote end of file to tr instead of the shown ^C which stops the process tr

17 tr: Translate characters More examples: ▫Can be used to translate case for a file  tr a-z A-Z <file1 or tr '[a-z]' '[A-Z]' <file1  Takes the input from file1 with the < redirection  Turns all lower case letters to upper case  Output goes to std out ▫Get rid of characters  tr –d [a-z] <file1  Gets rid of all lower case chars from file1  Again output is std out ▫Compressing repeated chars  tr –s ' ' <file1  Changes repeated spaces to a single space

18 Resume 2/5

19

20

21 Regular Expression A pattern to match strings of text which is: ▫Concise ▫Flexible Used by many programming languages and operating systems

22 Regular Expressions BRE ▫Basic Regular Expression ERE ▫Extended Regular Expression IRE ▫Interval Regular Expression TRE ▫Tagged Regular Expression

23 Character class Set of characters enclosed within square brackets [ ] ▫Can be a list of single characters  [aD1]  a, D, and the character 1 only ▫Can be a range of characters  [a-zA-Z]  All the upper and lower case chars ▫Negate a class  [^0-9]  Not the numeric chars 0-9

24 Regular Expressions * ▫Refers to the immediately proceeding character ▫Any number of repeated character(s)  0 or more  Used with other patterns  [A*] ▫Anything that matches 0 or more ‘A’s in a row ▫ s*print will match sprint, ssprint, sssprint and print ! Note: this is not related to the familiar wildcard *

25 Regular Expressions. ▫Any character  Exactly one ▫ S... with match Sort, Sxxx, S123, …  Any four char string starting with S  Does not match Sabcd (5 characters) ▫Note.* means 0 or more of any character Pattern starting locations ▫^  Pattern starts at the beginning of a line ▫$  Pattern starts at the end of a line

26 Extended Regular Expressions | ▫Either one of a set ▫ [a|b]  Matches if an a or a b  Must be one of them ( and ) ▫Chars between the parenthesis and what is before or after ▫ ‘animaltype:(dog|cat)’  look for animaltype:dog or animaltype:cat  ( ) is used to group patterns

27

28

29 grep – Search a pattern Searches for a pattern in a file ▫ grep options pattern filename(s)  std in is used if there is no filename  Can also pipe data to grep ▫Notes:  Pattern does not need be quoted if no delimiters or special chars in it  Can always use quotes to be safe

30 grep - Options -i ▫Ignore case -v ▫Don’t display lines matching expression ▫Typically want to check the return code -l ▫Display filenames  Useful when grepping multiple files -e ▫Useful when grepping for – -x ▫Match entire line -f file ▫Takes expression from a file ▫Great if you have a messy or complex regular expression

31 grep - examples Examples: ▫ #grep 3 bigfile3 file 3 text ▫ #grep file bigfile3 file 1 text file 1 text file 3 text file 1 text file 1 test file 1 test Try grep for text and test also… #cat bigfile3 file 1 text file 3 text file 1 text file 1 test

32

33 sed – Streaming Editor Edit a file(s) with a specified action ▫ sed options 'address action' file(s) Basics: ▫Take input from the file(s) ▫Performs the action on the file(s) ▫Sends output to std out Uses: ▫Select part(s) of a file  By line  By content ▫Edit a file  e.g. create a template, then use sed to customize for a run Oddities ▫Usually need –n to get rid of unwanted duplicated or original lines

34 sed – Line addressing Select specific lines ▫ #sed '3q' tenline.file Line 1 Line 2 Line 3  Selects the first 3 lines then quits ▫ #sed '$p' tenline.file Last Line  Prints last line  $ - last line  p – print  Show with and without the –n option ▫ #sed '5,7p' tenline.file Line 5 Line 6 Line 7  Prints lines 5 through 7 #cat tenline.file Line 1 Line 2 Line 3 Line 4 Line 5 Line 6 Line 7 Line 8 Line 9 Last Line

35 sed – Line addressing Select specific lines with ; ▫ #sed '1p;3p;$p' tenline.file Line 1 Line 3 Last Line  Prints line 1, 3 and the last line ($) ! Will negate operations ▫ #sed '3,$!p' tenline.file Line 1 Line 2  Does not print line 3 through the end Notes: ▫By default sed will echo the input lines as well as the selected lines   get duplicated lines  Use –n to not echo the input lines

36 sed – Context addressing Use a pattern to identify lines to work with ▫Use / to delimit the pattern Examples ▫ #sed –n '/2/p' tenline.file Line 2  Find all lines with 2 in them and print ▫ #sed –n '/^2/p' tenline.file  Finds all lines that start with 2 and print  ^ - starting the line #cat tenline.file Line 1 Line 2 Line 3 Line 4 Line 5 Line 6 Line 7 Line 8 Line 9 Last Line

37 sed – Writing selected lines to a file Can use w to write the selected lines to a file Example ▫ sed –n '/2/w twos.file' tenline.file  w instead of p puts the output to a file  -n does not print duplicated

38 sed – Text editing Can edit the stream ▫i  Insert ▫a  Append ▫c  Change ▫d  Delete ▫s  Substitute

39 sed - editing Example: inserting ▫ #sed '1i\ >#!/bin/bash\ ># using the bash shell >' test.sh > $$  Notes:  1i inserts text starting line 1  \ is a continuation character within the quotes  Input is the code or text in test.sh  Redirecting the output to $$ (temporary file)  Ends up with the 2 new lines at the beginning in $$  Can further modify $$

40 sed - editing Use s to indicate substitution Example: substituting ▫ sed 's/a/b/' file  replaces a with b for the first instance on each line ▫ sed 's/a/b/g' file  g (global) replaces a with b for all instances on each line


Download ppt "Filters and Utilities. Notes: This is a simple overview of the filtering capability Some of these commands are very powerful ▫Only showing some of the."

Similar presentations


Ads by Google