CSE 303 Concepts and Tools for Software Development Richard C. Davis UW CSE – 10/9/2006 Lecture 6 – String Processing
10/9/2006CSE 303 Lecture 62 Administravia Assignment 2 is due in one week Office hours –Mine are after class today –I’ll have extra office hours later this week –T/A office hours will be posted by next lecture Follow Turn-in instructions! –username-hw2 –Plain text in readme.txt
10/9/2006CSE 303 Lecture 63 Tip of the Day Replacing Strings in Emacs –M-x query-replace Enter name of string to search for Enter name of string to replace Enter “y” or “n” for each match found –M-x replace-string Same, but does not prompt for each string
10/9/2006CSE 303 Lecture 64 Last Time Powerful Unix Tools –grep –find –diff –wc
10/9/2006CSE 303 Lecture 65 Today String Processing –sed :single lines –More Complex awk : more complex processing perl, python, ruby :general programming Shell Wrap-Up
10/9/2006CSE 303 Lecture 66 Automating File Editing We’ve learned to automate simple tasks –Move around files –Start/Stop processes –Change user environment/permissions But what about… –Changing strings –Repetitive edits to multiple files sed :can help (used in HW2)
10/9/2006CSE 303 Lecture 67 Sed : A Stream EDitor sed –Non-interactive editor –Performs editing actions –Actions defined in a “script” –Stream-oriented Input from file or stdin Script processes each line Output goes to stdout
10/9/2006CSE 303 Lecture 68 How Sed Works Each line copied to “pattern space” All editing commands applied –To data in pattern space –Done in sequence Original input does not change Possible to restrict edits to subset of lines
10/9/2006CSE 303 Lecture 69 Command-Line Syntax Method 1: One-line syntax –sed [options] 'command' file(s) –sed -e 'cmd1' –e 'cmd2' file(s) Method 2: Script file holds commands –sed [options] –f script file(s)
10/9/2006CSE 303 Lecture 610 Search and Replace with Sed Most Common use –sed ‘s/pattern/replacement/g file –Means “replace every (longest) substring that matches pattern with replacement” Common variations –Omit g at end: replace only first match –Put num at end: replace every num th match –sed -n : suppress normal output –Put p at end: print matching lines –sed -r : Use “extended” regular expressions
10/9/2006CSE 303 Lecture 611 Using the Pattern Space Can replace with all or part of a match Special characters in replacement –& : Entire pattern space –\1 : String that match 1 st set of parentheses –\2 : String that match 2 nd set of parentheses –…
10/9/2006CSE 303 Lecture 612 Examples Not so useful –sed 's/a/b/g' ex1.txt –sed 's/a/b/' ex1.txt –sed 's/a/b/2' ex1.txt –sed -n 's/a/b/2p' ex1.txt More useful –sed 's/.*Linux \(.*\).*/\1:/' ex2.txt –sed 's/.*Linux.*/&:/' ex2.txt Newline Note –The \n is not in the text matched against and is (re)- added when printed
10/9/2006CSE 303 Lecture 613 Sed Command Details General syntax of sed commands –[address[,address]][!]command[args] Address specifies range to look at –Address types Line with a particular number e.g.: 3 Lines matching pattern e.g.: /SAVE/ –Using two addresses specifies a range of lines –Using ! Means “use lines not specified in address” Other Commands –d : delete lines
10/9/2006CSE 303 Lecture 614 More Sed Examples Delete lines 3-5: sed '3,5 d' ex3.c Delete lines that don’t contain SAVE –sed '/SAVE/! d' ex3.c Delete lines that start with // –sed '/\/\// d' ex3.c Delete lines between /* and */ –sed '/\/\*/, /\*\// d' ex3.c
10/9/2006CSE 303 Lecture 615 Advanced Sed Features Commands so far: substitute, print, delete Other commands (not used in class) –Append, replace with block, insert, translate –Branch to label –Multi-line patterns –The hold space for fancy editing E.g., copy and paste of lines Need these? Use more powerful language
10/9/2006CSE 303 Lecture 616 Awk Processes text files –File contains records Separated by newline (default) –Records contain fields Separated by spaces (default) Why use awk? –Generate reports from logs –Process results of an experiment (Named after authors, Aho, Weinberger, and Kernighan)
10/9/2006CSE 303 Lecture 617 Running Awk One-line syntax –awk [options] 'script' file(s) Script file –awk [options] –f scriptFile file(s)
10/9/2006CSE 303 Lecture 618 Awk Functionality Script structure –pattern { procedure } Records processed one at a time –Pattern restricts to matching records Fields accessed with $1, …$n BEGIN and END patterns –For procedures before/after processing file
10/9/2006CSE 303 Lecture 619 Advanced Awk Features awk is a very powerful language –Looping constructs –Arrays –Functions –Fancy printing –Powerful math functions Need these? Use Perl, Python, or Ruby
10/9/2006CSE 303 Lecture 620 More Powerful Script Languages Perl, Python, and Ruby Interpreted Write scripts like bash –Prefix script with #! –Make executable with chmod Pre-compiled (fast!)
10/9/2006CSE 303 Lecture 621 Perl Practical Extraction and Report Language –Or “Pathologically Eclectic Rubbish Lister” Language properties –Excellent pattern matching –“Kitchen Sink” syntax –No objects in original version
10/9/2006CSE 303 Lecture 622 Python Fully Object Oriented Simpler Syntax Allows different styles –Procedural –Functional
10/9/2006CSE 303 Lecture 623 Ruby Fully Object Oriented Syntax more similar to Smalltalk Many different ways to do the same things –Harder to debug
10/9/2006CSE 303 Lecture 624 Summary String Processing –sed : quick mods to single lines –awk : more complex record processing –perl, python, ruby : learn one That’s all for the shell! Note: We don’t require you to know how to use any scripting tools other than sed in this class, but we do require you to know when you should consider learning to use one of these tools.
10/9/2006CSE 303 Lecture 625 Next Time Introduction to C!