Time to talk about your class projects!. Shell Scripting Awk (lecture 2)

Slides:



Advertisements
Similar presentations
CIS 240 Introduction to UNIX Instructor: Sue Sampson.
Advertisements

1 Unix Talk #2 AWK overview Patterns and actions Records and fields Print vs. printf.
ISBN Regular expressions Mastering Regular Expressions by Jeffrey E. F. Friedl –(on reserve.
2000 Copyrights, Danielle S. Lahmani UNIX Tools G , Fall 2000 Danielle S. Lahmani Lecture 6.
CS 898N – Advanced World Wide Web Technologies Lecture 8: PERL Chin-Chih Chang
CS311 – Today's class Perl – Practical Extraction Report Language. Assignment 2 discussion Lecture 071CS Operating Systems I.
CS Lecture 03 Outline Sed and awk from previous lecture Writing simple bash script Assignment 1 discussion 1CS 311 Operating SystemsLecture 03.
Scripting Languages Chapter 6 I/O Basics. Input from STDIN We’ve been doing so with $line = chomp($line); Same as chomp($line= ); line input op gives.
Bash, part 2 Prof. Chris GauthierDickey COMP Unix Tools.
25-Jun-15 JavaScript Language Fundamentals II. 2 Exception handling, I Exception handling in JavaScript is almost the same as in Java throw expression.
AWK: The Duct Tape of Computer Science Research Tim Sherwood UC San Diego.
Lecture 2 BNFO 135 Usman Roshan. Perl variables Scalar –Number –String Examples –$myname = “Roshan”; –$year = 2006;
Shell Scripting Awk (part1) Awk Programming Language standard unix language that is geared for text processing and creating formatted reports but it.
Versions/Implementations awk : original awk nawk : new awk, dates to 1987 gawk : GNU awk has more powerful string functionality - NOTE – We are going.
Advanced File Processing
Fortran 1- Basics Chapters 1-2 in your Fortran book.
Agenda Sed Utility - Advanced –Using Script-files / Example Awk Utility - Advanced –Using Script-files –Math calculations / Operators / Functions –Floating.
Chap 3 – PHP Quick Start COMP RL Professor Mattos.
CNG 140 C Programming (Lecture set 9) Spring Chapter 9 Character Strings.
CIS 218 Advanced UNIX1 CIS 218 – Advanced UNIX (g)awk.
ASP.NET Programming with C# and SQL Server First Edition Chapter 3 Using Functions, Methods, and Control Structures.
AWK. text processing languge awk Created for Unix by Aho, Weinberger and Kernighan Basicly an: ▫interpreted ▫text processing ▫programming language Updated.
Shell Script Programming. 2 Using UNIX Shell Scripts Unlike high-level language programs, shell scripts do not have to be converted into machine language.
Linux+ Guide to Linux Certification, Third Edition
Chapter Five Advanced File Processing. 2 Objectives Use the pipe operator to redirect the output of one command to another command Use the grep command.
Introduction to Awk Awk is a convenient and expressive programming language that can be applied to a wide variety of computing and data manipulation tasks.
Copyright © 2010 Certification Partners, LLC -- All Rights Reserved Perl Specialist.
Awk Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
Sed, awk, & perl CS 2204 Class meeting 13 *Notes by Mir Farooq Ali and other members of the CS faculty at Virginia Tech. Copyright 2003.
JavaScript Syntax and Semantics. Slide 2 Lecture Overview Core JavaScript Syntax (I will not review every nuance of the language)
Introduction to Unix – CS 21 Lecture 12. Lecture Overview A few more bash programming tricks The here document Trapping signals in bash cut and tr sed.
Chapter 12: gawk Yes it sounds funny. In this chapter … Intro Patterns Actions Control Structures Putting it all together.
13 More Advanced Awk Mauro Jaskelioff (originally by Gail Hopkins)
Chapter 10: BASH Shell Scripting Fun with fi. In this chapter … Control structures File descriptors Variables.
Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}
BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.
1 P51UST: Unix and Software Tools Unix and Software Tools (P51UST) Awk Programming (2) Ruibin Bai (Room AB326) Division of Computer Science The University.
Introducing Python CS 4320, SPRING Lexical Structure Two aspects of Python syntax may be challenging to Java programmers Indenting ◦Indenting is.
LIN Unix Lecture 7 Hana Filip. LIN Text Processing Command Line Utility Programs (cont.) sed LAST WEEK wc sort tr uniq awk TODAY join paste.
©Colin Jamison 2004 Shell scripting in Linux Colin Jamison.
Copyright © 2003 ProsoftTraining. All rights reserved. Perl Fundamentals.
1 Lecture 9 Shell Programming – Command substitution Regular expressions and grep Use of exit, for loop and expr commands COP 3353 Introduction to UNIX.
Lecture 26: Reusable Methods: Enviable Sloth. Creating Function M-files User defined functions are stored as M- files To use them, they must be in the.
Department of Electrical and Computer Engineering Introduction to Perl By Hector M Lugo-Cordero August 26, 2008.
CSCI 330 UNIX and Network Programming
Awk- An Advanced Filter by Prof. Shylaja S S Head of the Dept. Dept. of Information Science & Engineering, P.E.S Institute of Technology, Bangalore
1 Homework Done the reading? –K&R –Glass Chapters 1 and 2 Applied for cs240? (If not, keep at it!) Gotten a UNIX account? (If not, keep at it!)
Sed. Class Issues vSphere Issues – root only until lab 3.
Computer Programming for Biologists Class 4 Nov 14 th, 2014 Karsten Hokamp
1 Lecture 10 Introduction to AWK COP 3344 Introduction to UNIX.
2: Basics Basics Programming C# © 2003 DevelopMentor, Inc. 12/1/2003.
CSCI 330 UNIX and Network Programming Unit IX: awk II.
 2006 Pearson Education, Inc. All rights reserved Control Statements: Part 2.
ORAFACT Text Processing. ORAFACT Searching Inside Files grep - searches for patterns within files grep [options] [[-e] pattern] filename [...] -n shows.
Introduction to Programming the WWW I CMSC Winter 2003 Lecture 17.
CSC 4630 Perl 3 adapted from R. E. Beck. Problem But we worked on it first: Input: Read from a text file named in a command line argument Output: List.
Shell Scripting September 27, 2004 Class Meeting 6, Part II * Notes adapted by Lenwood Heath from previous work by other members of the CS faculty at Virginia.
Linux Administration Working with the BASH Shell.
FILES AND EXCEPTIONS Topics Introduction to File Input and Output Using Loops to Process Files Processing Records Exceptions.
13 Arrays CE : Fundamental Programming Techniques June 161.
Awk 2 – more awk. AWK INVOCATION AND OPERATION the "-F" option allows changing Awk's "field separator" character. Awk regards each line of input data.
CSE 303 Concepts and Tools for Software Development Richard C. Davis UW CSE – 10/9/2006 Lecture 6 – String Processing.
Lesson 5-Exploring Utilities
CSC 4630 Meeting 7 February 7, 2007.
Lecture 9 Shell Programming – Command substitution
PROGRAMMING THE BASH SHELL PART IV by İlker Korkmaz and Kaya Oğuz
John Carelli, Instructor Kutztown University
Homework Applied for cs240? (If not, keep at it!) 8/10 Done with HW1?
Introduction to Computer Science
Introduction to Bash Programming, part 3
Presentation transcript:

Time to talk about your class projects!

Shell Scripting Awk (lecture 2)

Basic structure of AWK use The essential organization of an AWK program follows the form: pattern { action } The pattern specifies when the action is performed.

Like most UNIX utilities, AWK is line oriented. That is, the pattern specifies a test that is performed with each line read as input. If the condition is true, then the action is taken. The default pattern is something that matches every line. This is the blank or null pattern.

Program syntax BEGIN { } : the begin block contains all modifications to built-in variables and anything you want done before awk procedures are implemented { }: list of procedures carried out on all lines END { } : the end block contains all final calculations or print summaries

As you might expect, these two words specify actions to be taken before any lines are read, and after the last line is read. The AWK program: BEGIN { print "START" } { print } END { print "STOP" } adds one line before and one line after the input file.

Example: #!/usr/bin/nawk -f BEGIN { FS=“:” #the –F of the command line becomes FS in a script } { print $1} END { print “Finished working on this file” } %chmod 755 example.awk %./example.awk /etc/passwd | tail noaccess nobody4 Finished working on this file

Input file: Jimmy the Weasel 100 Pleasant Drive San Francisco, CA Big Tony 200 Incognito Ave. Suburbia, WA Cousin Vinnie Vinnie's Auto Shop 300 City Alley Sosueme, OR Awk script: #!/usr/bin/awk –f BEGIN { FS="\n" RS="" ORS="" } { x=1 while ( x<NF ) { print $x "\t" x++ } print $NF "\n" }

Looping Constructs awk loop syntax are very similar to C and perl while: continues to loop as long as condition exited successfully while ( x==y ) { commands }

do/while do the following set of commands, while condition is true do { commands } while ( x==y ) The difference between while and do/while is when the condition is tested. It is tested prior to running the commands for a while loop, but tested after the set of commands is run once in a do/while loop

for loops one of the most common loop structures is the for loop, which iterates over an array of objects for ( x=1; x<=NF; x++) { #in awk, arrays start at 1 commands } * if you take anything away from this lecture, memorize the above for loop syntax

break and continue break: breaks out of a loop continue: restarts at the beginning of the loop x=1 while (1) { if ( x == 4 ) { x++ continue } print "iteration",x if ( x > 20 ) { break } x++ }

if/else/else if if loops work much like they did in bash but the syntax is a bit different (no then or fi) if ( conditional1 ) { commands } else if ( conditional2 ) { #optional commands } else { #optional commands } you can have an if loop without an else if or else, but you can’t have an else if or else without an if

Arrays array indices start at 1 (in most computer programming languages, except fortran and matlab, arrays start at 0) mis-indexing arrays is one of the most common bugs in any code arrays are commonly indexed by numbers, but in awk, they can be indexed by strings

to explicitly set an array element, use brackets to specify which index of the array you are setting myarray[1]=“jim” #note, strings appear in quotes myarray[2]=456 or myarray[“name”]=“jim” #index strings appear in quotes too

to reference an array element, use brackets to specify what index you want for ( x in myarray ) { print myarray[x] } #x gets set to an index variable by use of the in function, but the access order of the index variables is random

to delete an array element, use the delete command delete myarray[1] to test if an element exists, use a if loop for ( 1 in myarray ) { print “It’s there” } else { print “It’s missing” }

you can also set arrays using the split command split(“string”,destination array,separator) split returns the number of indices numelements=split("Jan,Feb,Mar,Apr,May",mymonths,",") so that numelements=5 and mymonths[1]=“Jan”

Formatted output printf : the formatted print function returns with the standard C syntax %s specifies strings %d specifies integers %f specifies floating point values printf(“%s %s version %d\n”, “Hello”, “world”, 2) Hello world version 2

you can control how many spaces are reserved for the formatted print (%) by adding numbers %10s - 10 character string print %5d - reserves 5 spaces for the integer %10.2f - reserves 10 spaces for the float and prints only to the 100ths value 9.05 the default format is right justified. To make formatted text left justified, add a – after the % %-10.2f becomes 9.05

sprintf sends formatted print to a string variable rather to stdout n=sprintf ("%d plus %d is %d", a, b, a+b);

Sub-strings substr : allows you to cut specific characters from strings. this function also available in C and perl substr(string,startcharacter,numberofcharacters) oldstring=“How are you?” newstr=substr(oldstring,9,3) What is newstr in this example?

Other string functions length : returns the number of characters in a string length(oldstring) returns 12 index : returns the start character of the one string in another index(oldstring,”you”) returns 9 tolower/toupper : converts string to all lower or to all upper case

subroutines (aka functions) Format -- "function", then the name, and then the parameters separated by commas, inside parentheses. "{ }" code block contains the code that you'd like this function to execute. function monthdigit(mymonth) { return (index(months,mymonth)+3)/4 }

nawk provides a "return" statement that allows the function to return a value. function monthdigit(mymonth) { return (index(months,mymonth)+3)/4 } This function converts a month name in a 3-letter string format into its numeric equivalent. For example, this: print monthdigit("Mar")....will print this: 3

What does this do? index(months,mymonth) Built-in string function index, returns the starting position of the occurrence of a substring (the second parameter) in another string (the first paramater), or it will return 0 if the string isn't found.

months="Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec”  print index(months,”Aug”) 29 To get the number associated with the month (based on the string with the 12 months) add 3 to the index (29+3=32) and divide by 4 (32/4=8, Aug is 8 th month). The string months was designed so the calculation gave the month number.

Matching Regular Expressions match : search for a regular expression, set the built-in variables RSTART to start character and RLENGTH to the matched string length match returns the start character by default start=match(oldstring,/you/) #note, regexp format print start RSTART RLENGTH 9 9 3

String substitution sub and gsub : serve as single search and replace or global search and replace functions that work with regular expressions sub(regexp,replacestring,oldstring) sub(/o/,"O",oldstring)#this changes the given string print oldstring oldstring="How are you doing today?" gsub(/o/,"O”,oldstring) print oldstring HOw are you doing today? HOw are yOu dOing tOday?

Input file: 23 Aug 2000food--YJimmy's Buffet Aug 2000-inco-YBoss Man Note, there are tabs between the fields, which you can’t really see with this screen copy Example Script

#!/usr/bin/awk -f BEGIN { #set global variables and built-in functions FS="\t+" months="Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec" } function monthdigit(mymonth) {#set subroutines (aka functions) return (index(months,mymonth)+3)/4 } function doincome(mybalance) { mybalance[curmonth,$3] += amount mybalance[0,$3] += amount } function doexpense(mybalance) { mybalance[curmonth,$2] -= amount mybalance[0,$2] -= amount } function dotransfer(mybalance) { mybalance[0,$2] -= amount mybalance[curmonth,$2] -= amount mybalance[0,$3] += amount mybalance[curmonth,$3] += amount }

#main program { curmonth=monthdigit(substr($1,4,3)) amount=$7 #record all the categories encountered if ( $2 != "-" ) globcat[$2]="yes" if ( $3 != "-" ) globcat[$3]="yes" #tally up the transaction properly if ( $2 == "-" ) { if ( $3 == "-" ) { print "Error: inc and exp fields are both blank!" exit 1 } else { #this is income doincome(balance) if ( $5 == "Y" ) doincome(balance2) }

} else if ( $3 == "-" ) { #this is an expense doexpense(balance) if ( $5 == "Y" ) doexpense(balance2) } else { #this is a transfer dotransfer(balance) if ( $5 == "Y" ) dotransfer(balance2) } #end of main program END { bal=0 bal2=0 for (x in globcat) { bal=bal+balance[0,x] bal2=bal2+balance2[0,x] } printf("Your available funds: %10.2f\n", bal) printf("Your account balance: %10.2f\n", bal2) }

Input file: 23 Aug 2000food--YJimmy's Buffet Aug 2000-inco-YBoss Man Output to the screen: Your available funds: Your account balance: