P51UST: Unix and Software Tools Unix and Software Tools (P51UST) Awk Programming (3) Ruibin Bai (Room AB326) Division of Computer Science The University.

Slides:



Advertisements
Similar presentations
CST8177 awk. The awk program is not named after the sea-bird (that's auk), nor is it a cry from a parrot (awwwk!). It's the initials of the authors, Aho,
Advertisements

M AT L AB Programming: scripts & functions. Scripts It is possible to achieve a lot simply by executing one command at a time on the command line (even.
Coordinatate systems are used to assign numeric values to locations with respect to a particular frame of reference commonly referred to as the origin.
Introduction to Unix – CS 21 Lecture 11. Lecture Overview Shell Programming Variable Discussion Command line parameters Arithmetic Discussion Control.
1 Unix Talk #2 AWK overview Patterns and actions Records and fields Print vs. printf.
ISBN Regular expressions Mastering Regular Expressions by Jeffrey E. F. Friedl –(on reserve.
CS Lecture 03 Outline Sed and awk from previous lecture Writing simple bash script Assignment 1 discussion 1CS 311 Operating SystemsLecture 03.
Bash, part 2 Prof. Chris GauthierDickey COMP Unix Tools.
AWK: The Duct Tape of Computer Science Research Tim Sherwood UC San Diego.
Guide To UNIX Using Linux Third Edition
Shell Scripting Awk (part1) Awk Programming Language standard unix language that is geared for text processing and creating formatted reports but it.
1 Day 16 Sed and Awk. 2 Looking through output We already know what “grep” does. –It looks for something in a file. –Returns any line from the file that.
Chapter Seven Advanced Shell Programming. 2 Lesson A Developing a Fully Featured Program.
Advanced File Processing
Agenda User Profile File (.profile) –Keyword Shell Variables Linux (Unix) filters –Purpose –Commands: grep, sort, awk cut, tr, wc, spell.
Advanced UNIX Shell Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
Unix Talk #2 (sed). 2 You have learned…  Regular expressions, grep, & egrep  grep & egrep are tools used to search for text in a file  AWK -- powerful.
CIS 218 Advanced UNIX1 CIS 218 – Advanced UNIX (g)awk.
CS 403: Programming Languages Fall 2004 Department of Computer Science University of Alabama Joel Jones.
Advanced File Processing. 2 Objectives Use the pipe operator to redirect the output of one command to another command Use the grep command to search for.
UNIX Shell Script (1) Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
Chapter Five Advanced File Processing Guide To UNIX Using Linux Fourth Edition Chapter 5 Unix (34 slides)1 CTEC 110.
Chapter Five Advanced File Processing. 2 Objectives Use the pipe operator to redirect the output of one command to another command Use the grep command.
BUILDING JAVA PROGRAMS CHAPTER 7 Arrays. Exam #2: Chapters 1-6 Thursday Dec. 4th.
P51UST: Unix and Software Tools Unix and Software Tools (P51UST) Exam Revision Ruibin Bai (Room AB326) Division of Computer Science The University of Nottingham.
Awk Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
Sed Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
Introduction to Unix – CS 21 Lecture 12. Lecture Overview A few more bash programming tricks The here document Trapping signals in bash cut and tr sed.
Computational Methods of Scientific Programming Lecturers Thomas A Herring, Room , Chris Hill, Room ,
13 More Advanced Awk Mauro Jaskelioff (originally by Gail Hopkins)
A talk about AWK Don Newcomb 18 Jan What is AWK? AWK is an interpreted computer language It is primarily used for text processing and data formatting.
Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}
1 P51UST: Unix and Software Tools Unix and Software Tools (P51UST) Awk Programming (2) Ruibin Bai (Room AB326) Division of Computer Science The University.
Time to talk about your class projects!. Shell Scripting Awk (lecture 2)
Introducing Python CS 4320, SPRING Lexical Structure Two aspects of Python syntax may be challenging to Java programmers Indenting ◦Indenting is.
1 Command-Line Processing In many operating systems, command-line options are allowed to input parameters to the program SomeProgram Param1 Param2 Param3.
Chapter Five Advanced File Processing. 2 Lesson A Selecting, Manipulating, and Formatting Information.
1 P51UST: Unix and Software Tools Unix and Software Tools (P51UST) More Shell Programming Ruibin Bai (Room AB326) Division of Computer Science The University.
Lesson 3-Touring Utilities and System Features. Overview Employing fundamental utilities. Linux terminal sessions. Managing input and output. Using special.
CSCI 330 UNIX and Network Programming
P51UST: Unix and SoftwareTools Unix and Software Tools (P51UST) Version Control Systems Ruibin Bai (Room AB326) Division of Computer Science The University.
1 P51UST: Unix and Software Tools Unix and Software Tools (P51UST) Awk Programming Ruibin Bai (Room AB326) Division of Computer Science The University.
The awk command. Introduction Awk is a programming language used for manipulating data and generating reports. The data may come from standard input,
Sed. Class Issues vSphere Issues – root only until lab 3.
1 Lecture 10 Introduction to AWK COP 3344 Introduction to UNIX.
Announcements Assignment 1 due Wednesday at 11:59PM Quiz 1 on Thursday 1.
CSCI 330 UNIX and Network Programming Unit IX: awk II.
-Joseph Beberman *Some slides are inspired by a PowerPoint presentation used by professor Seikyung Jung, which was derived from Charlie Wiseman.
The Scripting Programming Language
CSE 303 Concepts and Tools for Software Development Richard C. Davis UW CSE – 10/11/2006 Lecture 7 – Introduction to C.
CS 403: Programming Languages Lecture 20 Fall 2003 Department of Computer Science University of Alabama Joel Jones.
1 Homework Continue with K&R Chapter 5 –Skipping sections for now –Not covering section 5.12 Continue on HW5.
Awk- An Advanced Filter by Prof. Shylaja S S Head of the Dept. Dept. of Information Science & Engineering, P.E.S Institute of Technology, Bangalore
Arun Vishwanathan Nevis Networks Pvt. Ltd.
Lesson 5-Exploring Utilities
CSC 4630 Meeting 7 February 7, 2007.
Lecture 14 Programming with awk II
Command Line Arguments
PROGRAMMING THE BASH SHELL PART IV by İlker Korkmaz and Kaya Oğuz
CS 403: Programming Languages
Intro to PHP & Variables
John Carelli, Instructor Kutztown University
Guide To UNIX Using Linux Third Edition
LING 408/508: Computational Techniques for Linguists
Unix Talk #2 (sed).
LING 408/508: Computational Techniques for Linguists
Computing in COBOL: The Arithmetic Verbs and Intrinsic Functions
Exercise Arrays.
Awk.
Introduction to Bash Programming, part 3
Presentation transcript:

P51UST: Unix and Software Tools Unix and Software Tools (P51UST) Awk Programming (3) Ruibin Bai (Room AB326) Division of Computer Science The University of Nottingham Ningbo, China

2 P51DBS: Database Systems Contents Missing things –Reading input from a pipeline –More string functions –System variables –Forcing variable types –Arithmetic functions

Reading input from a pipe The UNIX “who am i” command will give the following type of output: This output can be piped to getline: –“who am i” | getline Here, $0 will be set to the output of the command, the line will be parsed into fields such that “zlizrb1” will be put in field $1, “pts/32” will be put into $2, etc. The system variable NF will be set $ who am i zlizrb1 pts/21 Apr 17 15:15 (pcname)

Reading Input From a Pipe $ awk ′ BEGIN { “who am i” | getline name = $1 FS = “:” } name ~ $1 {print $5} ′ /etc/passwd This script pipes the result of the “who am i” command to getline which parses it into fields. The variable “name” is assigned to field number 1 and the Field Separator is assigned to “:” The script then tests to see whether the first field ($1) in /etc/passwd is the same as that stored in name (the fields in /etc/passwd are separated by a “:”) If so, the 5 th field of /etc/passwd is printed (which contains the corresponding user’s full name)

Some Important Limitations There is a limit to the number of pipes and files that the system can have open at any one time –This limit varies from system to system –In most implementations of awk, up to 10 open files is allowed. Use the close() function! Some other limits are: –Number of fields per record –Characters per input record –Characters per field –See the awk manual page for more information

Using close() with Pipes and Files Why use close()? –So your program can open as many pipes and files as it needs without exceeding the system limit –It allows your program to run the same command twice –You may need close() to force an output pipe to finish its work { do something | “sort > myFile” } END { close(“sort > myFile”) while ((getline 0) { do more stuff }

Advanced String Functions (1) gsub(regex,s,str) –Globally substitutes s for each match of the regular expression regex in the string str. Returns the number of substitutions. If a string str is not supplied, it will use $0 7 P51UST: Unix and SoftwareTools

Advanced String Functions (2) asort(src[,d]) –Supported in gawk –The function sorts the array src based on the element values. –If d is specified, the function will make a copy of src into d and then d is sorted. –Also replaces the indices with values from 1 to the number of elements in the array. 8 P51UST: Unix and SoftwareTools

Advanced String Functions (3) asorti(src[,d]) –Supported in gawk –Like sort(), but the sorting is done based on indeces in the array, not based on the element values. –The value of original indeces will be stored in the array. 9 P51UST: Unix and SoftwareTools

asort vs asorti BEGIN{ arr[1]="a“; arr[2]="d" arr[4]="f“; arr[8]="c " asort(arr,arrcpy1) asorti(arr,arrcpy2) print "Original array" for(i in arr) print "arr["i"]=" arr[i] print "Array after using asort" for(i in arrcpy1){ print "arr["i"]=" arrcpy1[i] } print "Array after using asorti" for(i in arrcpy2){ print "arr["i"]=" arrcpy2[i] } $ awk -f asorti.awk Original array arr[4]=f arr[8]=c arr[1]=a arr[2]=d Array after using asort arr[4]=f arr[1]=a arr[2]=c arr[3]=d Array after using asorti arr[4]=8 arr[1]=1 arr[2]=2 arr[3]=4 10 P51UST: Unix and SoftwareTools

System Variables that are Arrays There are two system variables that are arrays in Gawk: –ARGV An array containing the command line arguments given to awk. The number of elements is stored in another variable called ARGC (not an array) The array is indexed from 0 (unlike other arrays in awk) The last element is therefore ARGC-1 –E.g. ARGV[ARGC-1], ARGV[2] The first element is the name of the command that invoked the script 11 P51UST: Unix and SoftwareTools

System Variables that are Arrays (2) ENVIRON –An array containing environment variables –Each element is the value of the current environment –The index of each element is the name of the environment variable –E.g. ENVIRON[“PATH”], ENVIRON[“SHELL”] 12 P51UST: Unix and SoftwareTools

ARGV Example BEGIN { for (x=0; x<ARGC; x++) print ARGV[x] print ARGC } Output $ awk -f parameters.awk 2007 G51UST “Gail Hopkins” students=80 - awk 2007 G51UST Gail Hopkins Students= P51UST: Unix and SoftwareTools

Use of Backslash Backslash can be used: –To continue strings across new lines marian$ awk ‘BEGIN {print ″hello, \ > world″ }’ output hello, world 14 P51UST: Unix and SoftwareTools

Forcing Variable Types In awk, you do not declare variables and give them types Sometimes you want to force awk to treat a variable as a particular type, e.g. as a number or as a string. –To force a variable, x, to be treated as a number, put in the line: x=x+0 –To force a variable, x, to be treated as a string, put in the line: x=x “” 15 P51UST: Unix and SoftwareTools

Built in Arithmetic Functions awk has a number of arithmetic functions that are built in. Some are shown below: –exp(x) –int(x) –sqrt(x) –cos(x) 16 P51UST: Unix and SoftwareTools Returns e to the power x Returns a truncated value of x Returns the square root of x Returns the cosine of x

A Summary of Awk Functions TypeFunction or commands Arithmeticexpintsqrtsincos randsrandlogatan2 Stringasortasortigsubindexlength splitsubstrtolowertoupper Control Flowif/elsedo/whileforbreakcontinue breakreturn I/Oprintprintfgetlinenextnextfile closefflush Programmingsystemdeletefunction 17 P51UST: Unix and SoftwareTools