Chapter 12: gawk Yes it sounds funny. In this chapter … Intro Patterns Actions Control Structures Putting it all together.

Slides:



Advertisements
Similar presentations
CST8177 awk. The awk program is not named after the sea-bird (that's auk), nor is it a cry from a parrot (awwwk!). It's the initials of the authors, Aho,
Advertisements

1 Unix Talk #2 AWK overview Patterns and actions Records and fields Print vs. printf.
Grep (Global REgular expresion Print) Operation –Search a group of files –Find all lines that contain a particular regular expression pattern –Write the.
2000 Copyrights, Danielle S. Lahmani UNIX Tools G , Fall 2000 Danielle S. Lahmani Lecture 6.
CS311 – Today's class Perl – Practical Extraction Report Language. Assignment 2 discussion Lecture 071CS Operating Systems I.
More on Numerical Computation CS-2301 B-term More on Numerical Computation CS-2301, System Programming for Non-majors (Slides include materials from.
AWK: The Duct Tape of Computer Science Research Tim Sherwood UC San Diego.
Chapter 9: Arrays and Strings
JavaScript, Third Edition
Introduction to C Programming
Shell Scripting Awk (part1) Awk Programming Language standard unix language that is geared for text processing and creating formatted reports but it.
INTRO TO PROGRAMMING Chapter 2. M-files While commands can be entered directly to the command window, MATLAB also allows you to put commands in text files.
Agenda Sed Utility - Advanced –Using Script-files / Example Awk Utility - Advanced –Using Script-files –Math calculations / Operators / Functions –Floating.
Chapter 5: Advanced Editors awk, sed, tr, cut. Objectives: After studying this lesson, you should be able to: –awk: a pattern scanning and processing.
CNG 140 C Programming (Lecture set 9) Spring Chapter 9 Character Strings.
CIS 218 Advanced UNIX1 CIS 218 – Advanced UNIX (g)awk.
Chapter 3 Processing and Interactive Input. 2 Assignment  The general syntax for an assignment statement is variable = operand; The operand to the right.
AWK. text processing languge awk Created for Unix by Aho, Weinberger and Kernighan Basicly an: ▫interpreted ▫text processing ▫programming language Updated.
C Programming n General Information on C n Data Types n Arithmetic Operators n Relational Operators n if, if-else, for, while by Kulapan Waranyuwat.
CS 403: Programming Languages Fall 2004 Department of Computer Science University of Alabama Joel Jones.
Awk search for and process a pattern in a file. Format awk [-Fc] –f program-file [file-list] awk program [file-list] Summary The awk utility is a pattern-scanning.
Introduction to Awk Awk is a convenient and expressive programming language that can be applied to a wide variety of computing and data manipulation tasks.
Programmable Text Processing with awk Lecturer: Prof. Andrzej (AJ) Bieszczad Phone: “UNIX for Programmers and Users”
Chapter 13: sed Say what?. In this chapter … Basics Programs Addresses Instructions Control Spaces Examples.
Chapter 2: Java Fundamentals Type conversion,String.
An Introduction to Java Programming and Object-Oriented Application Development Chapter 7 Characters, Strings, and Formatting.
Awk Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
Introduction to Unix – CS 21 Lecture 12. Lecture Overview A few more bash programming tricks The here document Trapping signals in bash cut and tr sed.
13 More Advanced Awk Mauro Jaskelioff (originally by Gail Hopkins)
Chapter 10: BASH Shell Scripting Fun with fi. In this chapter … Control structures File descriptors Variables.
A talk about AWK Don Newcomb 18 Jan What is AWK? AWK is an interpreted computer language It is primarily used for text processing and data formatting.
Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}
BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.
1 P51UST: Unix and Software Tools Unix and Software Tools (P51UST) Awk Programming (2) Ruibin Bai (Room AB326) Division of Computer Science The University.
Time to talk about your class projects!. Shell Scripting Awk (lecture 2)
Introducing Python CS 4320, SPRING Lexical Structure Two aspects of Python syntax may be challenging to Java programmers Indenting ◦Indenting is.
LIN Unix Lecture 7 Hana Filip. LIN Text Processing Command Line Utility Programs (cont.) sed LAST WEEK wc sort tr uniq awk TODAY join paste.
CSC141 Introduction to Computer Programming Teacher: AHMED MUMTAZ MUSTEHSAN Lecture - 6.
Chapter Twelve sed, awk & perl1 System Programming sed, awk & perl.
Chapter-4 Managing input and Output operation.  Reading, processing and writing of data are three essential functions of a computer program.  Most programs.
1 LAB 4 Working with Trace Files using AWK. 2 Structure of Trace File.
1 Lecture 9 Shell Programming – Command substitution Regular expressions and grep Use of exit, for loop and expr commands COP 3353 Introduction to UNIX.
CSCI 330 UNIX and Network Programming
Awk- An Advanced Filter by Prof. Shylaja S S Head of the Dept. Dept. of Information Science & Engineering, P.E.S Institute of Technology, Bangalore
1 P51UST: Unix and Software Tools Unix and Software Tools (P51UST) Awk Programming Ruibin Bai (Room AB326) Division of Computer Science The University.
CISC 1480/KRF Copyright © 1999 by Kenneth R. Frazer 1 AWK q A programming language for handling common data manipulation tasks with only a few lines of.
Review of Awk Principles
The awk command. Introduction Awk is a programming language used for manipulating data and generating reports. The data may come from standard input,
Sed. Class Issues vSphere Issues – root only until lab 3.
1 Lecture 10 Introduction to AWK COP 3344 Introduction to UNIX.
2: Basics Basics Programming C# © 2003 DevelopMentor, Inc. 12/1/2003.
CSCI 330 UNIX and Network Programming Unit IX: awk II.
1 © 2001 John Urrutia. All rights reserved. CIS52 – File Manipulation File Manipulation Utilities Regular Expressions sed, awk.
By Dr P.Padmanabham Professor (CSE)&Director Bharat Institute of Engineering &Technology Hyderabad Mobile
PHP Tutorial. What is PHP PHP is a server scripting language, and a powerful tool for making dynamic and interactive Web pages.
Awk- An Advanced Filter by Prof. Shylaja S S Head of the Dept. Dept. of Information Science & Engineering, P.E.S Institute of Technology, Bangalore
Chapter 3: Formatted Input/Output 1 Chapter 3 Formatted Input/Output.
Winter 2016CISC101 - Prof. McLeod1 CISC101 Reminders Quiz 3 this week – last section on Friday. Assignment 4 is posted. Data mining: –Designing functions.
Awk 2 – more awk. AWK INVOCATION AND OPERATION the "-F" option allows changing Awk's "field separator" character. Awk regards each line of input data.
Arun Vishwanathan Nevis Networks Pvt. Ltd.
AWK.
CSC 4630 Meeting 7 February 7, 2007.
CIS3931 – Intro to JAVA Lecture Note Set 2 17-May-05.
PROGRAMMING THE BASH SHELL PART IV by İlker Korkmaz and Kaya Oğuz
Agenda Control Flow Statements Purpose test statement
John Carelli, Instructor Kutztown University
Chapter 8 JavaScript: Control Statements, Part 2
LING 408/508: Computational Techniques for Linguists
Awk.
Introduction to C CSE 2031 Fall /15/2019 8:26 AM.
Presentation transcript:

Chapter 12: gawk Yes it sounds funny

In this chapter … Intro Patterns Actions Control Structures Putting it all together

gawk? GNU awk awk == Aho, Weinberger and Kernighan Pattern processing language Filters data and generates reports

gawk con’t Syntax: gawk [options] [program] [file-list] gawk [options] –f program-file [file-list] Essentially, program is a list of things to pattern match, and then a list of actions to perform Can either be on the command line or in a file

gawk program A gawk program contains one or more lines in the format pattern { action } Pattern is used to determine which lines of data to select Action determines what to do with those lines Default pattern is all lines Default action is to print the line Use single quotes around program on CL

Patterns Simple numeric or string comparisons = > Regular expressions (see Appendix A) –The ~ operator matches pattern –The !~ operator does not match pattern Combinations using || (OR) and && (AND)

Patterns, con’t BEGIN – before any lines are processed END – after all lines are processed pattern1,pattern2 – a range, that starts with pattern 1, and ends with pattern2. After matching pattern2, gawk attempts to match pattern1 again

Variables $0 – the current record (line) $1-$n – fields in current record FS – input field separator (default: SPACE / TAB ) NF – number of fields in record NR – current record number RS – input record separator (default: NEWLINE ) OFS – output field separator ORS – output record separator

Associative Arrays A variable type similar to an array, but with strings as indexes (instead of integers) Ex –myAssocArray[name] = “Bob” –myAssocArray[hometown] = “Austin” Ex –studentGrades[ ] = 75 –studentGrades[ ] = 100

Pattern examples $1 ~ /^[A-Z]/ –Matches records where first field starts with a capital letter $3 <= $5 –Matches records where the third field is less than or equal to the fifth field $2 > 5000 && $1 !~ /exempt/ –Matches records where second field is greater than 5000 and first field is not exempt

Functions length(str) – returns length of str –Returns length of line if str omitted int(num) – returns integer portion of num tolower(str) – coverts chars to lower case toupper(str) – converts chars to upper case substr(str,pos,len) – returns substring of str starting at pos with length len

Actions Default action is print entire record Using print, can print out particular parts (i.e., fields) –Ex. { print $1 } Put literal strings in single quotes By default multiple parameters catenated –Use comma to use OFS Ex. { print $1, $5 }

Actions, con’t Separate multiple actions by semicolons Other actions usually involve variables (i.e., incrementors, accumulators) Variables need not be formally initialized By default set to zero or null Standard operators function normally * / % + - = = -= *= /= %=

Actions, con’t Instead of print you can use printf (c-style) Syntax: –printf “control-string”, arg1, arg2 … argn –control-string contains one or more conversion –%[-][[x].[y]]conv - – left justify x – min field width y – decimal places conv : d – decimal f – floating point s – string Ex: %.2f – floating point with two decimal places

Control Structures gawk programs can utilize several control structures Can use if-else, while, for, break and continue All are C-style in syntax (what did the K in gawk stand for?)

if … else Syntax: if (condition) { commands } else { commands }

while Syntax: while (condition) { commands }

for Syntax: for (init; condition; increment) { commands } You can use break and continue for both for and while loops

Examples gawk ‘{print}’ cars gawk ‘/chevy/’ cars gawk ‘{print $3, $1}’ cars gawk ‘/chevy/ {print $3, $1} cars gawk ‘$1 ~ /^h/’ cars gawk ‘2000 <= $5 && $5 < 9000’ cars gawk ‘/volvo/, /bmw/’ cars gawk ‘{print $3, $1, “$” $5}’ cars gawk ‘BEGIN {print “Car Info”}’ cars

Putting it all together BEGIN{ print " Miles" print "Make Model Year (000) Price" print \ " " } { if ($1 ~ /ply/) $1 = "plymouth" if ($1 ~ /chev/) $1 = "chevrolet" printf "%-10s %-8s %2d %5d $ %8.2f\n",\ $1, $2, $3, $4, $5 }

Results gawk -f printf_demo cars Miles Make Model Year (000) Price plymouth fury $ chevrolet malibu $ ford mustang $ volvo s $ ford thundbd $ chevrolet malibu $ bmw 325i $ honda accord $ ford taurus $ toyota rav $ chevrolet impala $ ford explor $

Associative Arrays gawk ‘ {manuf[$1]++} END {for(name in manuf) print name,\ manuf[name]}’ cars | sort bmw 1 chevy 3 ford 4 honda 1 plym 1 toyota 1 volvo 1

Standalone Scripts Alternative to issuing gawk –f at command line Just like making a shell script – first line defines what runs script #!/bin/gawk –f Then begin your patterns/actions

Advanced gawk getline - allows you to manually pull lines from input –Useful if you need to loop through data Coprocess – direct input or output through a second process, using |& operator Coprocess can be network based using /inet/tcp/0/URL