Awk 2 – more awk. AWK INVOCATION AND OPERATION the "-F" option allows changing Awk's "field separator" character. Awk regards each line of input data.

Slides:



Advertisements
Similar presentations
CST8177 awk. The awk program is not named after the sea-bird (that's auk), nor is it a cry from a parrot (awwwk!). It's the initials of the authors, Aho,
Advertisements

Awk 1 – an introduction. Your scripts I will put them on the shared drive. Different solutions – diversity Title of (lab or question) Efficient?
1 Unix Talk #2 AWK overview Patterns and actions Records and fields Print vs. printf.
Grep (Global REgular expresion Print) Operation –Search a group of files –Find all lines that contain a particular regular expression pattern –Write the.
ISBN Regular expressions Mastering Regular Expressions by Jeffrey E. F. Friedl –(on reserve.
2000 Copyrights, Danielle S. Lahmani UNIX Tools G , Fall 2000 Danielle S. Lahmani Lecture 6.
CS Lecture 03 Outline Sed and awk from previous lecture Writing simple bash script Assignment 1 discussion 1CS 311 Operating SystemsLecture 03.
AWK: The Duct Tape of Computer Science Research Tim Sherwood UC San Diego.
Unix Filters Text processing utilities. Filters Filter commands – Unix commands that serve dual purposes: –standalone –used with other commands and pipes.
UNIX Filters.
Shell Scripting Awk (part1) Awk Programming Language standard unix language that is geared for text processing and creating formatted reports but it.
Regular Expressions A regular expression defines a pattern of characters to be found in a string Regular expressions are made up of – Literal characters.
Advanced File Processing
Overview of the grep Command Alex Dukhovny CS 265 Spring 2011.
Exam Revision. Exam Details Time 90 minutes (1hour 30 minutes). Six questions! How long per question? Five parts per question. How long for each part?
Pattern matching with regular expressions A common file processing requirement is to match strings within the file to a standard form, e.g. address.
Fortran 1- Basics Chapters 1-2 in your Fortran book.
Agenda Sed Utility - Advanced –Using Script-files / Example Awk Utility - Advanced –Using Script-files –Math calculations / Operators / Functions –Floating.
Unix programming Term: III B.Tech II semester Unit-II PPT Slides Text Books: (1)unix the ultimate guide by Sumitabha Das (2)Advanced programming.
Chap 3 – PHP Quick Start COMP RL Professor Mattos.
CIS 218 Advanced UNIX1 CIS 218 – Advanced UNIX (g)awk.
Sahar Mosleh California State University San MarcosPage 1 System.out.println for console output System.out is an object that is part of the Java language.
Input, Output, and Processing
CS 403: Programming Languages Fall 2004 Department of Computer Science University of Alabama Joel Jones.
Programming Languages Meeting 13 December 2/3, 2014.
Advanced File Processing. 2 Objectives Use the pipe operator to redirect the output of one command to another command Use the grep command to search for.
Chapter Five Advanced File Processing Guide To UNIX Using Linux Fourth Edition Chapter 5 Unix (34 slides)1 CTEC 110.
Introduction to Awk Awk is a convenient and expressive programming language that can be applied to a wide variety of computing and data manipulation tasks.
Programmable Text Processing with awk Lecturer: Prof. Andrzej (AJ) Bieszczad Phone: “UNIX for Programmers and Users”
Agenda Regular Expressions (Appendix A in Text) –Definition / Purpose –Commands that Use Regular Expressions –Using Regular Expressions –Using the Replacement.
Awk Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
Introduction to Unix – CS 21 Lecture 12. Lecture Overview A few more bash programming tricks The here document Trapping signals in bash cut and tr sed.
Chapter 12: gawk Yes it sounds funny. In this chapter … Intro Patterns Actions Control Structures Putting it all together.
BASICS CONCEPTS OF ‘C’.  C Character Set C Character Set  Tokens in C Tokens in C  Constants Constants  Variables Variables  Global Variables Global.
REGEX. Problems Have big text file, want to extract data – Phone numbers (503)
Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}
Regular Expressions What is this line all about? while (!($search =~ /^\s*$/)) { It’s a string search just like before, but with a huge twist – regular.
BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.
Time to talk about your class projects!. Shell Scripting Awk (lecture 2)
Introduction to Lex Ying-Hung Jiang
Chapter Five Advanced File Processing. 2 Lesson A Selecting, Manipulating, and Formatting Information.
CS 330 Programming Languages 10 / 02 / 2007 Instructor: Michael Eckmann.
CSCI 330 UNIX and Network Programming
Awk- An Advanced Filter by Prof. Shylaja S S Head of the Dept. Dept. of Information Science & Engineering, P.E.S Institute of Technology, Bangalore
© Janice Regan, CMPT 102, Sept CMPT 102 Introduction to Scientific Computer Programming Input and Output.
1 P51UST: Unix and Software Tools Unix and Software Tools (P51UST) Awk Programming Ruibin Bai (Room AB326) Division of Computer Science The University.
Alon Efrat Computer Science Department University of Arizona Unix Tools.
The awk command. Introduction Awk is a programming language used for manipulating data and generating reports. The data may come from standard input,
Sed. Class Issues vSphere Issues – root only until lab 3.
1 Lecture 10 Introduction to AWK COP 3344 Introduction to UNIX.
ORAFACT Text Processing. ORAFACT Searching Inside Files grep - searches for patterns within files grep [options] [[-e] pattern] filename [...] -n shows.
CS 330 Programming Languages 09 / 30 / 2008 Instructor: Michael Eckmann.
By Dr P.Padmanabham Professor (CSE)&Director Bharat Institute of Engineering &Technology Hyderabad Mobile
Awk- An Advanced Filter by Prof. Shylaja S S Head of the Dept. Dept. of Information Science & Engineering, P.E.S Institute of Technology, Bangalore
Chapter 3: Formatted Input/Output 1 Chapter 3 Formatted Input/Output.
FILES AND EXCEPTIONS Topics Introduction to File Input and Output Using Loops to Process Files Processing Records Exceptions.
Lesson 4 String Manipulation. Lesson 4 In many applications you will need to do some kind of manipulation or parsing of strings, whether you are Attempting.
Arun Vishwanathan Nevis Networks Pvt. Ltd.
CSC 4630 Meeting 7 February 7, 2007.
PROGRAMMING THE BASH SHELL PART IV by İlker Korkmaz and Kaya Oğuz
CS 403: Programming Languages
John Carelli, Instructor Kutztown University
The ‘grep’ Command Colin Masterson.
Topics Introduction to File Input and Output
Guide To UNIX Using Linux Third Edition
Linux Shell Script Programming
Lab 8: Regular Expressions
Topics Introduction to File Input and Output
Professor Jodi Neely-Ritz University of Florida
Presentation transcript:

Awk 2 – more awk

AWK INVOCATION AND OPERATION the "-F" option allows changing Awk's "field separator" character. Awk regards each line of input data (RECORD) as composed of multiple "fields", which are essentially words separated by blank spaces. A blank space (or a tab character) is the default "field separator". In some cases, the input data may be divided by another character, for example, a ":", and it would be nice to be able to tell Awk to use a different field separator. To invoke Awk and specify a ":" as the field separator, we write: awk -F:...

initialize Awk variables on the command line (soft coding) /gold/ { num_gold++; wt_gold += $2 } /silver/ { num_silver++; wt_silver += $2 } END { val_gold = 485 * wt_gold val_silver = 16 * wt_silver... The prices of gold and silver could be specified by variables, say, "pg" and "ps": END { val_gold = pg * wt_gold val_silver = ps * wt_silver and then the program would be invoked with variable initializations in the command line as follows: awk -f summary.awk pg=485 ps=16 coins.txt

SEARCH PATTERNS (1) /The/ /^The/ /The$/ /\$/ /[Tt]he/ /[a-z]/ /[a-zA-Z0-9]/

A range of characters can also be excluded, by preceding the range with a "^". For example: /^[^a-zA-Z0-9]/ -- matches any line that doesn't start with a letter or digit. A "|" allows regular expressions to be logically ORed. For example: /(^Germany)|(^Netherlands)/ -- matches lines that start with the word "Germany" or the word "Netherlands". The "." special characters allows "wildcard" matching, meaning it can be used to specify any arbitrary character. For example: /wh./ -- matches "who", "why", and any other string that has the characters "wh" and any following character.

a (possibly signed) integer number. /^[+-]?[0-9]+$/ -- matches any line that consists only of a (possibly signed) integer number. /^ Find string at beginning of line. /^[-+]? Specify possible "-" or "+" sign for number. /^[-+]?[0-9]+ Specify one or more digits "0" through "9". /^[-+]?[0-9]+$/ Specify that the line ends with the number.

SEARCH PATTERNS (2) The search can be constrained to a single field within the input line. For example: $1 ~ /^France$/ -- searches for lines whose first field ($1) is the word "France", while: $1 !~ /^Norway$/ -- searches for lines whose first field is not the word "Norway". It is possible to search for an entire series or "block" of consecutive lines in the text, For example: /^Ireland/,/^Summary/ -- matches a block of text whose first line begins with "Ireland" and whose last line begins with "Summary".

comparison operations Awk supports search patterns using a full range of comparison operations: < Less than. <= Less than or equal. == Equal. != Not equal. >= Greater than or equal to. > Greater than.

Awk's built-in variables 1 NR: Keeps a current count of the number of input lines. NF: Keeps a count of the number of words in an input line. The last field in the input line can be designated by $NF. FILENAME: Contains the name of the current input file. FS: Contains the "field separator" character used to divide fields on the input line. The default is "white space", meaning space and tab characters. FS can be reassigned to another character to change the field separator.

Awk's built-in variables 2 RS: Stores the current "record separator" character. Since, by default, an input line is the input record, the default record separator character is a "newline". OFS: Stores the "output field separator", which separates the fields when Awk prints them. The default is a "space" character. ORS: Stores the "output record separator", which separates the output lines when Awk prints them. The default is a "newline" character. OFMT: Stores the format for numeric output. The default format is "%.6g", which will be explained when "printf" is discussed.