CS 403: Programming Languages Fall 2004 Department of Computer Science University of Alabama Joel Jones.

Slides:



Advertisements
Similar presentations
CST8177 sed The Stream Editor. The original editor for Unix was called ed, short for editor. By today's standards, ed was very primitive. Soon, sed was.
Advertisements

Unix Trix for Emprirical CL1 CSA405: Unix Trix for Empirical CL How to use Unix as a toolbox for NLP applications.
1 Unix Talk #2 AWK overview Patterns and actions Records and fields Print vs. printf.
2000 Copyrights, Danielle S. Lahmani UNIX Tools G , Fall 2000 Danielle S. Lahmani Lecture 6.
 *, ? And [ …] . Any single character  ^ beginning of a line  $ end of the line.
CS 497C – Introduction to UNIX Lecture 31: - Filters Using Regular Expressions – grep and sed Chin-Chih Chang
CS 497C – Introduction to UNIX Lecture 25: - Simple Filters Chin-Chih Chang
T UTORIAL OF U NIX C OMMAND & SHELL SCRIPT S 5027 Professor: Dr. Shu-Ching Chen TA: Samira Pouyanfar Spring 2015.
Lecture 02CS311 – Operating Systems 1 1 CS311 – Lecture 02 Outline UNIX/Linux features – Redirection – pipes – Terminating a command – Running program.
Grep, comm, and uniq. The grep Command The grep command allows a user to search for specific text inside a file. The grep command will find all occurrences.
Unix Filters Text processing utilities. Filters Filter commands – Unix commands that serve dual purposes: –standalone –used with other commands and pipes.
UNIX Filters.
CS 124/LINGUIST 180 From Languages to Information Unix for Poets (in 2014) Dan Jurafsky (From Chris Manning’s modification of Ken Church’s presentation)
Shell Script Examples.
Shell Scripting Awk (part1) Awk Programming Language standard unix language that is geared for text processing and creating formatted reports but it.
Advanced File Processing
Overview of the grep Command Alex Dukhovny CS 265 Spring 2011.
LIN 6932 Unix Lecture 6 Hana Filip. LIN 6932 HW6 - Part II solutions posted on my website see syllabus.
Unix Talk #2 (sed). 2 You have learned…  Regular expressions, grep, & egrep  grep & egrep are tools used to search for text in a file  AWK -- powerful.
Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.
Unix programming Term: III B.Tech II semester Unit-II PPT Slides Text Books: (1)unix the ultimate guide by Sumitabha Das (2)Advanced programming.
Chapter 5: Advanced Editors awk, sed, tr, cut. Objectives: After studying this lesson, you should be able to: –awk: a pattern scanning and processing.
CIS 218 Advanced UNIX1 CIS 218 – Advanced UNIX (g)awk.
CS 403: Programming Languages Lecture 21 Fall 2003 Department of Computer Science University of Alabama Joel Jones.
Regular expressions Used by several different UNIX commands, including ed, sed, awk, grep A period ‘.’ matches any single characters.X. matches any X.
Advanced File Processing. 2 Objectives Use the pipe operator to redirect the output of one command to another command Use the grep command to search for.
UNIX Shell Script (1) Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
Chapter Five Advanced File Processing Guide To UNIX Using Linux Fourth Edition Chapter 5 Unix (34 slides)1 CTEC 110.
Module 6 – Redirections, Pipes and Power Tools.. STDin 0 STDout 1 STDerr 2 Redirections.
(Stream Editor) By: Ross Mills.  Sed is an acronym for stream editor  Instead of altering the original file, sed is used to scan the input file line.
Introduction to Awk Awk is a convenient and expressive programming language that can be applied to a wide variety of computing and data manipulation tasks.
Agenda Regular Expressions (Appendix A in Text) –Definition / Purpose –Commands that Use Regular Expressions –Using Regular Expressions –Using the Replacement.
Awk Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
Introduction to Unix – CS 21 Lecture 12. Lecture Overview A few more bash programming tricks The here document Trapping signals in bash cut and tr sed.
Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}
Chapter Five Advanced File Processing. 2 Lesson A Selecting, Manipulating, and Formatting Information.
Introduction to sed. Sed : a “S tream ED itor ” What is Sed ?  A “non-interactive” text editor that is called from the unix command line.  Input text.
Searching and Sorting. Why Use Data Files? There are many cases where the input to the program may come from a data file.Using data files in your programs.
Practical 1-LEX Implementation
Advanced Text Processing. 222 Lecture Overview  Character manipulation commands cut, paste, tr  Line manipulation commands sort, uniq, diff  Regular.
Awk- An Advanced Filter by Prof. Shylaja S S Head of the Dept. Dept. of Information Science & Engineering, P.E.S Institute of Technology, Bangalore
Alon Efrat Computer Science Department University of Arizona Unix Tools.
Sed. Class Issues vSphere Issues – root only until lab 3.
1 Lecture 10 Introduction to AWK COP 3344 Introduction to UNIX.
ORAFACT Text Processing. ORAFACT Searching Inside Files grep - searches for patterns within files grep [options] [[-e] pattern] filename [...] -n shows.
UNIX commands Head More (press Q to exit) Cat – Example cat file – Example cat file1 file2 Grep – Grep –v ‘expression’ – Grep –A 1 ‘expression’ – Grep.
FILTERS USING REGULAR EXPRESSIONS – grep and sed.
Lesson 6-Using Utilities to Accomplish Complex Tasks.
In the last class, Filters and delimiters The sample database pr command head and tail commands cut and paste commands.
CS 403: Programming Languages Lecture 20 Fall 2003 Department of Computer Science University of Alabama Joel Jones.
By Dr P.Padmanabham Professor (CSE)&Director Bharat Institute of Engineering &Technology Hyderabad Mobile
Awk- An Advanced Filter by Prof. Shylaja S S Head of the Dept. Dept. of Information Science & Engineering, P.E.S Institute of Technology, Bangalore
Filters and Utilities. Notes: This is a simple overview of the filtering capability Some of these commands are very powerful ▫Only showing some of the.
Awk 2 – more awk. AWK INVOCATION AND OPERATION the "-F" option allows changing Awk's "field separator" character. Awk regards each line of input data.
SIMPLE FILTERS. CONTENTS Filters – definition To format text – pr Pick lines from the beginning – head Pick lines from the end – tail Extract characters.
AWK One tool to create them all AWK Marcel Nijenhof Eth-0 11 Augustus 2010.
Tutorial of Unix Command & shell scriptS 5027
Lesson 5-Exploring Utilities
Looking for Patterns - Finding them with Regular Expressions
CST8177 sed The Stream Editor.
Chapter 6 Filters.
PROGRAMMING THE BASH SHELL PART IV by İlker Korkmaz and Kaya Oğuz
CS 403: Programming Languages
Tutorial of Unix Command & shell scriptS 5027
Tutorial of Unix Command & shell scriptS 5027
Guide To UNIX Using Linux Third Edition
Tutorial of Unix Command & shell scriptS 5027
Unix Talk #2 (sed).
Introduction to Bash Programming, part 3
Software I: Utilities and Internals
Presentation transcript:

CS 403: Programming Languages Fall 2004 Department of Computer Science University of Alabama Joel Jones

Other Filters (cont.) tr inputChars outputChar(s) tr a-z A-Z maps lower case to upper case Flags: -s squeezes multiple occurences of a character in the input to a single character in the output; -c takes the complement of the first argument, e.g. tr -c ab matches every character except a and b. tr also understands character ranges. uniq removes duplicate adjacent lines Flags: -c adds count of duplicate lines at beginning ‘\012’ is a new line Pair Up: Write a pipeline that prints the 10 most frequent words in its input.

Printing 10 most common words cat $* | # tr doesn’t take filename arguments tr -sc A-Za-z ‘\012’ | # all non alpha become newline sort | uniq -c | # get the count sort -n | # sort by count tail # prints 10 by default Use the man command to look at for help man sort

sed sed [options]‘list of commands’ filenames … Commands s/re1/re2/ substitute regular expression r1 with r2, first instance on every line s/re1/re2/g substitute regular expression r1 with r2, every instance on every line #command, does command for # times E.g. sed 3q prints first 3 lines /re1/q prints lines up to first one matching re1, then quits

sed (cont.) sed ‘s/^/ /’ file Pair Up: Write a sed command line that indents a line by adding four spaces to the beginning of the line More commands /re1/s/re2/re3/ substitute regular expression re2 with re3, first instance on every line matching re1 Pair Up: What does the above sed command do with empty lines? Write a sed command line that fixes this problem. sed ‘/./s/^/ /’ file # or sed ‘/^$/!s/^/ /’ file

sed (cont.) Commands (cont.) /re/d deletes lines matching re /re/p prints lines matching re Options -n turns off automatic printing -f filename takes sed commands from filename Pair Up: What does the sed command sed ‘/the/p’ < file do? Pair Up: Write a sed command line that does the same thing as: grep re file sed -n ‘/re/p’ file

awk awk [options]‘program’ filenames … Like sed, but program is different: pattern { action } pattern { action } … awk reads input in filenames one line at a time & when pattern matches, executes corresponding action Patterns Regular expressions C-like expressions

awk (cont.) Pattern or action is optional Pattern missing—perform action on every line Action missing—print every line matching pattern Simple action print—without argument prints current line Pair Up: Write an awk command line that does the same thing as: grep re file awk ‘/re/’ file Pair Up: Write an awk command line that does the same thing as: cat filenames … awk ‘{ print }’ filenames …

awk (cont.) Variables $0—entire line, $1-$NF—fields of line E.g. awk ‘{ print $2 }’ textFile prints 2nd field of every line of textFile E.g. who | awk ‘{ print $5, $1 }’ | sort prints name and login sorted by time NF is number of fields on current line NR is number of records (lines) read so far Options -Fchar—sets char as the field separator Pair Up: Write an awk command line that prints user names out of /etc/passwd, where the user name is the first field and fields are colon separated.

awk (cont.) awk -F: ‘{ print $1 }’ /etc/passwd N.B. most unix systems don’t store users in /etc/passwd anymore Field breaking Default is on space and tab and multiple contiguous white space counts as a single white space and leading separators are discarded Setting separator causes leading separators to be counted

awk (cont.) More on patterns Print user name of people who have no password $2 == “” 2nd field is empty $2 ~ /^$/ 2nd field matches empty string $2 !~ /./ 2nd field doesn’t match any character length($2) == 0 Length of 2nd field is zero Pair Up: Write an awk command line that prints lines of input that have an odd number of fields. awk ‘NF % 2 != 0’ files Pair Up: Write an awk command line that is the shortest equivalent of cat. awk '/^/' files