Software I: Utilities and Internals

Slides:



Advertisements
Similar presentations
 *, ? And [ …] . Any single character  ^ beginning of a line  $ end of the line.
Advertisements

CS 497C – Introduction to UNIX Lecture 25: - Simple Filters Chin-Chih Chang
Unix Utilities (sort/uniq) CS465 – Unix. The sort command Sorts lines Default behavior: Do a case-sensitive, ascii- alphabetic line sort, starting at.
CS 497C – Introduction to UNIX Lecture 23: - Simple Filters Chin-Chih Chang
Guide To UNIX Using Linux Third Edition
Lecture 02CS311 – Operating Systems 1 1 CS311 – Lecture 02 Outline UNIX/Linux features – Redirection – pipes – Terminating a command – Running program.
Grep, comm, and uniq. The grep Command The grep command allows a user to search for specific text inside a file. The grep command will find all occurrences.
Introduction to UNIX GPS Processing and Analysis with GAMIT/GLOBK/TRACK T. Herring, R. King. M. Floyd – MIT UNAVCO, Boulder - July 8-12, 2013 Directory.
CSCI 330 T HE UNIX S YSTEM File operations. OPERATIONS ON REGULAR FILES 2 CSCI The UNIX System Create Edit Display Contents Display Contents Print.
Unix Files, IO Plumbing and Filters The file system and pathnames Files with more than one link Shell wildcards Characters special to the shell Pipes and.
CSC 4630 Meeting 2 January 22, Filters Definition: A filter is a program that takes a text file as an input and produces a text file as an output.
Unix Filters Text processing utilities. Filters Filter commands – Unix commands that serve dual purposes: –standalone –used with other commands and pipes.
UNIX Filters.
CS 124/LINGUIST 180 From Languages to Information Unix for Poets (in 2014) Dan Jurafsky (From Chris Manning’s modification of Ken Church’s presentation)
Chapter 4: UNIX File Processing Input and Output.
Advanced File Processing
Agenda User Profile File (.profile) –Keyword Shell Variables Linux (Unix) filters –Purpose –Commands: grep, sort, awk cut, tr, wc, spell.
Guide To UNIX Using Linux Fourth Edition
LIN 6932 Unix Lecture 6 Hana Filip. LIN 6932 HW6 - Part II solutions posted on my website see syllabus.
Unix Talk #2 (sed). 2 You have learned…  Regular expressions, grep, & egrep  grep & egrep are tools used to search for text in a file  AWK -- powerful.
Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.
Unix programming Term: III B.Tech II semester Unit-II PPT Slides Text Books: (1)unix the ultimate guide by Sumitabha Das (2)Advanced programming.
Chapter 5: Advanced Editors awk, sed, tr, cut. Objectives: After studying this lesson, you should be able to: –awk: a pattern scanning and processing.
CS 403: Programming Languages Lecture 21 Fall 2003 Department of Computer Science University of Alabama Joel Jones.
Regular expressions Used by several different UNIX commands, including ed, sed, awk, grep A period ‘.’ matches any single characters.X. matches any X.
CS 403: Programming Languages Fall 2004 Department of Computer Science University of Alabama Joel Jones.
Advanced File Processing. 2 Objectives Use the pipe operator to redirect the output of one command to another command Use the grep command to search for.
Chapter Five Advanced File Processing Guide To UNIX Using Linux Fourth Edition Chapter 5 Unix (34 slides)1 CTEC 110.
Chapter Five Advanced File Processing. 2 Objectives Use the pipe operator to redirect the output of one command to another command Use the grep command.
Module 6 – Redirections, Pipes and Power Tools.. STDin 0 STDout 1 STDerr 2 Redirections.
Pipes and Filters Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See
I/O Redirection and Regular Expressions February 9 th, 2004 Class Meeting 4.
Introduction to Unix – CS 21 Lecture 12. Lecture Overview A few more bash programming tricks The here document Trapping signals in bash cut and tr sed.
Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz.
Chapter Five Advanced File Processing. 2 Lesson A Selecting, Manipulating, and Formatting Information.
LIN Unix Lecture 7 Hana Filip. LIN Text Processing Command Line Utility Programs (cont.) sed LAST WEEK wc sort tr uniq awk TODAY join paste.
Chapter Four I/O Redirection1 System Programming Shell Operators.
I/O Redirection & Regular Expressions CS 2204 Class meeting 4 *Notes by Doug Bowman and other members of the CS faculty at Virginia Tech. Copyright
Advanced Text Processing. 222 Lecture Overview  Character manipulation commands cut, paste, tr  Line manipulation commands sort, uniq, diff  Regular.
CS 124/LINGUIST 180 From Languages to Information Unix for Poets (in 2013) Christopher Manning Stanford University.
Awk- An Advanced Filter by Prof. Shylaja S S Head of the Dept. Dept. of Information Science & Engineering, P.E.S Institute of Technology, Bangalore
– Introduction to the Shell 1/21/2016 Introduction to the Shell – Session Introduction to the Shell – Session 3 · Job control · Start,
CS 124/LINGUIST 180 From Languages to Information
1 Lecture 10 Introduction to AWK COP 3344 Introduction to UNIX.
Lecture 1: Introduction, Basic UNIX Advanced Programming Techniques.
ORAFACT Text Processing. ORAFACT Searching Inside Files grep - searches for patterns within files grep [options] [[-e] pattern] filename [...] -n shows.
Uniq The uniq command is useful when you need to find duplicate lines in a file. The basic format of the command is uniq in_file out_file In this format,
UNIX commands Head More (press Q to exit) Cat – Example cat file – Example cat file1 file2 Grep – Grep –v ‘expression’ – Grep –A 1 ‘expression’ – Grep.
Lesson 6-Using Utilities to Accomplish Complex Tasks.
In the last class, Filters and delimiters The sample database pr command head and tail commands cut and paste commands.
6/13/2016Course material created by D. Woit 1 CPS 393 Introduction to Unix and C START OF WEEK 3 (UNIX) 6/13/2016Course material created by D. Woit 1.
Filters and Utilities. Notes: This is a simple overview of the filtering capability Some of these commands are very powerful ▫Only showing some of the.
AWK One tool to create them all AWK Marcel Nijenhof Eth-0 11 Augustus 2010.
Introduction to Textutils sort, uniq, wc, cut, grep, sed, awk ● Steve Walsh ● Linux Users of Victoria ● November, 2007.
Regular Expressions Copyright Doug Maxwell (
Cs288 Intensive Programming in Linux
Lesson 5-Exploring Utilities
CS 124/LINGUIST 180 From Languages to Information
The UNIX Shell Learning Objectives:
Chapter 6 Filters.
Linux command line basics III: piping commands for text processing
PROGRAMMING THE BASH SHELL PART IV by İlker Korkmaz and Kaya Oğuz
CS 403: Programming Languages
INTRODUCTION TO UNIX: The Shell Command Interface
CS 124/LINGUIST 180 From Languages to Information
Folks Carelli, Instructor Kutztown University
The Linux Command Line Chapter 6
Guide To UNIX Using Linux Third Edition
CS 124/LINGUIST 180 From Languages to Information
Lab 7: Filtering.
Presentation transcript:

Software I: Utilities and Internals Lecture 5 – Filters * Modified from Dr. Robert Siegfried original presentation

What Are Filters? A filter is a UNIX program that reads input (usually stdin), performs some transformation on it and writes it (usually to stdout). This follows the UNIX/Linux model of building simple components and then combining them to create more powerful applications. We might use grep or tail to select some of our input, sort to sort it, wc to count characters and/or lines, etc.

Examples of Filters UNIX filters include: grep – selects lines from standard input based on whether they contain a specified pattern. There are also egrep and fgrep. sort – places lines of input in order sed – "stream editor" – allows the user to perform certain specified transformation on the input. awk – named for Alfred Aho, Peter Weinberger and Brian Kernighan, it offers much more power in transforming input than sed.

sort sort sorts lines of input in ASCII order. The user has a certain amount of control over which column is used as the sort key: sort –f - fold upper and lower case together sort –d - sorts by "dictionary order", ignores everything except blanks and alphanumerics sort – n - sorts by numeric order sort –o filename - places sorted output in filename sort +number - skip the first number columns

sort – Some Examples ls | sort –f sort files in alphabetic order ls –s | sort –n sort small files first ls –s | sort –nr sort large files first ls –l | sort –nrk5 sort by byte count largest first who | sort –nk4 sort by login, oldest first sort k3f fleas sort by third field (ignore case)

uniq Retain only duplicate lines Retain only unique lines uniq can do one of 4 different things: Retain only duplicate lines Retain only unique lines Eliminate duplicate lines Count how many duplicate lines there are. Syntax uniq [-cdu] [infile [outfile]] -c prefixes line with number of occurrences -d only print duplicate lines -u only print unique lines

uniq – An Example [SIEGFRIE@panther ~]$ cat data Barbara Al Kathy [SIEGFRIE@panther ~]$ uniq -d data [SIEGFRIE@panther ~]$ uniq -u data

[SIEGFRIE@panther ~]$ uniq data Barbara Al Kathy [SIEGFRIE@panther ~]$ uniq -c data 1 Barbara 2 Al 1 Kathy [SIEGFRIE@panther ~]$

uniq and sort Together [SIEGFRIE@panther ~]$ cat CS270 Dan George Alice Roger Stuart Abigail Steven [SIEGFRIE@panther ~]$ cat CS271 Polly Molly Solly

Show those taking both classes (duplicates): [SIEGFRIE@panther ~]$ sort CS270 CS271 | uniq -d Abigail Alice Dan Steven Show those taking only one class (uniques): [SIEGFRIE@panther ~]$ sort CS270 CS271 | uniq -u George Molly Polly Roger Solly Stuart

List everyone taking either class [SIEGFRIE@panther ~]$ sort CS270 CS271 | uniq Abigail Alice Dan George Molly Polly Roger Solly Steven Stuart

Now list the number of classes each is taking [SIEGFRIE@panther ~]$ sort CS270 CS271 | uniq -c 2 Abigail 2 Alice 2 Dan 1 George 1 Molly 1 Polly 1 Roger 1 Solly 2 Steven 1 Stuart [SIEGFRIE@panther ~]$

comm comm file1 file2 – compare file1 and file2 line by line and prints the output in 3 columns: lines appearing in file1 only lines appearing in file2 only lines appearing in both files comm -1 suppress column 1 comm -2 suppress column 2 comm -3 suppress column 3

comm – An Example [SIEGFRIE@panther ~]$ cat CS270 Dan George Alice Roger Stuart Abigail Steven [SIEGFRIE@panther ~]$ cat CS271 Polly Molly Solly

List only those in csc270: [SIEGFRIE@panther ~]$ comm -1 CS270 CS271 Dan Alice Steven Polly Molly Solly Abigail List only those in 271: [SIEGFRIE@panther ~]$ comm -2 CS270 CS271 George Roger Stuart

List those in both [SIEGFRIE@panther ~]$ comm -3 CS270 CS271 Alice George Roger Steven Polly Molly Solly Abigail Stuart OR sort first with : comm -1 <(sort -n csc270file) <(sort -n csc271file)

tr tr –c complements tr –s replace sequences with a single char tr translates characters in a file tr a-z A-Z maps lower-case letters into upper case. tr A-Z a-z maps upper-case letters into lower case. tr –c complements tr –s replace sequences with a single char

tr – An Example [SIEGFRIE@panther ~]$ cat bin/sq cat $* | tr -sc A-Za-z '\012' | # Compress nonletters into newlines sort | # sort them uniq -c | # give a count sort -n | # sort by that count tail | # print the last term pr -5 # in 5 columns [SIEGFRIE@panther ~]$