Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter Five Advanced File Processing. 2 Lesson A Selecting, Manipulating, and Formatting Information.

Similar presentations


Presentation on theme: "Chapter Five Advanced File Processing. 2 Lesson A Selecting, Manipulating, and Formatting Information."— Presentation transcript:

1 Chapter Five Advanced File Processing

2 2 Lesson A Selecting, Manipulating, and Formatting Information

3 3 Objectives Use the pipe operator to redirect the output of one command to another command Use the grep command to search for a specified pattern in a file Use the uniq command to remove duplicate lines from a file

4 4 Objectives Use the comm and diff commands to compare two files Use the wc command to count words, characters and lines in a file Use the manipulate and format commands: sed, tr, and pr

5 5 Advancing Your File Processing Skills – Selection Commands The select commands, which extract data

6 6 Advancing Your File Processing Skills – Manipulation and Transformation Commands The manipulation and transformation commands alter and transform into useful and appealing formats data

7 7 Using the Select Commands Using Pipes – The pipe operator (|) redirects the output of one command to the input of another command. The character used to represent the pipe is the \ immediately above the right key (It looks like a : ). –An example would be to redirect the output of the ls command to the more command ls | more –The pipe operator can connect several commands on the same command line First_command | second command | third command The output of the first command goes into the second command as input, and the output of the second command goes into the third command as input. For example: cat products | cut –f2 –d: | sort –Will cat the products file, cut out the description field and sort by the desciption field.

8 8 Using Pipes Using pipe operators and connecting commands is useful when viewing directory information

9 9 Using the grep Command Used to search for a specific pattern in a file, such as a word or phrase. This is used to search within a file or files for a word or phrase. grep’s options and wildcard support allow for powerful search operations You can increase grep’s usefulness by combining with other commands, such as head or tail

10 10 Using the grep Command grep can take input from other commands and also be directed to provide input for other commands

11 11 grep Grep can be extremely useful to find files based on text within the file. –Options: -i ignores case -l lists only file names -c counts the number of lines -r searches through all subdirectories beneath the current directory

12 12 grep Grep can be used to search for text within a command output. –For example, the ps (process status) command displays all of the processes currently running on the system. To see only the processes running for a particular user, type in: ps –gaux | grep username

13 13 Using the uniq Command Removes duplicate lines from a file It compares only consecutive lines, therefore uniq requires sorted input Uniq has an option that allows you to generate output that contains a copy of each line that has a duplicate

14 14 Using the comm Command Used to identify duplicate lines in sorted files Unlike uniq, it does not remove duplicates, and it works with two files rather than one It compares lines common to file1 and file2, and produces three column output –Column one contains lines found only in file1 –Column two contains lines found only in file2 –Column three contains lines found in both files

15 15 Using the diff Command Attempts to determine the minimal changes needed to convert file1 to file2 The output displays the line(s) that differ The associated codes in the output indicate that in order for the files to match, specific lines must be added or deleted

16 16 Using the wc Command Used to count the number of lines, words, and bytes or characters in text files You may specify all three options in one issuance of the command If you don’t specify any options, you see counts of lines, words, and characters (in that order) wc –c filename displays byte count in file wc – l filename displays line count in file wc – w filename displays word count in file

17 17 Using the wc Command The options for the wc command: –l for lines –w for words –c for characters

18 18 Using the Manipulate and Format Commands These commands are: sed, tr, pr Used to edit and transform the appearance of data before it is displayed or printed

19 19 Introducing sed sed is a UNIX editor that allows you to make global changes to large files. It is a line editor, which means that you specify changes by line number and do not use the arrow keys to move around within the file. Minimum requirements are an input file and a command that lets sed know what actions to apply to the file sed commands have two general forms –Specify an editing command on the command line –Specify a script file containing sed commands

20 20 Introducing sed The many options of sed allow you to create new files containing the specific data you specify

21 21 sed options Options –n specifies line numbers to edit sed –n 3,4p file1 (p must be used after range to print the specified lines) –d deleted lines specified by –n –s substitutes specified text –e specifies multiple commands per line a\ appends text (no hyphen)

22 22 sed examples sed [-options] [command] [file(s)] –Can specify a command to be performed on one or more files sed [-options] [-f scriptfile] [file(s)] –Can use a scriptfile to specify a set of commands to be performed on one or more files

23 23 Translating Characters Using the tr command tr copies data from the standard input to the standard output, substituting or deleting characters specified by options and patterns Options –d deletes characters –s substitutes or replaces characters The patterns are strings and the strings are sets of characters A popular use of tr is converting lowercase characters to uppercase –tr [a-z] [A-Z] <product1 Converts all lowercase chars in the file to uppercase

24 24 Using the pr Command to Format Your Output pr prints specified files on the standard output in paginated form By default, pr formats the specified files into single-column pages of 66 lines Each page has a five-line header, its latest modification date, current page, and five- line trailer consisting of blank lines

25 25 Using the pr Command to Format Your Output

26 26 Using the pr Command to Format Your Output

27 27 Lesson B Using UNIX File-Processing Tools to Create an Application

28 28 Objectives Design a new file-processing application Design and create files to implement the application Use awk to generate formatted output

29 29 Objectives Use cut, sort, and join to organize and transform selected file information Develop customized shell scripts to extract and combine file data Test individual shell scripts and combine all scripts into a final shell program

30 30 Designing a New File- Processing Application The most important phase in developing a new application is the design The design defines the information an applications needs to produce The design also defines how to organize this information into files, records, and fields, which are called logical structures

31 31 Designing Records The first task is to define the fields in the records and produce a record layout A record layout identifies each field by name and data type (numeric or nonnumeric) Design the file record to store only those fields relevant to the record’s primary purpose

32 32 Linking Files with Keys Multiple files are joined by a key – a common field that each of the linked files share Another important task in the design phase is to plan a way to join the files The flexibility to gather information from multiple files comprised of simple, short records is the essence of a relational database system. UNIX provides several commands providing this flexibility

33 33

34 34 Creating the Programmer and Project Files With the basic design complete, you now implement your application design UNIX file processing predominantly uses flat files. Working with these files is easy, because you can create and manipulate them with text editors like vi and Emacs

35 35

36 36 Formatting Output The awk command is used to prepare formatted output For the purposes of developing a new file- processing application, we will focus primarily on the printf action of the awk command

37 37 Formatting Output Awk provides a shortcut to other UNIX commands

38 38 Using a Shell Script to Implement the Application Shell scripts should contain: –The commands to execute –Comments to identify and explain the script so that users or programmers other than the author can understand how it works Use the pound (#) character to mark comments in a script file

39 39 Running a Shell Script You can run a shell script in virtually any shell that you have on your system The Bash shell accepts more variations in command structures that other shells Run the script by typing sh followed by the name of the script, or make the script executable and type./ prior to the script name

40 40 Putting it all together to Produce the Report An effective way to develop applications is to combine many small scripts in a larger script file Have the last script added to the larger script print a report indicating script functions and results

41 41 Putting it all together to Produce the Report

42 42 Putting it all together to Produce the Report

43 43 Chapter Summary The UNIX file-processing commands can be organized into two categories: (1) select and (2) manipulation and transformation The uniq command removes duplicate lines from a sorted file The comm command compares lines common to file1 and file2, and produces output that shows the variances between the two The diff command attempts to determine the minimal set of changes needed to convert file1 into file2

44 44 Chapter Summary The tr command copies data read from the standard input to the standard output, substituting or deleting characters specified The se command is a file editor designed to make global changes to large files The pr command prints the standard output in pages The design of a file-processing application reflects what the application needs to produce Use record layout to identify each field by name and data type

45 45 Chapter Summary Shell programs should contain commands to execute programs and comments to identify and explain the programs. The pound (#) character denotes comments Write shell scripts in stages so that you can test each part before combining them into one script. Using small shell scripts and combining them in a final shell script file is an effective way to develop applications

46 46

47 47


Download ppt "Chapter Five Advanced File Processing. 2 Lesson A Selecting, Manipulating, and Formatting Information."

Similar presentations


Ads by Google