Presentation is loading. Please wait.

Presentation is loading. Please wait.

CIT 140: Introduction to ITSlide #1 CSC 140: Introduction to IT Advanced File Processing.

Similar presentations


Presentation on theme: "CIT 140: Introduction to ITSlide #1 CSC 140: Introduction to IT Advanced File Processing."— Presentation transcript:

1 CIT 140: Introduction to ITSlide #1 CSC 140: Introduction to IT Advanced File Processing

2 CIT 140: Introduction to ITSlide #2 Topics 1.Compressing files: 1.compress, 2.gzip, 3.bzip2 2.Archiving Files: tar 3.Sorting files: sort

3 CIT 140: Introduction to ITSlide #3 Data Compression Problem: How can we store X bytes using only Y < X bytes? Solution: Find redundancies in the data. 1.Run-length encoding Encode reptitions as the repeated value and a count. Ex: thethethe -> the3 2.Dictionary encoding Build dictionary of words. Encode each with a number. Common words: the, an, is, this

4 CIT 140: Introduction to ITSlide #4 Data Compression "Ask not what your country can do for you -- ask what you can do for your country." Dictionary: 1 ask 2 not 3 what 4 your 5 country 6 can 7 do 8 for 9 you Encoded version: “1 2 3 4 5 6 7 8 9 – 1 3 9 6 7 8 4 5.”

5 CIT 140: Introduction to ITSlide #5 Compressing Files: compress compress [-c] [-d] [-l] [-v] file1 [file2, …] -cSend output to stdout. -dDecompress instead of compressing. -vProvide verbose output.

6 CIT 140: Introduction to ITSlide #6 Compressing Files Old School The compress command compress [options][file-list]

7 CIT 140: Introduction to ITSlide #7 The uncompress command Uncompressing Files Old School

8 CIT 140: Introduction to ITSlide #8 Compressing Files: gzip gzip [-#] [-c] [-d] [-l] [-v] file1 [file2, …] -#Specify compression level. Default=6. -cSend output to stdout. -dDecompress instead of compressing. -lList compression stats. -vProvide verbose output.

9 CIT 140: Introduction to ITSlide #9 Compressing Files: gzip > man bash >bash.man > man tcsh >tcsh.man > ls –l *man -rw-r--r-- 1 waldenj 267350 Oct 4 19:48 bash.man -rw-r--r-- 1 waldenj 239534 Oct 4 19:48 tcsh.man > gzip *.man > ls –l *gz -rw-r--r-- 1 waldenj 71333 Oct 4 19:45 bash.man.gz -rw-r--r-- 1 waldenj 69759 Oct 4 19:45 tcsh.man.gz > gzip –l *gz compressed uncompressed ratio uncompressed_name 71333 267350 73.3% bash.man 69759 239534 70.8% tcsh.man 141092 506884 72.1% (totals) >

10 CIT 140: Introduction to ITSlide #10 Uncompressing Files: gunzip > gunzip bash.man.gz > ls -l *man *gz -rw-r--r-- 1 waldenj 267350 Oct 4 19:45 bash.man -rw-r--r-- 1 waldenj 69759 Oct 4 19:45 tcsh.man.gz > gzip -v bash.man bash.man: 73.3% -- replaced with bash.man.gz > gzip -dc bash.man.gz | less User Commands BASH(1) NAME bash - GNU Bourne-Again Shell … > ls -l *man *gz -rw-r--r-- 1 waldenj 71333 Oct 4 19:45 bash.man.gz -rw-r--r-- 1 waldenj 69759 Oct 4 19:45 tcsh.man.gz

11 CIT 140: Introduction to ITSlide #11 Modern Compression: bzip2 bzip2 [-#] [-c] [-d] [-l] [-v] file1 [file2, …] -#Specify compression level. Default=9. -cSend output to stdout. -dDecompress instead of compressing. -vProvide verbose output.

12 CIT 140: Introduction to ITSlide #12 Modern Compression: bzip2 > bzip2 -v bash.man tcsh.man bash.man: 4.821:1, 1.659 bits/byte, 79.26% saved, 267350 in, 55456 out. tcsh.man: 4.259:1, 1.878 bits/byte, 76.52% saved, 239534 in, 56236 out. > ls -l *bz2 -rw-r--r-- 1 waldenj 55456 Oct 4 19:45 bash.man.bz2 -rw-r--r-- 1 waldenj 56236 Oct 4 19:48 tcsh.man.bz2 > bzip2 -d bash.man.bz2 > bunzip2 tcsh.man.bz2 > ls -l *.man -rw-r--r-- 1 waldenj 267350 Oct 4 19:45 bash.man -rw-r--r-- 1 waldenj 239534 Oct 4 19:48 tcsh.man > bzip2 -dc bash.man.bz2 |less User Commands BASH(1) NAME bash - GNU Bourne-Again Shell

13 CIT 140: Introduction to ITSlide #13 Displaying Compressed Files zcat –Identical to compress -dc gzcat –Identical to gzip -dc bzcat2 –Identical to bzip2 -dc

14 CIT 140: Introduction to ITSlide #14 Compression Benchmarks > ls -l patch* -rw-r--r-- 1 waldenj 28944395 Oct 4 19:37 patch-2.6.13 -rw-r--r-- 1 waldenj 10238237 Oct 4 19:37 patch-2.6.13.Z -rw-r--r-- 1 waldenj 5009926 Oct 4 19:37 patch-2.6.13.bz2 -rw-r--r-- 1 waldenj 6220228 Oct 4 19:37 patch-2.6.13.gz Compression ToolCompression Ratio compress64.6% gzip78.5% bzip282.7%

15 CIT 140: Introduction to ITSlide #15 Archiving Files: tar tar [-c] [-t] [-x] [-v] [-f file.tar] file1 [file2, …] -cCreate a new tape archive. -fWrite the archive to specified file instead of writing to tape. -tTrace (view) archive contents. -vProvide verbose output. -xeXtract archive contents.

16 CIT 140: Introduction to ITSlide #16 Archiving Files: tar > tar -cvf manpages.tar *.man bash.man tcsh.man > ls -l manpages.tar -rw-r--r-- 1 waldenj 512000 Oct 4 21:01 manpages.tar > tar -tf manpages.tar bash.man tcsh.man > tar -tvf manpages.tar -rw-r--r-- waldenj/students 267350 2005-10-04 19:45 bash.man -rw-r--r-- waldenj/students 239534 2005-10-04 19:48 tcsh.man > mkdir tmp > cd tmp > tar -xvf../manpages.tar bash.man tcsh.man

17 CIT 140: Introduction to ITSlide #17 Other File Compression Tools PKzip/WinZip zip, unzip ARJ arj, unarj RAR rar, unrar

18 CIT 140: Introduction to ITSlide #18 Sorting Ordering set of items by some criteria. Systems in which sorting is used include: –Words in a dictionary. –Names of people in a telephone directory. –Numbers.

19 CIT 140: Introduction to ITSlide #19 Sorting: sort sort [-f] [-i] [-d] [-l] [-v] file1 [file2, …] -dSort in dictionary order (default.) -fIgnore case of letters. -iIgnore non-printable characters. -nSort in numerical order. -rReverse order of sort -uDo not list duplicate lines in output.

20 CIT 140: Introduction to ITSlide #20 sort Example > cat days.txt Sunday Monday Tuesday Wednesday Thursday Friday Saturday > sort days.txt Friday Monday Saturday Sunday Thursday Tuesday Wednesday

21 CIT 140: Introduction to ITSlide #21 sort Example > cat days.txt Sunday Monday Tuesday Wednesday Thursday Friday Saturday > sort -r days.txt Wednesday Tuesday Thursday Sunday Saturday Monday Friday

22 CIT 140: Introduction to ITSlide #22 sort Example > cat numbers.txt 101 5571 58 2001 9 > sort numbers.txt 101 2001 5571 58 9 > sort -n numbers.txt 9 58 101 2001 5571


Download ppt "CIT 140: Introduction to ITSlide #1 CSC 140: Introduction to IT Advanced File Processing."

Similar presentations


Ads by Google