Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Compression.

Similar presentations


Presentation on theme: "Data Compression."— Presentation transcript:

1 Data Compression

2 Behind The Scenes Compression used for: ~50% of web traffic
Most audio/video files Sometimes for every file on a drive

3 How Is This Possible? Entire King James Bible : 4,834,757 bytes
Zip Archive Containing It: 1,339,843 bytes

4 More Questions Why does this file: Compress different than:

5 Trick 1: Run Length Encoding :
Describe repetition as: (How many times)What to repeat A

6 RLE Examples ABABABABABAB 6AB AAABBBBBAAACC 3A,5B,3A,2C
(5)1,(1)0,(6)01

7 Trick 2 Same As Earlier: ABCDEFG-b7c7
Describe patterns with instructions to go back x and copy y characters ABCDEFG-b7c7 "Write down ABCDEFG, then go back 7 characters and copy the next 7 characters to the end of what you have"

8 Same As Earlier ABCDEFG-b7c7

9 Same As Earlier ABCDEFG-b7c7 ABCDEFG

10 Same As Earlier ABCDEFG-b7c7 ABCDEFG

11 Same As Earlier ABCDEFG-b7c7 ABCDEFGA

12 Same As Earlier ABCDEFG-b7c7 ABCDEFGAB

13 Same As Earlier ABCDEFG-b7c7 ABCDEFGABC

14 Same As Earlier ABCDEFG-b7c7 ABCDEFGABCD

15 Same As Earlier ABCDEFG-b7c7 ABCDEFGABCDE

16 Same As Earlier ABCDEFG-b7c7 ABCDEFGABCDEF

17 Same As Earlier ABCDEFG-b7c7 ABCDEFGABCDEFG

18 Same As Earlier ABCDEFG-b7c7 ABCDEFGABCDEFG

19 Same As Earlier AB-b2c6

20 Same As Earlier AB-b2c6 AB

21 Same As Earlier AB-b2c6 AB

22 Same As Earlier AB-b2c6 ABA

23 Same As Earlier AB-b2c6 ABAB

24 Same As Earlier AB-b2c6 ABABA

25 Same As Earlier AB-b2c6 ABABAB

26 Same As Earlier AB-b2c6 ABABABA

27 Same As Earlier AB-b2c6 ABABABAB

28 Same As Earlier AB-b2c6 ABABABAB

29 Same As Earlier AB-b2c2-C-b3c4

30 Same As Earlier AB-b2c2-C-b2c5 AB

31 Same As Earlier AB-b2c2-C-b2c5 AB

32 Same As Earlier AB-b2c2-C-b2c5 ABAB

33 Same As Earlier AB-b2c2-C-b2c5 ABAB

34 Same As Earlier AB-b2c2-C-b2c5 ABABC

35 Same As Earlier AB-b2c2-C-b2c5 ABABC

36 Same As Earlier AB-b2c2-C-b2c5 ABABCB

37 Same As Earlier AB-b2c2-C-b2c5 ABABCBC

38 Same As Earlier AB-b2c2-C-b2c5 ABABCBCB

39 Same As Earlier AB-b2c2-C-b2c5 ABABCBCBC

40 Same As Earlier AB-b2c2-C-b2c5 ABABCBCBCB

41 Same As Earlier AB-b2c2-C-b2c5 ABABCBCBCB

42 Shorter Symbol Trick Shorter Symbol Trick:
Use minimum number of bits to represent different symbols in message More common symbols get shorter representation

43 More Common A is most common in this pattern: AAAABAAC
So maybe we can use a shorter code for it  (10 bits)

44 Why Does it Work No code is a prefix for another 010110010 ABCAAB
0 : it is an A 1 : keep going ABCAAB

45 Building a Code CS160 Reader… Huffman Code Building

46 Lossy Compression Lossless compression :
Can recreate original perfectly Algorithms: Run length encoding, same as earlier, shorter symbol Examples: zip files, www traffic

47 Lossy Compression Lossy compression
Original can NOT be recreated perfectly

48 My Kids Kb

49 Every Other Line/Column Removed

50 Remaining pixels packed back down : 320Kb

51 Blown back up vs original
Original Compressed

52 Only keep every 4th line/column : 81 Kb

53 Real JPEG Image broken into blocks of pixels

54 Real JPEG Each block processed seperately

55 Real JPEG Block processed, to look for compressible patterns

56 Real JPEG Patterns can more or less recreate image

57 JPEG 200% No compress Low compress Med compress High compress


Download ppt "Data Compression."

Similar presentations


Ads by Google