Data Compression. How Is This Possible? Entire King James Bible : 4,834,757 bytes Zip Archive Containing It: 1,339,843 bytes.

Slides:



Advertisements
Similar presentations
T.Sharon-A.Frank 1 Multimedia Compression Basics.
Advertisements

15 Data Compression Foundations of Computer Science ã Cengage Learning.
Data Compression CS 147 Minh Nguyen.
Digital Color 24-bit Color Indexed Color Image file compression
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture3.
Spring 2003CS 4611 Multimedia Outline Compression RTP Scheduling.
CSCI 3 Chapter 1.8 Data Compression. Chapter 1.8 Data Compression  For the purpose of storing or transferring data, it is often helpful to reduce the.
Compression JPG compression, Source: Original 10:1 Compression 45:1 Compression.
A Data Compression Algorithm: Huffman Compression
Data Representation CS105. Data Representation Types of data: – Numbers – Text – Audio – Images & Graphics – Video.
CS430 © 2006 Ray S. Babcock Lossy Compression Examples JPEG MPEG JPEG MPEG.
Data storage Charles McAnany. What are the ones and zeroes? Hard drive Computer "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod.
Data Compression Gabriel Laden CS146 – Dr. Sin-Min Lee Spring 2004.
Dale & Lewis Chapter 3 Data Representation
Data Structures and Algorithms Huffman compression: An Application of Binary Trees and Priority Queues.
Compression Algorithms Robert Buckley MCIS681 Online Dr. Smith Nova Southeastern University.
Cosc 2150: Computer Organization Chapter 2a Data compression.
Data Compression For Images. Data compression or source coding is the process of encoding information using fewer bits (or other information-bearing units)
Chapter 2 Source Coding (part 2)
Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly.
Huffman Encoding Veronica Morales.
1 Analysis of Algorithms Chapter - 08 Data Compression.
Prof. Amr Goneid Department of Computer Science & Engineering
 The amount of data we deal with is getting larger  Not only do larger files require more disk space, they take longer to transmit  Many times files.
© Jalal Kawash 2010 Trees & Information Coding: 3 Peeking into Computer Science.
1 i206: Lecture 2: Computer Architecture, Binary Encodings, and Data Representation Marti Hearst Spring 2012.
Image Compression (Chapter 8) CSC 446 Lecturer: Nada ALZaben.
An introduction to audio/video compression Dr. Malcolm Wilson.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Web Graphics By Chris Harding. Contents  Software  Vector Graphics and Pixel Based  Transparent Images  Compression  GIF vs. JPEG  Animated GIF.
Huffman Code and Data Decomposition Pranav Shah CS157B.
Spring 2000CS 4611 Multimedia Outline Compression RTP Scheduling.
CS654: Digital Image Analysis Lecture 34: Different Coding Techniques.
Bahareh Sarrafzadeh 6111 Fall 2009
Chapter 3 Data Representation. 2 Compressing Files.
Comp 335 File Structures Data Compression. Why Study Data Compression? Conserves storage space Files can be transmitted faster because there are less.
Data Compression Data Compression For Images. Acknowledgement Most of this lecture note has been taken from the lecture note on Multimedia and HCI course.
CS 1501: Algorithm Implementation LZW Data Compression.
Sound (analogue signal). time Sound (analogue signal) time.
CS 101 – Sept. 11 Review linear vs. non-linear representations. Text representation Compression techniques Image representation –grayscale –File size issues.
An introduction to audio/video compression Prepared by :: Bhatt shivani ( )
Computer Sciences Department1. 2 Data Compression and techniques.
Data Compression: Huffman Coding in Weiss (p.389)
Component 1.9 Security and Data Management
GCSE COMPUTER SCIENCE Topic 3 - Data 3.3 Data Storage and Compression.
File Compression 3.3.
Denary to Binary Numbers & Binary to Denary
Design & Analysis of Algorithm Huffman Coding
Vocabulary byte - The technical term for 8 bits of data.
IMAGE COMPRESSION.
Data Compression.
Multimedia Outline Compression RTP Scheduling Spring 2000 CS 461.
Lesson Objectives Aims You should know about: 1.3.1:
Unit 2- Lesson 1 & 2- Bytes and File Sizes / Text Compression
Digital Image Processing Lecture 21: Lossy Compression May 18, 2005
JPG vs GIF vs PNG What is the difference?
Vocabulary byte - The technical term for 8 bits of data.
Huffman Coding, Arithmetic Coding, and JBIG2
Data Compression CS 147 Minh Nguyen.
Data Compression.
Why Compress? To reduce the volume of data to be transmitted (text, fax, images) To reduce the bandwidth required for transmission and to reduce storage.
Data Compression.
Data compression Why compress (or ‘zip’)? Lossy vs. lossless
Data Compression.
15 Data Compression Foundations of Computer Science ã Cengage Learning.
Do Now! Convert the following sequence of bits into an image using the protocol we discussed (first 8 bits are lengthxwidth, Then fill in the rows pixel.
Unit 2- Lesson 1 & 2- Bytes and File Sizes / Text Compression
Data Compression.
15 Data Compression Foundations of Computer Science ã Cengage Learning.
Presentation transcript:

Data Compression

How Is This Possible? Entire King James Bible : 4,834,757 bytes Zip Archive Containing It: 1,339,843 bytes

More Questions Why does this file: Compress different than:

Behind The Scenes Compression used for: – ~50% of web traffic – Most audio/video files – Sometimes for every file on a drive

Trick 1: Describe the contents of this file in as few words as possible…

Trick 1: Run Length Encoding : – Describe repetition as: (How many times)What to repeat – A

RLE Examples ABABABABABAB 6AB AAABBBBBAAACC 3A,5B,3A,2C (5)1,(1)0,(6)01

RLE Fail ABCDEF 1A,1B,1C,1D,1E

RLE Fail 2 This file doesn't just have A's: 80A,1newline,80A,1newline,80A,1newline…

Trick 2 Same As Earlier: – Describe patterns with instructions to go back x and copy y characters ABCDEFG-b7c7 "Write down ABCDEFG, then go back 7 characters and copy the next 7 characters to the end of what you have"

Same As Earlier ABCDEFG-b7c7

Same As Earlier ABCDEFG-b7c7 ABCDEFG

Same As Earlier ABCDEFG-b7c7 ABCDEFG

Same As Earlier ABCDEFG-b7c7 ABCDEFGA

Same As Earlier ABCDEFG-b7c7 ABCDEFGAB

Same As Earlier ABCDEFG-b7c7 ABCDEFGABC

Same As Earlier ABCDEFG-b7c7 ABCDEFGABCD

Same As Earlier ABCDEFG-b7c7 ABCDEFGABCDE

Same As Earlier ABCDEFG-b7c7 ABCDEFGABCDEF

Same As Earlier ABCDEFG-b7c7ABCDEFG

Same As Earlier ABCDEFG-b7c7 ABCDEFGABCDEFG

Same As Earlier AB-b2c6

Same As Earlier AB-b2c6 AB

Same As Earlier AB-b2c6 AB

Same As Earlier AB-b2c6 ABA

Same As Earlier AB-b2c6AB

Same As Earlier AB-b2c6 ABABA

Same As Earlier AB-b2c6 ABABAB

Same As Earlier AB-b2c6 ABABABA

Same As Earlier AB-b2c6 ABABABAB

Same As Earlier AB-b2c6 ABABABAB

Same As Earlier AB-b2c2-C-b3c4

Same As Earlier AB-b2c2-C-b2c5 AB

Same As Earlier AB-b2c2-C-b2c5 AB

Same As Earlier AB-b2c2-C-b2c5AB

Same As Earlier AB-b2c2-C-b2c5AB

Same As Earlier AB-b2c2-C-b2c5 ABABC

Same As Earlier AB-b2c2-C-b2c5 ABABC

Same As Earlier AB-b2c2-C-b2c5 ABABCB

Same As Earlier AB-b2c2-C-b2c5 ABABCBC

Same As Earlier AB-b2c2-C-b2c5 ABABCBCB

Same As Earlier AB-b2c2-C-b2c5 ABABCBCBC

Same As Earlier AB-b2c2-C-b2c5 ABABCBCBCB

Same As Earlier AB-b2c2-C-b2c5 ABABCBCBCB

Shorter Symbol Trick Normally text is 8-bit ASCII – 8bits = 256 possibilities

Shorter Symbol Trick If messages is just A's and B's we are wasting space: A B Why not: 0 1

Shorter Symbol Trick Shorter Symbol Trick: – Use minimum number of bits to represent different symbols in message – More common symbols get shorter representation

More Common This message: AAAABAAC Three symbols, need 2 bits – Could do  AAAABAAC (16 bits)

More Common But A is more common: AAAABAAC So maybe we can use a shorter code for it  AAAABAAC (10 bits)

Why Does it Work No code is a prefix for another – 0 : it is an A – 1 : keep going ABCAAB

Why Does it Work A BAD code  – 0 : is it an A? is it the start of a D? ABDA

Building a Code CS160 Reader… – Huffman Code Building

Lossy Compression Lossless compression : – Can recreate original perfectly – Algorithms: Run length encoding, same as earlier, shorter symbol – Examples: zip files, www traffic

Lossy Compression Lossy compression – Original can NOT be recreated perfectly

My Kids Kb

Every Other Line/Column Removed

Remaining pixels packed back down : 320Kb

Blown back up vs original OriginalCompressed

Only keep every 4th line/column : 81 Kb

Real JPEG Image broken into blocks of pixels

Real JPEG Each block processed seperately

Real JPEG Block processed, to look for compressible patterns

Real JPEG Patterns can more or less recreate image

JPEG 200% No compress Low compress Med compress High compress