1 Data Compression Engineering Math Physics (EMP) Steve Lyon Electrical Engineering.

Slides:



Advertisements
Similar presentations
15 Data Compression Foundations of Computer Science ã Cengage Learning.
Advertisements

Data Compression CS 147 Minh Nguyen.
Time-Frequency Analysis Analyzing sounds as a sequence of frames
Image Compression. Data and information Data is not the same thing as information. Data is the means with which information is expressed. The amount of.
Chapter 7 End-to-End Data
School of Computing Science Simon Fraser University
Chapter 5 Making Connections Efficient: Multiplexing and Compression
School of Computing Science Simon Fraser University
Spatial and Temporal Data Mining
Compression JPG compression, Source: Original 10:1 Compression 45:1 Compression.
Computer Science 335 Data Compression.
1 Digital Cameras Engineering Math Physics (EMP) Jennifer Rexford
T.Sharon-A.Frank 1 Multimedia Image Compression 2 T.Sharon-A.Frank Coding Techniques – Hybrid.
1 A Balanced Introduction to Computer Science, 2/E David Reed, Creighton University ©2008 Pearson Prentice Hall ISBN Chapter 12 Data.
Roger Cheng (JPEG slides courtesy of Brian Bailey) Spring 2007
1 JPEG Compression CSC361/661 Burg/Wong. 2 Fact about JPEG Compression JPEG stands for Joint Photographic Experts Group JPEG compression is used with.jpg.
Image Compression JPEG. Fact about JPEG Compression JPEG stands for Joint Photographic Experts Group JPEG compression is used with.jpg and can be embedded.
Spring 2015 Mathematics in Management Science Binary Linear Codes Two Examples.
Image Formation and Digital Video
Image Compression - JPEG. Video Compression MPEG –Audio compression Lossy / perceptually lossless / lossless 3 layers Models based on speech generation.
Joint Picture Experts Group(JPEG)
Trevor McCasland Arch Kelley.  Goal: reduce the size of stored files and data while retaining all necessary perceptual information  Used to create an.
CS559-Computer Graphics Copyright Stephen Chenney Image File Formats How big is the image? –All files in some way store width and height How is the image.
CS 1308 Computer Literacy and the Internet. Creating Digital Pictures  A traditional photograph is an analog representation of an image.  Digitizing.
JPEG C OMPRESSION A LGORITHM I N CUDA Group Members: Pranit Patel Manisha Tatikonda Jeff Wong Jarek Marczewski Date: April 14, 2009.
Chapter 2 Source Coding (part 2)
1 Topic 4: Physical Layer - Chapter 10: Transmission Efficiency Business Data Communications, 4e.
ECE242 L30: Compression ECE 242 Data Structures Lecture 30 Data Compression.
Compression is the reduction in size of data in order to save space or transmission time. And its used just about everywhere. All the images you get on.
Introduction to JPEG Alireza Shafaei ( ) Fall 2005.
CSCI-235 Micro-Computers in Science Hardware Part II.
Video Basics. Agenda Digital Video Compressing Video Audio Video Encoding in tools.
Computers and Scientific Thinking David Reed, Creighton University Data Representation 1.
Lab #5-6 Follow-Up: More Python; Images Images ● A signal (e.g. sound, temperature infrared sensor reading) is a single (one- dimensional) quantity that.
Chapter 11 Fluency with Information Technology 4 th edition by Lawrence Snyder (slides by Deborah Woodall : 1.
D ATA C OMMUNICATIONS Compression Techniques. D ATA C OMPRESSION Whether data, fax, video, audio, etc., compression can work wonders Compression can be.
DIGITAL Video. Video Creation Video captures the real world therefore video cannot be created in the same sense that images can be created video must.
1 i206: Lecture 2: Computer Architecture, Binary Encodings, and Data Representation Marti Hearst Spring 2012.
Chapter 2 : Business Information Business Data Communications, 6e.
Image Processing and Computer Vision: 91. Image and Video Coding Compressing data to a smaller volume without losing (too much) information.
CIS679: Multimedia Basics r Multimedia data type r Basic compression techniques.
Image Compression Supervised By: Mr.Nael Alian Student: Anwaar Ahmed Abu-AlQomboz ID: IT College “Multimedia”
Chapter 2 : Imaging and Image Representation Computer Vision Lab. Chonbuk National University.
Addressing Image Compression Techniques on current Internet Technologies By: Eduardo J. Moreira & Onyeka Ezenwoye CIS-6931 Term Paper.
Digital Image Processing Image Compression
1 Image Formats. 2 Color representation An image = a collection of picture elements (pixels) Each pixel has a “color” Different types of pixels Binary.
Digital Media Lecture 4: Bitmapped images: Compression & Convolution Georgia Gwinnett College School of Science and Technology Dr. Jim Rowan.
Advances in digital image compression techniques Guojun Lu, Computer Communications, Vol. 16, No. 4, Apr, 1993, pp
CSCI-100 Introduction to Computing Hardware Part II.
Chapter 1 Background 1. In this lecture, you will find answers to these questions Computers store and transmit information using digital data. What exactly.
Data Compression Data Compression For Images. Acknowledgement Most of this lecture note has been taken from the lecture note on Multimedia and HCI course.
STATISTIC & INFORMATION THEORY (CSNB134) MODULE 11 COMPRESSION.
1 Part A Multimedia Production Chapter 2 Multimedia Basics Digitization, Coding-decoding and Compression Information and Communication Technology.
IS502:M ULTIMEDIA D ESIGN FOR I NFORMATION S YSTEM M ULTIMEDIA OF D ATA C OMPRESSION Presenter Name: Mahmood A.Moneim Supervised By: Prof. Hesham A.Hefny.
Software Design and Development Storing Data Part 2 Text, sound and video Computing Science.
Submitted To-: Submitted By-: Mrs.Sushma Rani (HOD) Aashish Kr. Goyal (IT-7th) Deepak Soni (IT-8 th )
Fundamentals of Data Representation Yusung Kim
Video Basics.
Data Compression.
Digital 2D Image Basic Masaki Hayashi
Data Compression.
Data Compression CS 147 Minh Nguyen.
Software Equipment Survey
Engineering Math Physics (EMP)
Why Compress? To reduce the volume of data to be transmitted (text, fax, images) To reduce the bandwidth required for transmission and to reduce storage.
15 Data Compression Foundations of Computer Science ã Cengage Learning.
Chapter 8 – Compression Aims: Outline the objectives of compression.
15 Data Compression Foundations of Computer Science ã Cengage Learning.
Presentation transcript:

1 Data Compression Engineering Math Physics (EMP) Steve Lyon Electrical Engineering

2 Why Compress? Digital information is represented in bits –Text: characters (each encoded as a number) –Audio: sound samples –Image: pixels More bits means more resources –Storage (e.g., memory or disk space) –Bandwidth (e.g., time to transmit over a link) Compression reduces the number of bits –Use less storage space (or store more items) –Use less bandwidth (or transmit faster) –Cost is increased processing time/CPU hardware

Video –TV – 640x480 pixels (ideal US broadcast TV) –3 colors/pixel (Red, Green, Blue) –1 byte (values from 0 to 255) for each color  ~900,000 bytes per picture (frame) –30 frames/second  ~27MB/sec –DVD holds ~5 GB  Can store ~3 minutes of uncompressed video on a DVD  Must compress 3 Do we really need to compress?

4 Compression Pipeline Sender and receiver must agree –Sender/writer compresses the raw data –Receiver/reader un-compresses the compressed data Example: digital photography compress uncompress compress uncompress

5 Two Kinds of Compression Lossless –Only exploits redundancy in the data –So, the data can be reconstructed exactly –Necessary for most text documents (e.g., legal documents, computer programs, and books) Lossy –Exploits both data redundancy and human perception –So, some of the information is lost forever –Acceptable for digital audio, images, and video

6 Lossless: Huffman Encoding Normal encoding of text –Fixed number of bits for each character ASCII with seven bits for each character –Allows representation of 2 7 =128 characters –Use 97 for ‘a’, 98 for ‘b’, …, 122 for ‘z’ But, some characters occur more often than others –Letter ‘a’ occurs much more often than ‘x’ Idea: assign fewer bits to more-popular symbols –Encode ‘a’ as “000” –Encode ‘x’ as “ ”

7 Lossless: Huffman Encoding Challenge: generating an efficient encoding –Smaller codes for popular characters –Longer codes for unpopular characters English Text: frequency distribution Morse code

8 Lossless: Run-Length Encoding Sometimes the same symbol repeats –Such as “eeeeeee” or “eeeeetnnnnnn” –That is, a run of “e” symbols or a run of “n” symbols Idea: capture the symbol only once –Count the number of times the symbol occurs –Record the symbol and the number of occurrences Examples –So, “eeeeeee” becomes –So, “eeeeetnnnnnn” becomes Useful for fax machines –Lots of white, separate by occasional black

9 Image Compression Benefits of reducing the size –Consume less storage space and network bandwidth –Reduce the time to load, store, and transmit the image Redundancy in the image –Neighboring pixels often the same, or at least similar –E.g., the blue sky Human perception factors –Human eye is not sensitive to high spatial frequencies

Approximating arbitrary functions (curves) How can we represent some arbitrary function by some simple ones? Ex. This mountain range

Approximating with a sum of cosines n=1 ½ wavelength 2 ½ wavelengths n=5 n=15 7 ½ wavelengths constant n=0

Approximation with 5 terms

Approximation with 15 terms

Approximation with 45 terms

Approximation with 145 terms

16 Discrete cosine transform How do we determine the coefficients of each term? –How much of “3 wavelengths” vs. “47 wavelengths”? –Look at the fit and tweak the coefficients? –Maybe for a couple –Insane for 145 Idea: look at  = 0 if n  m  =  /2 if n = m (or  if n = m = 0) So, if Then

Often, most of the information is in the first few f n  low frequencies Ex. “filter” and keep only the low frequencies  compression Can manipulate the Fourier coefficients (f n ) 17

Periodic functions 18 Produces a periodic curve: Cosine transforms particularly good for representing periodic signals - Like sound (music)

19 Example: Digital Audio Sampling the analog signal –Sample at some fixed rate –Each sample is an arbitrary real number Quantizing each sample –Round each sample to one of a finite number of values –Represent each sample in a fixed number of bits 4 bit representation (values 0-15)

20 Example: Digital Audio Speech –Sampling rate: 8000 samples/second –Sample size: 8 bits per sample –Rate: 64 kbps Compact Disc (CD) –Sampling rate: 44,100 samples/second –Sample size: 16 bits per sample –Rate: kbps for mono, Mbps for stereo

21 Example: Digital Audio Audio data requires too much bandwidth –Speech: 64 kbps is too high for a dial-up modem user –Stereo music: Mbps exceeds most access rates Compression to reduce the size –Remove redundancy –Remove details that humans tend not to perceive Example audio formats –Speech: GSM (13 kbps), G.729 (8 kbps), and G (6.4 and 5.3 kbps) –Stereo music: MPEG 1 layer 3 (MP3) at 96 kbps, 128 kbps, and 160 kbps

KB34 KB 8 KB Joint Photographic Experts Group (JPEG) Lossy compression

23 Contrast Sensitivity Curve

Digital cameras (CCDs) output RGB –Eyes most sensitive to intensity –Less sensitive to color variations Convert image to YCbCr –Y = intensity ~ (R+G+B)  Gives black & white  B&W TV’s could use that when color TV first came out –Cb ~ (B – Y) –Cr ~ (R – Y) Sometimes leave as RGB – gives poorer quality jpeg 24 How JPEG works 1

25 How JPEG works 2 Either RGB or YCbCr gives 3 8-bit “planes” –Process separately Process image in 8-pixel x 8-pixel blocks –2-dimensional discrete Fourier Transform (DCT) N 1 = N 2 = 8 –Just matrix multiplication –Produces 8x8 matrix (B) of spatial frequencies –“Quantize”  divide each element by fixed number  High-frequency coefficients divided by larger number  If result is small, set to 0 (the lossy part)  Can be “lossier” on Cb and Cr than on Y Lossless compression to squeeze out the 0’s

Block of pixels (really 8 by 8)2D DCT of Block Quantization Matrix (accentuate the low frequencies) Quantized Pixel Matrix 2D Discrete Cosine Transform (DCT) Division and Rounding Low frequency High frequency

JPEG Artifacts 27  JPEG does not compress text or diagrams well.  Here same file size as lossless compression – gif  Get “halos” around letters, lines, etc.  Lines and text have sharp edges  JPEG smears  Get “blotchy” appearance when heavily compressed  Have 8x8 blocks of all one color – only constant term in DCT remained

28 Conclusion “Raw” digital information often has many more bits than necessary –Redundancies and patterns we can use –Information that is imperceptible to people Lossless compression –Used when must be able to exactly recreate original –Find common patterns (letter frequencies, repeats, etc.) Lossy Compression –Can get very large compression ratios – a few to 1000’s –Exploit redundancy and human perception  Remove information we (people) don’t need –Too much compression degrades the signals