Data Representation :: Compression

Slides:



Advertisements
Similar presentations
1 A Balanced Introduction to Computer Science, 2/E David Reed, Creighton University ©2008 Pearson Prentice Hall ISBN Chapter 12 Data.
Advertisements

Nat 4/5 - Software Design and Development – Low Level Operations - 1 National 4/5 – Computing Science Information Systems Design and Development Media.
Zinnia Bell. RAWimages are image files that have not yet processed, they contain minimally processed data from the image sensor of either a image scanner,
Trevor McCasland Arch Kelley.  Goal: reduce the size of stored files and data while retaining all necessary perceptual information  Used to create an.
Representation of Data in Computer Systems
CS559-Computer Graphics Copyright Stephen Chenney Image File Formats How big is the image? –All files in some way store width and height How is the image.
Fundamentals Rawesak Tanawongsuwan
Lecture 10 Data Compression.
Faculty of Sciences and Social Sciences HOPE Website Development Graphics Stewart Blakeway FML 213
Task 01 – Explain how different types of graphical images relate to file formats, file conversions, formats and compression. Emily Riley.
COMP Bitmapped and Vector Graphics Pages Using Qwizdom.
Chapter 11 Fluency with Information Technology 4 th edition by Lawrence Snyder (slides by Deborah Woodall : 1.
1 Perception, Illusion and VR HNRS 299, Spring 2008 Lecture 14 Introduction to Computer Graphics.
Multimedia and The Web.
Common file formats  Lesson Objective: Understanding common file formats and their differences.  Learning Outcome:  Describe the type of files which.
Chapter 2 : Business Information Business Data Communications, 6e.
Data Compression. Compression? Compression refers to the ways in which the amount of data needed to store an image or other file can be reduced. This.
CMSC 100 Storing Data: Huffman Codes and Image Representation Professor Marie desJardins Tuesday, September 18, 2012 Tue 9/18/12 1CMSC Data Compression.
Unit 1: Task 1 By Abbie Llewellyn. Vector Graphic Software (Corel Draw) Computer graphics can be classified into two different categories: raster graphics.
Marwan Al-Namari 1 Digital Representations. Bits and Bytes Devices can only be in one of two states 0 or 1, yes or no, on or off, … Bit: a unit of data.
Image File Formats. What is an Image File Format? Image file formats are standard way of organizing and storing of image files. Image files are composed.
COMP135/COMP535 Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman Chapter 2 Lecture 2 – Digital Representations.
Comp 335 File Structures Data Compression. Why Study Data Compression? Conserves storage space Files can be transmitted faster because there are less.
STATISTIC & INFORMATION THEORY (CSNB134) MODULE 11 COMPRESSION.
Image File Formats Harrow Computer Club – Wed, 1 Dec 2010 Bob Watson MA CMath MIMA MBCS.
Information Systems Design and Development Media Types Computing Science.
Software Design and Development Storing Data Part 2 Text, sound and video Computing Science.
By the end of this session you should be able to... Understand character sets and why these are used within computer systems. Understand how characters.
Component 1.9 Security and Data Management
GCSE COMPUTER SCIENCE Topic 3 - Data 3.3 Data Storage and Compression.
File Compression 3.3.
Denary to Binary Numbers & Binary to Denary
Compression & Huffman Codes
3.3 Fundamentals of data representation
Vocabulary byte - The technical term for 8 bits of data.
IMAGE COMPRESSION.
Computer Science Higher
Data Compression.
ENEL 111 Digital Electronics
Lesson Objectives Aims You should know about: 1.3.1:
Computer Graphics Different Images File.
Associated Hardware and File Handling
JPG vs GIF vs PNG What is the difference?
Data Compression.
Lossy vs Lossless compression
Vocabulary byte - The technical term for 8 bits of data.
Chapter III, Desktop Imaging Systems and Issues: Lesson IV Working With Images
Data Compression CS 147 Minh Nguyen.
Look at Me Mod 4 Lesson 3 Graphics Module 4- Build a Game.
Digital Image Formats: An Explanation
Why Compress? To reduce the volume of data to be transmitted (text, fax, images) To reduce the bandwidth required for transmission and to reduce storage.
Representing Images 2.6 – Data Representation.
Web Design and Development
Topic 3: Data Compression.
Do Now! Convert the following sequence of bits into an image using the protocol we discussed (first 8 bits are lengthxwidth, Then fill in the rows pixel.
Chapter 8 – Compression Aims: Outline the objectives of compression.
GCSE COMPUTER SCIENCE Topic 3 - Data 3.9 Data Compression.
Creating Digital Graphics
Programming Techniques :: Records
Programming Techniques :: File Handling
Programming Techniques :: String Manipulation
Programming Techniques :: Flow Diagrams and Pseudocode
Data Representation :: Binary & Hexadecimal
Running & Testing :: IDEs
Programming Techniques :: Logic & Truth Tables
Programming Techniques :: Data Types and Variables
Networks :: Wireless Networks
Running & Testing Programs :: Translators
Programming Techniques :: Arithmetic & Boolean Operators
Programming Techniques :: Computational Thinking
Presentation transcript:

Data Representation :: Compression jamie@drfrostmaths.com www.drfrostmaths.com @DrFrostMaths Last modified: 19th July 2019

www.drfrostmaths.com ? Everything is completely free. Why not register? Registering on the DrFrostMaths platform allows you to save all the code and progress in the various Computer Science mini-tasks. It also gives you access to the maths platform allowing you to practise GCSE and A Level questions from Edexcel, OCR and AQA. With Computer Science questions by: Your code on any mini-tasks will be preserved. Note: The Tiffin/DFM Computer Science course uses JavaScript as its core language. Most code examples are therefore in JavaScript. Using these slides: Green question boxes can be clicked while in Presentation mode to reveal. Slides are intentionally designed to double up as revision notes for students, while being optimised for classroom usage. The Mini-Tasks on the DFM platform are purposely ordered to correspond to these slides, giving your flexibility over your lesson structure. ?

Learning Objectives Directly from the OCR specification: Not in the syllabus: Compression algorithms, i.e. how data is compressed (although we will touch upon these just for your interest)

The need for compression Compression is reducing the amount of data needed for a file/data stream. There are many reasons why we’d want to use compression: 1 Web pages load more quickly Some of the larger JavaScript files on DrFrostMaths I run through a tool at www.jscompress.com. This removes whitespace, renames variables to single letters and uses various clever programming syntax to reduce code length. The file size ends up being more than 50% less. As a convention we name ‘minified’ JavaScript files to end with .min.js

The need for compression Compression is reducing the amount of data needed for a file/data stream. There are many reasons why we’d want to use compression: 2 Files take up less storage space ‘zip’ files are a compressed collection of files. It has the advantage of treating a directory as a single file (making it easier to send), but also takes up less overall space.

The need for compression Compression is reducing the amount of data needed for a file/data stream. There are many reasons why we’d want to use compression: 3 Files/data takes up less bandwidth Bandwidth is the amount of data transferred in a fixed amount of time. Having to download less data may save on your mobile phone bill! Chrome on Android phones uses a ‘compression proxy’. All requests for web data goes via Google’s servers, which compresses the data before delivering it to your phone.

The need for compression Compression is reducing the amount of data needed for a file/data stream. There are many reasons why we’d want to use compression: 4 Emails have limited attachment size While email standards size as MIME don’t have a theoretical maximum file attachment size, in practice most email services have a limit.

Lossy vs Lossless Compression For some data, any compression must allow the full original data to be reconstructed, e.g. Compressed code would not function correctly if we lost code. Compressed files similarly might be corrupted if we couldn’t recover some of the original data after uncompressing. ! Lossless compression allows the original data to be reconstructed in full. Data is only temporarily removed while the data is in compressed form. However, for audio or image files, sometimes we tolerate some of the original data to be lost at the expense quality. Reduce audio sample size time ! Lossy compression permanently discards some of the data.

Lossy vs Lossless Compression Advantages Disadvantages Example audio/visual file types Lossless No reduction in quality: image will look exactly the same/audio sound exactly the same. Relatively small reduction in file size. png (image) gif (image) wav (audio) Lossy Larger reduction file size/reduced bandwidth. Commonly used, therefore most software can read such data. Loses data, so can’t reconstruct original. Can’t be used on files which must preserve all data. Loss of quality may be noticeable if compression high. jpg (image) mp3 (audio) ? ? ? ? ? ?

JPEGs We saw on the previous slide that JPGs result in a permanent loss in quality of the image. For images with blocks of colour, e.g. the above, we tend to get quite a lot of ‘noise’, so PNGs/GIFs tend to be better for ‘graphic art’. JPEG compression tends to work much better on photos, and is the file format typically outputted by cameras. Decreasing compression rate We can customise the amount of JPEG compression. Higher compression reduces file size but also reduces quality, as demonstrated above.

For your interest :: Image Compression Algorithms (Not in the syllabus) PNG compression [Source: Pink Kitty 111] This image shows the ‘relative cost’ (in terms of number of bits) required for each pixel, with blue the least bits and red the most bits. As you can see, areas of the same colour take up less space. But also repeating textures (e.g. ends of the bananas) also take up less space due to how the compression algorithm works.

For your interest :: Image Compression Algorithms (Not in the syllabus) PNG compression There is a two stage compression process, part of a compression algorithm known as DEFLATE: #1 :: LZSS Compression This identifies repeating sequences of characters. For images, this corresponds to efficiently compressing repeating segments within the image. (5,3) means we’re using the word starting at position 5 (the ‘S’) and 3 characters long. LZSS

For your interest :: Image Compression Algorithms (Not in the syllabus) #2 :: Huffman Coding Typically we would use the same number of bits for each character, e.g. 8 bits. But it would be more space efficient to use a varying number of bits for each character, so that more common characters use less bits and less common characters use more bits. Because no code is a prefix of any other code (e.g. 10 doesn’t appear as the first two digits of any other code), it means there is no ambiguity in how the string is split up. Symbol Code a b 10 c 110 d 111 a b c d e.g. 0110101110110101110 Suppose we had just 4 letters used in our data: a, b, c, d. The decimals show the proportion of time each letter appears, e.g. ‘a’ 40% of the time (and thus should have the least number of bits). ? 0 110 10 111 0 110 10 111 0 acbdacbda ? LZSS

For your interest :: Image Compression Algorithms (Not in the syllabus) JPEG compression JPEG compression is considerably more complicated and uses a large amount of mathematics. But a summary: The colour model is converted from RGB (Red-Green-Blue) to Y’CRCB, where Y’ is to do with Brightness and CRCB two colour components. Because the brightness is confined to a single value (rather than spread across R, G and B), and because human visual perception is dominated by brightness over colour, we can compress colour information more efficiently. A single 2D cosine wave: The brightness (Y’) of each pixel is initially preserved, but for each of the colour values CR and CB, it is ‘downsampled’, such that each 2×2 block is replaced with a single colour value. Each 8×8 block undergoes something called a Discrete Cosine Transformation for each of Y’, CR and CB, which means to approximate it as a sum of 2D cosine curves. We use bit encoding techniques similar to that for PNGs, e.g. Huffman encoding.

Exam Question ? OCR Sample Question Paper Notice that you need to give the practical implication of each benefit (even if it’s really obvious!)

Coding Mini-Tasks Return to the DrFrostMaths site to complete the various tasks on compression.