Visual Representation of Information SIMS-201 Visual Representation of Information Converting gray scale and color images to binary. Image compression.
Overview Chapter 5: From the real world to Images and Video Introduction to visual representation and display Converting images to gray scale Color representation Video
Introduction to Visual Representation and Display Images play a fundamental role in the representation, storage and transmission of information Earlier, we learned how to represent information that was in the form of numbers and text with binary digits In this chapter we will learn how to represent still and time varying images with binary digits “A picture is worth ten thousand words” But it takes a whole lot more than that!
Image Issues The world we live in is analog, or continuously varying There are problems involved in digitizing, or making discrete, so we make some approximations and determine tradeoffs involved While digitizing, we need to consider the following facts: We are producing information for human use Human vision has limitations Take advantage of this fact Produce displays that are “good enough”
Digital Information for Humans Many digital systems take advantage of human limitations (visual, aural, etc) Human gray scale acuity is 2% of full brightness Or: Most people can detect at most 50 gray levels (6 bits) The human eye can resolve about 60 “lines per degree of visual arc”- a measure of the ability of the eye to resolve fine detail When we look at a 8.5 x 11” sheet of paper at 1 foot (landscape) the viewing angles are 49.25 degrees for the horizontal dimension, and 39 degrees for the vertical dimension We can therefore distinguish : 49.25 degrees x 60 = 2955 horizontal lines 39 degrees x 60 = 2340 vertical lines These numbers give us a clue about the length of the code needed to capture images
Visual Arc “lines per degree of visual arc” Image brought closer to the eye, we can resolve more detail Humans can resolve 60 lines per degree of visual arc A line requires two strings of pixels – one black, one white Pixel – The smallest unit of representation for visual information Visual Arc
Pixels A pixel is the smallest unit of representation for visual information Each pixel in a digitized image represents one intensity (brightness) level (gray scale or color) 13 x 13 grid = 169 pixels Gray scale Color
To form a black line, we need to arrange a string of black pixels parallel to a string of white pixels. The eye will discern this black-white transition as a line. So, two rows of pixels are needed (one black and one white). For our paper example: Number of pixels needed to represent total image on a page: (2 x 2955) x (2 x 2340) = 27,658,800 pixels per page This number of pixels would be sufficient to represent any image on the page with no visible degradation compared to a perfect (unpixelized) image at a distance of one foot As the number of pixels that form an image (spatial resolution) decreases, the amount of data that needs to be transmitted, stored or processed decreases as well. However, the tradeoff is that the quality of the image degrades as a result
A note about printer resolution When dealing with printers we often quote the resolution in terms of dots per inch (dpi), which corresponds to pixels per inch in our example It is popular to set laser or ink-jet printer settings to 600 dpi of resolution However, if we hold the paper closer, we might see edges between the pixels, because our ability to resolve fine detail increases. Then, we would need a greater resolution printer for example: 720 dpi or 1200 dpi
How many pixels should be used If too few pixels used, image appears “coarse” 16 x16 (256 pixels) 64 x 64 (4096 pixels)
Digitizing Images (gray scale) The first step to digitize a “black and white” image composed of an array of gray shades, is to divide the image into a number of pixels, depending on the required spatial resolution The number of brightness levels to be represented by each pixel is assigned next If we wish to use for example, 6 bits for the brightness level of each pixel, then each pixel can represent one of 64 (26) different brightness levels (shades of gray, from black to white, also called grayscale levels) Then, each pixel would have a 6-bit number associated with it, representing the brightness level (shade) that is closest to the actual brightness level at that pixel (in the perfect continuous image we are digitizing)
This process is known as quantization (we will learn more about this later in the course) - it is the process of rounding off actual continuous values so that they can be represented by a fixed number of binary digits As a result of the operations just described, the analog image is digitized and represented by a string of binary digits… 1010010101010101010
6-bit image (64 gray levels) In the figures below, each pixel in the image is represented by 6 bits, 3 bits and 1 bit. The effect of varying the number of bits used to represent each pixel is evident 6-bit image (64 gray levels) 3-bit image (8 gray levels)
1-bit image (pure black and white)
How much storage is needed? Total number of bits required for storage = total number of pixels * number of bits used per pixel For example – Black and white photo 64 x 64 pixels Use 32 gray levels (5 bits) 64 x 64 x 5 = 20,480 bits = 2560/1024 bytes = 2.5KB We recall that data storage is measured in bytes KB represents 210 or 1024 bytes
Another example Black and White photo 256 x 256 pixel 6 bits (64 gray levels) How much storage is needed? 256 x 256 x 6 = 393,216 bits 393,216/8 = 49,152 bytes 49,152/1024 = 48 KB
A note about resolution Since the total number of bits required for storage = total number of pixels * number of bits used per pixel, there are two ways to reduce the number of bits needed (size of file) to represent an image: Reduce total number of pixels (dpi) Reduce number of bits used per pixel (number of gray levels) Applying these however, reduces the quality of the image. The first results in low spatial resolution (image appears coarse). The second results in poor brightness resolution, as seen in the previous slides. A better approach to reduce the amount of storage needed is - by applying Image Compression
Digitizing Images (color) Any color can be created by adding the appropriate combination of red, green and blue light proportions. Thus, we can represent a color with 3 numbers indicating the amount of red, green, and blue light that combine to produce the color. If we wish to digitize a color image, we again first need to divide the image into pixels Next, we need to know the amount of red, green and blue (RGB) that comprises the color at each pixel location Finally, we will convert these three levels to a binary number of a predefined length For example: If we use 3 bits for each color value, we would be able to represent 8 intensity levels each of red, green and blue This representation would require 9 bits per pixel Which would allow us to have 512 different colors per pixel
Example Color photo 256x256 pixel 9 bits per pixel (3 bits each for red, green and blue) 256x256x9=589,826 bits 589,826/8=73,728 bytes 73,728/1024=72 KB of memory is needed to store this color photo
Hue Luminance Saturation Another approach to color representation of images is Hue, Luminance and Saturation (HLS) This system does not represent colors by combinations of other colors, but it still uses 3 numerical values Hue: Represents where the pure color component falls on a scale that extends across the visible light spectrum (from red to violet) Luminance: Represents how light or dark a pixel is Saturation: How “pure” the color is, i.e. how much it is diluted by the addition of white (100% saturation means no dilution with white ) Let us see at how this system works with the power point color palette on this box:
Video Human perception of movement is slow (visual persistence/latency) Studies show that humans can only take in 20 different images per second before they begin to blur together If these images are sufficiently similar, then the blurring which takes place appears to the eye to resemble motion, in the same way we discern it when an object moves smoothly in the real world. We can detect higher rates of flicker, but only to about 50 per second This phenomenon has been used since the beginning of the 20th century to produce “moving pictures”, or movies. Movies show 24 frames per second TV works similarly, but instead of a frame, TV refreshes in lines across the tube This same phenomenon can be used to create digitized video - a video signal stored in binary form
Video We have already discussed how individual images are digitized; digital video simply consists of a sequence of digitized still images, displayed at a rate sufficiently high to appear as continuous motion to the human visual system. The individual images are obtained by a digital camera that acquires a new image at a sufficiently fast rate (say, 60 times per second), to create a time-sampled version of the scene in motion Because of human visual latency, these samples at certain instants in time are sufficient to capture all of the information that we are capable of taking in!
Adding up the bits Assume a screen that is 512x512 pixels - about the same resolution as a good TV set. Assume 3 bits per color per pixel, for a total of 9 bits per pixel Let's say we want the scene to change 60 times per second, so that we don't see any flicker or choppiness. This means we will need 512 x 512 pixels x 9 bits per pixel x 60 frames per second x 3600 seconds = 500 billion bits per hour - just for the video. Francis Ford Coppola's The Godfather, at over 3 hours, would require nearly 191 GB - over 191 billion bytes, of memory using this approach. This almost sounds like an offer we can refuse. But, do films actually require this much storage? – Fortunately, no.. The reason we can represent video with significantly fewer bits than in this example is due to compression techniques, which take advantage of certain predictabilities and redundancies in video information to reduce the amount of information to be stored.