Chapter 3: Data Representation Chapter 3 Data Representation Page 17 Computers use bits to represent all types of data, including text, numerical values, sounds, images, and animation. How many bits does it take to represent a piece of data that could have one of, say, 1000 values? If only one bit is used, then there are only two possible values: 0 and 1. If only one bit is used, then there are only two possible values: 0 and 1. If two bits are used, then there are four possible values: 00, 01, 10, and 11. If two bits are used, then there are four possible values: 00, 01, 10, and 11. Three bits produces eight possible values: 000, 001, 010, 011, 100, 101, 110 and 111. Three bits produces eight possible values: 000, 001, 010, 011, 100, 101, 110 and 111. Four bits produces 16 values; five bits produces 32; six produces 64;... Four bits produces 16 values; five bits produces 32; six produces 64;... Continuing in this fashion, we see that k bits would produce 2 k possible values. Continuing in this fashion, we see that k bits would produce 2 k possible values. Since 2 9 is 512 and 2 10 is 1024, we would need ten bits to represent a piece of data that could have one of 1000 values. Since 2 9 is 512 and 2 10 is 1024, we would need ten bits to represent a piece of data that could have one of 1000 values. Mathematically, this is the “ceiling” of the base-two logarithm, i.e., the count of how many times you could divide by two until you get to the value one: Mathematically, this is the “ceiling” of the base-two logarithm, i.e., the count of how many times you could divide by two until you get to the value one: 1000/2=500500/2=250250/2=125125/2=6363/2=3232/2=1616/2=88/2=44/2=22/2=
Representing Integers with Bits Chapter 3 Data Representation Page 18 Two’s complement notation was established to ensure that addition between positive and negative integers shall follow the logical pattern. Examples: 4-Bit Pattern Integer Value Bit Pattern Integer Value = = = = -7??? OVERFLOW! = 7??? OVERFLOW!
Two’s Complement Coding & Decoding How do we code –44 in two’s complement notation using 8 bits? First, write the value 44 in binary using 8 bits: First, write the value 44 in binary using 8 bits: Starting on the right side, skip over all zeros and the first one: Starting on the right side, skip over all zeros and the first one: Continue moving left, complementing each bit: Continue moving left, complementing each bit: The result is -44 in 8-bit two’s complement notation: The result is -44 in 8-bit two’s complement notation: How do we decode from two’s complement into an integer? Starting on the right side, skip over all zeros and the first one: Starting on the right side, skip over all zeros and the first one: Continue moving left, complementing each bit: Continue moving left, complementing each bit: Finally, convert the resulting positive bit code into an integer:76Finally, convert the resulting positive bit code into an integer:76 So, the original negative bit code must have represented:–76So, the original negative bit code must have represented:–76 Chapter 3 Data Representation Page 19
Representing Real Numbers with Bits When representing a real number like in binary form, a rather complicated approach is taken.When representing a real number like in binary form, a rather complicated approach is taken. Using only powers of two, we note that 17 is and.15 is …Using only powers of two, we note that 17 is and.15 is … So, in pure binary form, would be …So, in pure binary form, would be … In “scientific notation”, this would be … × 2 4In “scientific notation”, this would be … × 2 4 The standard for floating-point notation is to use 32 bits. The first bit is a sign bit (0 for positive, 1 for negative). The next eight are a bias-127 exponent (i.e., the actual exponent). And the last 23 bits are the mantissa (i.e., the exponent-less scientific notation value, without the leading 1).The standard for floating-point notation is to use 32 bits. The first bit is a sign bit (0 for positive, 1 for negative). The next eight are a bias-127 exponent (i.e., the actual exponent). And the last 23 bits are the mantissa (i.e., the exponent-less scientific notation value, without the leading 1). So, would have the following floating-point notation:So, would have the following floating-point notation: Chapter 3 Data Representation Page 20
Representing Text with Bits ASCII: American Standard Code for Information Interchange ASCII code was developed as a means of converting text into a binary notation.ASCII code was developed as a means of converting text into a binary notation. Each character has a 7-bit representation.Each character has a 7-bit representation. For example, CAT would be represented by the bits: For example, CAT would be represented by the bits: Chapter 3 Data Representation Page 21
Fax Machines In order to transmit a facsimile of a document over telephone lines, fax machines were developed to essentially convert the document into a grid of tiny black and white rectangles. This important document must be faxed immediately!!! A standard 8.5 11 page is divided into 1145 rows and 1728 columns, producing approximately 2 million 0.01 rectangles. Each rectangle is scanned by the transmitting fax machine and determined to be either predominantly white or predominantly black. We could just use the binary nature of this black/white approach (e.g., 1 for black, 0 for white) to fax the document, but that would require 2 million bits per page! Chapter 3 Data Representation Page 22
CCITT Fax Conversion Code By using one sequence of bits to represent a long run of a single color (either black or white), the fax code can be compressed to a fraction of the two million bit code that would otherwise be needed. length white black length white black length white black Chapter 3 Data Representation Page 23
Binary Code Interpretation Chapter 3 Data Representation Page 24 How is the following binary code interpreted? In “programmer’s shorthand” (hexadecimal notation)… A7BD065C3C FC AE As ASCII text… So(space)easy. As CCITT fax conversion code… white 11 2 black white white black white black white 11 2 black white 10 3 black white 10 3 black As a two’s complement integer... The negation of ( ) -24,843,437,912,294,226
Representing Audio Data with Bits Chapter 3 Data Representation Page 25 Audio files are digitized by sampling the audio signal thousands of times per second and then “quantizing” each sample (i.e., rounding off to one of several discrete values). The ability to recreate the original analog audio depends on the resolution (i.e., the number of quantization levels used) and the sampling rate.
Representing Still Images with Bits Chapter 3 Data Representation Page 26 Digital images are composed of three fields of color intensity measurements, separated into a grid of thousands of pixels (picture elements). The size of the grid (the image’s resolution) determines how clear the image can be displayed. 2 2 4 4 8 8 16 512
RGB Color Representation Chapter 3 Data Representation Page 27 In digital display systems, each pixel in an image is represented as an additive combination of the three primary color components: red, green, and blue. TrueColor Examples RedGreenBlueResult Printers, however, use a subtractive color system, in which the complementary colors of red, green, and blue (cyan, magenta, and yellow) are applied in inks and toners in order to subtract colors from a viewer’s perception.
Compressing Images with JPEG The Joint Photographic Experts Group developed an elaborate procedure for compressing color image files: First, the original image is split into 8 8 squares of pixels. Each square is split into three 8 8 grids indicating the levels of lighting and blue and red coloration the square contains. Chapter 3 Data Representation Page 28 After rounding off the values in the three grids in order to reduce the number of bits needed, each grid is traversed in a zig-zag pattern to maximize the chances that consecutive values will be equal, which, as occurred in fax machines, reduces the bit requirement even further. Depending on how severely the values were rounded, the restored image will either be a good representation of the original (with a high bit count) or a bad representation (with a low bit count).
Representing Video with Bits Video images are merely a sequence of still images, shown in rapid succession. Chapter 3 Data Representation Page 29 One means of compressing such a vast amount of data is to use the JPEG technique on each frame, thus exploiting each image’s spatial redundancy. The resulting image frames are called intra-frames. Video also possesses temporal redundancy, i.e., consecutive frames are usually nearly identical, with only a small percentage of the pixels changing color significantly. So video can be compressed further by periodically replacing several I-frames with predictive frames, which only contain the differences between the predictive frame and the last I-frame in the sequence. P-frames are generally about one-third the size of corresponding I-frames. The Motion Picture Experts Group (MPEG) went even further by using bidirectional frames sandwiched between I- frames and P-frames (and between consecutive P-frames). Each B-frame includes just enough information to allow the original frame to be recreated by blending the previous and next I/P-frames. B-frames are generally about half as big as the corresponding P-frames (i.e., one-sixth the size of the corresponding I-frames).