Standard Data Encoding

Standard Data Encoding
Summary of Standard Data Encoding Summary of Standard Data Encoding Forms © Alan T. Pinck / Algonquin College; 2003

Character Encoding Storage Unit = byte ASCII EBCDIC
‘A’:41h …’Z’:5Ah; plus 20h for lower case ‘0’:30h … ‘9’:39h blank:20h; carriage return:0Dh; line feed:0Ah EBCDIC ‘A’:C1h … ‘I’:C9h; ‘J’:D1h …’R’:D9h; ‘S’:E2...‘Z’:E9h subtract 40h for lower case ‘0’:F0h … ‘9’:F9h blank 40h A “byte” is a collection of bits which form the standard storage unit for the encoded pattern representing a character, on some particular computer. While there is some variation in the size of a byte, by far the most common byte size is 8 bits. The ASCII encoding system was designed as a 7-bit code; 8-bit “extended” versions use the original 7-bit patterns with a leading 0-valued bit; when the leading bit has a 1-value, the code has no standardized meaning in the ASCII system. The upper case letters are encoded in ASCII starting with the letter ‘A’ encoded as 41(hex), ‘B’ as 42(hex), and so on up to ‘Z’ as 5A(hex). Adding the ASCII code value for a space, 20(hex), to the code value for any upper case letter gives the coding pattern for the corresponding lower case letter. The carriage return has a code of 0D(hex) and the line feed has a code of 0A(hex). In EBCDIC, the first 9 upper case letters, ‘A’ to ‘I’, are encoded with the values from C1 to C9 (hex) inclusive. The next 9 upper case letters, ‘J’ to ‘R’, are encoded with the values from D1 to D9 (hex) inclusive. The last 8 upper case letters, ‘S’ to ‘Z’, are encoded with the values E2 to E9 (hex) inclusive. In EBCDIC, lower case letters are encoded by subtracting the value of an EBCDIC space, 40(hex), from the code for the corresponding upper case letter. EBCDIC digits, ‘0’ to ‘9’ are encoded with the values F0 to F9 (hex) inclusive.

Basic Integer Numeric Forms
Storage Unit = word Unsigned Binary position numbers (starting at 0 on right) position weights : 2position 2’s Complement subtract unsigned binary version of absolute value from 0 (if working in binary) reverse bits and add 1 In computer architecture terminology, a “word” is a collection of bits used to encode a particular computer’s basic integer form. There is much more variation between computers with respect to word size than byte size. Typical word sizes are 16-bits, 32-bits, or 64 bits. Within any collection of bits (and especially within collections representing numbers), individual bits are identified by position numbers, starting with the right-most bit as position 0, and going up in steps of 1 as we move to the left. For unsigned binary, the encoded values is equal to the sum of the weights of the bits which are turned “on”. The weight of any particular bit is defined as 2 to the power of its position number. Although historically and theoretically, several different forms could be used to allow for the encoding of integer values which could be either negative or positive, in practice the only form in general use is 2’s Complement. In 2’s Complement, positive values are encoded identically to the way in which unsigned binary values are encoded. Negative values are encoded by subtracting the absolute value of the number to be encoded, in its unsigned binary form, from zero. Note that this method works whether using binary, hexadecimal (or even octal) arithmetic. Alternatively, a negative value, in 2’s complement, can be encoded by writing the binary pattern of its absolute value in unsigned binary, reversing the bits of this pattern and then adding 1 to the result.

Standard Integer Numeric Flags
Zero all bits in result word are 0 may not be a “true” zero e.g. AAAh + 556h (12-bit word) would turn Zero “on” Carry result wrong (too large or less than zero) for unsigned binary Sign copy of left-most bit of result word Overflow result is wrong if treated as 2’s complement (result sign is logically impossible for given operand signs) When performing arithmetic operations, especially addition and subtraction, on basic numeric encoded values, 4 flags are commonly used to indicate various possibilities about the results. These flags may be later used to conditionally perform alternate collections of instructions. The Zero flag is only turned on if all the bits in the result word are turned off; that is, the Zero flag is on if the result word contains the value zero. Notice that in certain situations, the result word might contain the value zero (and the Zero flag would be turned on) when the logical result of addition is not zero… but the logical result is too large to fit in a word. The Carry flag is used to indicate that the result of an arithmetic operation is too large to fit into the result word or that it is less than 0. If the Carry flag is on, the result can not be correctly decoded using unsigned binary. The Sign flag is a copy of the left-most bit of the result word. It is only relevant when the result word is intended to be interpreted as a signed (2’s complement) value. If the Sign flag is on, the result word represents a value less than 0 using the 2’s complement system. The Overflow flag is turned on when the result word of an arithmetic operation is incorrect if interpreted using the 2’s complement system. For addition this means that either two positive values have been added together and generated a result which appears to be negative, or two negative values have been added together and have generated a result which appears to be positive. Similarly, for subtraction, the overflow flag will be turned on if a negative number is subtracted from a positive number and the result word appears to be negative, or if a positive number is subtracted from a negative number and the result word appears to be positive.

Basic Float Form : 32-bit IEEE-754
1 bit : sign (0=positive; 1=negative) 8 bits: excess-127 binary exponent 23 bits : normalized binary mantissa without leading 1.0 There are several different methods for encoding floating point values. The most common of these are the 32-bit and 64-bit IEEE-754 schemes. (Only the 32-bit method is discussed in this course). Using the IEEE bit form: the first bit is a sign indicator for the entire float value; the next 8 bits contain an unsigned binary value which is 127 more than the actual binary exponent (this is called an excess-127 value); the remaining 23-bits form a normalized binary mantissa without the leading 1. A normalized number is composed of a mantissa with exactly one non-zero digit to the left of the (possibly implied) decimal place plus an exponent which indicates how many positions the decimal point should be moved to the right (or to the left if the exponent is negative) in order to form the real, intended value. For binary encoded values, a normalized mantissa will always start with a 1 (since this is the only non-zero digit); therefore, rather than waste a position for this constant, it is omitted from the actual encoded form. Once the implied 1. has been inserted before the encoded part of the mantissa and the decimal point has been moved the number of positions indicated by the exponent, the decimal value can be calculated using standard position weights (note that the first position to the right of a binary decimal point has a value of 0.5, the next position has a weight of 0.25, the next a weight of 0.125, and so on.

Hybrid Character-Numeric Forms
(sign is always in right-most byte) Zoned Decimal Fd Fd … FD sd BCD Packed Decimal (a form of BCD) dd dd … dd ds (always an odd number of digits) Zoned decimal values are almost identical to EBCDIC character encoding for the number when expressed as a decimal value; the one difference is that the right-most byte contains a indication of whether the value is positive or negative. Except for the right-most byte every digit will be represents using a hexadecimal value of F followed by the decimal value. The right-most byte will replace the F with a value indicating the sign; as generated by a computer system, this value will either be C(hex) for positive or D(hex) for negative values… however, the hex values A, E, and F are treated as valid alternative positive indicators, and the hex value B is treated as an alternative negative indicator. BCD (or Binary Coded Decimal) is a collection of several different coding forms used to encode numbers in terms of their decimal representation. Packed Decimal is one particular example of BCD. Specifically, a Packed Decimal number can be formed from its corresponding Zoned Decimal pattern by removing all the hexadecimal F digits and shifting the sign indicator value from the second group of 4 bits from the right to the right-most group of 4 bits. The right-most byte will thus contain a decimal digit value and a sign indicator, bytes to the left of this will contain two digits each. d: hex value in range 0 to 9 inclusive s: C (hex) positive (also accepts A, E, and F as positive) D (hex) negative (also accepts B as negative)

End of Lecture

Standard Data Encoding

Similar presentations

Presentation on theme: "Standard Data Encoding"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Standard Data Encoding

Similar presentations

Presentation on theme: "Standard Data Encoding"— Presentation transcript:

Similar presentations

About project

Feedback