Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 3 Data Representation.

Similar presentations


Presentation on theme: "Chapter 3 Data Representation."— Presentation transcript:

1 Chapter 3 Data Representation

2 Chapter goals Describe numbering systems and their use in data representation Compare and contrast various data representation methods Describe how nonnumeric data is represented

3 Data representation Humans have many symbolic forms to represent information Alphabet, numbers, pictograms   Computer can only represent information with electrical signals Is a circuit on or off?

4 Computers, numbers, and binary data
Computers only use on/off signals to represent information These signals can only represent numeric data Even character based data is represented as a number

5 Why binary data? Electricity has two states, on and off
Binary numbers only have 0s and 1s Data is stored as collections of binary numbers

6 Binary numbers are “computer friendly”
Binary numbers are signals that can easily be transported Binary numbers can be easily processed (transformed) by two-state electrical devices that are easy to design and fabricate These devices (and/or gates, adders) are strung together like an assembly line to carry out a function And gate has two inputs: if both are true (both 1) then output is true (1); in all other cases output is false Or gate has two inputs: if both are false (both 0) then output is false(0); in all other cases output is true

7 Logic gates Each gate has two inputs and one output. The values of the inputs are compared to a truth table, then the corresponding value is output.

8 Boolean algebra System developed by George Boole (19th century mathematician) that can determine if two values are: Equal, not equal, less than, greater than, etc. Boolean algebra allows the CPU to carry out binary arithmetic (see White p.36-37)

9 Binary numbers Can be combined into a positional numbering system
Base for decimal numbers is 10, base for binary numbers is 2 Each position to the left is an increasing factor of 2

10 Terminology for number systems
Base is also referred to as the radix Binary numbers have a radix of 2 Decimal numbers have a radix of 10 Radix point separates whole values from fractional values Decimal point is a kind of radix point

11 Base 2 positional example

12 Numbering systems Higher base (radix) means fewer positions are needed to represent a number Base 2 needs many more positions than base 10 Base 16 (hexidecimal) is often used to represent binary numbers Base 16 takes fewer positions to represent a number Has a direct correspondence with binary numbers

13 Computers & binary numbers
Each digit of a binary number is called a bit Bit string – group of digits that describes a single value

14 Bit strings Left most bit (most significant bit) called high order bit
Right most bit (least significant bit) called low order bit 8 bits make a byte Programming languages/spreadsheets/etc. automatically translate from base 10 to base 2 and back again

15 Hexadecimal notation Base or radix is 16 More compact than binary
Symbols used are 0-9, A-F One hexadecimal position corresponds to 4 bits Used to designate memory locations, colors (html & VB) Also used (ibm mainframes) is octal notation (base 8)

16 Goals of computer data representation
Any representation format for numeric data represents a balance among several factors, including: Compactness Accuracy Range Ease of manipulation Standardization

17 Balancing objectives Compactness and range are inversely related: the more compact, the smaller the range Accuracy increases with # of bits used, especially with real numbers: example, 1/3, or (non-terminating fraction)

18 Other objectives Does information format make it easier for processor to perform operations? Is data in a standard format, allowing simple transfer between computers?

19 CPU standard data types
Integer Real number Character Boolean Memory address Modern cpu can manage these basic data types. Note that type character refers to a single letter or symbol

20 Integer data types Unsigned – assumed to be positive
Signed – uses one bit (usually high order bit) to indicate sign 0 is positive, 1 is negative

21 Representing negative integers
Excess notation and twos complement Allow subtraction to be carried out as addition Number is converted to its complement 1 is added to the result When added to another binary number, carry bit is ignored

22 Range and overflow Most CPUs use a fixed width of 32 or 64 bits to represent an integer For small numbers format is padded with leading zeros Machine processes fixed width information more easily than variable width

23 Integer overflow If number is too big for fixed width integer format CPU throws an overflow error Integer format width is tradeoff between overflow and wasted space (padded zeros) CPU often use double precision data types for arithmetic operations Long = double integer Double = double floating point

24 Representing real numbers
More complicated problem than storing integers Real numbers contain whole & fractional components How to represent both parts together in one format?

25 Fixed format for real numbers
This format is not flexible enough: Needs to store both numbers with a large whole & small fractional, And small whole & large fractional

26 Floating point notation
Any real number can be re-written using floating point (scientific notation) becomes X 10¹ Format stores (mantissa), 1 (exponent), and sign (+) becomes X 102 Format stores (mantissa), 2 (exponent), and sign (-) Note: number is first translated into base two before put in floating point format

27 IEEE floating point format for real numbers
High order sign bit refers to mantissa, not exponent 8 bit exponent stored in excess notation to allow for negative values 23 bit mantissa is ordinary binary number What is the bit pattern in floating point for ? For –143.99?

28 Floating point range Number of bits in floating point format limit range of exponent, mantissa Overflow (too large a number) always occurs in the exponent Underflow (too small a number, i.e. negative exponent does not fit)

29 Range for mantissa Number of bits for mantissa limit the number of significant digits stored for a real number 23 bits allows for approx. 7 decimal places of precision Mantissa is stored using truncation (information that does not fit is discarded) Does not throw an overflow condition

30 Processing complexity
General rule is floating point operations (+, -, *, etc.) take CPU twice as long as integers (binary) Floating Point Operations Per Second (FLOPS) is a measure of processor speed Floating point has two parts to manage, it should be logical that it takes twice as long

31 Character data Alphabetic letters (upper & lower case), numerals, punctuation marks, special symbols are called characters Variable of type character contain only one symbol Sequence of symbols forming words, sentences, etc. called a string Most cpus and some languages (C ) have no built in data type for string

32 How computers store characters
Character data cannot be directly processed by a computer Must be translated into a number Characters are converted into numbers using a table of correspondences between a character and a bit string

33 Design issues for character coding schemes
Table must be publicly available and all users must use the same table Coding scheme is a tradeoff among compactness, ease of manipulation, accuracy, range, & standardization

34 Examples of character coding schemes
BCD and EBCIDIC – older IBM mainframe computers ASCII – PCs Unicode – larger format allows for expanded and international alphabets (Java and internet applications)

35 ASCII coding scheme 7 bit format allows for parity bit (used to check for errors over transmission lines) Has unique codes for all uppercase & lowercase letters, numbers, other printable characters Also includes codes for device control

36 Device control In many applications that handle text, formatting & commands to a device are included in the same stream of data as the text Examples: word processors (reveal codes), HTML tags Examples: CR (carriage return), tab, form feed

37 Limitations to ASCII Not robust enough to represent multiple languages and symbols 7 bit format allows for 128 unique codes, some languages have thousands of symbols Unicode (16 bit) has 65,536 entries

38 Boolean data Data types has two values, true and false
Can be stored with one bit The results of many CPU operations (comparisons) generate a Boolean value stored in a register

39 Memory addresses Primary storage is a series of contiguous bytes
CPU must be able to access sections of memory directly Sections of memory are accessed by their address (location)

40 Formats for memory addresses
Flat memory model – memory starts at address 0, goes to maximum capacity – 1 Simple integers used to store address Segmented memory model Memory is divided into equal sized segments called pages Address has two parts 00FA:0034 number for page, and location within page

41 Data structures These five primitive types are quite limited for representing real world data Words, sentences Dates Data base tables More complex data structures constructed from these five primitive types Material in this chapter on data structures will be covered in more detail in data structures class and IS223 (C++ programming)

42 Data Structures

43 Data Structures

44 Data Structures

45 Data Structures

46 Data Structures

47 Data Structures

48 Data Structures

49 Data Structures

50 Data Structures

51 Data Structures

52 Data Structures

53 Chapter summary To be processed by any device, data must be converted from its native format into a form suitable for the processing device. All data, including nonnumeric data, are represented within a modern computer system as strings of binary digits, or bits. Each bit string has a specific data format and coding method.

54 Summary (cont.) Numeric data is stored using integer, real number, and floating point formats. Characters are converted to numbers by means of a coding table. Boolean vales can have only two values, true and false. Programs often need to define and manipulate data in larger and more complex units than primitive CPU data types.


Download ppt "Chapter 3 Data Representation."

Similar presentations


Ads by Google