Presentation is loading. Please wait.

Presentation is loading. Please wait.

Digitizing Discrete Information

Similar presentations


Presentation on theme: "Digitizing Discrete Information"— Presentation transcript:

1

2 Digitizing Discrete Information
Digitize Represent info with digits (symbols) Digits: { 0, 1, 2, …, 9 } Or digits: { A, B, C, …, Z } Or any set of distinct symbols

3

4 Symbols, Briefly Prefer short names for symbols
One, two, …, Instead of “asterisk”, “closing parenthesis”, etc. Aside: we shorten many names in IT exclamation point => bang asterisk => star open parenthesis => open paren open curly brace => open brace

5 Ordering Symbols Want order for the digits/symbols Digitize
0 – 9 has obvious order But what about { #, …, ) }? Define a collating sequence Digitize Represent info with symbols

6 Fundamental Information Representation
Given digital info, how to store it? Use physical phenomena Light Current Magnetism

7 Fundamental Information Representation
In digital world Don’t care how much, just presence In logical world (basis of computing) True and false

8 Fundamental Information Representation
Physical world can implement logical world Presence => “true” Absence => “false”

9 The PandA Representation
We will use “PandA” for presence and absence representation Only two states Could use false for absent, true for present Or 0 for absent, and 1 for present

10 The PandA Representation
Such a formulation is said to be discrete Discrete means “distinct” or “separable” Opposite of continuous No “shades of gray”

11 Analog vs. Digital Analog is continuous data/information Sound waves

12 Analog vs. Digital Digital is discrete info Obtained by sampling

13 A Binary System PandA encoding is binary

14 Bits Form Symbols PandA unit is a binary digit (bit)
Bit sequences form binary numbers

15 Encoding Bits on a CD-ROM
PandA bit values are pits and lands

16 Bits in Computer Memory
Memory is a long sequence of bits Sidewalk Analogy

17 Sidewalk Memory Imagine clean sidewalk consisting of squares
Presence of a stone on a square => 1 Absence of a stone => 0 Sidewalk: sequence of bits

18 Sidewalk Memory 1 1 1

19 Sidewalk Memory Writing info Reading info Put stone on square (1)
Remove stone from square (0) Reading info

20 Alternative PandA Encodings
Other ways to encode two states Color of stone Number of stones Another?

21 Combining Bit Patterns
One bit with two states isn’t enough So we combine them

22

23 Hex Explained Hex numbers are base-16 A bit sequence may be
Error prone Instead use hex

24 The 16 Hex Digits Hex digits { 0, 1, 2, …, 9, A, B, C, D, E, F }
Can represent 4-bit sequences 0000 = 0 hex 0001 = 1 hex 1001 = 9 hex 1010 = A hex 1111 = F hex

25 Hex to Bits and Back Again
Each hex digit corresponds to 4 bits B A D F A B C 6 ?

26 Digitizing Numbers in Binary
Need binary representations for Numbers Characters But also image video sound

27 Counting in Binary Binary numbers (base 2) uses digits 0 and 1
Decimal numbers (base 10) use 0 through 9 Counting to ten

28 Counting in Binary Place value representation

29 Place Value in a Decimal Number
Example, 1010 (base 10) is (1 × 1000) + (0 × 100) + (1 × 10) + (0 × 1)

30 Place Value in a Binary Number
Binary is base 2 so powers of 2 are used

31 Place Value in a Binary Number
1010 in binary (1 × 8) + (0 × 4) + (1 × 2) + (0 × 1)

32

33 Digitizing Text # of bits determines # of symbols that can be represented n bits => 2n symbols

34 Digitizing Text To digitize English text Roman letters Arabic numbers
Punctuation Arithmetic symbols

35 Assigning Symbols So we need to represent 26 uppercase
26 lowercase letters 10 numerals 20 punctuation characters 10 arithmetic characters 3 other characters (new line, tab, and backspace) 95 symbols…enough for English

36 Assigning Symbols To represent 95 distinct symbols we need how many bits? Need to represent control characters too

37 Assigning Symbols ASCII stands for American Standard Code for Information Interchange Widely used 7-bit code Advantages of a “standard” Interoperability of h/w Communications among programs

38 Extended ASCII: An 8-Bit Code
For other languages 7 bits aren’t enough IBM developed an 8-bit ASCII Uses 1 byte Uses 0 in leftmost bit followed by 7-bit ASCII codes Allows 128 more codes that start with 1 Can handle most Western languages

39

40 ASCII Character Set (Decimal)
Decimal - Character 0 NUL 1 SOH 2 STX 3 ETX 4 EOT 5 ENQ 6 ACK 7 BEL 8 BS 9 HT NL VT NP CR SO SI 16 DLE 17 DC DC DC DC NAK 22 SYN 23 ETB 24 CAN 25 EM SUB 27 ESC 28 FS GS RS US 32 SP 33 ! " # $ % & ' 40 ( ) * , / : ; < = > ? A B C D E F G 72 H I J K L M N O 80 P Q R S T U V W 88 X Y Z [ \ ] ^ _ 96 ` a b c d e f g 104 h i j k l m n o 112 p q r s t u v w 120 x y z { | } ~ DEL

41 ASCII Character Set (Hexadecimal)
Hexadecimal - Character 00 NUL 01 SOH 02 STX 03 ETX 04 EOT 05 ENQ 06 ACK 07 BEL 08 BS HT 0A NL 0B VT 0C NP 0D CR 0E SO 0F SI 10 DLE 11 DC DC DC DC NAK 16 SYN 17 ETB 18 CAN 19 EM 1A SUB 1B ESC 1C FS 1D GS 1E RS 1F US 20 SP ! " # $ % & ' 28 ( ) 2A * 2B C , 2D E F / A : 3B ; 3C < 3D = 3E > 3F ? A B C D E F G 48 H I 4A J 4B K 4C L 4D M 4E N 4F O 50 P Q R S T U V W 58 X Y 5A Z 5B [ 5C \ 5D ] 5E ^ 5F _ 60 ` a b c d e f g 68 h i 6A j 6B k 6C l 6D m 6E n 6F o 70 p q r s t u v w 78 x y 7A z 7B { 7C | 7D } 7E ~ 7F DEL

42 Beyond ASCII Unicode Uses up to 4 bytes to handle how many characters?
Allows all modern scripts (Kanji, Arabic, Cyrillic, Hebrew, etc.) Contains 8-bit ASCII as the low 256 characters for compatibility Allows ancient scripts like Egyptian hieroglyphics

43 ASCII Coding of Phone Numbers
How to encode in ASCII? Encode each digit with its ASCII byte etc. etc.

44 Another ASCII Example From Lab 1 CSCI ftw! Takes ? bytes to store.
Representation in ASCII? A In Binary?

45 Advantages of Long Encodings
Short encodings save memory Examples of longer encodings NATO Broadcast Alphabet Bar Codes

46 NATO Broadcast Alphabet
NATO alphabet Used for radio communication Purposely inefficient Distinctive amid noise (‘m’ versus ‘n’) Letters represented with word “symbols” a => alpha, b => bravo, c => charlie Digits keep their usual names Except 9 => niner

47 NATO Broadcast Alphabet

48 Bar Codes Universal Product Codes (UPC) use more bits than necessary
UPC-A encoding uses 7 bits to encode the digits 0 – 9

49 Bar Codes Encodes manufacturer (left side) and product (right side)
Different bit combinations are used for each side One side is complement of the other Bit patterns were chosen to appear as different as possible

50 Bar Codes Encodings for each side make it possible to recognize whether code is upside down

51 Metadata and the OED To represent info
Need to convert to binary Need to describe its properties Characteristics of the content also need to be encoded How is the content structured? What other content is it related to? Where was it collected? When was it created or captured? What units is it given in? How should it be displayed? And so on…

52 Metadata and the OED Metadata info describing info
often specified with tags (like with HTML)

53 Properties of Data ASCII encodes characters
Metadata gives properties of data font style color justification margins etc.

54 Properties of Data Content and metadata example

55 Using Tags for Metadata
Oxford English Dictionary (OED) Definitive reference for every English word’s meaning, etymology, and usage Printed version is 20 volumes, weighs 150 pounds, and fills 4 feet of shelf space

56 Structure Tags Digital OED uses tags to indicate structure
<hw> for a headword (word defined) <pr> for pronunciation <ph> for phonetic notations <ps> for part of speech <hm> for homonym numbers <e> for entire entry <hg> for head group (all info at start of definition)

57 Structure Tags Algorithms utilize tags Search Formatting

58

59 Quiz What’s the first step in debugging? Fix the error in this CSS
check for obvious isolate the problem reproduce the problem pinpoint Fix the error in this CSS body { color; red }

60 Quiz Like all engineers, programmers begin with a _____________ – a precise description of the input, how the system should behave, and how the output should be produced.

61 Summary Digitizing info Storing info using PandA ASCII Metadata
Bits, bytes, hex ASCII Metadata


Download ppt "Digitizing Discrete Information"

Similar presentations


Ads by Google