Bits and the "Why" of Bytes: Representing Information Digitally

Slides:



Advertisements
Similar presentations
Review of HTML Ch. 1.
Advertisements

The Binary Numbering Systems
Review Ch.1,Ch.4,Ch.7. Review of tags covered various header tags Img tag Style, attributes and values alt.
Learning Objectives Explain the link between patterns, symbols, and information Determine possible PandA encodings using a physical phenomenon Encode.
Chapter 7 Representing Information Digitally. Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Learning Objectives Explain.
Chapter 7 Representing Information Digitally. Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Learning Objectives Explain.
Binary Representation
1. Discrete / Continuous Representations Of numbers – binary & decimal Bits Hexadecimal - 'Hex' Representing text Bits and Bytes.
Chapter 8_2 Bits and the "Why" of Bytes: Representing Information Digitally.
1 The Information School of the University of Washington Nov 6fit more-digital © 2006 University of Washington Digital Information INFO/CSE 100,
Chapter 8_1 Bits and the "Why" of Bytes: Representing Information Digitally.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Representing Information Digitally Bits and the “Why” of Bytes lawrence snyder.
Chapter 7 Representing Information Digitally. Learning Objectives Explain the link between patterns, symbols, and information Determine possible PandA.
Chapter 8 Bits and the "Why" of Bytes: Representing Information Digitally.
Data Representation (in computer system) Computer Fundamental CIM2460 Bavy LI.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Fluency with Information Technology Third Edition by Lawrence Snyder Chapter.
Computer Fluency Binary Systems. Humans Decimal Numbers (base 10) Decimal Numbers (base 10) Sign-Magnitude (-324) Sign-Magnitude (-324) Decimal Fractions.
CCE-EDUSAT SESSION FOR COMPUTER FUNDAMENTALS Date: Session III Topic: Number Systems Faculty: Anita Kanavalli Department of CSE M S Ramaiah.
Chapter 3 Data Representation Text Characters. 2 Representing Text To represent a text document in digital form, we need to be able to represent every.
©Brooks/Cole, 2003 Chapter 2 Data Representation.
Chapter 2 Data Representation. Define data types. Visualize how data are stored inside a computer. Understand the differences between text, numbers, images,
Globalisation & Computer Systems week 5 1. Localisation presentations 2.Character representation and UNICODE UNICODE design principles UNICODE character.
Binary Numbers and ASCII and EDCDIC Mrs. Cueni. Data Representation  Human speech is analog because it uses continuous signals (waves) that vary in strength.
Chapter 2 Computer Hardware
Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly.
Digital Information  digits are symbols chapter 8 BITS & THE “WHY” OF BYTES.
Chapter 8 Fluency with Information Technology 4 th edition by Lawrence Snyder (slides by Deborah Woodall : 1.
Data Representation.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Representing Information Digitally Bits and the “Why” of Bytes lawrence snyder.
1 INFORMATION IN DIGITAL DEVICES. 2 Digital Devices Most computers today are composed of digital devices. –Process electrical signals. –Can only have.
CS151 Introduction to Digital Design
HOW COMPUTERS MANIPULATE DATA Chapter 1 Coming up: Analog vs. Digital.
Logical Circuit Design Week 2,3: Fundamental Concepts in Computer Science, Binary Logic, Number Systems Mentor Hamiti, MSc Office: ,
Globalisation & Computer systems Week 5/6 Character representation ACII and code pages UNICODE.
1 COMS 161 Introduction to Computing Title: The Digital Domain Date: September 6, 2004 Lecture Number: 6.
Data Representation Conversion 24/04/2017.
Data Representation, Number Systems and Base Conversions
DATA REPRESENTATION CHAPTER DATA TYPES Different types of data (Fig. 2.1) The computer industry uses the term “MULTIMEDIA” to define information.
The Information School of the University of Washington Oct 13fit digital1 Digital Representation INFO/CSE 100, Fall 2006 Fluency in Information Technology.
Base 2 Numbering System Chapter 1.
Data Representation. What is data? Data is information that has been translated into a form that is more convenient to process As information take different.
Learning Objectives Explain the link between patterns, symbols, and information Determine possible PandA encodings using a physical phenomenon Encode.
The Information School of the University of Washington 15-Oct-2004cse digital1 Digital Representation INFO/CSE 100, Spring 2005 Fluency in Information.
1 Problem Solving using Computers “Data....Representation, and Storage.
M204 - Data Representation
1.4 Representation of data in computer systems Character.
Chapter 7 Representing Information Digitally. Learning Objectives Explain the link between patterns, symbols, and information Determine possible PandA.
Binary Representation in Text
Binary Representation in Text
Unit 2.6 Data Representation Lesson 2 ‒ Characters
Chapter 8 & 11: Representing Information Digitally
Chapter 3 Data Representation Text Characters
Digitizing Discrete Information
Binary Numbers and ASCII and EDCDIC
Digital Electronics Jess 2008.
Bits and the "Why" of Bytes: Representing Information Digitally
Chapter 3 Data Representation
TOPICS Information Representation Characters and Images
Ch2: Data Representation
Data Representation Conversion 05/12/2018.
Digital Representation
Presenting information as bit patterns
COMS 161 Introduction to Computing
INFO/CSE 100, Spring 2005 Fluency in Information Technology
Chapter 2 Data Representation.
COMS 161 Introduction to Computing
Data Representation Chapter 2 Computer HW (Von Neumann Model) Program
Chapter 3 - Binary Numbering System
Digital Representation of Data
ASCII and Unicode.
Presentation transcript:

Bits and the "Why" of Bytes: Representing Information Digitally Chapter 4 Bits and the "Why" of Bytes: Representing Information Digitally

Digitizing Discrete Information Digitize: Represent information with digits (numerals 0 through 9 when using base 10) Limitation of Digits (Phone#, SS#, etc.) Alternative Representation: Any set of symbols could represent phone number digits, as long as the keypad is labeled accordingly Symbols, Briefly Digits have the advantage of having short names (easy to say) But computer professionals also shorten symbol names (exclamation point pronounced "bang")

Ordering Symbols Another advantage of digits for encoding information is that items can be listed in numerical order To use other symbols, we need an ordering system (collating sequence) Agreed order from smallest to largest value In choosing symbols for encoding, consider how symbols interact with things being encoded … 

The Fundamental Representation of Information The fundamental patters used in IT come when the physical world meets the logical world The most fundamental form of information is the presence or absence of a physical phenomenon (matter, light, magnetism, etc.) In the logical world, the concepts of True and False (T and F) are very important Logic is the foundation of reasoning (human and mechanical/computing)

The PandA Representation By associating True with the presence of a phenomenon and False with its absence, we use the physical world to implement the logical world, and produce information technology PandA is a mnemonic for "Presence and Absence" It is discrete (distinct or separable) — the phenomenon is present or it is not (True or False) No continuous gradation in between!

A Binary System Two patterns — Present and Absent — make PandA a binary system We can give any names to these two patterns as long as we are consistent The PandA basic unit is known as a "bit" (short for binary digit) Sequences of PandA values -- (usually 0’s and 1’s) can be used to represent symbols

Bits in Computer Memory Memory is arranged inside a computer in a very long sequence of bits (places where a phenomenon can be set and detected) Analogy: Sidewalk Memory Each sidewalk square represents a memory slot, and stones represent the presence or absence If a stone is on the square, the value is 1, if not the value is 0

Alternative PandA Encodings There are other ways to encode two states using physical phenomena Use stones on all squares, but black stones for one state and white for the other Use multiple stones of two colors per square: more black than white means 0 and more white than black means 1 Stone in center for one state, off-center for the other Suggestions?

Combining Bit Patterns There are only two bit values (0 and 1). The bits alone can only represent two things, like (won, lost) or (yes, no) If we group bit values into sequences we can represent enough symbols to encode more complex information (one sequence per symbol) PandA has p=2 values, arranging them into n-length sequences, we can create 2n symbols

Hex Explained Previously, we showed how to specify custom colors in HTML using hex digits e.g., <font color ="#FF8E2A"> Hex is short for hexadecimal, base 16 Why use hex? Writing the sequence of bits can be long, tedious, and error-prone

The 16 Hex Digits: 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F Sixteen digits (or hexits) can be represented perfectly by the 16 possible 4-bit sequences Changing hex digits to bits and back again: Given a sequence of bits, group them in 4's and write the corresponding hex digit Given hex, write associated group of 4 bits

Encoding Text Consider representing the Roman alphabet (Letters A-Z) 26 letters (one case only) We can use multiple-bit patterns to represent each letter How many patterns are required?

Extending the Encoding 26 letters of the Roman alphabet are represented; 32 (2^5) patterns available with 5-bits. How many patterns are left? Can these spaces be used for the Arabic numerals? Enough? What if we need punctuation? Keyboard. Escaped Characters. Etc.

Digitizing Text Early binary representation — 1 and 0 — encoded numbers and keyboard characters Now representation for sound, video, and other types of information are also important For encoding text, what symbols should be included? We want to keep the list small enough to use fewer bits, but we don't want to leave out critical characters

Assigning Symbols 26 uppercase and 26 lowercase Roman letters 10 Arabic numerals 10 arithmetic characters 20 punctuation characters (including space) 3 non-printable characters (new line, tab, backspace) = 95 characters, enough to represent English For 95 symbols, we need 7-bit sequences A standard 7-bit code is ASCII(American Standard Code for Information Interchange)

Extended ASCII: An 8-bit Code By the mid-1960's, it became clear that 7 bit ASCII was not enough to represent text from languages other than English IBM extended ASCII to 8 bits, 256 symbols Called "Extended ASCII," the first half is original 7-bit ASCII with a 0 added at the beginning of each group of bits (byte) Handles most Western languages and additional useful symbols

Example: ASCII Coding a Phone Number How would a computer represent in its memory, the phone number 888 555 1212? We want it stored not as a single large number, but as string of symbols telling what phone buttons to press Encode each digit with its ASCII byte: 8 8 8 5 5 etc. 00111000 00111000 00111000 00110101 00110101 etc.

Beyond ASCII - Unicode Computing in the 21st century uses many languages with many different character sets 1990s - Unicode was created used 2 bytes (16 bits) to represent over 65,000 characters (216) (32 bit version now available) Allows all modern scripts (Kanji, Arabic, Cyrillic, Hebrew, etc.) Contained 8-bit ASCII as the low 256 characters for compatibility Current standards use even more bits to represent over 107,000 characters in 90 scripts Allows ancient scripts like Egyptian hieroglyphics

NATO Broadcast Alphabet The code for broadcast communication is purposefully inefficient, to be distinctive when spoken amid noise. Ex. ‘M’ vs. ‘N’

Text, Data, and MetaData Extended ASCII codes letters and characters well, but most documents contain more than just text. For example: Format information like font, font size, justification, etc. Formatting characters could be added to ASCII, but that mixes the content with the description of its form (metadata) Metadata can be represented using tags, as in HTML

Using Tags to Encode Oxford English Dictionary (OED) printed version is 20 volumes! We could type the entire contents as ASCII characters (in about 120 person-years), but searching would be difficult Suppose you search for the word "set." It is included in many other words like closet, horsetail, settle, etc. How will the software know what characters comprise the definition of set? Incorporate metadata

Structure Tags A special set of tags was developed to specify OED's structure <hw> means headword, the word being defined Other tags label pronunciation <pr>, phonetic notation <ph>, parts of speech <ps> The tags do not print. They are there only to specify structure so the computer knows what part of the dictionary it is looking at

Why "BYTE" Why is BYTE spelled with a Y? Consider Memory errors and Parity The Engineers at IBM were looking for a word for a quantity of memory between a bit and a word (usually 32 bits). Bite seemed appropriate, but they changed the i to a y, to minimize typing errors Half a Byte = Nibble (Nybble?)