Data Encoding COSC 1301.

Slides:



Advertisements
Similar presentations
The Binary Numbering Systems
Advertisements

ENGIN112 L4: Number Codes and Registers ENGIN 112 Intro to Electrical and Computer Engineering Lecture 4 Number Codes and Registers.
Binary Representation Introduction to Computer Science and Programming I Chris Schmidt.
Digital Fundamentals Floyd Chapter 2 Tenth Edition
Lecture 10: System Fundamentals Intro to IT COSC1078 Introduction to Information Technology Lecture 10 System Fundamentals James Harland
Representing Information as Bit Patterns
CS 151 Digital Systems Design Lecture 4 Number Codes and Registers.
IT-101 Section 001 Lecture #4 Introduction to Information Technology.
Representing Information in Binary (Continued)
COMPUTER FUNDAMENTALS David Samuel Bhatti
Computer Systems Nat 4/5 Computing Science Data Representation Lesson 3: Storing Text.
Connecting with Computer Science 2 Objectives Learn why numbering systems are important to understand Refresh your knowledge of powers of numbers Learn.
Chapter 5 Data representation.
LING 408/508: Programming for Linguists Lecture 2 August 28 th.
© 2009 Pearson Education, Upper Saddle River, NJ All Rights ReservedFloyd, Digital Fundamentals, 10 th ed Digital Fundamentals Tenth Edition Floyd.
Chapter 3 Representing Numbers and Text in Binary Information Technology in Theory By Pelin Aksoy and Laura DeNardis.
CMPT 120 How computers run programs Summer 2012 Instructor: Hassan Khosravi.
Lecture 11: Machine Processing Intro to IT COSC1078 Introduction to Information Technology Lecture 11 Machine Processing James Harland
Computer Math CPS120: Data Representation. Representing Data The computer knows the type of data stored in a particular location from the context in which.
IT253: Computer Organization
Eng.Samra Essalaimeh Philadelphia University 2013/ nd Semester PIC Microcontrollers.
Data Representation.
Lec 3: Data Representation Computer Organization & Assembly Language Programming.
Digital Logic Design Lecture 3 Complements, Number Codes and Registers.
CISC1100: Binary Numbers Fall 2014, Dr. Zhang 1. Numeral System 2  A way for expressing numbers, using symbols in a consistent manner.  " 11 " can be.
1 Data Representation Characters, Integers and Real Numbers Binary Number System Octal Number System Hexadecimal Number System Powered by DeSiaMore.
EEL 3801C EEL 3801 Part I Computing Basics. EEL 3801C Data Representation Digital computers are binary in nature. They operate only on 0’s and 1’s. Everything.
Data Representation, Number Systems and Base Conversions
Lecture 10: Binary Representation Intro to IT COSC1078 Introduction to Information Technology Lecture 10 Binary Representation James Harland
Computer Science I Storing data. Binary numbers. Classwork/homework: Catch up. Do analysis of image types.
Data Encoding COSC Computers and Data Computers store information as sequences of bits Computers store many types of data: numbers text audio images.
Data Representation. How is data stored on a computer? Registers, main memory, etc. consists of grids of transistors Transistors are in one of two states,
CS 125 Lecture 3 Martin van Bommel. Overflow In 16-bit two’s complement, what happens if we add =
Number Systems. The position of each digit in a weighted number system is assigned a weight based on the base or radix of the system. The radix of decimal.
1 CE 454 Computer Architecture Lecture 4 Ahmed Ezzat The Digital Logic, Ch-3.1.
Floating Point Numbers
Nat 4/5 Computing Science Data Representation Lesson 3: Storing Text
Data Representation COE 308 Computer Architecture
Binary Representation in Text
Binary Representation in Text
Programming and Data Structure
Data Representation ICS 233
Lec 3: Data Representation
Data Representation.
Number Representation
Data Representation Binary Numbers Binary Addition
CSCI 198: Lecture 4: Data Representation
Chapter 3 Data Storage.
CSCI 161: Lecture 4: Data Representation
Data Encoding Characters.
BEE1244 Digital System and Electronics BEE1244 Digital System and Electronic Chapter 2 Number Systems.
Data Representation COE 301 Computer Organization
LING 388: Computers and Language
Data Representation Data Types Complements Fixed Point Representation
Information Representation
Fundamentals of Data Representation
COMS 161 Introduction to Computing
COMS 161 Introduction to Computing
COMS 161 Introduction to Computing
Digital Encodings.
Storing Negative Integers
Chapter 3 DataStorage Foundations of Computer Science ã Cengage Learning.
Computer Organization
Data Representation ICS 233
Abstraction – Number Systems and Data Representation
LING 388: Computers and Language
23/04/2019 Data Representation Conversion.
Chapter 3 - Binary Numbering System
Data Representation COE 308 Computer Architecture
Presentation transcript:

Data Encoding COSC 1301

Computers and Data Computers store information as sequences of bits Computers store many types of data: numbers text audio images video

Standards Look around – how many items do you see that are based on a standard? Standards: make our lives simpler, more efficient Sometimes there aren't any.

Not Much of a Standard But getting better http://www.puremobile.ca/insiderblog/tag/chargers

A Small Number of Standards Imperial vs. metric http://www.drumsanders.net/drillbitsset-c-8_15_16.html

A Small Number of Standards Candelabra vs Edisonhttp://www.lightbulbmarket.com/product/048702_40-Watt-BA9-Philips-DuraMax-Clear-Long-Life-Bent-Tip-Candelabra-Bulb http://stuartmillerdesign.co.uk/Other%20Services/otherservices.htm

A Small Number of Standards Candelabra vs Edison, , even when new things are introduced, they must conform. http://www.lightbulbmarket.com/product/048702_40-Watt-BA9-Philips-DuraMax-Clear-Long-Life-Bent-Tip-Candelabra-Bulb http://stuartmillerdesign.co.uk/Other%20Services/otherservices.htm http://www.seanpaune.com/2007/06/17/compact-fluorescent-light-bulb/

Bitten by Lack of a Single Standard http://www.iecee.org/whatsnew/archives/whatsnew_2004.html

Bitten by Lack of a Single Standard Region codes intended to make it hard for things to be universal http://en.wikipedia.org/wiki/DVD_region_code

Wishing for Standards http://www.sheldonbrown.com/tire-sizing.html Go to this article to see how complicated the tire size situation is. Image: http://blogs.villagevoice.com/dailymusto/2010/11/my_rules_for_bi.php http://www.sheldonbrown.com/tire-sizing.html

A General Trend Toward Standards Word Sizes of Early Computers EDVAC 44 bits 1947 MARK 1 40 bits 1948 EDSAC 17 bits 1949 CSIRAC 20 bits UNIVAC I 12 digits 1951 IBM 701 36 bits 1952 CDC 1604 48 bits 1959 CDC 6600 60 bits 1964 IBM 360 32 bits 1965 x-86 16 bits 1978 x-32 1986 x-64 64 bits 2004 EDSAC had 18 bits but the first one wasn’t usable CSIRAC – Australian Word Size – natural unit of data for a particular processor Word = fixed size group of bits that are handled as a unit by the hardware of the processor Word size = number of bits in a word Typically registers are word sized. The largest piece of data that can be transferred to and from memory in a single operation is a word. The largest address size (denoting a location in memory) is a word. Modern processors: 8, 16, 32 or 64 bits. Minimum amount of storage given for any type of data is allocated in multiples of that word size.

Standard: Integer Representation Representing integers in base 2: 93 1 Obvious how to represent integers

Integers 1 1 But what about: -93 sign bit Representing integers in base 2: 93 1 But what about: -93 1 Obvious how to represent integers sign bit

Integers 1 But what about: -93 sign bit sign bit Problem: Two representations of zero – positive zero and negative zero Unnecessary complexity Better representations make it easier for the computer. Obvious how to represent integers

Two's Complement: Negative Integers 93 -93 Flip the bits: Then add 1: 1 1 Now addition and subtraction work. You must tell me how many bits to use, and add leading 0s as needed. 1 A good explanation of why it works: http://www.cs.cornell.edu/~tomf/notes/cps104/twoscomp.html

A Problem 104.23 10423 What should we do about: If we always want two places after . : Then we could write: 10423 Sort of like counting cents instead of dollars. But we want a more general solution. And then always treat it as though the decimal point were there.

Floating Point Numbers Floating point representation: exponential/scientific notation Example: 123l.45 can be represented as a decimal floating-point number with the integer 12345 as the significand and -2 as the exponent (and 10 as the base). It’s value is given by the following: 123.45 = 12345 X 10 -2 See the following slide to see how a computer stores this

IEEE Standard - Floating Point Single Format: 32 bits (4 bytes) to store a floating point number: 1 bit for the sign 8 bits for the exponent 23 bits for the mantissa or significand Double Format: 64 bits (8 bytes) to store a floating point number: 11 bits for the exponent 52 bits for the mantissa or significand

Text To represent text digitally, need to be able to represent every possible character that may appear: Computers have revolutionized our world. コンピュータは私たちの世界に革命をもたらしました。 Les ordinateurs ont révolutionné notre monde. English Japanese French

Text Decide how many characters we need to represent. Then: determine the required number of bits. English: 26 letters, 52 for upper and lower case. Plus punctuation... And other languages? character set: a list of characters and the codes used to represent each Several character sets have been used over the years - a standard makes processing text easier

ASCII ASCII: American Standard Code for Information Interchange 1963: 7 bits per character = 128 different symbols Thought to be enough at the time 8th bit in each character byte – used as a check bit or parity bit check for errors in transmission of data Later: Latin-1 Extended ASCII character set All 8 bits used to represent character Represent 256 characters – includes accented characters, other special characters

ASCII http://www.krisl.net/cgi-bin/ascbin.pl http://www.asciitable.com/ Note that most of the control codes are obsolete now. http://www.krisl.net/cgi-bin/ascbin.pl

Representing Text Fourscore and seven … F o u r 01000110 01101111 01110101 01110010

Representing Text T h e n u m b e r i s 1 7 . 54 68 65 20 6E 75 6D 62 65 72 20 69 73 20 31 37 2E In hex

Computing with Text Suppose we want to capitalize this entire paragraph: Computers have revolutionized our world. They have changed the course of our daily lives, the way we do science, the way we entertain ourselves, the way that business is conducted, and the way we protect our security. Let’s go back and look at the ASCII table to see how to do that. Add Octal 40 to all the letters (but not the punctuation). Do it in Python with st.upper()

When We Need More Characters What about things like: 简体字 Chinese string means “simple writing”. ASCII not enough for international use.

When We Need More Characters What about things like: 简体字 Answer: Unicode Chinese string means “simple writing” A conversion applet: http://www.pinyin.info/tools/converter/chars2uninumbers.html

Unicode Previously, a letter maps to some bits: A encoded as 0100 0001 In Unicode, a letter maps to a code point – a number like U+0639 U+ means Unicode numbers are hexadecimal Every character has a Unicode code point This doesn't indicate how the code point is encoded as a sequence of bits, though U+0041: English letter A U+0639: Arabic letter Ain

Unicode Example: Hello 5 code points, one code point (i.e., number) per letter U+0048 U+0065 U+006C U+006F How is this stored in memory? Different standards for this. One standard: UTF-8 Standard system for storing strings of Unicode code points in binary (i.e., U+DDDD stored in some number of bytes)

UTF-8 Code points 0-127 stored in one byte So English text looks same in UTF-8 as ASCII (backwards compatible) Code points 128 and higher: 2, 3, up to 6 bytes Hello: U+0048 U+0065 U+006C U+006C U+006F Stored as: 48 65 6C 6C 6F (same as ASCII) For Hebrew characters, accented letters, etc.: you may need more bytes