LING 408/508: Programming for Linguists Lecture 2 August 26 th.

Slides:



Advertisements
Similar presentations
ICS312 Set 2 Representation of Numbers and Characters.
Advertisements

The Binary Numbering Systems
Chapter 2: Data Representation
Principles of Computer Architecture Miles Murdocca and Vincent Heuring Chapter 2: Data Representation.
Binary Representation Introduction to Computer Science and Programming I Chris Schmidt.
Digital Fundamentals Floyd Chapter 2 Tenth Edition
1 Information Representation (level ISA-3) Signed numbers and bit-level operations.
8 November Forms and JavaScript. Types of Inputs Radio Buttons (select one of a list) Checkbox (select as many as wanted) Text inputs (user types text)
A bit can have one of two values: 0 or 1. The C language provides four operators that can be used to perform bitwise operations on the individual bits.
Microprocessor Systems Design I
Digital Fundamentals Floyd Chapter 2 Tenth Edition
15 September How Computers Work: Other Forms of Data.
CCE-EDUSAT SESSION FOR COMPUTER FUNDAMENTALS Date: Session III Topic: Number Systems Faculty: Anita Kanavalli Department of CSE M S Ramaiah.
COMPUTER FUNDAMENTALS David Samuel Bhatti
1.6 Signed Binary Numbers.
Dr. Bernard Chen Ph.D. University of Central Arkansas
Chapter 5 Data representation.
LING 408/508: Programming for Linguists Lecture 2 August 28 th.
© 2009 Pearson Education, Upper Saddle River, NJ All Rights ReservedFloyd, Digital Fundamentals, 10 th ed Digital Fundamentals Tenth Edition Floyd.
Computers Organization & Assembly Language
Lecture 11: Machine Processing Intro to IT COSC1078 Introduction to Information Technology Lecture 11 Machine Processing James Harland
2-1 Chapter 2 - Data Representation Principles of Computer Architecture by M. Murdocca and V. Heuring © 1999 M. Murdocca and V. Heuring Chapter Contents.
IT253: Computer Organization
2-1 Chapter 2 - Data Representation Principles of Computer Architecture by M. Murdocca and V. Heuring © 1999 M. Murdocca and V. Heuring Principles of Computer.
EX_01.1/46 Numeric Systems. EX_01.2/46 Overview Numeric systems – general, Binary numbers, Octal numbers, Hexadecimal system, Data units, ASCII code,
Computer Programming I. Today’s Lecture  Components of a computer  Program  Programming language  Binary representation.
Lec 3: Data Representation Computer Organization & Assembly Language Programming.
Data Representation and Storage Lecture 5. Representations A number value can be represented in many ways: 5 Five V IIIII Cinq Hold up my hand.
Digital Logic Design Lecture 3 Complements, Number Codes and Registers.
ICS312 Set 1 Representation of Numbers and Characters.
1 INFORMATION IN DIGITAL DEVICES. 2 Digital Devices Most computers today are composed of digital devices. –Process electrical signals. –Can only have.
CS151 Introduction to Digital Design
Data Representation Dr. Ahmed El-Bialy Dr. Sahar Fawzy.
1 COMS 161 Introduction to Computing Title: The Digital Domain Date: September 6, 2004 Lecture Number: 6.
Number Systems Denary Base 10 Binary Base 2 Hexadecimal Base 16
1 Information Representation in Computer Lecture Nine.
Monday, January 14 Homework #1 is posted on the website Homework #1 is posted on the website Due before class, Jan. 16 Due before class, Jan. 16.
Digital Fundamentals Tenth Edition Floyd Chapter 2 © 2008 Pearson Education.
Computer Science I Storing data. Binary numbers. Classwork/homework: Catch up. Do analysis of image types.
CS 2130 Lecture 23 Data Types.
Data Encoding COSC Computers and Data Computers store information as sequences of bits Computers store many types of data: numbers text audio images.
Tokens in C  Keywords  These are reserved words of the C language. For example int, float, if, else, for, while etc.  Identifiers  An Identifier is.
1 COMS 161 Introduction to Computing Title: Computing Basics Date: September 8, 2004 Lecture Number: 7.
© 2009 Pearson Education, Upper Saddle River, NJ All Rights ReservedFloyd, Digital Fundamentals, 10 th ed Digital Logic Design Dr. Oliver Faust.
Chapter 1 Representing Data in a Computer. 1.1 Binary and Hexadecimal Numbers.
CS 125 Lecture 3 Martin van Bommel. Overflow In 16-bit two’s complement, what happens if we add =
 Computers are 2-state devices › Pulse – No pulse › On – Off  Represented by › 1 – 0  BINARY.
Number Systems. The position of each digit in a weighted number system is assigned a weight based on the base or radix of the system. The radix of decimal.
Lecture 3: More Java Basics Michael Hsu CSULA. Recall From Lecture Two  Write a basic program in Java  The process of writing, compiling, and running.
Data Encoding COSC 1301.
Lec 3: Data Representation
Lecture No. 4 Number Systems
Tokens in C Keywords Identifiers Constants
Microprocessor Systems Design I
LING 388: Computers and Language
LING 408/508: Programming for Linguists
LING 408/508: Computational Techniques for Linguists
Digital Logic & Design Lecture 03.
COMS 161 Introduction to Computing
COMS 161 Introduction to Computing
COMS 161 Introduction to Computing
Storing Negative Integers
Chapter 3 DataStorage Foundations of Computer Science ã Cengage Learning.
LING 388: Computers and Language
Comp Org & Assembly Lang
LING 388: Computers and Language
COMS 161 Introduction to Computing
LING 388: Computers and Language
Networks & I/O Devices.
Presentation transcript:

LING 408/508: Programming for Linguists Lecture 2 August 26 th

Today’s Topics continuing on from last time … Homework 1

Adminstrivia No class on – Monday September 7 th (Labor Day) – Wednesday November 11 th (Veterans Day) – Week after September 11 th (out of town), plus Monday 21 st – Monday October 12 th

Introduction: data types what if you want to store even larger numbers than 32 bits? – Binary Coded Decimal (BCD) – 1 byte can code two digits (0-9 requires 4 bits) – 1 nibble (4 bits) codes the sign (+/-), e.g. hex C/D bytes (= 4 nibbles) bytes (= 5 nibbles) C D credit (+) debit (-)

Introduction: data types Typically, 64 bits (8 bytes) are used to represent floating point numbers (double precision) – c = x 10 8 (m/s) – coefficient: 52 bits (implied 1, therefore treat as 53) – exponent: 11 bits (usually not 2’s complement, unsigned with bias 2 (10-1) -1 = 511) – sign: 1 bit (+/-) C: float double wikipedia x86 CPUs have a built-in floating point coprocessor (x87) 80 bit long registers e.g. probabilities

Introduction: data types Next time, we'll talk about the representation of characters (letters, symbols, etc.)

Example 1 Recall the speed of light: c = x 10 8 (m/s) 1.Can a 4 byte integer be used to represent c exactly? – 4 bytes = 32 bits – 32 bits in 2’s complement format – Largest positive number is – = 2,147,483,647 – c = 299,792,458

Example 2 Recall the speed of light: c = x 10 8 (m/s) 2.How much memory would you need to encode c using BCD notation? – 9 digits – each digit requires 4 bits (a nibble) – BCD notation includes a sign nibble – total is 5 bytes

Example 3 Recall the speed of light: c = x 10 8 (m/s) 3.Can the 64 bit floating point representation ( double ) encode c without loss of precision? – Recall significand precision: 53 bits (52 explicitly stored) – = 9,007,199,254,740,991 – almost 16 digits

Example 4 Recall the speed of light: c = x 10 8 (m/s) The 32 bit floating point representation ( float ) – sometimes called single precision - is composed of 1 bit sign, 8 bits exponent (unsigned with bias 2 (8-1) -1), and 23 bits coefficient (24 bits effective). Can it represent c without loss of precision? – = 16,777,215 – Nope

Homework 1 For both solutions, show your work, i.e. how you derived your answer Pi () is an irrational number – can't be represented precisely! wikipedia

Homework 1 1.Encode Pi as accurately as possible using both the 64 and 32 bit floating point representations Instruction: draw the diagram and fill in the 1's and 0's 2.How many decimal places of precision is provided by each of the 64 and 32 bit floating point representations?

Homework 1 Hints How to encode 1: (bias: = 2 0, frac: 1000… remember: there is an implicit leading 1, = 1.000… in binary)

Homework 1 Hints How to encode 2: (exp: = bias = 2 1, frac: 1000…) = 10.00… in binary

Homework 1 Hints How to encode 3: (exp: = bias = 2 1, frac: 1100…) = … in binary

Homework 1 Hints How to encode 4: (exp: = bias = 2 2, frac: 1000…) = 100.0… in binary

Homework 1 Hints How to encode 5: (exp: = bias = 2 2, frac: 1010…) = 101.0… in binary

Homework 1 Hints How to encode 6: (exp: = bias = 2 2, frac: 1100…) = 110.0… in binary

Homework 1 Hints How to encode 7: (exp: = bias = 2 2, frac: 1110…) = 111.0… in binary

Homework 1 Hints How to encode 8: (exp: = bias = 2 3, frac: 1000…) = … in binary

Homework 1 Hints Decimal 3.5 is 1.11 x 2 1 = 11.1 in binary

Homework 1 Hints Decimal 3.25 is x 2 1 = in binary

Homework 1 Hints Decimal is x 2 1 = in binary

Homework 1 Due Friday night – (by midnight in my box) Required format (for all homeworks unless otherwise specified): – Plain text or PDF formats only (no.doc,.docx etc.) – Single file only – cut and paste into one document (no multiple attachments) – Subject line: 408/508 Homework 1 – First line: your full name

Introduction: data types How about letters, punctuation, etc.? ASCII – American Standard Code for Information Interchange – Based on English alphabet (upper and lower case) + space + digits + punctuation + control (Teletype Model 33) – Question: how many bits do we need? – 7 bits + 1 bit parity – Remember everything is in binary … C: char Teletype Model 33 ASR Teleprinter (Wikipedia)

Introduction: data types order is important in sorting! 0-9: there’s a connection with BCD. Notice: code 30 (hex) through 39 (hex)

Introduction: data types Parity bit: – transmission can be noisy – parity bit can be added to ASCII code – can spot single bit transmission errors – even/odd parity: receiver understands each byte should be even/odd – Example: 0 (zero) is ASCII 30 (hex) = even parity: , odd parity: – Checking parity: Exclusive or (XOR): basic machine instruction – A xor B true if either A or B true but not both – Example: (even parity 0) xor bit by bit 0 xor 1 = 1 xor 1 = 0 xor 0 = 0 xor 0 = 0 xor 0 = 0 xor 0 = 0 xor 0 = 0 x86 assemby language: 1.PF: even parity flag set by arithmetic ops. 2.TEST: AND (don’t store result), sets PF 3.JP: jump if PF set Example: MOV al, TEST al, al JP

Introduction: data types UTF-8 – standard in the post-ASCII world – backwards compatible with ASCII – (previously, different languages had multi-byte character sets that clashed) – Universal Character Set (UCS) Transformation Format 8-bits (Wikipedia)

Introduction: data types Example: – あ Hiragana letter A: UTF-8: E38182 – Byte 1: E = 1110, 3 = 0011 – Byte 2: 8 = 1000, 1 = 0001 – Byte 3: 8 = 1000, 2 = 0010 – い Hiragana letter I: UTF-8: E38184 Shift-JIS (Hex): あ : 82A0 い : 82A2

Introduction: data types How can you tell what encoding your file is using? Detecting UTF-8 – Microsoft: 1 st three bytes in the file is EF BB BF (not all software understands this; not everybody uses it) – HTML: (not always present) – Analyze the file: Find non-valid UTF-8 sequences: if found, not UTF-8… Interesting paper: – archive.mozilla.org/projects/intl/UniversalCharsetDetection.html

Introduction: data types Filesystem: – different on different computers: sometimes a problem if you mount filesystems across different systems Examples: – FAT32 (File Allocation Table)DOS, Windows, memory cards – ExFAT (Extended FAT)SD cards (> 4GB files) – NTFS (New Technology File System) Windows – ext4 (Fourth Extended Filesystem)Linux – HFS+ (Hierarchical File System Plus)Macs limited to 4GB max file size

Introduction: data types Filesystem: – different on different computers: sometimes a problem if you mount filesystems across different systems Files: – Name (Path from / root) – Type (e.g..docx,.pptx,.pdf,.html,.txt) – Owner(usually the Creator) – Permissions (for the Owner, Group, or Everyone) – need to be opened (to read from or write to) – Mode: read/write/append – Binary/Text in all programming languages: open command

Introduction: data types Text files: – text files have lines: how do we mark the end of a line? – End of line (EOL) control character(s): LF 0x0A (Mac/Linux), CR 0x0D (Old Macs), CR+LF 0x0D0A (Windows) – End of file (EOF) control character: (EOT) 0x04 (aka Control-D) binaryvision.nl programming languages: NUL used to mark the end of a string