Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computer Organization Computer Architecture, Data Representation, Information Theory i206 Fall 2010 John Chuang Some slides adapted from Marti Hearst,

Similar presentations


Presentation on theme: "Computer Organization Computer Architecture, Data Representation, Information Theory i206 Fall 2010 John Chuang Some slides adapted from Marti Hearst,"— Presentation transcript:

1 Computer Organization Computer Architecture, Data Representation, Information Theory i206 Fall 2010 John Chuang Some slides adapted from Marti Hearst, Brian Hayes, or Glenn Brookshear

2 John Chuang2 Computer Organization Bits & Bytes Binary Numbers Number Systems Gates Boolean Logic Circuits CPU Machine Instructions Assembly Instructions Program Algorithms Application Memory Data compression Compiler/ Interpreter Operating System Data Structures Analysis I/O Memory hierarchy Design Methodologies/ Tools Process Truth table Venn Diagram DeMorgan’s Law Numbers, text, audio, video, image, … Decimal, Hexadecimal, Binary AND, OR, NOT, XOR, NAND, NOR, etc. Register, Cache Main Memory, Secondary Storage Context switch Process vs. Thread Locks and deadlocks Op-code, operands Instruction set arch Lossless v. lossy Info entropy & Huffman code Adders, decoders, Memory latches, ALUs, etc. Data Representation Data storage Principles ALUs, Registers, Program Counter, Instruction Register Network Distributed Systems Security Cryptography Standards & Protocols Inter-process Communication Searching, sorting, Encryption, etc. Stacks, queues, maps, trees, graphs, … Big-O UML, CRC TCP/IP, RSA, … Confidentiality Integrity Authentication … C/S, P2P Caching sockets Formal models Finite automata regex

3 John Chuang3 Computers of Different Shapes and Sizes

4 John Chuang4 Computer System Components 1.Hardware – provides basic computing resources (CPU, memory, I/O devices, network) 2.Operating system – controls and coordinates the use of the hardware among the various application programs for the various users 3.Applications programs – define the ways in which the system resources are used to solve the computing problems of the users (compilers, database systems, video games, business programs) 4.Users (people, machines, other computers) Source: Silberschatz, Galvin, Gagne

5 John Chuang5 Typical Computer System Hardware Components Brookshear Fig 2.13  Question: What happens in a computer system when I execute an application program?

6 John Chuang6 Data Representation Bits & Bytes Binary Numbers Number Systems Gates Boolean Logic Circuits CPU Machine Instructions Assembly Instructions Program Algorithms Application Memory Data compression Compiler/ Interpreter Operating System Data Structures Analysis I/O Memory hierarchy Design Methodologies/ Tools Process Truth table Venn Diagram DeMorgan’s Law Numbers, text, audio, video, image, … Decimal, Hexadecimal, Binary AND, OR, NOT, XOR, NAND, NOR, etc. Register, Cache Main Memory, Secondary Storage Context switch Process vs. Thread Locks and deadlocks Op-code, operands Instruction set arch Lossless v. lossy Info entropy & Huffman code Adders, decoders, Memory latches, ALUs, etc. Data Representation Data storage Principles ALUs, Registers, Program Counter, Instruction Register Network Distributed Systems Security Cryptography Standards & Protocols Inter-process Communication Searching, sorting, Encryption, etc. Stacks, queues, maps, trees, graphs, … Big-O UML, CRC TCP/IP, RSA, … Confidentiality Integrity Authentication … C/S, P2P Caching sockets Formal models Finite automata regex

7 John Chuang7 Data Representation  All data stored in and manipulated by a digital computer are represented by patterns of bits: -Numbers -Text characters -Sound -Images and videos -Anything else…  Bit = Binary Digit = a symbol whose meaning depends on the application at hand  Binary: takes on values of ‘0’ or ‘1’ -Or equivalently, “FALSE” or “TRUE”, “OFF” or “ON”

8 John Chuang8

9 9 Number System  Hexadecimal Notation (Base 16) particularly useful as shorthand notation for streams of bits -Long bit streams are difficult to make sense of -The lengths of most bit streams used in a machine are multiples of four. -Hexadecimal notation is more compact. -Less error-prone to manually read, copy, or write  Example: Ethernet MAC addresses -48 binary bits --> 12 hexadecimal digits -e.g., 00:1e:c2:b7:e9:77

10 John Chuang10 Decimal-Binary Conversion Decimal to binary: Binary to decimal:

11 John Chuang11 Binary Representation of Fractions and Floating-Point Numbers

12 John Chuang12 Representing text  Each printable character (letter, punctuation, etc.) is assigned a unique bit pattern. -ASCII = 7-bit values for most symbols used in written English text (see Appendix A in Brookshear) -Unicode = 16-bit values for most symbols used in most world languages today -ISO proposed standard = 32-bit values  Example: the message ‘Hello.’ in ASCII encoding

13 John Chuang13 Audio Encoding  Telephone: -8,000 samples per second -8 bits per sample -  64 kbps  CD-quality audio: -44,100 samples per second -16 bits per sample -2 channels for stereo -  1.4 Mbps -  60 minute music CD takes ~630MB  Many audio encoding standards (e.g., H.323 for Voice- over-IP, MP3 for music) The sound wave represented by the sequence 0, 1.5, 2.0, 1.5, 2.0, 3.0, 4.0, 3.0, 0

14 John Chuang14 Representation of Images and Video  Image encoding: -Number of bits per pixel (e.g., 24 bits for RGB color) -Number of pixels in image (e.g., 480*640)  Video encoding: -Number of bits per pixel -Number of pixels per frame -Number of frames per second

15 John Chuang15 Video Encoding Examples  Uncompressed video: -8 bits per pixel -480*640 pixels per frame -24 frames per second -  56 Mbps  Uncompressed HDTV: -24 bits per pixel -1920*1080 pixels per frame -24-60 frames per second -  1000-3000 Mbps 1.5 Mbps MPEG1 Compression 10-30 Mbps MPEG2 Compression

16 John Chuang16 Data Compression  Encoding data using fewer bits  Many encoding standards (e.g., JPEG, MPEG, MP3) includes data compression  Lossless -data received = data sent -used for executables, text files, numeric data  Lossy -data received != data sent -used for images, video, audio

17 John Chuang17 JPEG  JPEG: Joint Photographic Expert Group (ISO/ITU)  Lossy still-image compression  Three phase process -process in 8x8 block chunks (macroblock) -DCT: transforms signal from spatial domain into and equivalent signal in the frequency domain (loss-less) -apply a quantization to the results (lossy) -RLE-like encoding (loss-less)  Compression ratio of 30-to-1 typical Source image JPEG compression DCTQuantizationEncoding Compressed image

18 John Chuang18 MPEG  Motion Picture Expert Group  Lossy compression of video -First approximation: JPEG on each frame -Also remove inter-frame redundancy  Frame types -I frames: intrapicture -P frames: predicted picture -B frames: bidirectional predicted picture  Example sequence transmitted as I P B B I B B  Compression Ratio: typically 90-to-1 -30-to-1 for I frames; P/B frames get another 3-5 x Frame 1Frame 2Frame 3Frame 4Frame 5Frame 6Frame 7 I frameB frame P frameB frame I frame MPEG compression Forward prediction Bidirectional prediction Compressed stream Input stream

19 John Chuang19 Video Encoding (with Compression)  Uncompressed video: -8 bits per pixel -480*640 pixels per frame -24 frames per second -  56 Mbps  Uncompressed HDTV: -24 bits per pixel -1920*1080 pixels per frame -24-60 frames per second -  1000-3000 Mbps 1.5 Mbps MPEG1 Compression 10-30 Mbps MPEG2 Compression

20 John Chuang20 MP3  MPEG Audio compression  Three layers with different data rates:  MP3 is actually MPEG-1 Layer III CodingBit rateCompression Factor Layer I384 kbps4 Layer II192 kbps8 Layer III128 kbps12

21 John Chuang21 Data Compression Entropy Encoding  Recall that ASCII coding uses a fixed 7 bits to represent characters  Can we devise a more intelligent encoding scheme given knowledge that some characters (e.g., “a”, “e”) occur more frequently than others (e.g., “q”, “z”)?  Huffman code: variable length entropy encoding -Intuition: assign shorter codes for characters that occur more frequently

22 John Chuang22 Information Theory in One Slide  Claude Shannon, 1948  {M} = M 1, M 2, …, M i, …, M N  P i = probability of occurrence of M i  Entropy, I i = log 2 (1/P i ) -Information entropy is a measure of the uncertainty associated with a random variable

23 John Chuang23 Huffman Coding  Choose codes such that messages with higher entropy has longer code lengths  Given: messages with known weights (probabilities of occurrence) -Construct binary tree with each message as a leaf node -Iteratively merge two nodes with lowest weights -Assign bits (0/1) to each branch -Read binary tree from root to leaf to obtain code for each message  Performance: -L i = code length for M i -Average message length L M = Sum[P i *L i ] -Average information per message = Sum[P i *I i ] MessageCode a10 a210 a3111 a4110 http://en.wikipedia.org/wiki/Huffman_code

24 John Chuang24 Summary  Different types of data (e.g., numbers, text characters, sounds, images, videos) can all be represented using binary bits  Compression techniques allow compact encoding of information  There are many standards for encoding different data types -Compression often part of encoding standard  Up next: how these binary represented data can be stored, communicated, and operated upon

25 John Chuang25 Upcoming Events  Follow-up reading for today’s lecture: -Wikipedia articles on ‘information theory’ and ‘huffman coding’  Reading assignment for next lecture: -Brookshear Chapter 2.1-2.5, Appendix C -Optional: Brookshear Chapter 2.6


Download ppt "Computer Organization Computer Architecture, Data Representation, Information Theory i206 Fall 2010 John Chuang Some slides adapted from Marti Hearst,"

Similar presentations


Ads by Google