Management Information Systems Lection 06 Archiving information CLARK UNIVERSITY College of Professional and Continuing Education (COPACE)

Slides:



Advertisements
Similar presentations
T.Sharon-A.Frank 1 Multimedia Compression Basics.
Advertisements

Lecture 4 (week 2) Source Coding and Compression
Information Representation
Image Compression. Data and information Data is not the same thing as information. Data is the means with which information is expressed. The amount of.
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Technology ICT Option: Data Representation. Data Representation In our everyday lives, we communicate with each other using analogue data. This data takes.
Data Representation COE 202 Digital Logic Design Dr. Aiman El-Maleh
The Binary Numbering Systems
Digital Fundamentals Floyd Chapter 2 Tenth Edition
Data Representation Computer Organization &
Data Representation COE 205
CSCI 3 Chapter 1.8 Data Compression. Chapter 1.8 Data Compression  For the purpose of storing or transferring data, it is often helpful to reduce the.
Connecting with Computer Science, 2e
1 The Information School of the University of Washington Nov 6fit more-digital © 2006 University of Washington Digital Information INFO/CSE 100,
Chapter 1 Data Storage. 2 Chapter 1: Data Storage 1.1 Bits and Their Storage 1.2 Main Memory 1.3 Mass Storage 1.4 Representing Information as Bit Patterns.
Data Representation in Computers
1 Lossless Compression Multimedia Systems (Module 2) r Lesson 1: m Minimum Redundancy Coding based on Information Theory: Shannon-Fano Coding Huffman Coding.
Spring 2015 Mathematics in Management Science Binary Linear Codes Two Examples.
Connecting with Computer Science 2 Objectives Learn why numbering systems are important to understand Refresh your knowledge of powers of numbers Learn.
Dale & Lewis Chapter 3 Data Representation
©Brooks/Cole, 2003 Chapter 2 Data Representation.
Chapter 2 Source Coding (part 2)
© 2009 Pearson Education, Upper Saddle River, NJ All Rights ReservedFloyd, Digital Fundamentals, 10 th ed Digital Fundamentals Tenth Edition Floyd.
Lecture 5.
Representing text Each of different symbol on the text (alphabet letter) is assigned a unique bit patterns the text is then representing as.
Computers Organization & Assembly Language
Chapter 1 Data Storage(2) Yonsei University 1 st Semester, 2014 Sanghyun Park.
Computer Math CPS120: Data Representation. Representing Data The computer knows the type of data stored in a particular location from the context in which.
1 Perception, Illusion and VR HNRS 299, Spring 2008 Lecture 14 Introduction to Computer Graphics.
Foundations of Computer Science Computing …it is all about Data Representation, Storage, Processing, and Communication of Data 10/4/20151CS 112 – Foundations.
Representing Data. Representing data u The basic unit of memory is the bit  A transistor that can hold either high or low voltage  Conceptually, a tiny.
Lec 3: Data Representation Computer Organization & Assembly Language Programming.
Chapter 3 Section 1 Number Representation Modern cryptographic methods, unlike the classical methods we just learned, are computer based. Representation.
1 Problem Solving using computers Data.. Representation & storage Representation of Numeric data The Binary System.
 The amount of data we deal with is getting larger  Not only do larger files require more disk space, they take longer to transmit  Many times files.
Compsci Today’s topics l Binary Numbers  Brookshear l Slides from Prof. Marti Hearst of UC Berkeley SIMS l Upcoming  Networks Interactive.
Lecture 5. Topics Sec 1.4 Representing Information as Bit Patterns Representing Text Representing Text Representing Numeric Values Representing Numeric.
1 i206: Lecture 2: Computer Architecture, Binary Encodings, and Data Representation Marti Hearst Spring 2012.
Multimedia Specification Design and Production 2012 / Semester 1 / L3 Lecturer: Dr. Nikos Gazepidis
Logical Circuit Design Week 2,3: Fundamental Concepts in Computer Science, Binary Logic, Number Systems Mentor Hamiti, MSc Office: ,
Section 3.1: Number Representation Practice HW (not to hand in) From Barr Text p. 185 # 1-5.
Lossless Compression CIS 465 Multimedia. Compression Compression: the process of coding that will effectively reduce the total number of bits needed to.
Compsci Today’s topics l Binary Numbers  Brookshear l Slides from Prof. Marti Hearst of UC Berkeley SIMS l Upcoming  Networks Interactive.
CISC1100: Binary Numbers Fall 2014, Dr. Zhang 1. Numeral System 2  A way for expressing numbers, using symbols in a consistent manner.  " 11 " can be.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Huffman Code and Data Decomposition Pranav Shah CS157B.
Data Representation, Number Systems and Base Conversions
Data Representation The storage of Text Numbers Graphics.
Lecture 4: Lossless Compression(1) Hongli Luo Fall 2011.
Data Representation. What is data? Data is information that has been translated into a form that is more convenient to process As information take different.
Data Representation. How is data stored on a computer? Registers, main memory, etc. consists of grids of transistors Transistors are in one of two states,
Characters CS240.
ASCII AND EBCDIC CODES By : madam aisha.
Huffman code and Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Information in Computers. Remember Computers Execute algorithms Need to be told what to do And to whom to do it.
Number Systems. The position of each digit in a weighted number system is assigned a weight based on the base or radix of the system. The radix of decimal.
Submitted To-: Submitted By-: Mrs.Sushma Rani (HOD) Aashish Kr. Goyal (IT-7th) Deepak Soni (IT-8 th )
Data Representation. In our everyday lives, we communicate with each other using analogue data. This data takes the form of: Sound Images Letters Numbers.
Computer Science: An Overview Eleventh Edition
3.3 Fundamentals of data representation
Lec 3: Data Representation
3.1 Denary, Binary and Hexadecimal Number Systems
Data Encoding Characters.
Chapter 1 Data Storage.
Ch2: Data Representation
Chapter 2 Data Representation.
Option: Data Representation
Option: Data Representation
Chapter 3 - Binary Numbering System
Presentation transcript:

Management Information Systems Lection 06 Archiving information CLARK UNIVERSITY College of Professional and Continuing Education (COPACE)

Plan Coding of numeric information Coding of textual information Coding of graphical information Archiving of information Shannon-Fano coding Huffman coding

Basic terms Coding is the converting the message to the code, that is, to the set of symbols transmitted by the communication channel

Coding of numeric information Binary encoding used in computing, based on the representation of data sequence of two characters: 0 and 1. These signs are called binary digits, in English - binary digit, or, in short, bit (bit).

Coding of numeric information One bit can be represent two numbers: 0 or 1 (yes or no, true or false, etc.). If the number of bits is increased to two, we can represent four different numbers: Three bits can encode eight different values:

Coding binary data The general formula is: N = 2 i where N - number of independent coded values; i - bit binary code.

Coding of binary integers Principle: Integer is divided in a half, while the reminder is not either zero or one. The set of reminders from each division, written from right to left with the last reminder forms a binary equivalent of a decimal number.

Example 19 : 2 = : 2 = : 2 = : 2 = 1 So, =1011 2

Coding of binary integers To encode the integers from 0 to 255 it is enough to have 8 bits. 16-bit coding is used for integers from 0 to bits are used for more than 16.5 million numbers.

Coding of textual information If each letter of the alphabet matches a certain integer, then we can use the binary code for the encoding the textual information. Eight bits are sufficient to encode 256 different characters.

Coding of textual information U.S. Standards Institute (ANSI - American National Standard Institute) has put in place a system of encoding ASCII (American Standard Code for Informational Interchange - American Standard Code for Information Interchange).

Coding of textual information There are two encoding tables in ASCII: basic (symbols with numbers ) and extended one ( ).

The extended ASCII character set

Windows 1251 character set

Coding of textual information The use of multiple concurrent encoding happen due to the limited set of codes (256). The character set based on a 16-bit character encoding, called universal - UNICODE. It contains the unique codes for different characters. The transition to this system was limited by the insufficient resources of computing for a long time

Coding of graphical information Graphic image is made ​​up of tiny dots (pixels) which form a grid called a raster.

Example increasing in seven times

Coding of graphical information Pixels with only two possible colors (black and white) can be encoded by two numbers - 0 or 1. So, it is necessary to use only 1 bit. For black and white illustrations it is generally accepted coding with 256 shades of gray. How many bits do we need then?

Example

Coding of graphical information The color image on the screen is obtained by mixing three primary colors: red (Red) green (Green) blue (Blue)

Coding of graphical information

While encoding color images, the principle of decomposition of any color on the basic components is used. Such a coding system is called RGB. If for the encoding of each of the main components of color it is used 256 bits, then the system provides different colors.

Archiving of information Data archiving is the process of converting the information stored in a file to the form which reduces redundancy in its representation and thus requires less space for storage

Archiving of information Archiving (packing) movement of the source files into an archive file in a compressed format Decompression (unpacking) is the process of recovering files from the archive in the exact form which they had before archiving

Archiving of information The aims: accommodation in a more compact form on the disk reduction of time (or cost) of the transmission of information through communication channels simplification of transferring files from one computer to another protection from unauthorised access

Archiving of information One of the first archiving method was proposed in 1844 by Samuel Morse in the coding system of Morse code. Frequent characters are coded in shorter sequences

Archiving of information In the 40-ies of the XX century the founder of the modern information theory Shannon and in independency with him Fano developed a universal algorithm for constructing optimal codes. There is an analogue of this algorithm which was proposed by Huffman. The principle of this algorithm is the encoding of frequently occurring characters by shorter sequences of bits.

Archiving of information In the 70's of the XX century Lempel and Ziv proposed algorithms LZ77 and LZW. The algorithm finds the repeated sequences and replace some numbers instead of these sequences according to the dynamically generated dictionary. Most modern archives (WinRar, WinZip) are based on the variations of the Lempel-Ziv algorithm.

Archiving of information where K c – the coefficient of the compressed file, V c – the volume of the compressed file, V r – the volume of the resource file. The degree of the compression depends on the archiving program, the method and the type of source file

Archiving of information The degree of compression for graphical, text and data files is 5-40%. The degree of compression for executable files is 60-90%. The degree of compression for archived files is %.

Archiving of information The self-extracting archive file is the boot executable module which is able to self-unzip contained files without using the archiver. Big archive files can be divided into several toms.

Shannon-Fano coding

1.Develop a list of probabilities or frequency counts 2.Sort the lists of symbols according to frequency 3.Divide the list into two parts, with the total frequency counts of the left part being as close to the total of the right as possible. 4.The left part of the list is assigned the binary digit 0, and the right part is assigned the digit 1. 5.Recursively apply the steps 3 and 4 to each of the two halves, subdividing groups and adding bits to the codes until each symbol has a code.

Huffman coding SymbolCode a10 a210 a3110 a4111

Huffman coding A source generates 4 different symbols with probability. A binary tree is generated from left to right taking the two least probable symbols and putting them together to form another equivalent symbol having a probability that equals the sum of the two symbols. The process is repeated until there is just one symbol. The tree can then be read backwards, from right to left, assigning different bits to different branches.