Download presentation
Presentation is loading. Please wait.
Published byVivian Owens Modified over 9 years ago
1
Lecture 3 Data Representation
2
The Transition from Text-Based Computing to the Graphical OS In the early 1980’s desktop computers began to be introduced with GUI Operating Systems. The Apple Lisa and Macintosh as well as Microsoft’s Windows OS replaced typed (text only) commands for file and software application management with more user-friendly graphical manipulations with a new type of controller, the mouse. Using the mouse a user could move a file into a directory by clicking and dragging a graphical image of a file into the graphical image of a folder. These graphical images are associated with the files and directories they represent by standard commands of the OS that have been hidden from the user (this is call abstraction).
3
The history of Windows dates back to September 1981, when Chase Bishop, a computer scientist, designed the first model of an electronic device and project called “Interface Manager”. It’s name was later changed to “Windows” and was announced in November 1983 (after the Apple Lisa, but before the Apple Macintosh). The first version of the Windows OS was not released until November 1985. http://flytesolutions.com/blog/what-is-microsoft-operating-system/ Origin of the Windows Operating System
4
Data and Computers Numbers – (e.g. integers, floating-point) Numbers are jut numeric values, but when associated with particular units e.g. position (latitude and longitude), dates and times or other physical units of measure, computers enable disciplines like Graphical Information Systems. Text – By the Fall of 1994 it is estimated that more textual information was being stored electronically than in all the printed documents on the planet. Over the past 20 years the amount of electronic-only information has continued to grow exponentially. Audio – Various encoding schemes have come into common use for digitizing, capturing, copying and replaying music and other.sources of sound information. These digital formats include MP3, WAV, and FLAC. Images – There are many different formats for Graphics – Pixel based and Vector based representations are two different ways to store and manipulate drawings and other types of graphics data. In addition to lines and common graphics shapes, there are graphics formats for displaying textual characters (see TrueType fonts). Video – Videos are the most demanding forms of data with respect to storage and bandwidth requirements. A combination of sound and images being displayed at 30 frames per second or higher with each frame an image that could be several megabytes in size can quickly overwhelm a computer or network. Special video COmpression/DECompression techniques called CODECs have been devised to reduce the required storage space and network bandwidth needed to save and transmit video.
5
Data Compression Data Compression is the process of reducing the amount of space needed to store a piece of data. A data compression technique can be lossless, which means the data can be retrieved without losing any of the original information. Or it can be lossy, in which case some information is lost in the process of compaction. The compression ratio gives an indication of how much compression occurs. The compression ratio is the size of the compressed data divided by the size of the original data. bitmap (bmp) 790 Kbytes no compression Jpeg (jpg) 30 Kbytes.038 compression ratio Graphics Interchange Format (gif) 96 Kbytes.122 compression ratio
6
Graphics Interchange Format (GIF) GIF is referred to as a lossless compression technique. However this is only after the image has been reduced to the most prevalent 256 colors. This reduction of 24 bit color down to an index of the top 256 can significantly degrade the image if it is comprised of many different colors.
7
Analog vs Digital Information Analog data can be represented as a voltage level in an electric circuit, as an intensity of light, a strengh of a magnetic domain on a magnetic tape or any smoothly varying signal. Digital data are collections of (usually binary) numeric values. Digitization is the process of converting an analog signal into a collection of discrete sample values. The number of samples taken per second is the temporal resolution while the number of steps in the amplitude of the signal samples is the depth or precision of the samples. There is a fixed level of ( discretization ) error created by the digitization process. However once a signal has been digitized it can be reproduced indefinitely without further degradation. Alternatively, each time an analog copy is made from another analog copy more noise is induced in the copied signal. Everything's a copy of a copy of a copy.
8
Unicode and ASCII Code no parent Computer languages support the use of ASCII and Unicode. These are encoding schemes for the storage and display of characters from a set of symbols. The ASCII encodes 128 english letters, digits 0 through 9 and a variety of special marks and or commands for placement of the cursor on a textual display. The extended ASCII character set uses a byte (8 bits) per symbol. Unicode is a 16-bit encoding that can represent many more characters in a wide range of languages.
9
http://www.unicode.org/charts/ Samples of Unicode
10
Text Compression Alphabetic information (text) is a fundamental type of data. Therefore, it is important that we find ways to store text efficiently and transmit text efficiently between one computer and another. Our textbook examines three types of text compression: In run-length encoding, a sequence of repeated characters is replaced by a flag character, followed by the repeated character, followed by a single digit that indicates how many times the character is repeated. For example, consider the following string of seven repeated ‘A’ characters: AAAAAAA If we use the ‘*’ character as our flag, this string would be encoded as: *A7 Keyword EncodingRun-Length Encoding An important characteristic of any Huffman encoding is that no bit string used to represent a character is the prefix of any other bit string used to represent a character. 1000100101101111 Huffman Encoding Common words are encoded using special symbols or other shorter representations.
11
Digitizing Audio Signals Choosing the number of bits for each sample determines the discretization level or depth of the signal. In this example we have 12 levels between 0 and the maximum amplitude so we will need a minimum of 4 bits per sample. If we assume that we are sampling at a rate of 22,000 samples per second (i.e. 22KHz) then the forty samples above cover a total time interval of 1/22,000 x 40 = 1.8 milliseconds. A three minute song would therefore require 3 x 60 x 22,000 = 3,960,000 samples of 4 bits each or 15.84 Mbits = 1.98 Mbytes to store the music file in a raw (i.e. uncompressed) format.
12
Representing Images 0 0 0 0 0255 0255 0 0255255 255 0 0 255 0255 255255 0 255255255 RED GREEN BLUE
13
Pixels in a Digitized Image pixel
14
Graphics Formats Raster (or pixel) graphics are fixed resolution bitmaps. Vector graphics are mathematical descriptions of shapes that are scalable.
15
Napster In 1999, Shawn Fanning launched a file-sharing program that took the music industry by storm, rapidly gaining the praise of millions and the criticism of many. Nineteen-year-old Shawn had only recently dropped out of his first year at Northeastern University to pursue a solution to the difficulty of downloading and exchanging music over the Net. With the support of his uncle, Shawn tackled this problem with dedication and ingenuity and, in a few months, developed an answer. Shawn wrote source code that wove together a search engine, file sharing, and Internet Relay Chat, making it possible for anyone to easily access and trade music files.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.