Media Data and File Formats

Media Data and File Formats
Howell Istance © De Montfort University, 2001

© De Montfort University, 2001
Text Files 1. Plain text (unformatted) ASCII Character set is most common, 7 bits are used This can represent 128 Code words A = A = Computers store data in bytes The extra bit can be used for Parity / Extended Character Sets : Error detection A parity bit is used (Odd Parity) Extend codewords to 256 IBM’s EBCDIC 2. Formatted Text Characters used to give text and formatting information Bold, Italic, Position, etc Also contains information on page numbers, version, index, etc Formatted files are usually much larger than their plain text equivalent © De Montfort University, 2001

Vector Graphics and Bitmapped Images
Vector Graphics: image represented and stored as a collection of shapes, together with data (parameters) defining how the shapes will be produced and where they will be located Bitmapped Images: image represented and stored as a collection of pixels which displayed make up the image © De Montfort University, 2001

Bitmapped images Pixels make up an image Any computer image is made up of individual pixels of data. Each pixel has information about its colour - either as part of the pixel data itself or as an index which points to a lookup table. © De Montfort University, 2001

Colour - Depth #FF00A3 #255,0,163 # , , Each pixel has a colour depth. A certain number of ‘bits’ are used to define a pixels colour e.g. held as an RGB value. For 256 (or less) colours a palette is used as an index describing which 256 actual colours to use in an image. For an RGB (other schemes exist) each pixel is stored as a 3 number component which makes up 24 bits - 8,8,8 (16 bit systems use 6,5,5 bits for example). Thus each pixel takes 3 bytes of space in raw form. A palette system stores a table of colours (as for RGB) and each pixel in the image has a number which is an index to the table. 240,200,171 © De Montfort University, 2001

Models for bitmapped images
Model consists a 2-D array of pixel values May be of a different size and colour depth from the image which will be finally displayed. A view of the image displayed on screen in an image editor is not the model, the view has been transformed and clipped and displayed You do not see the model © De Montfort University, 2001

Models in Vector graphics
y (0,0) x Model is a series of mathematical constructs, together with data to define the location, size and attributes of each, such as colour, line style… Constructs include shapes (rectangle, oval, lines), curves polygons (sets of points, coordinate pairs, with lines joining consecutive points), polylines polygon meshes (set points with instructions to show which points are to be joined by lines) © De Montfort University, 2001

Storing models….as bitmapped images
If the rectangles measure 4.5 cm, then on a 72 dpi (dots per inch) monitor, each side will contain 128 pixels 4.5 cm The image will contain 128 * 128 = pixels If 3 bytes are used to store each pixel value, then 16384 * 3 bytes = bytes are required Size is constant regardless of complexity of image within the 128 pixel square © De Montfort University, 2001

Storing models….as vector graphics
(Post Script) 0 1 0 setrgbcolour rectfill 1 0 1 setrgbcolour rectfill 4.5 cm 78 bytes required! But Postscript renderer required, which slows the display process and has to be available on the host machine Size increases as complexity of image increases, as more instructions are needed to define the image © De Montfort University, 2001

Representation as vector graphics…
Vector graphics enable images to be composed of filled shapes Each object can be manipulated individually Scaling objects is easy (by applying mathematical transforms to the object definition) © De Montfort University, 2001

Distorted poppy… Easy to manipulate individual elements of image here… © De Montfort University, 2001

Vector representation of complex images
To approach realistic image, complex definition of gradient meshes is required File size approx. 10 MB Generated in Illustrator Taken from Wiley book site © De Montfort University, 2001

Rendered as a bitmapped image
File size of this image is 152K No longer possible to interact with separate components Edits and application of effects are done on the vector version and the end result is saved as a bitmapped image. © De Montfort University, 2001

GIF Files Image files hold a lot of data Image files tend to be large files To reduce storage space COMPRESSION techniques are used One solution is RUN LENGTH ENCODING Count the number of pixels that are the same Decoder uses this count to copy the original pixel X times © De Montfort University, 2001

GIF Files Developed by Compuserve Used for single or multiple images Based on LZW compression Lempel, Ziv invented original algorithm Welch developed it further Replaces multiple strings of data with a TOKEN…….. And a count value LZW can give reasonable compression  50% © De Montfort University, 2001

Compression - LZW / RLE Runs of colour can be defined in a simple Run / Colour / Number format. e.g. R0206 for black, R0304 for gray, RF005 for green, R04FE for red, R0203 for black, R0614 for blue - taken from the palette below. A lot of images are made up of large sections of flat colour space - meaning that a lot of adjacent pixels can be grouped together in a small definition. Fort the blue component above (20 pixels - 20 bytes) we can encode it in 3 bytes. One byte to indicate a ‘run’, one for the ‘index’ to the palette, and one for the length of the run. Variations exist on this - but the savings in space are obvious. 3 2 1 6 5 4 The palette © De Montfort University, 2001

GIF Files Decompression is fairly quick Universal standard Not optimised for image compression UNISYS hold patent on LZW so there may be a problem with royalties © De Montfort University, 2001

Compression - LZW / RLE TIF uncompressed = 289k TIF lzw compressed = 248k Here we see how LZW compression performs on photographic and simple image types. The large green expanse of the smaller image lends itself to simple effective compression. The photographic image is too complex to compress well - there are few open spaces of the same colour. TIF uncompressed = 90k TIF lzw compressed = 5k © De Montfort University, 2001

JPEG Files Joint Photographic Experts Group Uses a Fourier Transform technique to eliminate high frequency components in image Uses several algorithms including run-length encoding Can be lossy blockiness posterisation Ringing © De Montfort University, 2001

Digital Video We see a sequence of still images as a continuous movement if rate of presentation is greater than critical flicker fusion frequency (about 40 images/sec) Film is shot at 24 frames/second but each frame effectively shown twice in projection giving refresh rate of 48 frames/second 3 broadcast TV and video standards NTSC – US, Canada PAL – Europe, Australia SECAM – France, Eastern Europe © De Montfort University, 2001

Interlacing and Frame rates
For each system, refresh rate is double broadcast frame rate, by showing half of one frame followed by the other half NTSC (30 frames/second), PAL and SECAM (25 frames/second) © De Montfort University, 2001

Image sizes and Data Rates…
Consider the amount of data to represent a sequence of digital images at NTSC broadcast rate, if 3 bytes used to represent a pixel value 640 * 480 * 3 = 900K per image 1 second at 30 frames/second = 26 Mb 1 minute at 30 frames/second = 1.6 Gb For PAL/SECAM, similar 768 * 576 * 3 1 second at 25 frames/second = 31 Mb 1 minute at 25 frames/second = 1.85 Gb Data (transfer) rate = data amount per unit time (e.g 26 Mb/sec for NTSC, 31 Mb/sec for PAL) © De Montfort University, 2001

Compression techniques
Can’t assume that devices (camera, video card) used to digitise images will be available for playback on end user machine Need to provide software codec to apply a compression technique suited to capabilities of end user machine All techniques operate on a sequence of bitmapped images Video data normally compressed and recompressed twice, when captured (hardware codec) –real-time compression needed In order to be transmitted (software codec) © De Montfort University, 2001

Intra- and Inter-frame compression
Spatial (intra-frame) compression compresses each frame in isolation Lossy techniques applied, leading to some loss of image quality Temporal (inter-frame) compression calculates and compressed differences between sequence of frames 1 Key frame + (succession of usually) 6 difference frames Difference frame contains difference between original frame and preceeding key frame or preceeding difference frame Time to compress may be (much) longer than time to decompress – asymetric codec Fast decompression times important © De Montfort University, 2001

Software codecs Four are main contenders for compressing video for delivery on CD-ROM, or via internet: Cinepak, Intel Indeo, Sorenson and MPEG-1 Cinepak, Intel Indeo, Sorenson all use vector quantisation Frame divided into small blocks (vectors), Code book contains typical block patterns closest approximation to code book entry worked out and index to code book is stored instead of original vector Decompression (fast) obtained by replacing indices from data stream with code book entries Compression (slow) as much as 150* decompress time © De Montfort University, 2001

Sound Files Two main types WAV files Digital samples of analogue waveforms Midi Files Set of instructions to control computer © De Montfort University, 2001

WAV Files Sound is sampled according to Nyquist Sampling Theorem SAMPLE RATE – at least 2 X Highest frequency Range of frequencies occurring in human voice is Hz Telephone Sampling rate is at least 6800 Hz – is actually 8Hz Range of frequencies ear can detect is , 000 Hz Sampling rate is at least 40,000Hz – is actually 44,100Hz © De Montfort University, 2001

Conversion to digital There are 21 signal levels -10 to 0 to +10 We need 5 bits to represent this range Note 5 bits gives 32 combinations Use 0XXXX for Positive values Use 1XXXX for negative values © De Montfort University, 2001

3volts is represented by 00011 7volts is represented by 00111 10volts is represented by 01100 -3volts is represented by 10011 -7volts is represented by 10111 -10volts is represented by 11100 0volts is represented by 00000 Each sample is transmitted to an output device sequentially © De Montfort University, 2001

Quantisation noise The example uses a 1 volt step range What if the audio sample is 7.5 volts? The encoder gives a value of 8 volts The decoder outputs an 8 volt signal This error is called QUANTISATION NOISE © De Montfort University, 2001

Companding Most audio signals are quiet more signals at lower levels than high levels Companding means using a non-linear scale For example, 0-5 volts might have 20 values 5- 8 volts might have 8 values 8-10 volts might have 2 values This gives better resolution at lower levels at the expense of high signal levels © De Montfort University, 2001

CD Quality WAV files Use 16 X 2 bits to represent the audio signal This gives X 2 “steps” Quantisation noise is low A lot of bits will carry no information (low sound levels) This means a lot of data redundancy WAV file size becomes large 1Mbyte = 0.7 seconds of sound © De Montfort University, 2001

MIDI Files These are digital sound files Control computers, sequencers, etc Each bit in the signal is used Must have a MIDI player to hear the sound File size is very small compared to WAV files © De Montfort University, 2001

Audio Compression ADPCM Predicts next sample value TrueSpeech Based on mathematical model of airflow over vocal tract Highly efficient (1/16th) MPEG Audio Fits with MPEG Video files © De Montfort University, 2001

Psychoacoustic model Throw away samples which will not be perceived, ie those under the curve © De Montfort University, 2001

Zip Files Popular file compression utility Based on LZW Used to transfer or store large files Zipped files give good results for text and WAV files Poor results for graphics / video (typically 3%) © De Montfort University, 2001

File Size / Performance
There is a trade-off between: Speed of loading File size Quality There is no one correct solution for all multimedia applications © De Montfort University, 2001

Media Data and File Formats

Similar presentations

Presentation on theme: "Media Data and File Formats"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Media Data and File Formats

Similar presentations

Presentation on theme: "Media Data and File Formats"— Presentation transcript:

Similar presentations

About project

Feedback