© De Montfort University, Digital Video Howell Istance School of Computing Technology De Montfort University
© De Montfort University, Moving pictures… We see a sequence of still images as a continuous movement if rate of presentation is greater than critical flicker fusion frequency (about 40 images/sec) Film is shot at 24 frames/second but each frame effectively shown twice in projection giving refresh rate of 48 frames/second 3 broadcast TV and video standards –NTSC – US, Canada –PAL – Europe, Australia –SECAM – France, Eastern Europe
© De Montfort University, Interlacing and Frame rates For each system, refresh rate is double broadcast frame rate, by showing half of one frame followed by the other half NTSC (30 frames/second), PAL and SECAM (25 frames/second)
© De Montfort University, Image sizes and Data Rates… Consider the amount of data to represent a sequence of digital images at NTSC broadcast rate, if 3 bytes used to represent a pixel value –640 * 480 * 3 = 900K per image –1 second at 30 frames/second = 26 Mb –1 minute at 30 frames/second = 1.6 Gb For PAL/SECAM, similar –768 * 576 * 3 –1 second at 25 frames/second = 31 Mb –1 minute at 25 frames/second = 1.85 Gb Data (transfer) rate = data amount per unit time (e.g 26 Mb/sec for NTSC, 31 Mb/sec for PAL)
© De Montfort University, Sites for digitisation and compression Camera –Pass digital data to computer via high-speed interface (IEEE Firewire), low grade cameras use USB interface –+ve: digital data resistant to effect of noise during transfer –-ve: no user control over compression/quality tradeoff Computer –Analogue data from camera passed to video capture card –Data susceptible to noise, reduces compression efficiency –User has greater control over compression Playback after decompression via external monitor Compressor/decompressor = ‘codec’
© De Montfort University, Analogue Broadcast Standards Field is each interlaced half of frame Fields transmitted at grid frequency –NTSC (US) 60 Hz – 30 frames/second –PAL and SECAM (Europe) 50 Hz – 25 frames/second Adding colour to NTSC signal interfered with audio so correction factor applied (1000/1001) NTSC field rate = 60 * 1000/1001 = 59.94, giving frame rate of Playback of video on computer monitor not interlaced, lines from each field written into a frame buffer, top to bottom (progressive scanning) fast monitor refresh enables lower frame rate without flicker
© De Montfort University, Line/field rates NTSC 525 lines/frame, 45 lines contain synchronisation data, 480 picture data –Represented as 525/59.94 PAL SECAM 625 lines/frame, 49 lines contain synchronisation data, 576 picture data –Represented as 625/50 Mapping film footage shot at 24 frames/second to video –NTSC 3:2 pulldown, PAL shows 24 frames in 24/25 seconds
© De Montfort University, Colour Model Originally required means of transmitting colour signal which could be ignored by black and white TV receivers Separate ‘brightness’ of a image element from its colour Y (luminance) = R G B U = (weighting factor) * (B – Y) V = (weighting factor) * (R – Y) Analogue TV uses Y’UV –3 signals combined into 1 composite signal Digital TV uses Y’C B C R : same idea, different weights
© De Montfort University, Down sample chrominance components All 4 Y’ values preserved The 4 (2h2v) Cr values replaced by one C r value in sample Same reduction with C b values (2h2v = 4:2:0) Y’ CrCr CbCb 4 pixels (3 bytes each) Y’ CrCr CbCb
© De Montfort University, Sampling Analogue data Standard CCIR 601 prescribes 720 samples / picture line, both broadcast standards for luminance, 360 samples of both colour difference values Chrominance sub-sampling (4:2:2) Less bandwidth to transmit colour than luminance NTSC frame 720 * 480 pixels PAL frame 720 * 576 pixels Sampled digital data then has to be compressed for transmission
© De Montfort University, (720) (480) – NTSC (576) – PAL/SECAM
© De Montfort University, :2:2 Chrominance Sub sampling 4 Y luminance samples 2 C r chrominance samples 2 C b chrominance samples
© De Montfort University, :2:2 Chrominance Sub sampling 4 Y luminance samples 2 C r chrominance samples 2 C b chrominance samples
© De Montfort University, :1:1 Chrominance Sub sampling 4 Y luminance samples 1 C r chrominance sample 1 C b chrominance sample
© De Montfort University, :2:0 Chrominance Sub sampling 4 Y luminance samples 2 C r chrominance samples 2 C b chrominance samples
© De Montfort University, Standards for captured data Hardware codecs used to compress (and decompress) sampled data from camera to storage 2 standards emerge here DV (consumer, semi-professional equipment) –Variations DVCAM, DVPRO concern tape formats MPEG-2 (digital broadcast, studio equipment) –Collection of standards, grouped into profiles and levels –Most common ‘main profile at main level’ –CCIR 601 scanning, 4:2:0 chrominance subsampling, data rate of 15 MegaBits/seconds (1.87 Mb/sec)
© De Montfort University, Compression techniques Can’t assume that devices (camera, video card) used to digitise images will be available for playback on end user machine Need to provide software codec to apply a compression technique suited to capabilities of end user machine All techniques operate on a sequence of bitmapped images Video data normally compressed and recompressed twice, –when captured (hardware codec) –real-time compression needed –In order to be transmitted (software codec)
© De Montfort University, Intra- and Inter-frame compression Spatial (intra-frame) compression compresses each frame in isolation –Lossy techniques applied, leading to some loss of image quality Temporal (inter-frame) compression calculates and compressed differences between sequence of frames –1 Key frame + (succession of usually) 6 difference frames –Difference frame contains difference between original frame and preceeding key frame or preceeding difference frame Time to compress may be (much) longer than time to decompress – asymetric codec Fast decompression times important
© De Montfort University, Static.avi moving.avi File size: 4.49M bytes Total duration: seconds Average data rate: K per second Image size: 320 x 240 Pixel depth: 24 bits Frame rate: fps There are 39 keyframes, 537 delta frames. There are 39 empty frames. Compressor: 'IV50', Indeo® video 5.10 File size: 7.22M bytes Total duration: seconds Average data rate: K per second Image size: 320 x 240 Pixel depth: 24 bits Frame rate: fps There are 46 keyframes, 632 delta frames. There are 31 empty frames. Compressor: 'IV50', Indeo® video 5.10
© De Montfort University, Digital Video (DV) - cameras DV equipment uses similar compression technique to MJPEG Chrominance subsampling 4:2:0 Also uses temporal compression Has to maintain 3.25 Mbytes/sec data rate (due to demands of DV VTR equipment Quality is varied dynamically – if no or little motion in a sequence, more opportunity for temporal compression to make savings, so less spatial compression applied, giving higher image quality
© De Montfort University, Motion JPEG (MJPEG) - cards Most common approach during capture of analogue video Loosely defined way of applying JPEG compression JPEG compression applied to each frame, no temporal compression Discrete Cosine Transform works just as well on Y’C b C r Can specify quality setting – compression vs image quality trade-off Typical data rates 3 Mbytes/sec – compression ratio of 7:1, achieved by low- mid range capture cards
© De Montfort University, Examples of cameras (’97 prices) Miro DC-1+ / ATI All in Wonder with Analogue Video Signal M-JPEG / fps Digital Video Camera fps Semi Professional Camera fps (Broadcast Quality = fps)
© De Montfort University, Examples of cards (’97 prices) DV editing in real time, Real Time video effects, Native DV editing (avoids recompression) Matrox RT2000 £900 (more effects, better integration with software) Pinnacle DV500 £600 (cheaper, faster, but less quality)
© De Montfort University, Software codecs Four are main contenders for compressing video for delivery on CD-ROM, or via internet: Cinepak, Intel Indeo, Sorenson and MPEG-1 Cinepak, Intel Indeo, Sorenson all use vector quantisation Frame divided into small blocks (vectors), Code book contains typical block patterns closest approximation to code book entry worked out and index to code book is stored instead of original vector Decompression (fast) obtained by replacing indices from data stream with code book entries Compression (slow) as much as 150* decompress time
© De Montfort University, Software codecs Full motion, full screen playback not possible with mid- range processors (decompression not fast enough) VHS quality (in terms of lossiness), ¼ frame (320 * frames / second is feasible Sorenson codec can compress a video with these parameters to (only) 50Kbytes/sec –Within capabilites of multimedia PC or 1x speed CD-ROM
© De Montfort University, Comparison of S/W Codecs Moving (panning camera) Static (talking head) Intel Indeo ,394K4,597K Cinepak (Quality =100%) 8,451K (2m 40s) 7,381K Sorenson 312,199K (40s) 12,393K (35s) MPEG13,292K (1m 55s) 2,880K (1m 40s) Original Size (20secs) 67,500K
© De Montfort University, % compression Cinepack 15 fps
© De Montfort University, % compression Indeo fps
© De Montfort University, % compression Indeo fps
© De Montfort University, % compression Intel DVI 15 fps
© De Montfort University, % compression Intel DVI 15 fps