School of Informatics CG087 Time-based Multimedia Assets IntroductionDr Paul Vickers1 Introduction to time-based multimedia assets Dr Paul Vickers Dr Alf Watson Emil Petkov
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers2 Introduction Looking at multimedia assets that change over time –Sound –Video Sound clearly requires a time dimension to be perceived By video we mean either motion video, or sequences of still images that are time-based (e.g. synchronised slide show) or combination of both
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers3 What the module covers Sound theory –Physics of sound –Psycho-acoustics Sound language –Atmospheric sound –Sonic/musical grammars –Foley art and sound tracks –Sound in games
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers4 What the module covers Sound as a communication/interaction medium –Auditory display –Sonification, auralisation –Earcons, auditory icons Musical instrument digital interface (MIDI) –Definition & history –Protocols –Input –Controllers & tone generators –Sequencers
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers5 What the module covers Combining MIDI and audio –Sampling & loops –Sequencing & mixing (Compression and streaming –What is compression? –Types of compression (e.g. WMA, MP3, ATRAC))
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers6 What the module covers Video –Digital vs analogue –Filming – video language, managing shooting –Digitising, optimising, compression systems –Synchronising sound to video –Compression & streaming –Time-based control structures
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers7 Who does what Dr Paul Vickers: Sound –Pandon 1.21 –Tel Dr Alf Watson: Video –Pandon 1.14 –Tel
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers8 Assessment There is one piece of assessment (no exam) –An asset creation, reporting and discussion task –Will be handed out in week 1 –Due in week 13
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers9 Introduction to sound What is sound? –If a tree falls in a forest and nobody is there to hear it, does it make a sound? –Discuss… Different from vision –Can only attend to one visual stream at once, yet can monitor several auditory streams –When you mix sounds from several sources we still perceive them as separate sources
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers10 Sound perception Sound is a construction of the mind Neural coding & processing of information from auditory system, integration with information from other sensory systems, and responding to the result is what defines hearing Sound is temporal –(what is a sound of zero duration?) –Though it has some spatial characteristics. Discuss.
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers11 Physical definition Vibration of an object produces sound. If we hear the vibration, the sound is audible –Hitting a table causes vibration –Blowing across a bottle causes complex vibrations –Plucking a guitar string causes it to vibrate What’s the difference between the sounds made by hitting a table, blowing a trombone, whistling? Fourier showed that any vibration can be resolved into a sum of sinusoidal vibrations
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers12 Sinusoids Sinusoids (aka sine waves) describe relationships between displacement and time – er.htmlhttp:// er.html
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers13 Attributes of sine wave Frequency (no. of cycles per second, Hz) Amplitude (height of wave) Starting phase – What does frequency relate to? What does amplitude relate to?
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers14 Attributes of sound Sound has three physical attributes –Frequency –Intensity –Time In music we talk about pitch. Is it the same as frequency? –Concert pitch is where the A above middle C has a frequency of 440Hz
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers15 Complex sounds Most sounds are not pure tones, but combinations of frequencies Pure tone:
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers16 Complex sounds Square wave Triangle wave
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers17 Pitch perception All three waves had the same perceived frequency yet the first contained only one sinusoid, whilst the square and triangle waves contained many lower-intensity sinusoids above the fundamental of 250Hz Thus, we tend to perceive the pitch of a tone as its fundamental (or its missing fundamental!) –You hear a 100Hz pitch when presented with a stimulus consisting of the sum of the frequencies of 700, 800, 900, & 1000Hz: all four tones are harmonics of 100Hz Pitch of sine waves varies with intensity! Pitch increases as level increases
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers18 Timbre What differentiated the sound of the three tones we heard earlier? How would you describe them? The quality that differentiates how tones sound is called timbre Different instruments have different timbres
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers19 Intensity Intensity is an measure of the energy of a signal measured in decibels (dB) –0dB is the threshold of human hearing ( W/m 2 ) –3dB is a doubling of intensity –Pain threshold 120dB-140dB –90dB (or prolonged exposure to lower levels) can cause permanent hearing damage
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers20 Power ratios and decibels
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers21 Example intensities
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers22 Loudness Loudness is a psychophysical response which does not necessarily equate to intensity. Usually measured in phons –Phon is the level in dB SPL of an equally loud 1,000Hz tone. All tones judged equal in loudness to a 40dB SPL 1KHz tone have a loudness level of 40 phons. Studies have shown loudness doubles every 10dB or so, i.e. sound must be increased in intensity by a factor of ten for the sound to be perceived as twice as loud. Or, it takes 10 violins to sound twice as loud as one violin.
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers23 Equal loudness contours 100Hz, 52dB SPL 1KHz, 40 dB SPL 4KHz, 37 dB SPL All judged to be equal in loudness
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers24 Psychoacoustics Psychoacoustics is the psychological study of hearing. The aim of psychoacoustic research is to find out how hearing works. We have seen that what we hear is not the same as what is presented Why do we hear things in certain ways?
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers25 Auditory scene analysis “The things that we see are organized into patterns or figures, rather than discrete dots of light of different colours. In hearing, we also tend to organize sounds into auditory objects or streams. Bregman (1994) has termed this process for audition auditory scene analysis” –Taken from For info on ASA see
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers26 Other phenomena The cocktail party effect (Arons, 1992) allows us to pick out one conversation in a babel of voices The tritone paradox (Deutsch, 1991) shows how our language influences the way we hear music – Auditory illusions –Shepard risset tones –And many others. See what you can find.
CG087 Time-based Multimedia Assets School of Informatics IntroductionDr Paul Vickers27 References Arons, B. (1992). “A Review of the Cocktail Party Effect.” Journal of the American Voice I/O Society 12(Jul.): Bregman, Albert S. (1994). Auditory Scene Analysis: The Perceptual Organization of sound. Cambridge, Massachusetts: The MIT Press. Deutsch, D. (1991). “The Tritone Paradox: An Influence of Language on Music Perception.” Music Perception 8(4):