3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University What is sound? Sound is a physical phenomenon caused by vibration of material (ex.: violin string). The vibration triggers pressure wave fluctuations in the air around the material. The pressure waves propagate in the air. We hear the sound when the wave reaches our eardrums.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University What is sound? The wave form occurs repeatedly at regular intervals or periods. Sound waves have a natural origin, so they are never absolutely uniform or periodic. A sound with a recognizable periodicity is called music. It includes singing. Non-periodic sounds can be called noises.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Sound Waves Without air there is no sound. For example in space. Sound has wave-like behaviour like reflection, refraction and diffraction. This makes the design of “surround sound” possible.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Frequency A sound frequency is the reciprocal value of its period. The frequency represents the number of periods per seconds and is measured in hertz (Hz). A kHz describes 1000 oscillations per second or 1000 Hz.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Frequency Ranges Infrasonic: 0 to 20Hz Audiosonic: 20Hz to 20kHz Ultrasonic: 20kHz to 1GHz Hypersonic: 1GHz to 10 Thz In multimedia we are concerned with sounds in the audiosonic range.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Wave Length The wave length is the length of one wave period. It is the reverse of the frequency. A sound with a 20Hz frequency has a wave length of 17 meters. A sound with a frequency of 20kHz has a wave length of 1.8 centimeters.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Amplitude A sound has a property called amplitude, which humans perceive subjectively as loudness or volume. Measured in decibels (db). The amplitude of a sound is a measuring unit used to deviate the pressure wave from its main value. 0 db - no sound110 db – front row at rock concert 20 db - rustling of paper130 db – pain threshold 35 db - quiet home 160 db - instant perforation of eardrum 70 db - noisy street 100 db - iPod at full volume

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Sound Perception Sound enters the ear canal. At the eardrum, sound energy (air pressure changes) are transformed into mechanical energy (eardrum vibrates). The outer ear helps us to locate the source of the sound by the relative intensity differences between the two ears. The inner ear transforms the sound into impulses sent to the brain.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Frequency Perception Humans have different perception abilities with different frequencies. It is easier to perceive midrange frequencies than the very high and very low frequencies. Sometimes, a loud sound will mask a softer one especially if the sound's two frequencies are in the similar range. This will be important for sound compression.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Representation on Computers The computer has to measure the waves amplitude in regular time intervals. It then generates a series of sampling values (samples). The process is called digitization by an analog-to-digital converter (ADC). A digital-to-analog converter (DAC) is used to achieve the opposite conversion.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Sampling Rate The rate of sampling an analog signal is measured in Hz (number of samples per second). The inverse of the sampling frequency is the sampling period or sampling interval, which is the time between samples. For CD quality, we use 44100Hz. For DVDs it is 48000Hz.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Nyquist Theorem The Nyquist Theorem, also known as the sampling theorem, is a principle that is followed in the digitization of analog signals. For analog-to-digital conversion (ADC) to result in a faithful reproduction of the signal, the samples of the analog waveform must be taken frequently. The number of samples per second is called the sampling rate or sampling frequency.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Nyquist Theorem Any analog signal consists of components at various frequencies. The simplest case is the sine wave, in which all the signal energy is concentrated at one frequency. In practice, analog signals usually have complex waveforms, with components at many frequencies. The highest frequency component in an analog signal determines the bandwidth of that signal. The higher the frequency, the greater the bandwidth, if all other factors are held constant.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Nyquist Theorem If a signal f(t) is sampled at regular intervals of time and at a rate higher than twice the highest significant signal frequency, then the samples contain all the information of the original signal. Digitally sampled audio has a bandwidth of (20 Hz - 20 KHz). By sampling at twice the maximum frequency (40 KHz) we could have achieved good audio quality. CD audio slightly exceeds this, resulting in an ability to represent a bandwidth of around 22050 Hz. (hence the 44100Hz sampling rate for CDs)‏

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Quantization After sampling, sound signals are represented by one of a fixed number of values, in a process known as pulse-code modulation (PCM). Pulse-code modulation (PCM) is a digital representation of an analog signal where the magnitude of the signal is sampled regularly at uniform intervals, then quantized to a series of symbols in a numeric (usually binary) code.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Quantization Quantization depends on the number of bits used in measuring the height of the wave form. 16 bit CD quality quantization results in 64K values. 8 bits quantization has only 256 (telephone quality). Example with 8 levels of quantization (3 bits):

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Noiseless Channels Nyquist proved that if any arbitrary signal has been run through a low pass filter of bandwidth H, the filtered signal can be completely reconstructed by making only 2H (exact) samples per second. If the signal consists of V discrete levels, Nyquist’s theorem states: max-data-rate = 2H log 2 V bits /sec A noiseless 3kHz channel with quantization level 1 bit cannot transmit binary signal at a rate exceeding 6000 bits per second (2 * 3000 * log 2 2).

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Noiseless Channels We need to send a data rate of 256 kbps over a noiseless channel with a bandwidth of 20 kHz. How many signal levels (quantization) do we need? max-data-rate = 2H log 2 V bits /sec 256000 = 2 * 20000 * log 2 V log2 V = 6.4 Since 6.4 is not a an integer, we will need to decrease to 6 bits (64 quantization levels) or increase to 7 bits (128 levels) for bit rates respectively of 240,000 and 280,000. The choice depends on the transmission media.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Noisy Channels Thermal noise present is measured by the ratio of the signal power S to the noise power N (signal-to- noise ratio S/N). C = H log 2 (1+S/N)‏ dB = 10*log(value1/value2) The capacity of the voice band of a telephone channel can be determined using the Gaussian model. The bandwidth is 3000 Hz and the signal to noise ratio is often 30 dB. Therefore, C = 3000 log 2 (1+1000) => 29902 bps‏ approx.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Formats Audio formats are described by sample rate and quantization: Voice quality - 8 bit quantization, 8000Hz mono (8 KBytes/sec)‏ Radio Quality - 22kHz 8-bit mono (22kBytes/s) and stereo (44 KBytes/sec)‏ CD quality - 16 bit quantization, 44100Hz linear stereo (196 KBytes/s)‏

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Formats mu-law encoding corresponds to CCITT G.711 - standard for voice data in telephone companies in USA, Canada, Japan A-law encoding - used for telephony elsewhere. A-law and mu-law are sampled at 8000 samples/second with precision of 12-bits, compressed to 8-bit samples.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Formats mu-law and A-law: 8-bit precision. PCM can be stored at various precisions, 16-bit PCM is common. Multiple channels of audio may be interleaved at sample boundaries. au (Sun/Next), wav (Microsoft RIFF/waveform format), aiff (Apple), RealAudio, mp3.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University 3D Sound Projection The shortest path between the sound source and the auditor is called the direct sound path. All other sound paths are reflected which means they are temporarily delayed before they reach the auditor's ear.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Music and MIDI The MIDI standard defines how to code the all the elements of musical scores, such as sequence of notes, timing conditions and the instrument to play each note. MIDI is a standard that manufacturers of musical instruments use so that instruments can communicate musical information via computers.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Music and MIDI MIDI does not transmit an audio signal or media. It simply transmits digital data "event messages" such as the pitch and intensity of musical notes to play, control signals for parameters such as volume, vibrato and panning, cues and clock signals to set the tempo. Because the music is simply data and not actually recorded wave forms, it is therefore maintained in a small file format.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University MIDI Interface Hardware - specifies a MIDI port (plugs into computers serial port) and a MIDI cable. Data format - has instrument specification, notion of beginning and end of note, frequency and sound volume. Data grouped into MIDI messages that specify a musical event. An instrument that satisfies both is a MIDI device (e.g. synthesizer).

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University MIDI File Formats Standard MIDI File (SMF) Format: MIDI files are typically created using computer-based sequencing software (or sometimes a hardware-based MIDI instrument or workstation) that organizes MIDI messages into one or more parallel "tracks" for independent recording and editing. In most sequencers, each track is assigned to a specific MIDI channel and/or a specific General MIDI instrument patch.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University MIDI File Formats MIDI Karaoke File (.KAR) Format: MIDI-Karaoke (which uses the ".kar" file extension) files are an "unofficial" extension of MIDI files, used to add synchronized lyrics to standard MIDI files. SMF players play the music as they would a.mid file but do not display these lyrics unless they have specific support for.kar messages. These often display the lyrics synchronized with the music in "follow-the-bouncing-ball" fashion, essentially turning any PC into a karaoke machine.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University MIDI software music recording and performance applications, musical notations and printing applications, music education etc. The MIDI standard specifies 16 channels and identifies 128 instruments.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University MOD File Format MOD is a computer file format used primarily to represent music, and was the first module file format. MOD files use the “.MOD” file extension, except on the Amiga where the original trackers instead use a “mod.” prefix scheme, e.g. “mod.echoing”. A MOD file contains a set of instruments in the form of samples, a number of patterns indicating how and when the samples are to be played, and a list of what patterns to play in what order.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Human Speech Human ear is most sensitive in the range 600Hz to 6000 Hz. Real-time signal generation allows transformation of text into speech without lengthy processing Must be understandable, must sound natural Speech transmission - coding, recognition and synthesis methods - achieve minimal data rate for a given quality.

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology.

Similar presentations

Presentation on theme: "3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology.

Similar presentations

Presentation on theme: "3. Audio Technology - Copyright © Denis Hamelin - Ryerson University Audio Technology."— Presentation transcript:

Similar presentations

About project

Feedback