Digital Sound Dr. Kairui Chen GGC.

Digital Sound Dr. Kairui Chen GGC

What is sound? Conversion of energy into vibrations in the air or some other elastic medium Vocal chords Tuning fork Guitar strings

Waveforms Sounds change over time (or sound is a function of time)
e.g. speech changes constantly Frequency spectrum – relative amplitudes of the frequency components alters as sound changes Waveform is a plot of amplitude against time Provides a graphical view of characteristics of a changing sound Can identify syllables of speech, rhythm of music, quiet and loud passages, etc

Frequency of Sound Wave
Refers to the number of complete back-and-forth cycles of vibrational motion of the medium particles per unit of time Unit for frequency: Hz (Hertz) 1 Hz = 1 cycle/second

Frequency a cycle a cycle Suppose it is1 second
Frequency = 2 Hz (i.e., 2 cycles/second)

Frequency Suppose it is1 second
a cycle a cycle a cycle a cycle Frequency = 4 Hz (i.e., 4 cycles/second) Higher frequency than the previous waveform.

Frequency Sound frequency often referred to as pitch of the sound.
Higher pitch -> higher frequency Lower pitch -> lower frequency Range of human hearing: roughly 20Hz–20kHz, varies from person to person and falls as we age

Sound Intensity Sound intensity: 0 dB: about 120 dB:
an objective measurement can be measured with auditory devices in decibels (dB) 0 dB: Threshold of hearing minimum sound pressure level at which humans can hear a sound at a given frequency does NOT mean zero sound intensity does NOT mean absence of sound wave about 120 dB: threshold of pain sound intensity that is 1012 times greater than 0 dB

A Single Tone Sound: A Simple Sine Wave Waveform
A sinlge sine wave waveform A single tone

Adding Sound Waves Most sound sources vibrate in complex ways leading to sounds with components at several different frequencies. A sinlge sine wave waveform A single tone A second sinlge sine wave waveform A second single tone A more complex waveform A more complex sound

Digitizing Sound Suppose we want to digitize this sound wave:

Effects of Sampling Rate
original waveform sampling rate = 10 Hz sampling rate = 20 Hz

Effects of Sampling Rate
Higher sampling rate: The reconstructed wave looks closer to the original wave; More sample points, more data to record, and thus larger file size;

Estimate Thresholds of Sampling Rate Based on Human Hearing
Let's consider these two factors: Human hearing range A rule called Nyquist's theorem

Nyquist Theorem We must sample at least 2 points in each sound wave cycle to be able to reconstruct the sound wave satisfactorily. Sampling rate of the audio  twice of the audio frequency (called a Nyquist rate) Sampling rate of the audio is higher for audio with higher pitch

Choosing Sampling Rate: Example 1
If we consider human ear's most sensitive range of frequency (2,000 Hz to 5,000 Hz), then what is the lowest sampling rate may be used that still satisfies the Nyquist Theorem? 11,025 Hz AM Radio Quality/Speech 22,050 Hz Near FM Radio Quality (high-end multimedia) 44,100 Hz CD Quality 48,000 Hz DAT (digital audio tape) Quality 96,000 Hz DVD-Audio Quality 192,000 Hz DVD-Audio Quality A

Choosing Sampling Rate: Example 2
Given the human hearing range (20 Hz to 20,000 Hz) and Nyquist Theorem, why do you think the sampling rate (44,100 Hz) for the CD-quality audio is reasonable? Nyquist rate for the 20,000 Hz is 40,000 Hz. 44,1000 Hz  twice of the audio frequency (Nyquist rate)

Sampling Rate Examples
11,025 Hz AM Radio Quality/Speech 22,050 Hz Near FM Radio Quality (high-end multimedia) 44,100 Hz CD Quality 48,000 Hz DAT (digital audio tape) Quality 96,000 Hz DVD-Audio Quality 192,000 Hz DVD-Audio Quality

Digitization: Quantization
Each of the discrete samples of amplitude values obtained from the sampling step are mapped and rounded to the nearest value on a scale of discrete levels. The number of levels in the scale is expressed in bit depth--the power of 2. More levels: more accurate mapping, better quality, but larger file size Less levels: less accurate mapping, worse quality, but smaller file size Bit depth of a digital audio is also referred to as resolution. For digital audio, higher resolution means higher bit depth. An 8-bit audio allows 28 = 256 possible levels in the scale only use if some distortion is acceptable, e.g. voice communication CD-quality audio is 16-bit (i.e., 216 = 65,536 possible levels)

Digital Audio File Size
File size of uncompressed digital audio is determined by: Sampling rate (r); Bit depth (s); Number of channels; Mono: single channel; Stereo: two channels; Multiple channels; Duration of the audio in seconds (t);

Let's estimate the file size of a 1-minute CD-quality audio file

1-minute CD Qualtiy Audio
Sampling rate = Hz (i.e., 44,100 samples/second) Bit depth = 16 (i.e., 16 bits/sample) Stereo (i.e., 2 channels: left and right channels)

File Size of 1-min CD-quality Audio
1 minute = 60 seconds Total number of samples = 60 seconds  44,100 samples/second = 2,646,000 samples Total number of bits required for these many samples = 2,646,000 samples  16 bits/sample = 42,336,000 bits This is for one channel. Total bits for two channels = 42,336,000 bits/channel  2 channels = 84,672,000 bits

File Size of 1-min CD-quality Audio
84,672,000 bits = 84,672,000 bits / (8 bits/byte) = 10,584,000 bytes = 10,584,000 bytes / (1024 bytes/KB)  KB = KB / (1024 KB/MB)  10 MB Section 23/4 stopped here F15.

General Strategies to Reduce Digital Media File Size
Reduce sampling rate Reduce bit depth Apply compression For digital audio, these can also be options: reducing the number of channels shorten the length of the audio

Reduce Sampling Rate Sacrifices the fidelity of the digitized audio
Need to weigh the quality against the file size Need to consider: human perception of the audio (e.g., How perceptibe is the audio with lower sampling rate?) how the audio is used music: may need higher sampling rate short sound clips such as explosion and looping ambient background noise: may work well with lower sampling rate

Effect of Sampling Rate on File Size
File size = duration  sampling rate  bit depth  number of channels File size is reduced in the same proportion as the reduction of the sampling rate Example: Reducing the sampling rate from 44,100 Hz to 22,050 Hz will reduce the file size by half.

Effect of Bit Depth on File Size
File size = duration  sampling rate  bit depth  number of channels File size is reduced in the same proportion as the reduction of the bit depth Example: Reducing the bit depth from 16-bit to 8-bit will reduce the file size by half.

Most Common Choices of Bit Depth
usually sufficient for speech in general, too low for music 16-bit minimal bit depth for music 24-bit 32-bit

Effect of Number of Channels on File Size
File size = duration  sampling rate  bit depth  number of channels File size is reduced in the same proportion as the reduction of the number of channels Example: Reducing the number of channels from 2 (stereo) to 1 (mono) will reduce the file size by half.

Digital Sound Editing Software: Audacity (tutorial and hands on activity will be given in class). Timeline divided into tracks Sound on each track displayed as a waveform 'Scrub' over part of a track e.g. to find pauses Cut and paste, drag and drop May combine many tracks from different recordings (mix-down)

Effects and Filters Noise gate: remove hiss from music
Low pass and high pass filters Notch filter: removes a single narrow frequency band De-esser: removes the sibilance Click repairer: removes clicks from recordings taken from old vinyl records Reverb: echo effect etc

Audio File Compression
Lossless Lossy gets rid of some data, but human perception is taken into consideration so that the data removed causes the least noticeable distortion e.g. MP3 (good compression rate while preserving the perceivably high quality of the audio) Section 13.

Compression In general, lossy methods required because of complex and unpredictable nature of audio data CD quality, stereo, 3-minute song requires over 25 Mbytes Data rate exceeds bandwidth of dial-up Internet connection Difference in the way we perceive sound and image means different approach from image compression is needed

Companding Non-linear quantization
Higher quantization levels spaced further apart than lower ones Quiet sounds represented in greater detail than loud ones

ADPCM Differential Pulse Code Modulation
Similar to video inter-frame compression Compute a predicted value for next sample, store the difference between prediction and actual value Adaptive Differential Pulse Code Modulation Dynamically vary step size used to store quantized differences

Perceptually-Based Compression
Identify and discard data that doesn't affect the perception of the signal Needs a psycho-acoustical model, since ear and brain do not respond to sound waves in a simple way Threshold of hearing – sounds too quiet to hear Masking – sound obscured by some other sound

The Threshold of Hearing

Masking

Compression Algorithm
Split signal into bands of frequencies using filters Commonly use 32 bands Compute masking level for each band, based on its average value and a psycho-acoustical model i.e. approximate masking curve by a single value for each band Discard signal if it is below masking level Otherwise quantize using the minimum number of bits that will mask quantization noise

MP3 MPEG Audio, Layer 3 Three layers of audio compression in MPEG-1 (MPEG-2 essentially identical) Layer 1...Layer 3, encoding proces increases in complexity, data rate for same quality decreases e.g. Same quality 192kbps at Layer 1, 128kbps at Layer 2, 64kbps at Layer 3 10:1 compression ratio at high quality

AAC Advanced Audio Coding
Defined in MPEG-2 standard, extended and incorporated into MPEG-4 Not backward compatible with earlier standards Higher compression ratios and lower bit rates than MP3 Subjectively better quality than MP3 at the same bit rate

Audio Formats Platform-specific file formats
AIFF (mac), WAV (windows), AU (unix) Multimedia formats used as 'container formats' for sound compressed with different codecs QuickTime, Windows Media, RealAudio MP3 has its own file format, but MP3 data can be included as audio tracks in QuickTime movies and SWFs

MIDI Musical Instruments Digital Interface
Instructions about how to produce music, which can be interpreted by suitable hardware and/or software cf. vector graphics as drawing instructions Standard protocol for communicating between electronic instruments (synthesizers, samplers, drum machines) Allows instruments to be controlled by hardware or software sequencers

MIDI and Computers MIDI interface allows computer to send MIDI data to instruments Store MIDI sequences in files, exchange them between computers, incorporate into multimedia Computer can synthesize sounds on a sound card, or play back samples from disk in response to MIDI instructions Computer becomes primitive musical instrument (quality of sound inferior to dedicated instruments)

MIDI Messages Instructions that control some aspect of the performance of an instrument Status byte – indicates type of message 2 data bytes – values of parameters e.g. Note On + note number (0..127) + key velocity Running status – omit status byte if it is the same as preceding one

Common Audio File Types
Acronym For Originally Created By File Info & Compression Platforms .wav IBM Microsoft compressed, uncompressed Windows .mp3 MPEG audio layer 3 Moving Pictures Experts Group Good compression rate with perceivably high quality sound Cross-platform .mov QuickTime movie Apple Not just for video supports audio track and a MIDI track a variety of sound compressors files can be streamed "Fast Start" technology Cross-platform; requires QuickTime player

Common Audio File Types
Acronym For Originally Created By File Info & Compression Platforms .aiff Audio Interchange File Format Apple compressed, uncompressed Mac, Windows .au .snd Sun compressed Sun, Unix, Linux .ra .rm Real Audio Real Systems compressed; can be streamed with Real Server Cross-platform; requires Real player .wma Window Media Audio Microsoft

Choosing an Audio File Type
Determined by the intended use File size limitation Intended audience Whether as a source file

File Size Limitations Is your audio used on the Web?
file types that offer high compression streaming audio file types

Intended Audience What is the equipment that your audience will use to listen to your audio? If they are listening on computers, what are their operating systems? cross-platform vs. single platform

Whether as a Source File
If you are keeping the file for future editing, choose a file type: uncompressed allows lossless compression

References 1. Digital Multimedia, Nigel Chapman and Jenny Chapman.
2. Digital Media Primer, Yue-Ling Wong.

Digital Sound Dr. Kairui Chen GGC.

Similar presentations

Presentation on theme: "Digital Sound Dr. Kairui Chen GGC."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Digital Sound Dr. Kairui Chen GGC.

Similar presentations

Presentation on theme: "Digital Sound Dr. Kairui Chen GGC."— Presentation transcript:

Similar presentations

About project

Feedback