Download presentation
Presentation is loading. Please wait.
Published byElvin George Modified over 9 years ago
2
Intro. to Audio Signals Jyh-Shing Roger Jang ( 張智星 ) http://mirlab.org/jang MIR Lab, CSIE Dept National Taiwan Univ., Taiwan
3
What Are Audio Signals? zAudio signals are… ySignals that are audible to human, such as speech and music yThe range of fundamental frequencies of audible signals is about 20 ~ 20000 Hz. xThe range is wider for the young people, narrower for the elderly. Quiz candidate!
4
Voice Generation & Reception zSteps in voice generation & reception yVibration of voice source yResonance by surrounding objects yTraveling through air (or other media) yReception of membranes and neurons at inner ears yRecognition by brains zInstances of voice generation ySinging yWhistling yGuitar yFlute
5
Categorization of Audio Signals zNumber of sources yMonophonic: exampleexample yPolyphonic: exampleexample zWaveform yQuasi-periodic sound xvoiced sound of speech yAperiodic sound xUnvoiced sound of speech zSource types ySounds from animals (bioacoustics) xDog barking, cat meowing, frog croaking, duck quacking ySounds from non- animals xCar engines, thunders, music instruments
6
S/U/V in Speech zSpeech signals can be divided into S, U, V yS (silence): no speech activity yU (unvoiced): speech activity without vibration from vocal chords yV (voiced): speech activity with vibration zHow to detect S, U, V? yBy putting your hand on your throat to feel the vibration yBy waveform observation Quiz candidate!
7
Tools for General Audio Processing zTools for recording and waveform observation yCool Edit yGoldWave yAudacity yMATLAB zQuiz yWhat is the major difference between the waveforms of speech and whistle?
8
Speech Signal of “Sunday” zUnvoiced vs. voiced frames
9
Silence, Unvoiced and Voiced Sounds zExamples of S, U, V y“Six” y“ 資訊系 ” suvusvuv suvsus Quiz candidate!
10
Human Speech Production
11
Source-filter Model for Human Speech Production Speech is split into a rapidly varying excitation signal and a slowly varying filter. The envelope of the power spectra contains the vocal tract info. Two important characteristics of the model are fundamental frequency (f0) and formants (F1, F2, F3, …) unvoiced voiced
12
The Vocal Tract
13
Glottal Volume Velocity & Resulting Sound Pressure (Voiced)
14
Speech Production Glottal Pulses Vocal Tract Speech Signal (a) Source Spectrum(c) Output Energy Spectrum + + = = (b) Filter Function
15
Videos for Vocal Cords Movement zMovement of vocal cords yhttp://www.youtube.com/watch?v=mJedwz_r2Pchttp://www.youtube.com/watch?v=mJedwz_r2Pc yhttp://www.youtube.com/watch?v=v9Wdf-RwLcshttp://www.youtube.com/watch?v=v9Wdf-RwLcs
16
Parameters for Audio Files zThree major parameters for recording audio files ySample rate: no. of samples per sec x8 kHz (phone quality) x16 KHz (for common speech recognition) x44.1 KHz (CD quality) yBit resolution: no. of bits for representing a sample x8-bit (uint8 with range: 0~255) x16-bit (int16 with range: -32768~32767) yNo of channels xMono xStereo Quiz candidate!
17
Storage for Audio Files zExamples of storage requirement y1 min. of recording with fs=16000, nbits=16, #channel=1 60 (sec)*16 (KHz)*2 (byetes)*1 (channel) = 1920 KB = 1.92 MB y3-mins of CD music with fs=44.1KHz, nbits=16, #channel=2 180 (sec)*44.1 (KHz)*2 (bytes)*2 (channels) = 31752 KB = 32 MB Quiz candidate!
18
Other Interesting Phenomena zInteresting phenomena about audio signals yDon’t trust what you have heard! (Vision rules)Don’t trust what you have heard! yPerceived speech is highly context dependent:
19
Hints for Exercises zHow to generate a sine wave signal: yMath formula: yMATLAB code: duration=3; f=440; fs=16000; time=(0:duration*fs-1)/fs; y=0.8*sin(2*pi*f*time); plot(time, y); sound(y, fs);
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.