Download presentation
Presentation is loading. Please wait.
1
VAD (Voice Activity Detector)
Supervised by Dr. Kepuska By Preetham Nosum
2
VAD Part of Front End Module – Spectrum, MFCC
Detects speech using various features Overall flag set when individual flags are all ON
3
Tools Visual C++ - Parameters stored into respective extensions Matlab
- Graphs plotted from the data
4
VAD Features Energy Feature MFCC Feature Spectrum Feature
MFCC Enhanced Feature
5
Energy Feature Input used from the previous module – Frame energy
Mean Frame energy calculated Compared to current frame energy Energy flag set if the difference is high Mean not calculated during the VAD ON stage
6
MFCC Feature MFCC feature calculated from vector
Each frame compared to the overall mean mfcc Deviation from mean sets the MFCC flag Mean not calculated during the VAD ON stage
7
Spectrum Feature Uses Variance for detectioin
How much the signal changed after each frame Mean compared to variance Flag is set after a certain threshold is crossed
8
MFCC Enhanced Feature Uses Hybrid of Spectrum and MFCC
Two ways to detect: Variance mean and Variance of variance Most sensitive of all features Flag set ON/OFF based on two different conditions
9
Logic Overall flag set when all the flags are turned on
Overall flag turned off when any one of the feature flags is turned off Waits certain frames to make sure it’s speech
10
Future Needs more refinement
Test the MFCC and MFCC Enhanced features with different set of MFCC vector values Test with more data
11
Refrences Discrete-Time Speech Signal Processing, Thomas F. Quantieri
12
Questions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.