ECE5526 HW#1 By Clay McCreary
Problem 1 See the following slides for the plots Words from lecture slides: Pg 18 Messages Pg 27 Test shot Pg 31 Fisherman Pg 36 Numerals Pg 40 Summit
Problem 2
Problem 2 Example
Problem #3 Used the Pitch.m code to integrate pitch information into specgram_nist.m Included pitch information for every phoneme Pitch.m does not seem to perform well if there is a period of silence during the phoneme sample period (ie. unvoiced plosive) Improving the capabilities of the Pitch.m code is beyond the scope of this project, but will need to be improved
Problem #3 MATLAB Code
Problem #3 Example
Problem #3 Hypothesis From “Prosodic Modeling for Improved Speech Recognition and Understanding” by Wang, there are three elements to prosodic modeling: Pitch Duration Energy
Problem #3 Hypothesis cont. The pitch information incorporated into specgram_nist.m will allow for some prosodic context in the decision making algorithm. However, pitch, by itself, is susceptible to errors and is considered noisy and unreliable. To enhance the decision making capability of the WUW recognizer, duration and energy should also be considered.
Problem #3 Hypothesis cont. From my experience, I suggest that a typical WUW would contain the following characteristics: Preceded by a short pause First syllable would be higher pitch Short duration High energy I hypothesize that incorporating these factors into the WUW recognizer would improve it’s performance