PROJECT PROPOSAL Shamalee Deshpande
Problem Statement Extracting soft biometric features Age Gender Accent
Speaker Database A Speaker database from the LDC Corpus Catalog* Preferable use half the speaker set for training and the later half for verification of results Contain varying gender, age and accent *Linguistic Data Consortium, http://www.ldc.upenn.edu/Catalog/
Possible Computation for Gender Pitch In Cepstrum Analysis, Formants are completely removed from the spectrum thus isolating the pitch frequency. LPC also used to find pitch Pitch is used to classify speech with regards to Gender Av Males=100-132Hz Av Females=142-256Hz Window DFT IDFT Speech Cepstrum LOG
Possible Computation for Accent People usually have characteristic styles of pronouncing phonemes from an early age dependant on the primary language learned. Cepstral coefficients may again be used and presumably the MFCCs for the analysis of the speech spectrum to identify local/non-local speakers in a database.
Possible Computation for Age BUZZER Glottal excitation TUBE Vocal tract Characterized by intensity and pitch Characterized by formants Vocal tract length is said to be a good classifier of the age of a speaker Formant frequencies derived using LPC co-relate to the length of the vocal tract Children are said to have a higher formant frequency range than adults Specifically, elderly speakers are said to have lower formant frequencies F1,F2,F3 than their younger counterparts more so seen with regards to F1