1 Analysis of Parameter Importance in Speaker Identity Ricardo de Córdoba, Juana M. Gutiérrez-Arriola Speech Technology Group Departamento de Ingeniería Electrónica Universidad Politécnica de Madrid
2 Index Introduction System description Parameter extraction Voice conversion and synthesis Parameter analysis Application to a voice quality task Results Conclusions
3 parameters converted target speaker speech Synthesis — — source speaker voice — target speaker voice Analysis parameters Transformation functions computation transformation functions Voice conversion Introduction
4 Index Introduction System description System description Parameter extraction Voice conversion and synthesis Parameter analysis Application to a voice quality task Results Conclusions
5 System description
6 Index Introduction System description Parameter extraction Parameter extraction Voice conversion and synthesis Parameter analysis Application to a voice quality task Results Conclusions
7 Parameter Extraction I We used a 39 parameter synthesizer – F0 – Glottal source: FLUTTER, KOPEN, RET, SKEW, VELO, Eo, Ee – AV, ASP, ATURB, AF – F1, F2, F3, F4, F5, F6 – B1, B2, B3, B4, B5 – FNZ, FNP, BNZ, BNP, B2P, B3P, B4P, B5P, B6P – A2, A3, A4, A5, A6 – AB, GAIN
8 Parameter Extraction I Glottal parameters:
9 Parameter extraction II
10 Parameter Extraction III
11 Parameter Extraction IV We calculate F0, AV, AF, formant frequencies and bandwidths Pitch marks and formants are manually revised Only voiced sounds are transformed
12 Index Introduction System description Parameter extraction Voice conversion and synthesis Voice conversion and synthesis Parameter analysis Application to a voice quality task Results Conclusions
13 Voice conversion I Lineal transformation functions: For each pair of source-target units we compute the transformation coefficients which are stored in a file
14 Synthesis Formant synthesizer (Klatt) Parameterized units concatenation Prosodic modification, changing glottal pulse length and the number of glottal pulses Formant smoothing during unit transitions
15 Index Introduction System description Parameter extraction Voice conversion and synthesis Parameter analysis Parameter analysis Application to a voice quality task Application to a voice quality task Results Conclusions
16 Parameter Analysis I 11 speakers (5 female, 6 male) EUROM1 database in Castilian Spanish Sentence: “Mi abuelo me animó a estudiar solfeo” (My grandfather encouraged me to study solfa) Fs=16 kHz
17 Parameter Analysis II
18 Parameter Analysis III We want to know which parameters are actually relevant for speaker identity Discriminant functions are linear combinations of variables that best discriminate classes – They can be used to rank the variables in terms of their relative contribution to class discrimination LDA is performed: – For each phoneme of the sentence (does not work well for the whole sentence) – Coefficients of the first discriminant function are used to rank the parameters
19 Application to a Voice Quality Task We extracted four sentences of the Brian VOQUAL'03 database: normal, clear, creaky, and relax. ea We analyzed two phonemes of the sentence: “She has left for a great party today” We wanted to rank parameter importance to discriminate between the four classes: – We use the coefficients of the first discriminant function
20 Index Introduction System description Parameter extraction Voice conversion and synthesis Parameter analysis Application to a voice quality task Results Results Conclusions
21 Results I Voice Quality Task Frame classification for E and A using LDA for the first two discriminant functions normal creaky clear relax EA
22 Results II Voice Quality Task E A First function coefficients Absolute values of the coefficients that multiply each parameter in the first discriminant functions
23 Results III Speaker Identity Number of times each parameter has been the most relevant (up) and the least relevant (bottom) in the first discriminant function
24 Index Introduction System description Parameter extraction Voice conversion and synthesis Parameter analysis Application to a voice quality task Results Conclusions Conclusions
25 Conclusions Parameter importance depends on: – the type of speech – the gender of the speaker – the phonemes under study Results show that F0, formant frequencies and OQ are the most important parameters for speaker classification.