Inter Class MLLR for Speaker Adaptation Presenter : 陳彥達
Reference Sam-Joo Doh and Richard M. Stern, “Inter-Class MLLR for Speaker Adaptation”, ICASSP 2000.
Outline Introduction Concept of inter-class MLLR Inter-class function training Experiments
Introduction Why do we use MLLR ? Simple idea Few unknown parameters Sparse adaptation data Rapid adaptation
Introduction(2) i all Gaussian , i Class 1 i Class 2
Introduction(3) Single-class MLLR Multi-class MLLR All parameters use the same transformation in adaptation More reliable for small amount of adaptation data Multi-class MLLR Parameters in different classes use different transformation in adaptation More reliable for large amount of adaptation data
Introduction(4) Shortcoming of conventional MLLR The number of classes should be carried out according to the amount of adaptation data. Classes are independent in multi-class MLLR, so some parameters may not be adapted. Main idea of inter-class MLLR Use correlation between different classes to compensate for the shortcoming mentioned above.
Concept of inter-class MLLR Inter-class function The relation between different classes if is the inter-class function between class 1 and class 2 , i class 1 , i class 2
Concept of inter-class MLLR(2) Model setup steps Define multiple classes ( say class 1~n ) Find inter-class functions between each class Adaptation steps Choose a target class which is going to be adapted ( say class k ) Rank other classes according to their “closeness” to the target class
Concept of inter-class MLLR(3) If adaptation data ( i target class k ) coming, use conventional MLLR to find If adaptation data ( i target class k ) coming, use inter-class function to convert as the adaptation data in class k, and then use conventional MLLR to find Repeat above steps until all classes are adapted
Concept of inter-class MLLR(4) Adaptation data are selected from classes of decreasing proximity to the target class until there are sufficient data to estimate the target function. Limit cases no neighboring classes used → conventional multi-class MLLR All neighboring classes used → conventional single-class MLLR
Inter-class function training , i class m , i class n let then , i class n
Inter-class function training(2) Assuming we have training data of R speakers. We use these data to train , for each class for each speaker. ie. s={1,2,…,R}, m={1,2,…,n}. , i class n, for class m for speaker s ∴
Inter-class function training(3) Let we use the equation above and training data for all speakers to obtain and by conventional MLLR.
Experiments Mean-square error from simulated estimates of Gaussian means
Experiments(2) Word error rate for different types of MLLR 25 training speakers 13 phonetic-based regression classes 10 testing speakers, 20 sentences per speaker for testing, 5 sentences per speaker for adaptation use all the neighbor classes to estimate each target class Silence and noise phones are not adapted
Experiments(3)