Research on Machine Learning and Deep Learning Dr. M.W. Mak, Dept. of EIE, HKPolyU Machine Learning and Deep Learning Speech Applications Speaker Recognition Speech Emotion Recognition Bioinformatics Applications Protein Recognition ECG Recognition 7 CERG/GRF 1 CRF 1 ITF-TCS 3 GRF 1 ITF-TCS
Deep Learning for Domain Adaptation in Speaker Verification While state-of-the-art speaker verification systems work very well under the environment (source domain) for which they are trained, their performance suffers when they are deployed in a new target domain -- a phenomenon known as domain mismatch. This GRF project aims to develop a new deep neural network (DNN) for domain adaptation so that only a small amount of labeled data from the target domain will be sufficient for training a system to work in the new domain. A joined network comprising a regression DNN and a classification DNN will be co-trained by semi-supervised deep learning algorithms that exploit the labeled data from the source domain and unlabeled data from the target domain.
Deep Learning for Domain Adaptation in Speaker Verification
Deep Variational Learning for Robust Speaker Verification To incorporate voice biometrics into remote services, it is important to ensure that recognition accuracy can still be maintained even if the users are speaking in an adverse environment. The prevalent utterance representation (i-vectors) and scoring method (PLDA) rely on linear models to summarize the spectral characteristics of speech and to marginalize out any nuisance variability not related to speaker recognition. In this GRF project, we propose incorporating speaker labels into the learning algorithm of variational autoencoders. We also question on the suitability of the linearity assumption of the i-vector representation and introduce a variational utterance representation that can model more complex distributions of acoustic features.
Deep Variational Learning for Robust Speaker Verification
Demonstration Enrollment Verification