Presentation is loading. Please wait.

Presentation is loading. Please wait.

1-R-43  Neutral-to-Emotional Voice Conversion with Latent Representations of F0 using Generative Adversarial Networks Zhaojie Luo, Tetsuya Takiguchi, and.

Similar presentations


Presentation on theme: "1-R-43  Neutral-to-Emotional Voice Conversion with Latent Representations of F0 using Generative Adversarial Networks Zhaojie Luo, Tetsuya Takiguchi, and."— Presentation transcript:

1 1-R-43  Neutral-to-Emotional Voice Conversion with Latent Representations of F0 using Generative Adversarial Networks Zhaojie Luo, Tetsuya Takiguchi, and Yasuo Ariki (Kobe University) Canonical Correlation Analysis Overview Background Problems Goal 1. Applying the continuous wavelet transform (CWT) and cross wavelet transform (XWT) method to systematically capture the F0 features of different temporal scales. 2. Using the VAE-GAN to train the MCC and AS-CWT features. 1. The representation of fundamental frequency (F0) is too simple for emotion conversion. 2. The emotional voice data is insufficient. keep linguistic information unchanged  Hey Hey neutral sad happy angry Emotional voice conversion Emotional robot Framework L = LGAN + LDl like + Lprior Training model Dataset Samples: Results x E h G D x’ y input real data ouput Table 1 F0-RMSE results for different emotions. N2A, N2S and N2H represent the datasets from neutral to angry, sad and happy voice, respectively. MOS evaluation of emotional voice conversion Source LG NN VAE GAN VA-GAN N2A N2S N2H


Download ppt "1-R-43  Neutral-to-Emotional Voice Conversion with Latent Representations of F0 using Generative Adversarial Networks Zhaojie Luo, Tetsuya Takiguchi, and."

Similar presentations


Ads by Google