Voice conversion using Artificial Neural Networks

Slides:



Advertisements
Similar presentations
Adaption Adjusting Model’s parameters for a new speaker. Adjusting all parameters need a huge amount of data (impractical). The solution is to cluster.
Advertisements

6/3/20151 Voice Transformation : Speech Morphing Gidon Porat and Yizhar Lavner SIPL – Technion IIT December
Sensor/Actuator Network Calibration Kamin Whitehouse Nest Retreat, June
November 19, 2009Introduction to Cognitive Science Lecture 20: Artificial Neural Networks I 1 Artificial Neural Network (ANN) Paradigms Overview: The Backpropagation.
Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 15: Introduction to Artificial Neural Networks Martin Russell.
Speaker Adaptation for Vowel Classification
Introduction to Neural Networks Simon Durrant Quantitative Methods December 15th.
Language and Speaker Identification using Gaussian Mixture Model Prepare by Jacky Chau The Chinese University of Hong Kong 18th September, 2002.
Optimal Adaptation for Statistical Classifiers Xiao Li.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Track: Speech Technology Kishore Prahallad Assistant Professor, IIIT-Hyderabad 1Winter School, 2010, IIIT-H.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
 C. C. Hung, H. Ijaz, E. Jung, and B.-C. Kuo # School of Computing and Software Engineering Southern Polytechnic State University, Marietta, Georgia USA.
HMM-BASED PSEUDO-CLEAN SPEECH SYNTHESIS FOR SPLICE ALGORITHM Jun Du, Yu Hu, Li-Rong Dai, Ren-Hua Wang Wen-Yi Chu Department of Computer Science & Information.
Adaption Def: To adjust model parameters for new speakers. Adjusting all parameters requires too much data and is computationally complex. Solution: Create.
Isolated-Word Speech Recognition Using Hidden Markov Models
Classification of place of articulation in unvoiced stops with spectro-temporal surface modeling V. Karjigi , P. Rao Dept. of Electrical Engineering,
Utterance Verification for Spontaneous Mandarin Speech Keyword Spotting Liu Xin, BinXi Wang Presenter: Kai-Wun Shih No.306, P.O. Box 1001,ZhengZhou,450002,
Artificial Neural Networks (ANN). Output Y is 1 if at least two of the three inputs are equal to 1.
Prepared by: Waleed Mohamed Azmy Under Supervision:
Learning of Word Boundaries in Continuous Speech using Time Delay Neural Networks Colin Tan School of Computing, National University of Singapore.
International Conference on Intelligent and Advanced Systems 2007 Chee-Ming Ting Sh-Hussain Salleh Tian-Swee Tan A. K. Ariff. Jain-De,Lee.
Research & development Component Score Weighting for GMM based Text-Independent Speaker Verification Liang Lu SNLP Unit, France Telecom R&D Beijing
Machine Learning Seminar: Support Vector Regression Presented by: Heng Ji 10/08/03.
Conceptual Foundations © 2008 Pearson Education Australia Lecture slides for this course are based on teaching materials provided/referred by: (1) Statistics.
Evaluation of Speaker Recognition Algorithms. Speaker Recognition Speech Recognition and Speaker Recognition speaker recognition performance is dependent.
Ekapol Chuangsuwanich and James Glass MIT Computer Science and Artificial Intelligence Laboratory,Cambridge, Massachusetts 02139,USA 2012/07/2 汪逸婷.
Regression Approaches to Voice Quality Control Based on One-to-Many Eigenvoice Conversion Kumi Ohta, Yamato Ohtani, Tomoki Toda, Hiroshi Saruwatari, and.
MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.
An Evaluation of Many-to-One Voice Conversion Algorithms with Pre-Stored Speaker Data Sets Daisuke Tani, Yamato Ohtani, Tomoki Toda, Hiroshi Saruwatari.
Speech Parameter Generation From HMM Using Dynamic Features Keiichi Tokuda, Takao Kobayashi, Satoshi Imai ICASSP 1995 Reporter: Huang-Wei Chen.
So Far……  Clustering basics, necessity for clustering, Usage in various fields : engineering and industrial fields  Properties : hierarchical, flat,
Algoritmi e Programmazione Avanzata
DR.D.Y.PATIL POLYTECHNIC, AMBI COMPUTER DEPARTMENT TOPIC : VOICE MORPHING.
Language Project.  Neural networks have a large appeal to many researchers due to their great closeness to the structure of the brain, a characteristic.
A Baseline System for Speaker Recognition C. Mokbel, H. Greige, R. Zantout, H. Abi Akl A. Ghaoui, J. Chalhoub, R. Bayeh University Of Balamand - ELISA.
Artificial Intelligence & Neural Network
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 25 Nov 4, 2005 Nanjing University of Science & Technology.
Intro. ANN & Fuzzy Systems Lecture 23 Clustering (4)
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
Speech Communication Lab, State University of New York at Binghamton Dimensionality Reduction Methods for HMM Phonetic Recognition Hongbing Hu, Stephen.
Voice Activity Detection based on OptimallyWeighted Combination of Multiple Features Yusuke Kida and Tatsuya Kawahara School of Informatics, Kyoto University,
An Artificial Neural Network Approach to Surface Waviness Prediction in Surface Finishing Process by Chi Ngo ECE/ME 539 Class Project.
All Your Voices Are Belong to Us: Stealing Voices to Fool Humans and Machines Dibya Mukhopadhyay, Maliheh Shirvanian, Nitesh Saxena University of Alabama.
Adaption Def: To adjust model parameters for new speakers. Adjusting all parameters requires an impractical amount of data. Solution: Create clusters and.
Speech Lab, ECE, State University of New York at Binghamton  Classification accuracies of neural network (left) and MXL (right) classifiers with various.
SRINIVAS DESAI, B. YEGNANARAYANA, KISHORE PRAHALLAD A Framework for Cross-Lingual Voice Conversion using Artificial Neural Networks 1 International Institute.
Speaker Verification Using Adapted GMM Presented by CWJ 2000/8/16.
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
AN EXPECTATION MAXIMIZATION APPROACH FOR FORMANT TRACKING USING A PARAMETER-FREE NON-LINEAR PREDICTOR Issam Bazzi, Alex Acero, and Li Deng Microsoft Research.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Michael.
IIS for Speech Processing Michael J. Watts
Voice Activity Detection Based on Sequential Gaussian Mixture Model Zhan Shen, Jianguo Wei, Wenhuan Lu, Jianwu Dang Tianjin Key Laboratory of Cognitive.
High Quality Voice Morphing
Self-Organizing Network Model (SOM) Session 11
Deep Feedforward Networks
Mr. Darko Pekar, Speech Morphing Inc.
Building Adaptive Basis Function with Continuous Self-Organizing Map
ARTIFICIAL NEURAL NETWORKS
Advanced Wireless Networks
What is an ANN ? The inventor of the first neuro computer, Dr. Robert defines a neural network as,A human brain like system consisting of a large number.
Final Year Project Presentation --- Magic Paint Face
RECURRENT NEURAL NETWORKS FOR VOICE ACTIVITY DETECTION
Kocaeli University Introduction to Engineering Applications
EE513 Audio Signals and Systems
Introduction to Artificial Intelligence Lecture 24: Computer Vision IV
Introduction to Neural Network
Volume 50, Issue 1, Pages (April 2006)
Auditory Morphing Weyni Clacken
Random Neural Network Texture Model
Presentation transcript:

Voice conversion using Artificial Neural Networks 606410147 蘇宗埔 606410068 莊俊輝

Outline Introduction Voice conversion Experiments Conclusion

Introduction A voice conversion system morphs the utterance of a source speaker so that it is perceived as if spoken by a specified target speaker. We have exploited the mapping abilities of Artificial Neural Networks (ANN) to perform mapping of spectral features of a source speaker to that of a target speaker.

Voice conversion Gaussian Mixture Model (GMM) Gaussian mixture model is used to describe the location of feature parameters and covariance matrix to describe the change of shape Therefore Gaussian mixture model can describe the feature distribution of sound very smoothly. Artificial Neural Network (ANN)   Artificial Neural Network (ANN) models consist of interconnected processing nodes, where each node represents the model of an artificial neuron, and the interconnection between two nodes has a weight associated with it.

Experiments For the ABX test, we presented the listeners with a GMM transformed utterance and an ANN transformed utterance to be compared against X which will always be the original target speaker utterance. To make sure that the listener does not get biased, we have shuffled the position of ANN/GMM transformed utterances i.e., A and B, with X always constant at the end. A total of 32 subjects were asked to participate in the four experiments below, the results of which are provided in Figure.

Conclusion We have exploited the mapping abilities of ANN and it is shown that ANN can be used for spectral transformation in the voice conversion framework on a continuous speech signal. The usefulness of ANN has been demonstrated on different pairs of speakers. Comparison between ANN and GMM based transformation has shown that the ANN based spectral transformation yields better results than that of GMM with MLPG.