G.S.MOZE COLLEGE OF ENGINNERING BALEWADI,PUNE -45.

Slides:



Advertisements
Similar presentations
15-Nov-13www.fakengineer.com Seminar O n morphing.
Advertisements

Coarticulation Analysis of Dysarthric Speech Xiaochuan Niu, advised by Jan van Santen.
Entropy and Dynamism Criteria for Voice Quality Classification Applications Authors: Peter D. Kukharchik, Igor E. Kheidorov, Hanna M. Lukashevich, Denis.
IBM Labs in Haifa © 2007 IBM Corporation SSW-6, Bonn, August 23th, 2007 Maximum-Likelihood Dynamic Intonation Model for Concatenative Text to Speech System.
Pitch Prediction From MFCC Vectors for Speech Reconstruction Xu shao and Ben Milner School of Computing Sciences, University of East Anglia, UK Presented.
Basic Spectrogram Lab 8. Spectrograms §Spectrograph: Produces visible patterns of acoustic energy called spectrograms §Spectrographic Analysis: l Acoustic.
Speaker Recognition Sharat.S.Chikkerur Center for Unified Biometrics and Sensors
Speech Group INRIA Lorraine
VOICE CONVERSION METHODS FOR VOCAL TRACT AND PITCH CONTOUR MODIFICATION Oytun Türk Levent M. Arslan R&D Dept., SESTEK Inc., and EE Eng. Dept., Boğaziçi.
Analysis and Synthesis of Shouted Speech Tuomo Raitio Jouni Pohjalainen Manu Airaksinen Paavo Alku Antti Suni Martti Vainio.
December 2006 Cairo University Faculty of Computers and Information HMM Based Speech Synthesis Presented by Ossama Abdel-Hamid Mohamed.
6/3/20151 Voice Transformation : Speech Morphing Gidon Porat and Yizhar Lavner SIPL – Technion IIT December
4/25/2001ECE566 Philip Felber1 Speech Recognition A report of an Isolated Word experiment. By Philip Felber Illinois Institute of Technology April 25,
Speaker Adaptation for Vowel Classification
Signal Modeling for Robust Speech Recognition With Frequency Warping and Convex Optimization Yoon Kim March 8, 2000.
Voice Transformation Project by: Asaf Rubin Michael Katz Under the guidance of: Dr. Izhar Levner.
Dynamic Time Warping Applications and Derivation
Voice Transformations Challenges: Signal processing techniques have advanced faster than our understanding of the physics Examples: – Rate of articulation.
A PRESENTATION BY SHAMALEE DESHPANDE
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING MARCH 2010 Lan-Ying Yeh
Natural Language Understanding
Representing Acoustic Information
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Normalization of the Speech Modulation Spectra for Robust Speech Recognition Xiong Xiao, Eng Siong Chng, and Haizhou Li Wen-Yi Chu Department of Computer.
Eng. Shady Yehia El-Mashad
IIT Bombay ICA 2004, Kyoto, Japan, April 4 - 9, 2004   Introdn HNM Methodology Results Conclusions IntrodnHNM MethodologyResults.
„Bandwidth Extension of Speech Signals“ 2nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd and 23rd June.
Introduction to Voice Conversion Hsin-Te Hwang Department of Communication Engineering, Chiao Tung University, Hsinchu 1.
1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.
Acoustic Analysis of Speech Robert A. Prosek, Ph.D. CSD 301 Robert A. Prosek, Ph.D. CSD 301.
Chapter 16 Speech Synthesis Algorithms 16.1 Synthesis based on LPC 16.2 Synthesis based on formants 16.3 Synthesis based on homomorphic processing 16.4.
International Conference on Intelligent and Advanced Systems 2007 Chee-Ming Ting Sh-Hussain Salleh Tian-Swee Tan A. K. Ariff. Jain-De,Lee.
Evaluation of Speaker Recognition Algorithms. Speaker Recognition Speech Recognition and Speaker Recognition speaker recognition performance is dependent.
1 PATTERN COMPARISON TECHNIQUES Test Pattern:Reference Pattern:
LML Speech Recognition Speech Recognition Introduction I E.M. Bakker.
Speech Signal Processing I By Edmilson Morais And Prof. Greg. Dogil Second Lecture Stuttgart, October 25, 2001.
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
Advanced Topics in Speech Processing (IT60116) K Sreenivasa Rao School of Information Technology IIT Kharagpur.
Speaker Recognition by Habib ur Rehman Abdul Basit CENTER FOR ADVANCED STUDIES IN ENGINERING Digital Signal Processing ( Term Project )
DR.D.Y.PATIL POLYTECHNIC, AMBI COMPUTER DEPARTMENT TOPIC : VOICE MORPHING.
Feature Vector Selection and Use With Hidden Markov Models to Identify Frequency-Modulated Bioacoustic Signals Amidst Noise T. Scott Brandes IEEE Transactions.
Overview ► Recall ► What are sound features? ► Feature detection and extraction ► Features in Sphinx III.
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
Audio processing methods on marine mammal vocalizations Xanadu Halkias Laboratory for the Recognition and Organization of Speech and Audio
(Extremely) Simplified Model of Speech Production
Speaker Identification by Combining MFCC and Phase Information Longbiao Wang (Nagaoka University of Technologyh, Japan) Seiichi Nakagawa (Toyohashi University.
Performance Comparison of Speaker and Emotion Recognition
Automatic Speech Recognition A summary of contributions from multiple disciplines Mark D. Skowronski Computational Neuro-Engineering Lab Electrical and.
Predicting Voice Elicited Emotions
A. R. Jayan, P. C. Pandey, EE Dept., IIT Bombay 1 Abstract Perception of speech under adverse listening conditions may be improved by processing it to.
By Sarita Jondhale 1 Signal preprocessor: “conditions” the speech signal s(n) to new form which is more suitable for the analysis Postprocessor: operate.
Automated Speach Recognotion Automated Speach Recognition By: Amichai Painsky.
Speaker Verification System Middle Term Presentation Performed by: Barak Benita & Daniel Adler Instructor: Erez Sabag.
AN EXPECTATION MAXIMIZATION APPROACH FOR FORMANT TRACKING USING A PARAMETER-FREE NON-LINEAR PREDICTOR Issam Bazzi, Alex Acero, and Li Deng Microsoft Research.
High Quality Voice Morphing
Mr. Darko Pekar, Speech Morphing Inc.
Online Multiscale Dynamic Topic Models
ARTIFICIAL NEURAL NETWORKS
Vocoders.
Sharat.S.Chikkerur S.Anand Mantravadi Rajeev.K.Srinivasan
1 Vocoders. 2 The Channel Vocoder (analyzer) : The channel vocoder employs a bank of bandpass filters,  Each having a bandwidth between 100 HZ and 300.
Voice conversion using Artificial Neural Networks
The Vocoder and its related technology
Ala’a Spaih Abeer Abu-Hantash Directed by Dr.Allam Mousa
EE513 Audio Signals and Systems
Speech Processing Final Project
Keyword Spotting Dynamic Time Warping
Music Signal Processing
Auditory Morphing Weyni Clacken
Presentation transcript:

G.S.MOZE COLLEGE OF ENGINNERING BALEWADI,PUNE -45.   A PRESENTATION ON Voice Morphing PROJECT GUIDE : By: Anil Mahadik Prof. Sonali Ghote

Content Title Introduction History Need of Vocal track area function AR-HMM Analysis AR-HMM Diagram Re-synthesis of Converted voice

Training Phase Conversion and morphing phase Application Conclusion References

Title The Project title is “Voice Morphing”. Give the information about Flexible Voice Morphing based on linear combination of multispeakers’ vocal tract area function. Voice morphing or voice conversion usually means transformation from a source speaker’s speech to a target speaker’s.

Introduction The main goal of the developed audio morphing methods is the smooth transformation from one sound to another. These techniques are considered to be a kind of point-to-point mapping in a feature space. There are many applications which may benefit from this sort of technology. Research on voice morphing aims to extend this restriction to area-to-area mapping by introducing multi-speakers .

History Voice morphing is a technology developed at the Los Alamos National Laboratory in New Mexico, USA by George Papcun and publicly demonstrated in 1999. Voice morphing enables speech patterns to be cloned and an accurate copy of a person's voice be made which can then say anything the operator wishes it to say.

Need of Vocal track area function Since the 1990s, many techniques for voice conver-sion have been proposed [1-7]. One successful technique is to use a statistical method for mapping a source speaker’s voice to a target speaker’s but a weakness of these methods is the discontinuity of formants. The proposed method employs an estimated vocal tract area function to avoid such weakness.

Vocal Tract area function(A) Interpolation in the vocal tract area domain is considered to provide reasonably continuous transition of formants. Estimation of the vocal tract area function implies simultaneous estimation of the voice source characteristics.

AR-HMM analysis For this purpose of Estimation of the vocal tract area function introduce Auto-Regressive Hidden Markov Model (AR-HMM) analysis of speech. The AR-HMM model represents the vocal tract characteristics by an AR model and the glottal source wave by an HMM. The AR-HMM analysis estimates the vocal tract resonance characteristics and vocal source waves in the sense of maximum likelihood estimation.

Diagram of AR-HMM

Re-synthesis of the converted voice There are two phase’s Training phase and Conversion & Morphing phase. The procedure of each phase is as follow in Diagram.

Training phase AR-HMM analysis: Speech samples with the same phonetic content from both source and target speaker are analyzed . Feature alignment: The feature vectors obtained above are time-aligned using dynamic time warping (DTW) in order to compensate for any differences in duration between source and target utterances. Estimation of the conversion function: The aligned vectors are used to train a joint GMM whose parameters are then used to construct a stochastic conversion function.

Training phase

Conversion and morphing phase AR-HMM analysis: In this case only the source speaker’s utterances are used. Features Transformation: The GMM-based transfor-mation function constructed during training is now used for converting every source log vocal tract area function and vocal cord cepstrum into its most likely target equivalent. Linear Interpolation ,Synthesis of the source wave and LPC synthesis.

Conversion and morphing phase

Application Applications as the creation of peculiar voices in animation films. Voice morphing has tremendous possibilities in military psychological warfare and subversion. Voice morphing is a powerful battlefield weapon which can be used to provide fake orders to the enemy's troops, appearing to come from their own commanders.

Conclusion This paper has presented a voice morphing method based on mappings in the vocal tract area space and glottal source wave spectrum that can each be independently mod-ified. These features have been realized using AR-HMM analysis of speech. In future, we will investigate how to improve the quality of voice conversion with interpolation techniques.

References [1] L.M. Arslan, D.Talkin, ”Voice conversion by codebook map-ping of line spectral frequencies and excitation spectrum,” Proc. Eurospeech, pp.1347-1350, 1997. [2] Y.Stylianou, O.Cappe, “A system voice conversion based on probabilistic classification and a harmonic plus noise mod-el”, Proc.ICASSP, pp.281-284, 1998 . [3] A.Kain, “Spectral voice conversion for text-to-speech syn-thesis”, Proc.ICASSP pp.285-288, 1998. [4] H. Ye, S. Young, “High Quality Voice Morphing”, in Proc.IEEEICASSP, pp.9-12, 2004.