Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, 2008 1 Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet 2009 05/08/09.

Slides:



Advertisements
Similar presentations
Acoustic/Prosodic Features
Advertisements

An Algorithm for Determining the Endpoints for Isolated Utterances L.R. Rabiner and M.R. Sambur The Bell System Technical Journal, Vol. 54, No. 2, Feb.
Analysis and Digital Implementation of the Talk Box Effect Yuan Chen Advisor: Professor Paul Cuff.
1 A Spectral-Temporal Method for Pitch Tracking Stephen A. Zahorian*, Princy Dikshit, Hongbing Hu* Department of Electrical and Computer Engineering Old.
Liner Predictive Pitch Synchronization Voiced speech detection, analysis and synthesis Jim Bryan Florida Institute of Technology ECE5525 Final Project.
Advanced Speech Enhancement in Noisy Environments
Multipitch Tracking for Noisy Speech
Abstract Binaural microphones were utilised to detect phonation in a human subject (figure 1). This detection was used to cut the audio waveform in two.
A Robust Algorithm for Pitch Tracking David Talkin Hsiao-Tsung Hung.
Vineel Pratap Girish Govind Abhilash Veeragouni. Human listeners are capable of extracting information from the acoustic signal beyond just the linguistic.
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Keyword Spotting Using Crosscorrelation Engineering Expo Banquet 2009.
Speaker Recognition Sharat.S.Chikkerur Center for Unified Biometrics and Sensors
Itay Ben-Lulu & Uri Goldfeld Instructor : Dr. Yizhar Lavner Spring /9/2004.
Automatic Lip- Synchronization Using Linear Prediction of Speech Christopher Kohnert SK Semwal University of Colorado, Colorado Springs.
Speech Classification Speech Lab Spring 2009 February 17, 09 1 Montgomery College Speech Classification Uche O. Abanulo Physics, Engineering And Geosciences.
Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson.
Facial feature localization Presented by: Harvest Jang Spring 2002.
LYU0103 Speech Recognition Techniques for Digital Video Library Supervisor : Prof Michael R. Lyu Students: Gao Zheng Hong Lei Mo.
Presented By: Karan Parikh Towards the Automated Social Analysis of Situated Speech Data Watt, Chaudhary, Bilmes, Kitts CS546 Intelligent.
Introduction to Speech Synthesis ● Key terms and definitions ● Key processes in sythetic speech production ● Text-To-Phones ● Phones to Synthesizer parameters.
Communications & Multimedia Signal Processing Analysis of the Effects of Train noise on Recognition Rate using Formants and MFCC Esfandiar Zavarehei Department.
An Algorithm for Determining the Endpoints for Isolated Utterances L.R. Rabiner and M.R. Sambur The Bell System Technical Journal, Vol. 54, No. 2, Feb.
On the Correlation between Energy and Pitch Accent in Read English Speech Andrew Rosenberg, Julia Hirschberg Columbia University Interspeech /14/06.
On the Correlation between Energy and Pitch Accent in Read English Speech Andrew Rosenberg Weekly Speech Lab Talk 6/27/06.
Computer Science Department A Speech / Music Discriminator using RMS and Zero-crossings Costas Panagiotakis and George Tziritas Department of Computer.
Communications & Multimedia Signal Processing Formant Tracking LP with Harmonic Plus Noise Model of Excitation for Speech Enhancement Qin Yan Communication.
LYU0103 Speech Recognition Techniques for Digital Video Library Supervisor : Prof Michael R. Lyu Students: Gao Zheng Hong Lei Mo.
1 Real Time, Online Detection of Abandoned Objects in Public Areas Proceedings of the 2006 IEEE International Conference on Robotics and Automation Authors.
Voice Transformations Challenges: Signal processing techniques have advanced faster than our understanding of the physics Examples: – Rate of articulation.
Representing Acoustic Information
On the Accuracy of Modal Parameters Identified from Exponentially Windowed, Noise Contaminated Impulse Responses for a System with a Large Range of Decay.
Knowledge Base approach for spoken digit recognition Vijetha Periyavaram.
9 th Conference on Telecommunications – Conftele 2013 Castelo Branco, Portugal, May 8-10, 2013 Sara Candeias 1 Dirce Celorico 1 Jorge Proença 1 Arlindo.
Time-Domain Methods for Speech Processing 虞台文. Contents Introduction Time-Dependent Processing of Speech Short-Time Energy and Average Magnitude Short-Time.
Multimedia Specification Design and Production 2013 / Semester 2 / week 3 Lecturer: Dr. Nikos Gazepidis
Speech Processing Laboratory
1 Robust Endpoint Detection and Energy Normalization for Real-Time Speech and Speaker Recognition Qi Li, Senior Member, IEEE, Jinsong Zheng, Augustine.
Dynamic Captioning: Video Accessibility Enhancement for Hearing Impairment Richang Hong, Meng Wang, Mengdi Xuy Shuicheng Yany and Tat-Seng Chua School.
In The Name of God The Compassionate The Merciful.
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
Singer similarity / identification Francois Thibault MUMT 614B McGill University.
Pitch Estimation by Enhanced Super Resolution determinator By Sunya Santananchai Chia-Ho Ling.
Duraid Y. Mohammed Philip J. Duncan Francis F. Li. School of Computing Science and Engineering, University of Salford UK Audio Content Analysis in The.
Speaker Identification by Combining MFCC and Phase Information Longbiao Wang (Nagaoka University of Technologyh, Japan) Seiichi Nakagawa (Toyohashi University.
Voice Activity Detection based on OptimallyWeighted Combination of Multiple Features Yusuke Kida and Tatsuya Kawahara School of Informatics, Kyoto University,
Singer Similarity Doug Van Nort MUMT 611. Goal Determine Singer / Vocalist based on extracted features of audio signal Classify audio files based on singer.
Chapter 6: Frequency Domain Anaysis
More On Linear Predictive Analysis
Arlindo Veiga Dirce Celorico Jorge Proença Sara Candeias Fernando Perdigão Prosodic and Phonetic Features for Speaking Styles Classification and Detection.
SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.
0 / 27 John-Paul Hosom 1 Alexander Kain Brian O. Bush Towards the Recovery of Targets from Coarticulated Speech for Automatic Speech Recognition Center.
Did I say that?? Speech Lab Spring 2009 February 03, 09 1 Montgomery College Did I Say That? Did I Say That? Automatic Keyword Spotting Using Crosscorrelation.
Detection of Vowel Onset Point in Speech S.R. Mahadeva Prasanna & Jinu Mariam Zachariah Department of Computer Science & Engineering Indian Institute.
IIT Bombay 17 th National Conference on Communications, Jan. 2011, Bangalore, India Sp Pr. 1, P3 1/21 Detection of Burst Onset Landmarks in Speech.
By Sarita Jondhale 1 Signal preprocessor: “conditions” the speech signal s(n) to new form which is more suitable for the analysis Postprocessor: operate.
Topic: Pitch Extraction
Speech Processing Laboratory, Temple University May 5, Structure-Based Speech Classification Using Nonlinear Embedding Techniques Uchechukwu Ofoegbu.
Talking with computers
ARTIFICIAL NEURAL NETWORKS
Statistical Models for Automatic Speech Recognition
Sharat.S.Chikkerur S.Anand Mantravadi Rajeev.K.Srinivasan
Linear Predictive Coding Methods
Two-Stage Mel-Warped Wiener Filter SNR-Dependent Waveform Processing
Statistical Models for Automatic Speech Recognition
Pitch Estimation By Chih-Ti Shih 12/11/2006 Chih-Ti Shih.
Structure-Based Speech Classification Using State-Space Embedding
Anthor: Andreas Tsiartas, Prasanta Kumar Ghosh,
An Algorithm for Determining the Endpoints for Isolated Utterances
Automatic Prosodic Event Detection
Presentation transcript:

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Speaker Identification Using a Pitch Detection Algorithm Presenters: Estefany Carrillo Roberto M. Meléndez Komal Syed Montgomery College Speech Processing Center Faculty Advisor: Dr. Uchechukwu Abanulo

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary Presentation Outline Presenters: Estefany Carrillo Roberto M. Meléndez Komal Syed Montgomery College Speech Processing Center Faculty Advisor: Dr. Uchechukwu Abanulo

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Objectives To estimate the pitch contour of a given speech signal using autocorrelation To determine the effectiveness of pitch for speaker identification Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Speech Signals To understand pitch, one must first understand some basic concepts of speech signals To understand pitch, one must first understand some basic concepts of speech signals Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Voiced vs. Unvoiced Speech 5 Voiced  Quasi-periodic excitation  Modulation by vocal tract  Production of mainly vowels  High Energy Unvoiced  No periodic vibration of vocal chords  Noise-like nature  Production of most consonants  Low Energy Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Speech Signals Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Pitch Illustration Pitch period is the distance in time from one peak to the next Approximately the same for the same phoneme by the same speaker No periodicity, no frequency Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 How do we measure the pitch period Automatically? Correlation Measure of similarity between two signals Two signals compared by Sliding one signal by a certain time lag Multiplying both the overlapping regions Repeating the process and adding the products until there is no more overlap Cross-correlation – two different signals compared Autocorrelation – the same signal correlated Results in a maximum peak at which we set time = 0, and the rest of the correlation signals tapers of to zero Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Rationale for Autocorrelation Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary 1.A periodic (or quasi-periodic) signal will be similar from one period to the next 2.It is expected that the maximum peak in the autocorrelation function will occur at the pitch period value for each speech frame.

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary Speech Classification Algorithm

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Speech Classification 1.Given a normalized speech signal (amplitudes from -1 to 1) 2.Since speech is non-stationary (changes characteristics frequently with time), we first segment this signal into short frames (of about 10 ms) 3.We then compute the average energy of each frame: 4.Based on a pre-determined threshold, we classify the speech into voiced or unvoiced or background Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary Pitch Detection Algorithm

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Autocorrelation Based PDA 1.First we automatically assign a pitch of zero to every unvoiced or silence frame determined from the speech classification algorithm 2.We then compute the autocorrelation function of each voiced frame 3.A peak is searched for within the 2ms to 16ms range 4.The lag of this peak is considered the pitch period for that frame, and the pitch is computed as the inverse of that lag. Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary Pitch = 0 Zero lag

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary Autocorrelation Based PDA - Illustration

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary Application and Results

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary Speaker Recognition Reference Speech Feature Extraction Model Building Test Speech FeatureExtraction ComparisonRecognitionDecision SystemOutput

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary Speaker Identification using PDA Reference Speech Pitch Detection Average Pitch of Signal Test Speech Pitch Detection and Detection and average averagepitchcomputationDistanceComputationSpeaker = Minimum = Minimumdistance SystemOutput Test Speech

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary Experiment Group II: 10 Men Group I: 10 Women 1.Record each group member twice saying the same phrase 2.Record each group member saying a different phrase

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Categories Case I: Female/Same Phrase Case II: Male/Same Phrase Case III: Female/Different Phrase Case IV: Male Different Phrase Case V: Female and Male/Same Phrase Case VI: Female and Male/Different Phrase

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Procedure 1.Select a range of thresholds for unvoiced segments of speech Range = [0.001:0.0005:0.01] 2.Construct the pitch contour for each of the reference and test speech files for all thresholds 3.Using minimum distance criterion, determine the test speaker that matches the reference speaker

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Pitch Contours PITCHPITCH AMPLITUDEAMPLITUDE Reference Speaker Time (ms)

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 PITCHPITCH Matched Test Speaker AMPLITUDEAMPLITUDE Time (ms) Pitch Contours

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary Best Threshold 3.Select threshold that gives maximum number of correctly matched speakers for each category

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Noise 4.Add different levels of noise (5dB to 30dB) to: Both reference and test speech filesBoth reference and test speech files Only reference speech fileOnly reference speech file Only test speech filesOnly test speech files 5.Examine the number of matched speakers vs. the level of SNR (Signal to Noise Ratio)

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Female/Same Phrase Noise Added to Both Files Noise Added to Reference File Noise Added to Test File

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Male/Same Phrase Noise Added to Both Files Noise Added to Reference File Noise Added to Test File

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Female/Different Phrase Noise Added to Both Files Noise Added to Reference File Noise Added to Test File

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Male/Different Phrase Noise Added to Both Files Noise Added to Reference File Noise Added to Test File

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Male and Female/Same Phrase Noise Added to Both Files Noise Added to Reference File Noise Added to Test File

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Male and Female/Different Phrase Noise Added to Both Files Noise Added to Reference File Noise Added to Test File

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary 1.Pitch detection algorithms are heavily dependent on speech segmentation accuracy 2.Pitch is somewhat effective as a simple speaker identifier Summary

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Results 3. As signal to noise ratios increase, the number of correctly identified speakers increases 4. There seems to be an optimum signal to noise ratio that gives the maximum number of correctly matched speakers

Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09 Presenters: Estefany Carrillo Roberto M. Meléndez Komal Syed Montgomery College Speech Processing Center Faculty Advisor: Dr. Uchechukwu Abanulo