PROJECT PROPOSAL Shamalee Deshpande.

Slides:



Advertisements
Similar presentations
SPPA 403 Speech Science1 Unit 3 outline The Vocal Tract (VT) Source-Filter Theory of Speech Production Capturing Speech Dynamics The Vowels The Diphthongs.
Advertisements

Vowel Formants in a Spectogram Nural Akbayir, Kim Brodziak, Sabuha Erdogan.
Physical modeling of speech XV Pacific Voice Conference PVSF-PIXAR Brad Story Dept. of Speech, Language and Hearing Sciences University of Arizona.
Vineel Pratap Girish Govind Abhilash Veeragouni. Human listeners are capable of extracting information from the acoustic signal beyond just the linguistic.
The Human Voice. I. Speech production 1. The vocal organs
The Human Voice Chapters 15 and 17. Main Vocal Organs Lungs Reservoir and energy source Larynx Vocal folds Cavities: pharynx, nasal, oral Air exits through.
Speaker Recognition Sharat.S.Chikkerur Center for Unified Biometrics and Sensors
Itay Ben-Lulu & Uri Goldfeld Instructor : Dr. Yizhar Lavner Spring /9/2004.
Vowel Acoustics, part 2 November 14, 2012 The Master Plan Acoustics Homeworks are due! Today: Source/Filter Theory On Friday: Transcription of Quantity/More.
A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU.
Communications & Multimedia Signal Processing Analysis of the Effects of Train noise on Recognition Rate using Formants and MFCC Esfandiar Zavarehei Department.
Feature vs. Model Based Vocal Tract Length Normalization for a Speech Recognition-based Interactive Toy Jacky CHAU Department of Computer Science and Engineering.
Anatomic Aspects Larynx: Sytem of muscles, cartileges and ligaments.
1 Lab Preparation Initial focus on Speaker Verification –Tools –Expertise –Good example “Biometric technologies are automated methods of verifying or recognising.
A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU.
Communications & Multimedia Signal Processing Analysis of Effects of Train/Car noise in Formant Track Estimation Qin Yan Department of Electronic and Computer.
A PRESENTATION BY SHAMALEE DESHPANDE
Representing Acoustic Information
Acoustic and Linguistic Characterization of Spontaneous Speech Masanobu Nakamura, Koji Iwano, and Sadaoki Furui Department of Computer Science Tokyo Institute.
Source/Filter Theory and Vowels February 4, 2010.
Kinect Player Gender Recognition from Speech Analysis
Eng. Shady Yehia El-Mashad
Age and Gender Classification using Modulation Cepstrum Jitendra Ajmera (presented by Christian Müller) Speaker Odyssey 2008.
Artificial Intelligence 2004 Speech & Natural Language Processing Natural Language Processing written text as input sentences (well-formed) Speech.
Study of Word-Level Accent Classification and Gender Factors
Vowels, part 4 March 19, 2014 Just So You Know Today: Source-Filter Theory For Friday: vowel transcription! Turkish, British English and New Zealand.
Chapter 16 Speech Synthesis Algorithms 16.1 Synthesis based on LPC 16.2 Synthesis based on formants 16.3 Synthesis based on homomorphic processing 16.4.
Reporter: Shih-Hsiang( 士翔 ). Introduction Speech signal carries information from many sources –Not all information is relevant or important for speech.
Speech Acoustics1 Clinical Application of Frequency and Intensity Variables Frequency Variables Amplitude and Intensity Variables Voice Disorders Neurological.
Speech Or can you hear me now?. Linguistic Parts of Speech Phone Phone Basic unit of speech sound Basic unit of speech sound Phoneme Phoneme Phone to.
Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
Speaker Recognition by Habib ur Rehman Abdul Basit CENTER FOR ADVANCED STUDIES IN ENGINERING Digital Signal Processing ( Term Project )
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
Korean Phoneme Discrimination Ben Lickly Motivation Certain Korean phonemes are very difficult for English speakers to distinguish, such as ㅅ and ㅆ.
Overview ► Recall ► What are sound features? ► Feature detection and extraction ► Features in Sphinx III.
Artificial Intelligence 2004 Speech & Natural Language Processing Speech Recognition acoustic signal as input conversion into written words Natural.
VOCODERS. Vocoders Speech Coding Systems Implemented in the transmitter for analysis of the voice signal Complex than waveform coders High economy in.
Look who’s talking? Project 3.1 Yannick Thimister Han van Venrooij Bob Verlinden Project DKE Maastricht University.
Singer similarity / identification Francois Thibault MUMT 614B McGill University.
Towards a Cohort-Selective Frequency- Compression Hearing Aid Marie Roch ¤, Richard R. Hurtig ¥, Jing Lui ¤, and Tong Huang ¤ ¥ ¤
Speaker Identification by Combining MFCC and Phase Information Longbiao Wang (Nagaoka University of Technologyh, Japan) Seiichi Nakagawa (Toyohashi University.
A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU.
Performance Comparison of Speaker and Emotion Recognition
EEL 6586: AUTOMATIC SPEECH PROCESSING Speech Features Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida February 27,
Automatic Speech Recognition A summary of contributions from multiple disciplines Mark D. Skowronski Computational Neuro-Engineering Lab Electrical and.
SRINIVAS DESAI, B. YEGNANARAYANA, KISHORE PRAHALLAD A Framework for Cross-Lingual Voice Conversion using Artificial Neural Networks 1 International Institute.
RCC-Mean Subtraction Robust Feature and Compare Various Feature based Methods for Robust Speech Recognition in presence of Telephone Noise Amin Fazel Sharif.
Speech Processing Using HTK Trevor Bowden 12/08/2008.
EEL 6586: AUTOMATIC SPEECH PROCESSING Speech Features Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida February 20,
BIOMETRICS VOICE RECOGNITION. Meaning Bios : LifeMetron : Measure Bios : LifeMetron : Measure Biometrics are used to identify the input sample when compared.
B. Harpsichord Strings are plucked
The Human Voice. 1. The vocal organs
Automatic Speech Processing Project
ARTIFICIAL NEURAL NETWORKS
Spoken Digit Recognition
Presentation on Artificial Neural Network Based Pathological Voice Classification Using MFCC Features Presenter: Subash Chandra Pakhrin 072MSI616 MSC in.
Illustration Findings
The Human Voice. 1. The vocal organs
Sound & Voice Year 7 Drama.
Sharat.S.Chikkerur S.Anand Mantravadi Rajeev.K.Srinivasan
Speech Perception CS4706.
Elise A. Piazza, Marius Cătălin Iordan, Casey Lew-Williams 
Remember me? The number of times this happens in 1 second determines the frequency of the sound wave.
Ala’a Spaih Abeer Abu-Hantash Directed by Dr.Allam Mousa
Digital Systems: Hardware Organization and Design
Linear Prediction.
Speech Processing Final Project
Keyword Spotting Dynamic Time Warping
Dr. Babasaheb Ambedkar Marathwada University, Aurangabad
Presentation transcript:

PROJECT PROPOSAL Shamalee Deshpande

Problem Statement Extracting soft biometric features Age Gender Accent

Speaker Database A Speaker database from the LDC Corpus Catalog* Preferable use half the speaker set for training and the later half for verification of results Contain varying gender, age and accent *Linguistic Data Consortium, http://www.ldc.upenn.edu/Catalog/

Possible Computation for Gender Pitch In Cepstrum Analysis, Formants are completely removed from the spectrum thus isolating the pitch frequency. LPC also used to find pitch Pitch is used to classify speech with regards to Gender Av Males=100-132Hz Av Females=142-256Hz Window DFT IDFT Speech Cepstrum LOG

Possible Computation for Accent People usually have characteristic styles of pronouncing phonemes from an early age dependant on the primary language learned. Cepstral coefficients may again be used and presumably the MFCCs for the analysis of the speech spectrum to identify local/non-local speakers in a database.

Possible Computation for Age BUZZER Glottal excitation TUBE Vocal tract Characterized by intensity and pitch Characterized by formants Vocal tract length is said to be a good classifier of the age of a speaker Formant frequencies derived using LPC co-relate to the length of the vocal tract Children are said to have a higher formant frequency range than adults Specifically, elderly speakers are said to have lower formant frequencies F1,F2,F3 than their younger counterparts more so seen with regards to F1