Korean Phoneme Discrimination Ben Lickly Motivation Certain Korean phonemes are very difficult for English speakers to distinguish, such as ㅅ and ㅆ.

Slides:



Advertisements
Similar presentations
NEURAL NETWORKS Backpropagation Algorithm
Advertisements

Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
Voiceprint System Development Design, implement, test unique voiceprint biometric system Research Day Presentation, May 3 rd 2013 Rahul Raj (Team Lead),
Artificial Intelligence 13. Multi-Layer ANNs Course V231 Department of Computing Imperial College © Simon Colton.
Entropy and Dynamism Criteria for Voice Quality Classification Applications Authors: Peter D. Kukharchik, Igor E. Kheidorov, Hanna M. Lukashevich, Denis.
Acoustic Model Adaptation Based On Pronunciation Variability Analysis For Non-Native Speech Recognition Yoo Rhee Oh, Jae Sam Yoon, and Hong Kook Kim Dept.
Dual-domain Hierarchical Classification of Phonetic Time Series Hossein Hamooni, Abdullah Mueen University of New Mexico Department of Computer Science.
Speech Sound Production: Recognition Using Recurrent Neural Networks Abstract: In this paper I present a study of speech sound production and methods for.
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Neural Networks My name is Burleson. Neural Networks vs Conventional Computing Programming is broken into small, unambiguous steps Algorithms must be.
LYU0103 Speech Recognition Techniques for Digital Video Library Supervisor : Prof Michael R. Lyu Students: Gao Zheng Hong Lei Mo.
Speaker Adaptation for Vowel Classification
Document Classification Comparison Evangel Sarwar, Josh Woolever, Rebecca Zimmerman.
Classification of Music According to Genres Using Neural Networks, Genetic Algorithms and Fuzzy Systems.
1 Automated Feature Abstraction of the fMRI Signal using Neural Network Clustering Techniques Stefan Niculescu and Tom Mitchell Siemens Medical Solutions,
October 14, 2010Neural Networks Lecture 12: Backpropagation Examples 1 Example I: Predicting the Weather We decide (or experimentally determine) to use.
Authors: Anastasis Kounoudes, Anixi Antonakoudi, Vasilis Kekatos
Speech Recognition Deep Learning and Neural Nets Spring 2015.
Database Construction for Speech to Lip-readable Animation Conversion Gyorgy Takacs, Attila Tihanyi, Tamas Bardi, Gergo Feldhoffer, Balint Srancsik Peter.
Introduction Mel- Frequency Cepstral Coefficients (MFCCs) are quantitative representations of speech and are commonly used to label sound files. They are.
Audio classification Discriminating speech, music and environmental audio Rajas A. Sambhare ECE 539.
Age and Gender Classification using Modulation Cepstrum Jitendra Ajmera (presented by Christian Müller) Speaker Odyssey 2008.
Study of Word-Level Accent Classification and Gender Factors
Multiple-Layer Networks and Backpropagation Algorithms
Learning of Word Boundaries in Continuous Speech using Time Delay Neural Networks Colin Tan School of Computing, National University of Singapore.
Explorations in Neural Networks Tianhui Cai Period 3.
Neural Networks Ellen Walker Hiram College. Connectionist Architectures Characterized by (Rich & Knight) –Large number of very simple neuron-like processing.
Modeling speech signals and recognizing a speaker.
Appendix B: An Example of Back-propagation algorithm
International Conference on Intelligent and Advanced Systems 2007 Chee-Ming Ting Sh-Hussain Salleh Tian-Swee Tan A. K. Ariff. Jain-De,Lee.
Reporter: Shih-Hsiang( 士翔 ). Introduction Speech signal carries information from many sources –Not all information is relevant or important for speech.
Jacob Zurasky ECE5526 – Spring 2011
Supervisor: Dr. Eddie Jones Co-supervisor: Dr Martin Glavin Electronic Engineering Department Final Year Project 2008/09 Development of a Speaker Recognition/Verification.
Multi-Layer Perceptron
Look who’s talking? Project 3.1 Yannick Thimister Han van Venrooij Bob Verlinden Project DKE Maastricht University.
An Artificial Neural Network Approach to Surface Waviness Prediction in Surface Finishing Process by Chi Ngo ECE/ME 539 Class Project.
Phonetic features in ASR Kurzvortrag Institut für Kommunikationsforschung und Phonetik Bonn 17. Juni 1999 Jacques Koreman Institute of Phonetics University.
Neural Networks 2nd Edition Simon Haykin
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Artificial Neural Network System to Predict Golf Score on the PGA Tour ECE 539 – Fall 2003 Final Project Robert Steffes ID:
ADAPTIVE BABY MONITORING SYSTEM Team 56 Michael Qiu, Luis Ramirez, Yueyang Lin ECE 445 Senior Design May 3, 2016.
Intro. ANN & Fuzzy Systems Lecture 11. MLP (III): Back-Propagation.
Evolutionary Computation Evolving Neural Network Topologies.
Speech Recognition through Neural Networks By Mohammad Usman Afzal Mohammad Waseem.
Korean Phoneme Discrimination
Copyright © American Speech-Language-Hearing Association
Multiple-Layer Networks and Backpropagation Algorithms
Fall 2004 Backpropagation CS478 - Machine Learning.
Voice selection on notes
Outline Problem Description Data Acquisition Method Overview
Extreme Learning Machine
Speaker Classification through Deep Learning
Spoken Digit Recognition
Presentation on Artificial Neural Network Based Pathological Voice Classification Using MFCC Features Presenter: Subash Chandra Pakhrin 072MSI616 MSC in.
Jordi Pons, Olga Slizovskaia, Rong Gong, Emilia Gómez, Xavier Serra
Speech Recognition Christian Schulze
PROJECT PROPOSAL Shamalee Deshpande.
General Aspects of Learning
Musical Style Classification
A Kaggle Project By Ryan Bambrough
Artificial Neural Networks
Elise A. Piazza, Marius Cătălin Iordan, Casey Lew-Williams 
AUDIO SURVEILLANCE SYSTEMS: SUSPICIOUS SOUND RECOGNITION
Cheng-Kuan Wei1 , Cheng-Tao Chung1 , Hung-Yi Lee2 and Lin-Shan Lee2
Project 8: Internet of Hospital Things (IoHT): Communication to Facilitate Healing REU Student: Kiran Pandit Graduate mentors: Myungho Lee, Kangsoo Kim,
Word2Vec.
Advances in Deep Audio and Audio-Visual Processing
Phoneme Recognition Using Neural Networks by Albert VanderMeulen
Learning Combinational Logic
Auditory Morphing Weyni Clacken
Presentation transcript:

Korean Phoneme Discrimination Ben Lickly

Motivation Certain Korean phonemes are very difficult for English speakers to distinguish, such as ㅅ and ㅆ. ㅅ (IPA: s) ㅆ (IPA: s )

Network Inputs Sound files were edited down to single relevant phoneme. Mel Frequency Cepstral Coefficients (MFCC) were taken over each half of the phoneme. 13 coefficients per half = 26 total inputs

Network Outputs Only two outputs, one corresponding to ㅅ and one corresponding to ㅆ. Input sound of ㅅ → network outputs [1,0] Input sound of ㅆ → network outputs [0,1]

Network Structure

Training data Sample data was gathered from two native speakers. One male speaker, one female speaker Korean audio was also downloaded from internet and edited. A total of 56 samples were collected: 26 samples of each consonant.

Training Method Withheld 10% of data for validation Performed 10,000 epochs of backpropagation Often the network did not converge within time frame, or did not classify test data correctly. In these cases multiple rounds of training were needed.

Results For network with 5 hidden layer nodes, network took approximately 3 hours to converge. For network with 2 hidden layer nodes, network took approximately 2.5 hours to converge. For network with a single hidden layer node, network was not able to converge in a reasonable time frame (12 hours).

Potential Improvements Network does not deal well with sounds that fall between two extremes. One solution would be to add more diversity to sample data. (Include non- native speakers.)

Demo/Questions Samples of my speech: ㅅ : ㅆ :