Emotional Speech Analysis using Artificial Neural Networks IMCSIT-AAIA October 18-20, 2010 – Wisla, Poland. Jana Tuckova & Martin Sramka Department of.

Slides:



Advertisements
Similar presentations
The Robert Gordon University School of Engineering Dr. Mohamed Amish
Advertisements

Entropy and Dynamism Criteria for Voice Quality Classification Applications Authors: Peter D. Kukharchik, Igor E. Kheidorov, Hanna M. Lukashevich, Denis.
Automatic classification of weld cracks using artificial intelligence and statistical methods Ryszard SIKORA, Piotr BANIUKIEWICZ, Marcin CARYK Szczecin.
Designing Facial Animation For Speaking Persian Language Hadi Rahimzadeh June 2005.
MUSICAL SCALE IDENTIFICATION USING NEURAL NETWORKS -Lyndon Quadros.
Modular Neural Networks Approach to Chemical Content Analysis of Vegetation 1 N. Kussul, 1 V. Yatsenko, 2 A. Sachenko, 3 G. Markowsky, 1 A. Sydorenko,
Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 15: Introduction to Artificial Neural Networks Martin Russell.
Emergent Adaptive Lexicons Luc Steels 1996 Artificial Intelligence Laboratory Vrije Universiteit Brussel Presented by Achim Ruopp University of Washington.
On the Basis Learning Rule of Adaptive-Subspace SOM (ASSOM) Huicheng Zheng, Christophe Laurent and Grégoire Lefebvre 13th September 2006 Thanks to the.
Analysis of Classification-based Error Functions Mike Rimer Dr. Tony Martinez BYU Computer Science Dept. 18 March 2006.
Input-Output Relations in Syntactic Development Reflected in Large Corpora Anat Ninio The Hebrew University, Jerusalem The 2009 Biennial Meeting of SRCD,
Text Classification: An Implementation Project Prerak Sanghvi Computer Science and Engineering Department State University of New York at Buffalo.
Classification of Music According to Genres Using Neural Networks, Genetic Algorithms and Fuzzy Systems.
October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 1 Creating Data Representations On the other hand, sets of orthogonal vectors.
05/06/2005CSIS © M. Gibbons On Evaluating Open Biometric Identification Systems Spring 2005 Michael Gibbons School of Computer Science & Information Systems.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Attention Deficit Hyperactivity Disorder (ADHD) Student Classification Using Genetic Algorithm and Artificial Neural Network S. Yenaeng 1, S. Saelee 2.
Approximating the Algebraic Solution of Systems of Interval Linear Equations with Use of Neural Networks Nguyen Hoang Viet Michal Kleiber Institute of.
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
Gwangju Institute of Science and Technology Intelligent Design and Graphics Laboratory Multi-scale tensor voting for feature extraction from unstructured.
An Approach of Artificial Intelligence Application for Laboratory Tests Evaluation Ş.l.univ.dr.ing. Corina SĂVULESCU University of Piteşti.
Kumar Srijan ( ) Syed Ahsan( ). Problem Statement To create a Neural Networks based multiclass object classifier which can do rotation,
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
Hurieh Khalajzadeh Mohammad Mansouri Mohammad Teshnehlab
1 Artificial Neural Networks Sanun Srisuk EECP0720 Expert Systems – Artificial Neural Networks.
Matlab Matlab Sigmoid Sigmoid Perceptron Perceptron Linear Linear Training Training Small, Round Blue-Cell Tumor Classification Example Small, Round Blue-Cell.
International Conference on Intelligent and Advanced Systems 2007 Chee-Ming Ting Sh-Hussain Salleh Tian-Swee Tan A. K. Ariff. Jain-De,Lee.
Minimum Mean Squared Error Time Series Classification Using an Echo State Network Prediction Model Mark Skowronski and John Harris Computational Neuro-Engineering.
Learning a Fast Emulator of a Binary Decision Process Center for Machine Perception Czech Technical University, Prague ACCV 2007, Tokyo, Japan Jan Šochman.
Supervisor: Dr. Eddie Jones Co-supervisor: Dr Martin Glavin Electronic Engineering Department Final Year Project 2008/09 Development of a Speaker Recognition/Verification.
Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan.
Sang-Won Cho* : Ph.D. Student, KAIST Sang-Won Cho* : Ph.D. Student, KAIST Dong-Hyawn Kim: Senior Researcher, KORDI Dong-Hyawn Kim: Senior Researcher, KORDI.
ARTIFICIAL NEURAL NETWORKS. Overview EdGeneral concepts Areej:Learning and Training Wesley:Limitations and optimization of ANNs Cora:Applications and.
Overview of Part I, CMSC5707 Advanced Topics in Artificial Intelligence KH Wong (6 weeks) Audio signal processing – Signals in time & frequency domains.
Well Log Data Inversion Using Radial Basis Function Network Kou-Yuan Huang, Li-Sheng Weng Department of Computer Science National Chiao Tung University.
A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting Huang, C. L. & Tsai, C. Y. Expert Systems with Applications 2008.
Improving Speech Modelling Viktoria Maier Supervised by Prof. Hynek Hermansky.
Using Feed Forward NN for EEG Signal Classification Amin Fazel April 2006 Department of Computer Science and Electrical Engineering University of Missouri.
Speech Communication Lab, State University of New York at Binghamton Dimensionality Reduction Methods for HMM Phonetic Recognition Hongbing Hu, Stephen.
CSSE463: Image Recognition Day 14 Lab due Weds, 3:25. Lab due Weds, 3:25. My solutions assume that you don't threshold the shapes.ppt image. My solutions.
Neural Network Basics Anns are analytical systems that address problems whose solutions have not been explicitly formulated Structure in which multiple.
Reservoir Uncertainty Assessment Using Machine Learning Techniques Authors: Jincong He Department of Energy Resources Engineering AbstractIntroduction.
Performance Comparison of Speaker and Emotion Recognition
Face Image-Based Gender Recognition Using Complex-Valued Neural Network Instructor :Dr. Dong-Chul Kim Indrani Gorripati.
Objectives: Terminology Components The Design Cycle Resources: DHS Slides – Chapter 1 Glossary Java Applet URL:.../publications/courses/ece_8443/lectures/current/lecture_02.ppt.../publications/courses/ece_8443/lectures/current/lecture_02.ppt.
St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences Recurrent Neural Network-based Language Modeling for an Automatic.
Neural Network Recognition of Frequency Disturbance Recorder Signals Stephen Tang REU Final Presentation July 22, 2014.
Subjective evaluation of an emotional speech database for Basque Aholab Signal Processing Laboratory – University of the Basque Country Authors: I. Sainz,
Neural Networks for EMC Modeling of Airplanes Vlastimil Koudelka Department of Radio Electronics FEKT BUT Metz,
1 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types.
INTRODUCTION TO APPLIED LINGUISTICS
An Evolutionary Algorithm for Neural Network Learning using Direct Encoding Paul Batchis Department of Computer Science Rutgers University.
A Document-Level Sentiment Analysis Approach Using Artificial Neural Network and Sentiment Lexicons Yan Zhu.
Speech Recognition through Neural Networks By Mohammad Usman Afzal Mohammad Waseem.
Automatic Classification of Audio Data by Carlos H. L. Costa, Jaime D. Valle, Ro L. Koerich IEEE International Conference on Systems, Man, and Cybernetics.
Course Outline (6 Weeks) for Professor K.H Wong
OPTIMIZATION OF MODELS: LOOKING FOR THE BEST STRATEGY
Outline Problem Description Data Acquisition Method Overview
ARTIFICIAL NEURAL NETWORKS
Gender Classification Using Scaled Conjugate Gradient Back Propagation
A Support Vector Machine Approach to Sonar Classification
Schizophrenia Classification Using
Dynamic Routing Using Inter Capsule Routing Protocol Between Capsules
Speech Recognition Christian Schulze
Hu Li Moments for Low Resolution Thermal Face Recognition
Zhengjun Pan and Hamid Bolouri Department of Computer Science
network of simple neuron-like computing elements
An Improved Neural Network Algorithm for Classifying the Transmission Line Faults Slavko Vasilic Dr Mladen Kezunovic Texas A&M University.
Department of Electrical Engineering
Presentation transcript:

Emotional Speech Analysis using Artificial Neural Networks IMCSIT-AAIA October 18-20, 2010 – Wisla, Poland. Jana Tuckova & Martin Sramka Department of Circuit Theory, CTU – FEE in Prague Laboratory of Artificial Neural Network Applications 1/14

Overview IMCSIT-AAIA Acknowledgment: This work was supported by the Czech Science Foundation 102/09/0989 grant. Wisla, Poland  Introduction  Method - The patterns based on time and frequency characteristics - The patterns based on musical theory - Combination of both previous approaches  Experiments and Results  Conclusion and future work 2/14

IMCSIT-AAIA Wisla, Poland Introduction A classification of speech emotions. Our aim: 3/14 Why ANN? - The robustness of the solution for real methods by ANN is a great advantage, for example, in the area of noise signal processing. - It is possible treat various input data type currently.

 By a description of speech signals which are formulated by: - standard speech processing methods - music theory - combination of both methods  By ANN approach IMCSIT-AAIA Wisla, Poland Introduction  Which way ? 4/14 MLNNKSOM

IMCSIT-AAIA Wisla, Poland Introduction MLNN – with one hidden layer – the input layer is given by the key linguistic parameters – the outputs are the various clasees of emotions KSOM- SSOM – the training algorithm: Scaled Conjugate Gradient with superlinear convergence rate 5/14

which combines aspects of the VQ method with the topology preserving ordering of the quantization vectors. only for well-known input data for well-known classes of input data T he database forANN 216 patterns for training 72 for validation 72 for test IMCSIT-AAIAWisla, Poland Introduction 6/14 KSOM- SSOM

Corpus creation IMCSIT-AAIA Wisla, Poland Database of Utterances 7/14 Words (in Czech ) Words - translation Jé.Whoah. Má ?Got it ? Nevím.I don´t know. Vidíš?See you ? Povídej !Tell me ! Poezie.Poetry. Sentences (in Czech) Sentences - translation To mi nevadí.I don´t mind. Neumím si to vysvětlit. I don´t know to explain this. To bude světový rekord. It will be a world record. Jak se ti to líbí ?How do you like it ? Podívej se na nebe ! Look up at the heavens ! Až přijdeš, uvidíš. When you come, you´ll see.

Corpus creation IMCSIT-AAIA Wisla, Poland Recorded emotion speech was subjectively evaluated by 4 persons. The final database contained 720 patterns: 360 patterns for one-word sentences 360 patterns for multiword sentences) Emotions: 1- anger, 2- boredom, 3- pleasure 4- sadness H N R S The sentences was read by professional actors (2 f + 1 m) Speech recording: in a professional recording studio format “wav“ sampling frequence 44.1 kHz, 24bit 8/14

Method : The Patterns Based on Music Theory. IMCSIT-AAIA Wisla, Poland The method is based on the idea of the musical interval: The frequency difference between a specific n-tone and reference tone. Example: quint is frequency ratio of the fifth tone divided by the first tone = Int.1st2nd3rd4th5th6th7th8th Var.Min Maj Min Maj Min Maj Min Maj FR /14

IMCSIT-AAIAWisla, Poland Method: The Patterns Based on Musical Theory. The reference frequency (F0) is given by the choices in each utterance feature. The frequency ratios are compared with the music intervals. fifth circle fifth = f3/f2 geometric series tone affinity – decrease from n=1 to n=7 - increase from n=8 to n=13 10/14

Experimental Results IMCSIT-AAIA Wisla, Poland U-matrix H - anger R - pleasure S - sadness N - boredom 11/14 One-word sentencesMulti-word sentences

Wisla, Poland IMCSIT-AAIA Conclusion – for music theory Comparison to some publications: Success classifications 54-64% standard classifier 81 % ANN hight note versus 12 half tones Korea language Our results - success classifications: 74% (MLNN) QE / TE QE / TE / / (SSOM) 1 word sentence multiword sentence 12/14

Wisla, Poland IMCSIT-AAIA 13/14 Conclusion – future work Our effort in future work:  ANN application in prosody modelling: we want to apply results from the described experiments with emotional speech to the improvement of synthetic speech naturalness  ANN application in children’s disordered speech analysis developmental dysphasia

 These different domain of the application influence the database creation. Multiword sentences are more acceptable for prosody modelling. One-word sentences is suitable for the analysis of children’s disordered speech. WHY? often a speech malfunction is manifested in an inability to pronounce whole sentences Wisla, Poland IMCSIT-AAIA 0 Conclusion – future work 14/14

Wisla, Poland IMCSIT-AAIA Thank you for your attention The End